# ArtifactNet：基于物理伪影提取的AI音乐检测方法

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-04-17 08:00
- AIHOT 链接：https://aihot.virxact.com/items/cmo6mad9l04tnsl4rykvq21xk
- 原文链接：https://arxiv.org/abs/2604.16254

## AI 摘要

研究团队提出轻量级框架ArtifactNet，通过提取神经音频编解码器遗留的物理伪影识别AI音乐。该框架采用3.6M参数UNet提取残差并分解为7通道特征，经0.4M参数CNN分类，总参数量仅4.0M。配套发布含6,183首曲目的ArtifactBench基准（涵盖22个AI生成器）。在2,263首测试集上，该方法取得F1=0.9829、FPR=1.49%，远超CLAM等方法，参数量仅为其1/49。多格式增强训练使跨编解码器概率漂移降低83%。

## 正文

We present ArtifactNet, a lightweight framework that detects AI-generated music by reframing the problem as forensic physics -- extracting and analyzing the physical artifacts that neural audio codecs inevitably imprint on generated audio. A bounded-mask UNet (ArtifactUNet, 3.6M parameters) extracts codec residuals from magnitude spectrograms, which are then decomposed via HPSS into 7-channel forensic features for classification by a compact CNN (0.4M parameters; 4.0M total). We introduce ArtifactBench, a multi-generator evaluation benchmark comprising 6,183 tracks (4,383 AI from 22 generators and 1,800 real from 6 diverse sources). Each track is tagged with bench_origin for fair zero-shot evaluation. On the unseen test partition (n=2,263), ArtifactNet achieves F1 = 0.9829 with FPR = 1.49%, compared to CLAM (F1 = 0.7576, FPR = 69.26%) and SpecTTTra (F1 = 0.7713, FPR = 19.43%) evaluated under identical conditions with published checkpoints. Codec-aware training (4-way WAV/MP3/AAC/Opus augmentation) further reduces cross-codec probability drift by 83% (Delta = 0.95 -> 0.16), resolving the primary codec-invariance failure mode. These results establish forensic physics -- direct extraction of codec-level artifacts -- as a more generalizable and parameter-efficient paradigm for AI music detection than representation learning, using 49x fewer parameters than CLAM and 4.8x fewer than SpecTTTra.
