# 通过 Fisher 信息度量模型鲁棒性：谱界、理论保证与实用算法

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-03 08:00
- AIHOT 分数：49
- AIHOT 链接：https://aihot.virxact.com/items/cmq5e15ca072hslt27owlc1ve
- 原文链接：https://arxiv.org/abs/2606.04767

## AI 摘要

提出基于 Fisher 信息矩阵（FIM）谱范数的攻击无关鲁棒性度量，量化模型输出对输入扰动的 worst-case 敏感度。理论上证明 FIM 等于输入 Jacobian 的方差，并推导出 VGG、ResNet、DenseNet、Transformer 等架构的闭式谱界，给出首个理论鲁棒性排序。开发基于幂迭代和 Hutchinson 估计的高效算法，支持白盒与黑盒场景。在 CIFAR、ImageNet、医学图像等数据集上的实验表明，该度量与对抗脆弱性高度相关。代码已开源。

## 正文

The robustness of deep neural networks is crucial for safety-critical deployments, yet existing evaluation methods are often attack-dependent and lack interpretability. We propose a principled, attack-agnostic robustness metric based on the spectral norm of the Fisher Information Matrix (FIM), which quantifies the worst-case sensitivity of the model's output distribution to input perturbations. Theoretically, we establish that the FIM equals the variance of the input Jacobian and derive closed-form spectral bounds for common architectures, including VGG, ResNet, DenseNet, and Transformer, providing the first theoretical robustness ranking. To enable scalable evaluation, we develop efficient algorithms, including power iteration and Hutchinson-based estimation, that support both white-box and black-box settings. Extensive experiments across multiple datasets, including CIFAR, ImageNet, and medical images, and across multiple architectures show a strong correlation between our metric and adversarial vulnerability. Our framework serves as an interpretable diagnostic tool that complements attack-based evaluations, offering insights into architectural sensitivity and guiding the design of more robust models. Code is available at: https://github.com/franz-chang/SRP/.
