# 扩散模型中通过分数控制减少幻觉

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-29 08:00
- AIHOT 分数：52
- AIHOT 链接：https://aihot.virxact.com/items/cmpyw41ob03vusli37kti01q3
- 原文链接：https://arxiv.org/abs/2606.00377

## AI 摘要

扩散模型存在生成超出真实数据分布的幻觉样本问题。研究者通过密度视角首次实证分数平滑是根本原因，并将幻觉概率与分数函数的Lipschitz常数建立形式化联系。提出方差引导分数调制（VSM）策略，通过控制分数Jacobian降低平滑度，更逼近真实分数函数，在合成与真实数据集上减少幻觉约25%，同时保持高保真度与多样性。论文还推出两个具有极端语义变化的基准数据集用于系统性评估，代码和数据已开源。

## 正文

Diffusion models have emerged as the backbone of modern generative AI, powering advances in vision, language, audio and other modalities. Despite their success, they suffer from hallucinations, implausible samples that lie outside the support of true data distribution, which degrade reliability and trust. In this work, we first empirically confirm previously proposed hypothesis that score smoothness causes hallucinations in Image Generation diffusion models and provide a density-based perspective. We further formalize this notion by linking the hallucinations probability mass to lipschitz constant of the learned score function. Motivated by this, we introduce a Variance-Guided Score Modulation (VSM) strategy that controls the score Jacobian, in turn reducing score smoothness and better approximating the ground truth score that decreases hallucinations. Empirical results on synthetic and real-world datasets demonstrate that our approach reduces hallucinations (up to ~25%) while maintaining high fidelity and diversity, providing a principled step toward more reliable diffusion-based image generation. We also propose two benchmark datasets with extreme semantic variation for systematic hallucination evaluation. Code and Datasets are publicly available at https://github.com/bhosalems/VSM.
