# 当背景关键时：利用可迁移攻击攻破医学视觉语言模型

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-04-19 08:00
- AIHOT 链接：https://aihot.virxact.com/items/cmo933pac00f2sls28elm2cgb
- 原文链接：https://arxiv.org/abs/2604.17318

## AI 摘要

研究人员提出名为 MedFocusLeak 的高可迁移性黑盒多模态攻击方法，通过在非诊断性背景区域注入协调扰动并运用注意力分散机制，使医学视觉语言模型生成错误但临床可信的诊断。该方法在六种医学影像模态的测试中达到最先进的攻击成功率，且保持扰动不可察觉。研究同时引入统一评估框架与新指标，揭示了现代临床视觉语言模型推理能力的关键缺陷。

## 正文

Vision-Language Models (VLMs) are increasingly used in clinical diagnostics, yet their robustness to adversarial attacks remains largely unexplored, posing serious risks. Existing medical attacks focus on secondary objectives such as model stealing or adversarial fine-tuning, while transferable attacks from natural images introduce visible distortions that clinicians can easily detect. To address this, we propose MedFocusLeak, a highly transferable black-box multimodal attack that induces incorrect yet clinically plausible diagnoses while keeping perturbations imperceptible. The method injects coordinated perturbations into non-diagnostic background regions and employs an attention distraction mechanism to shift the model's focus away from pathological areas. Extensive evaluations across six medical imaging modalities show that MedFocusLeak achieves state-of-the-art performance, generating misleading yet realistic diagnostic outputs across diverse VLMs. We further introduce a unified evaluation framework with novel metrics that jointly capture attack success and image fidelity, revealing a critical weakness in the reasoning capabilities of modern clinical VLMs.
