ICML 2026论文揭示,长上下文大语言模型的性能并非随错误信息增加而线性下降,而是呈现“第一滴墨水”效应。研究发现,仅当上下文包含10%的高难度错误文本时,损害就已基本完成。例如,在一个128K-token的Qwen2.5设置中,这最初的10%错误文本造成了58%的性能损失。其机制在于softmax注意力机制会赋予与问题相近但错误的文本过高权重,仅这10%的高难度干扰文本就能贡献约97%的干扰压力。因此,过滤文档带来的提升可能主要源于缩短了有效上下文,而非移除“坏内容”。
A long-context AI can be poisoned by a few plausible wrong passages, not gradually worn down by many.
At just 10% bad context, the damage is already almost done.
"THE FIRST DROP OF INK " effect, analogous to how a single drop of ink contaminates water.
The mistake is to picture context as storage.
In a long prompt, the model is not calmly filing facts into separate boxes; it is running a competition over which pieces of text deserve attention when the answer is generated.
Hard distractors are dangerous because they are not random junk.
They are close enough to the question to look useful, but wrong enough to pull the model away from the gold evidence.