Anthropic@AnthropicAI

2025-10-10 00:28·266天前

AI 摘要

联合研究发现，仅需少量恶意文档就能在 LLM 中植入安全漏洞，且不受模型规模或训练数据量影响。这表明数据投毒攻击的实施门槛可能比此前认为的更低，实际威胁被低估。

New research with the UK @AISecurityInst and the @turinginst：

We found that just a few malicious documents can produce vulnerabilities in an LLM-regardless of the size of the model or its training data.

Data-poisoning attacks might be more practical than previously believed.

Anthropic@AnthropicAI · X

2025-10-10 00:28·266天前

AI 摘要

New research with the UK @AISecurityInst and the @turinginst：

We found that just a few malicious documents can produce vulnerabilities in an LLM-regardless of the size of the model or its training data.

Data-poisoning attacks might be more practical than previously believed.