AI 摘要
联合研究发现,仅需少量恶意文档就能在 LLM 中植入安全漏洞,且不受模型规模或训练数据量影响。这表明数据投毒攻击的实施门槛可能比此前认为的更低,实际威胁被低估。
New research with the UK @AISecurityInst and the @turinginst:
We found that just a few malicious documents can produce vulnerabilities in an LLM-regardless of the size of the model or its training data.
Data-poisoning attacks might be more practical than previously believed.