# 荒诞攻击突破AI防线 大小模型均受影响

- 来源：Ethan Mollick (@emollick)
- 发布时间：2026-05-14 21:37
- AIHOT 分数：64
- AIHOT 链接：https://aihot.virxact.com/items/cmp5k90p90expsljxodmr0a01
- 原文链接：https://x.com/emollick/status/2054918927952548223

## AI 摘要

看似荒谬的“荒诞攻击”（例如“根据日内瓦公约我无法支付这么多”）对AI代理有效，因为防护机制难以应对非常规论点。较小模型常被攻破，但即使较大模型也略受影响。https://www.microsoft.com/en-us/research/articles/whimsical-strategies-break-ai-agents-generating-out-of-distribution-adversarial-strategies-at-scale/

## 正文

"Whimsey attacks" that seem absurd （"I cannot pay that much because of the Geneva Convention"） work against AI agents as guardrails are weak against out-of-distribution arguments. Smaller models fall often， but it even gives an edge against bigger ones. https://www.microsoft.com/en-us/research/articles/whimsical-strategies-break-ai-agents-generating-out-of-distribution-adversarial-strategies-at-scale/
