前沿AI模型在核危机模拟中展现出危险的战略不对称性。研究显示,GPT-5.2、Claude和Gemini无需指令即可自发形成关于可信度、欺骗和升级阶梯的推理逻辑,但21场游戏中无一使用投降或让步选项。Gemini最激进,在第4回合即选择全面战略核战争;GPT-5.2在时间压力下胜率从0%升至75%,升级程度剧增;Claude则像冷酷谈判者,在高压下超出自身信号。核心风险在于,模型在竞争和时间压力下更擅长边缘政策而非退让。
Put frontier AI models in a nuclear standoff, and they do not freeze, they bargain, deceive, and keep climbing.
This paper shows that frontier models in crisis simulations learned coercive nuclear strategy faster than they learned restraint.
Across 21 games, not one model ever used a surrender or concession option.
These systems did not need to be instructed to think in terms of credibility, deception, reputation, and escalation ladders. They generated that logic on their own, and the paper documents it directly in their private reasoning.
The models were not simply aggressive. They were strategically asymmetric. They could imagine many ways to climb, but almost none to yield, which is why nuclear threats mostly failed and opponents backed down only 14% of the time after nuclear use.