AI 摘要
GTOWizard 测试显示,GPT-5.4、Claude Opus 4.6、Gemini 3.1 Pro、Grok 4 等主流模型在与专业扑克 AI 的 5000 手无限注德州扑克单挑中全部落败。推主调侃,既然直接玩扑克不行,不如测试 AI 生成会玩扑克的 AI 的能力。
What we really need is a benchmark where AI models make AI models that play poker.
We benchmarked every major AI model at poker. GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, Grok 4 and more. All played 5,000 hands of heads-up no-limit against our...