AI 摘要
MiniMax 的 M3 模型在卡塔尔 vs 瑞士的世界杯比赛中正确预测平局,成为五个模型和一位人类预测中唯一正确的选择。Kilo CLI 分析显示,该基准刻意排除博彩赔率,因此瑞士 64% 的市场赔率未被纳入。M3 依据双方相同的 WWDLW 记录、卡塔尔更高的原始评分以及瑞士更强的联赛水平做出判断。主推文同时提问“FWC-Bench when?”,暗示可能推出新基准测试。
happy world cup everyone ⚽️
FWC-Bench when?
Qatar vs Switzerland. Five models and one human predicted. Everyone took a side. @MiniMax_AI's M3 took the draw, and it was the only correct call. So we ran it ...