Rohan Paul@rohanpaul_ai

2026-06-25 03:37·8天前

AI 摘要

GLM-5.2 在 ARC-AGI-2 上取得 22.8% 的成绩，成本 $0.25/任务值得注意的是，大约 2025 年 5 月，ARC-AGI-2 上已验证的最佳模型仅为 3.0%。因此，虽然它仍远落后于 GPT-5.5（85%），但 GLM-5.2 也比 2025 年 5 月的最佳前沿分数高出约 7.6 倍，且每任务成本比 GPT-5.5 的 $1.87 便宜约 7.5 倍。

GLM-5.2 got 22.8% on ARC-AGI-2：， $0.25/task

To note here， around May 2025， the best verified models on ARC-AGI-2 were only at 3.0%.

So while it is still far behind GPT-5.5 （85%）， GLM-5.2 is also about 7.6x above the best frontier score from May 2025， and about 7.5x cheaper per task than GPT-5.5's $1.87 run.

ARC PrizeGLM-5.2 from @Zai_org on ARC-AGI (Verified) - ARC-AGI-2: 22.8%, $0.25 - ARC-AGI-1: 77.0%, $0.19 Performance is comparable with GPT-5.4 & 5.5 (Low Reasoning Effo...

推理评测/基准

在 X 查看原推导出 Markdown

Rohan Paul@rohanpaul_ai · X

48导出 Markdown