François Chollet@fchollet

2026-05-02 05:37·62天前

AI 摘要

最新一批模型在ARC-AGI-3上的得分目前仍低于1%。到今年年底，得分会达到多少呢？

The latest crop of models remains below 1% on ARC-AGI-3 -- for now.

Where will the scores be by the end of the year？

ARC PrizeGPT-5.5 & Opus 4.7 on ARC-AGI-3 - GPT-5.5: 0.43% - Opus 4.7: 0.18% We found 3 failure modes: - True local effect, false world model - Wrong level of abstraction...

Anthropic OpenAI 推理评测/基准

在 X 查看原推导出 Markdown

François Chollet@fchollet · X

56导出 Markdown