AI 摘要
最新一批模型在ARC-AGI-3上的得分目前仍低于1%。 到今年年底,得分会达到多少呢?
The latest crop of models remains below 1% on ARC-AGI-3 -- for now.
Where will the scores be by the end of the year?
GPT-5.5 & Opus 4.7 on ARC-AGI-3 - GPT-5.5: 0.43% - Opus 4.7: 0.18% We found 3 failure modes: - True local effect, false world model - Wrong level of abstraction...