Chubby♨️@kimmonismus

2026-06-13 03:58·20天前

AI 摘要

观察图表，我认为 Fable 5 只会保持领先直到 GPT-5.6。其次，我认为该基准测试很快就会完全饱和。

Looking at the graph， I think Fable 5 will only maintain its lead up to GPT-5.6.

And secondly， I think the benchmark will soon be completely saturated.

Epoch AIClaude Fable 5 scores very well on FrontierMath: Tiers 1-4 (v2), reaching 87% on Tiers 1-3 and 88% on Tier 4. This continues a streak of Anthropic models improv...

Anthropic OpenAI 推理评测/基准

在 X 查看原推导出 Markdown

Chubby♨️@kimmonismus · X

24导出 Markdown