Chubby♨️@kimmonismus

2026-07-01 05:39·2天前

AI 摘要

Claude Sonnet 5 在 Artificial Analysis Intelligence Index 得分 53，与 GPT-5.5 (xhigh) 和 Opus 4.8 (max) 差 2-3 分。标准定价（$3/$15 per 1M tokens）下每任务成本 $2.29，比 Sonnet 4.6 贵约 2 倍，比 Opus 4.8 贵约 15%。推理和知识密集型基准落后 Opus 4.8（如 CritPt 物理推理仅 17%），但在 agentic 知识工作（AA-Briefcase 和 GDPval-AA）上匹配或超越 Opus 4.8。上下文窗口 100 万 token，Anthropic 提供至 9 月 1 日促销价 $2/$10。新增 xhigh effort 设置。整体表现令人失望，并非一次好的发布。

tl；dr： Sonnet 5 is cheaper per token， but more expensive per solved problem - and still lags behind Opus 4.8 in overall intelligence.

Thats honestly disappointing and not a good release.

Artificial AnalysisClaude Sonnet 5 achieves 53 on the Artificial Analysis Intelligence Index, but without promotional pricing will cost more per task than Opus 4.8 We supported @A...

Anthropic 推理模型发布评测/基准

在 X 查看原推导出 Markdown

Chubby♨️@kimmonismus · X

68导出 Markdown

2026-07-01 05:39·2天前

在 X 看原推· x.com

AI 摘要

tl；dr： Sonnet 5 is cheaper per token， but more expensive per solved problem - and still lags behind Opus 4.8 in overall intelligence.

Thats honestly disappointing and not a good release.