Chubby♨️@kimmonismus

2026-04-17 16:23·76天前

AI 摘要

Opus 4.7 消耗的 token 数量约为原来的 1.3 倍。指令必须非常精确。许多人在抱怨这是一次"仓促发布"。在 Bullshit Benchmark 中，它的表现比 Opus 4.6 更差。反响非常两极分化。 Anthropic 这次可能帮了 OpenAI 一个大忙。Spud 预计下周发布。如果发布得当，它可能会盖过 Opus 的风头，让 ChatGPT 重回巅峰。 h/t @petergostev 提供基准测试和图片

Opus 4.7 consumes approximately 1.3 times as many tokens. The instructions must be very precise. Many are complaining about a "rushed release." In the Bullshit Benchmark， it performs worse than Opus 4.6. The mood is very mixed.

Anthropic may have done OpenAI a big favor with this. Spud is expected next week. And if the release is done right， it could overshadow Opus and catapult ChatGPT back to the top.

h/t @petergostev for the benchmark and image

Chubby♨️The mood regarding the Opus 4.7 update has shifted. If I had to guess, I'd say 60% are disappointed with the latest update, while 40% are positive. I'm still un...

Anthropic OpenAI 推理评测/基准

在 X 查看原推导出 Markdown

Chubby♨️@kimmonismus · X

导出 Markdown

2026-04-17 16:23·76天前

在 X 看原推· x.com

AI 摘要

Anthropic may have done OpenAI a big favor with this. Spud is expected next week. And if the release is done right， it could overshadow Opus and catapult ChatGPT back to the top.

h/t @petergostev for the benchmark and image