DogeDesigner@cb_doge

2026-04-12 23:39·81天前

AI 摘要

Anthropic的Claude Opus正在下滑。最新基准测试显示，其准确率在短短几天内从83.3%降至68.3%。这在编码过程中的幻觉率出现了大幅飙升。 Grok 4.20仍保持第一的位置。未被超越。https://t.co/FA5nbKKeS0

Anthropic's Claude Opus is FALLING.

Latest benchmarks show its accuracy dropped from 83.3% → 68.3% in just days.

That's a major spike in hallucinations during coding.

Grok 4.20 still holds the #1 spot. Undefeated.

DogeDesigner@cb_doge · X

2026-04-12 23:39·81天前

AI 摘要

Anthropic's Claude Opus is FALLING.

Latest benchmarks show its accuracy dropped from 83.3% → 68.3% in just days.

That's a major spike in hallucinations during coding.

Grok 4.20 still holds the #1 spot. Undefeated.