Anthropic 确认 Claude Code 存在问题并承诺实施更严格的质量控制

2026-04-24 18:52·69天前·Maximilian Schreiner

AI 摘要

Anthropic 确认其编程助手 Claude Code 出现质量问题，用户反馈其性能下降。公司已识别并修复了三个独立的错误源。为应对此次问题，Anthropic 承诺未来将执行更严格的质量控制措施，以保障产品输出的稳定性和可靠性。

原文 · 未翻译

Anthropic confirms Claude Code problems and promises stricter quality controls

Key Points

Following user complaints about declining quality in Claude Code, Anthropic identified and fixed three separate bugs affecting reasoning depth, caching, and text length restrictions.

To prevent similar incidents, the company is tightening internal testing before shipping updates. As compensation, Anthropic has reset usage limits for all subscribers.

The problems highlight an industry-wide compute crunch. This bottleneck is increasingly causing outages and forcing AI providers to raise prices for compute-intensive tools.

Users complained about declining quality in Claude Code. Anthropic identified and fixed three separate sources of error. The company promises stricter quality controls going forward.

Over the past month, a growing number of users reported that Anthropic's coding tool Claude Code was producing noticeably worse results. Anthropic has now laid out the causes in a detailed post-mortem: three independent changes to Claude Code, the Claude Agent SDK, and Claude Cowork combined to create a widely felt quality drop. The API itself was not affected, according to Anthropic. All three issues have been fixed as of April 20 with version 2.1.116.

Lower reasoning effort, caching bugs, and prompt restrictions caused the problems

The first issue dates back to March 4. Anthropic lowered the default reasoning effort from "high" to "medium" because some users were experiencing extreme latency in high mode. Internal testing had shown that medium mode delivered only slightly worse results on most tasks while significantly reducing latency. The trade-off didn't pay off: users quickly reported that Claude Code felt less intelligent. On April 7, Anthropic permanently rolled back the change.

The second problem was a bug in a caching optimization shipped on March 26. The plan was to delete older reasoning sections once after an hour of inactivity to reduce latency when resuming a session. A coding error caused the reasoning history to be wiped on every subsequent turn instead.

Claude progressively lost context about its own decisions. Users noticed forgetfulness, repetition, and strange tool choices. On top of that, the resulting cache misses burned through usage limits faster than expected. According to Anthropic, the bug slipped through reviews undetected and wasn't fixed until April 10.

A third issue appeared on April 16: a system prompt instruction meant to curb the well-known verbosity of Opus 4.7. The line read: "Length limits: keep text between tool calls to ≤25 words. Keep final responses to ≤100 words unless the task requires more detail." Later testing with a broader eval suite revealed a 3 percent quality drop. Anthropic rolled back the change on April 20.

The Decoder：AI News（RSS）

49导出 Markdown