原文 · 未翻译
Anthropic confirms Claude Code problems and promises stricter quality controls
Key Points
Following user complaints about declining quality in Claude Code, Anthropic identified and fixed three separate bugs affecting reasoning depth, caching, and text length restrictions.
To prevent similar incidents, the company is tightening internal testing before shipping updates. As compensation, Anthropic has reset usage limits for all subscribers.
The problems highlight an industry-wide compute crunch. This bottleneck is increasingly causing outages and forcing AI providers to raise prices for compute-intensive tools.
Users complained about declining quality in Claude Code. Anthropic identified and fixed three separate sources of error. The company promises stricter quality controls going forward.
Over the past month, a growing number of users reported that Anthropic's coding tool Claude Code was producing noticeably worse results. Anthropic has now laid out the causes in a detailed post-mortem: three independent changes to Claude Code, the Claude Agent SDK, and Claude Cowork combined to create a widely felt quality drop. The API itself was not affected, according to Anthropic. All three issues have been fixed as of April 20 with version 2.1.116.
Lower reasoning effort, caching bugs, and prompt restrictions caused the problems
The first issue dates back to March 4. Anthropic lowered the default reasoning effort from "high" to "medium" because some users were experiencing extreme latency in high mode. Internal testing had shown that medium mode delivered only slightly worse results on most tasks while significantly reducing latency. The trade-off didn't pay off: users quickly reported that Claude Code felt less intelligent. On April 7, Anthropic permanently rolled back the change.
The second problem was a bug in a caching optimization shipped on March 26. The plan was to delete older reasoning sections once after an hour of inactivity to reduce latency when resuming a session. A coding error caused the reasoning history to be wiped on every subsequent turn instead.
Claude progressively lost context about its own decisions. Users noticed forgetfulness, repetition, and strange tool choices. On top of that, the resulting cache misses burned through usage limits faster than expected. According to Anthropic, the bug slipped through reviews undetected and wasn't fixed until April 10.
A third issue appeared on April 16: a system prompt instruction meant to curb the well-known verbosity of Opus 4.7. The line read: "Length limits: keep text between tool calls to ≤25 words. Keep final responses to ≤100 words unless the task requires more detail." Later testing with a broader eval suite revealed a 3 percent quality drop. Anthropic rolled back the change on April 20.