原文 · 未翻译
The US government may be asking Anthropic the impossible by demanding unhackable LLMs
Government officials appear to be accusing Anthropic of disregarding Trump's cyber directive and releasing Fable 5 without explicit approval. Discussions are ongoing, but the government's accusation of a "jailbreak" mostly exposes its own gaps in knowledge.
"Everybody said Anthropic was a bad actor. Some of us said it was time to give them a chance. Now those people are questioning that. They screwed us." That's how an administration official summed up the conflict between the Trump administration and Anthropic, according to Axios.
As I suspected, government officials are accusing Anthropic of ignoring Trump's recently issued cyber executive order. The executive order called for supposedly voluntary government oversight of AI models. Anthropic welcomed the proposal but released Fable 5 without waiting for the designated clearinghouse, which could have signed off on the release, to be set up.
A government official also accuses Anthropic of knowing a jailbreak could occur. "They came to every fork in the road and took the wrong fork." The tip about this jailbreak, whose existence and severity haven't been confirmed, reportedly came from Amazon and other tech companies.
Government sources also criticized the communication between the two sides to Axios. "It's like they just speak in different languages." The Department of Commerce and Anthropic employees are reportedly in talks, with more meetings planned involving the CIA and science advisor Michael Kratsios.
The accusation that Anthropic knew about the jailbreak risk and stayed silent actually says more about the government's understanding of AI than about Anthropic. Anyone who works closely with AI models knows they can be hacked. OpenAI has warned that prompt injection, a related hacking method, may never be fully solved. There's no fix for LLM security yet.
The real question is how severe the breach is and how fast countermeasures kick in. But if the U.S. government insists frontier AI models must be "unhackable" before they ship internationally, tough talks are ahead. Then again, Anthropic isn't in a strong spot either. CEO Dario Amodei said back in 2023 that "a jailbreak could be life or death" if someone managed to bypass safety protocols in science, tech, and biology.