原文 · 未翻译
Hackers hijacked high-profile Instagram accounts by simply asking Meta's AI chatbot to change the email
Hackers took over prominent Instagram accounts by asking Meta's AI support chatbot to swap out the email address on file. Two-factor authentication was bypassed entirely. Targets included the Obama White House account, the Chief Master Sergeant of the US Space Force, and cosmetics chain Sephora.
Short, highly coveted usernames also changed hands within minutes and were resold on Telegram. These OG handles, names made up of just a few letters or common words, can fetch six-figure sums on gray markets. Researchers ZachXBT and Dark Web Informer, who track crypto crime and underground markets, documented the fallout publicly. Two of the compromised handles reportedly had a combined market value of over $1 million.
The method was surprisingly simple. Attackers turned on a VPN to place themselves in the target account's geographic region, kicked off a password reset, and then told the AI support assistant to update the email address on the account, promising to send the confirmation code right away. The bot then sent an eight-digit confirmation code to the attacker's email address, followed by a password reset link. Where Meta's automated identity check kicked in, the attackers got around it by running the victim's public Instagram photos through AI video generators, according to The CyberSec Guru. That produced realistic-looking selfie clips that fooled the automated security checks.
A textbook confused deputy attack
The CyberSec Guru calls the incident a textbook example of a well-known problem in IT security called the confused deputy. A helper system holds more privileges than the actual user, and an attacker tricks it into exercising those privileges on their behalf. The AI assistant was allowed to swap email addresses and reset passwords, actions a regular Instagram user can't trigger directly. Anyone who asked the bot nicely got those actions performed without even being logged in first.
At its core, this is a prompt injection with particularly expensive consequences. The language model can't reliably tell the difference between a harmless user request and a malicious instruction, as both are just text. The CyberSec Guru draws a comparison to SQL injection, where inputs also get misread as commands. The difference is that SQL can be locked down with clear rules. A language model has no clean separation between data and instructions.