Anthropic says a China-linked hacking group exploited its Claude AI model to conduct cyberattacks that required minimal human activity. The firm says the incident signals a new phase in automated hacking.
According to Anthropic, the attackers infiltrated several of the 30 institutions targeted. They achieved this by manipulating Claude Code into performing harmful tasks under the guise of legitimate security testing.
The system reportedly executed 80–90% of the attack independently. Anthropic described this as unprecedented, claiming it is the first documented case of an AI-led cyber operation carried out at scale.
Claude’s performance was inconsistent, however. The model fabricated findings, misunderstood its targets, and misinterpreted publicly available information as private.
Specialists disagree on the interpretation. Some view the case as a serious alert about AI capabilities, while others say Anthropic’s framing overstates what is essentially automated code generation.
Anthropic Reports Major Automation in Chinese Cyber Operation Using Its AI Tool
69