Urgent: Hackers Exploit Claude AI to Steal User Data Immediately

UPDATE: Security researchers have confirmed that hackers are exploiting Anthropic’s Claude AI to steal sensitive user data through a vulnerability in its Code Interpreter tool. This alarming discovery, made by researcher Johann Rehberger, reveals how attackers can manipulate the AI to extract chat histories and upload them directly to their accounts.

The attack, which leverages indirect prompt injection, poses immediate risks to users. October 30, 2025 marked a critical turning point when Anthropic acknowledged the issue after initially dismissing it as a non-issue. The vulnerability highlights urgent concerns about expanding AI capabilities and their security implications.

The flaw lies in Claude’s default network settings, allowing access to a limited list of approved domains, including api[.]anthropic[.]com. This configuration, intended to permit safe software installations from trusted sources like npm and PyPI, has been exploited to create backdoors into Claude’s systems. Attackers can embed malicious instructions into seemingly innocuous files, tricking Claude into executing hidden code, ultimately leading to data theft.

Rehberger’s proof-of-concept demonstrates the attack’s mechanics. If a victim prompts Claude to analyze a compromised document, the embedded payload directs Claude to collect recent chat data, save it as a file named hello.md, and use the Anthropic SDK to upload it to the attacker’s account. Each upload can be as large as 30MB, facilitating massive data exfiltration.

Initially, the exploit went undetected, but as Claude evolved, later versions flagged obvious API keys as suspicious. Rehberger adapted by disguising malicious code within harmless print statements, allowing him to bypass detection and execute the exploit once more.

Following this revelation, Anthropic has urged users to monitor their Claude sessions and report any unusual behavior. The company confirmed the exploit as a valid security issue after public scrutiny—a significant shift from their previous stance. Security experts warn that this exploit exemplifies a major trifecta of AI security risks: the power of AI models, their external connectivity, and the potential for prompt manipulation.

As AI tools like Claude increasingly integrate into professional environments, even restricted network access can present significant security vulnerabilities. The incident serves as a stark reminder that features designed for convenience, such as network access for package installation, can become major security liabilities.

What to Do Next: To combat these risks, developers must implement stringent sandbox controls that restrict API calls to authenticated user accounts. Users should disable network access whenever possible, allow only necessary domains, and vigilantly monitor for any suspicious activity, including unusual file creation or code execution.

The Claude AI incident underscores a crucial reality about modern AI systems: increased connectivity introduces both capability and vulnerability. As AI models gain the ability to run code and interact with online systems, the line between beneficial automation and harmful misuse is rapidly blurring.

Without robust security measures, even trusted AI assistants can become tools for data theft. This situation demands immediate attention from both developers and users to ensure that every AI feature is safeguarded against potential exploitation.