Researchers Say Claude Flaws Could Be Chained to Silently Exfiltrate User Data

Researchers at Oasis Security say they found a three-part exploit chain involving Claude features and related claude.com infrastructure that could silently extract sensitive data from a user’s conversation history. The attack abused trusted platform features rather than malware, phishing emails, or rogue tools. The researchers, who dubbed the attack “Claudy Day,” say the chain is notable not just for what it could expose, but because it worked against a default Claude session and relied on behavior users would have little reason to distrust.

According to Oasis, the attack starts by steering a user to what appears to be a legitimate Claude-related link, then uses a pre-filled prompt URL to slip hidden instructions into the session. From there, Claude can be manipulated into gathering accessible conversation data and sending it out through Anthropic’s own Files API, turning normal platform functionality into an exfiltration path. Oasis argues that the paths through which models receive instructions, and the permissions they inherit, now form a real security boundary rather than just a product design concern.

How the Exploit Worked

Oasis says the chain began with a convenience feature. Claude lets users open a new chat with a prompt already loaded through a ?q= URL parameter, but the researchers found they could hide instructions inside that prefilled text in ways the victim would not see before hitting Enter. Once submitted, those hidden instructions were processed along with the visible prompt.

From there, the attack moved into exfiltration. Oasis says Claude’s code execution sandbox blocks general outbound network traffic but still permits connections to Anthropic infrastructure. According to the report, it made it possible to instruct Claude to pull sensitive material from the user’s conversation history, write it to a file, and upload it through the Anthropic Files API to an attacker-controlled account. The upload occurred through Anthropic’s own infrastructure.

To get the victim there in the first place, Oasis says an open redirect on claude.com could help mask the delivery path. The researchers say a claude.com URL could bounce the user to an arbitrary destination, which in turn made it easier to present the attack as a trusted Claude link rather than an obvious fake. The redirect was only one part of the chain, but it helped make the attack harder to spot.

Why the Attack Was Difficult to Detect

What makes the chain especially troubling is how little of it would have looked suspicious from the user’s side. Oasis says the visible prompt could appear harmless even as hidden instructions rode along underneath it, and the follow-on activity did not depend on some shady external server or obvious malware. The traffic moved through legitimate Anthropic infrastructure, using real platform features and normal-looking API behavior. Oasis says users would have had little visible indication that anything was wrong.

“The Claudy Day attack chain highlights a new reality: the prompt itself is now an attack surface,” said Saumitra Das, vice president of engineering at Qualys. “There’s no malware or compromised infrastructure involved; it is just carefully crafted instructions delivered to a model that trusts them by default. By combining hidden prompt injection, a legitimate API endpoint, and a clean domain redirect, the attack looks like normal platform traffic at every step.”

What Data Was At Risk

In the most basic scenario, Oasis says the attack could reach a user’s conversation history, stored memory, and any sensitive exchanges available within that Claude session. Even that narrower exposure could be significant. Users often turn to chatbots to summarize notes, discuss work documents, troubleshoot code, and handle other sensitive personal or business matters.

The broader risk grows when Claude has access to more than the chat window. In setups with MCP servers, tools, or enterprise integrations enabled, the injected prompt could also read files, send messages, access APIs, and interact with connected services, Oasis says. That wider exposure is the researchers’ warning, not a claim that every Claude deployment was equally exposed. But it points to the direction this threat model is heading as AI assistants gain more access across workplace environments.

Why This Matters Beyond Claude

The bigger issue is not limited to Claude. Oasis’s findings point to a broader trust problem across AI systems: a convenience feature that appears harmless, an instruction path the product treats as trustworthy, and an agent with enough access to take meaningful actions. Put those together, and the result can become a security issue. The danger is not just a single bug, but the way small design decisions can stack into an attack path when models are given more reach.

“These findings continue to support the growing industry sentiment that while agentic assistants, such as Claude, can be a boon to productivity and velocity, they also represent a significant business risk through unintended breaching of what’s being called the ‘Lethal Trifecta,’” said Andrew Bolster, senior R&D manager for security and AI strategy at Black Duck. “That’s where agents are exposed to untrusted content— in this case, the URL parameter injection—access to private data, and the ability to externally communicate.”

Response and Remediation

Anthropic said it fixed the prompt-injection issue after Oasis disclosed the findings, while the open redirect and Files API issues were still being addressed when the report was published.

For organizations using Claude with MCP servers or other integrations, the more immediate lesson is to require explicit user confirmation before allowing an agent to use tools on an initial prompt and to review the permissions attached to agent sessions the way they would audit IAM or service-account access.

For platform builders, the findings point to a wider design problem that is unlikely to be unique to Claude. Security teams may need to take a harder look at how AI systems handle URL parameters, validate redirects, and expose internal APIs inside execution environments, especially when those systems are trusted to act on a user’s behalf.