Agentjacking: AI Coding Assistants Fall Prey to Exploits Through Error Monitoring Services

On June 17, the Threat Labs team at Tenet Security, a startup specializing in AI-agent security that has recently emerged from stealth, detailed a novel attack vector they have termed "agentjacking." This sophisticated exploit targets AI coding agents by leveraging seemingly innocuous error reports within Sentry, a widely adopted error-monitoring service. The attack circumvents traditional security measures by exploiting the inherent trust AI agents place in data provided through established integration protocols, turning them into unwitting execution engines for malicious commands without the need for malware or compromised credentials.

The core of the agentjacking attack lies in its deceptive simplicity. Attackers exploit the way AI coding agents, often integrated with external services via protocols like the Model Context Protocol (MCP), interpret data. These agents are designed to treat information received from integrated services as authoritative guidance. In this scenario, a forged error report injected into Sentry can be mistaken by an AI agent for a genuine issue requiring a fix, leading it to execute commands embedded within the fabricated report. This method is analogous to a building contractor receiving a forged work order that directs them to perform a specific, potentially harmful, task, trusting the system implicitly without questioning the source of the request.

The Vulnerability: Sentry DSNs and AI Agent Trust

At the heart of this vulnerability is the design of Sentry’s Data Source Name (DSN). A DSN is a write-only credential intended to allow applications to report errors without exposing sensitive project details. Sentry explicitly documents DSNs as safe to embed directly into frontend JavaScript code, as they are designed to be public and only require the DSN itself for authentication at the ingestion endpoint. This design choice, while effective for human-operated error monitoring, creates a critical opening when AI agents process the same data.

When an AI agent, operating under the assumption of data integrity from its integrated services, encounters a Sentry error report containing a crafted malicious command disguised as a resolution, it lacks the inherent human discernment to differentiate between legitimate data and an instruction to execute. The agent processes the data provided through the MCP as guidance, regardless of its origin or context. This fundamental limitation of current AI models, rather than a configuration error, is the primary enabler of agentjacking. The combination of a publicly accessible DSN, which allows attackers to write to the data stream, and an AI agent’s inherent trust in its connected services, forms a potent threat.

The Attack Chain: A Step-by-Step Deception

The agentjacking attack unfolds through a meticulously orchestrated series of seemingly ordinary steps, making it difficult to detect at any individual stage.

1. Locating the DSN

The initial phase involves the attacker identifying a target organization’s Sentry DSN. Sentry’s guidance on embedding DSNs in frontend JavaScript means they are often exposed in production websites. Attackers can discover these DSNs through various methods, including targeted queries on services like Censys, which scans the internet for connected devices and services, or by searching public code repositories like GitHub.

2. Crafting and Submitting a Malicious Event

Once a DSN is obtained, the attacker can submit a crafted error event to Sentry’s ingestion endpoint. This process requires no authentication beyond the DSN itself. The attacker has full control over the event’s payload, including the error message, associated tags, context keys, and the crucial stack trace. Sentry, receiving the event with a valid DSN, processes it as a genuine crash report, returning an HTTP 200 status code and filing the fabricated event alongside legitimate errors.

3. Disguising Malicious Commands as Resolutions

The critical element of the crafted event is the use of markdown within its message and context fields. When an AI agent retrieves this event via the Sentry MCP, the markdown is rendered. This rendering can create headings, code blocks, and a fabricated "resolution" section that mimics Sentry’s own templates. Hidden within this deceptive resolution is the attacker’s command, often presented as a snippet to be executed.

4. Manipulating the AI Agent

A developer, seeking to resolve outstanding issues, might instruct their AI coding agent to fix the Sentry errors. This is a common daily task for thousands of development teams. The AI agent, receiving the request, retrieves the injected event through the MCP. It then interprets the fake resolution as trusted guidance, steering its actions towards executing the embedded command rather than analyzing or modifying the application’s source code.

5. Executing the Command

With the AI agent now primed to act, it executes the attacker’s command using the developer’s privileges on their local machine. In Tenet Security’s controlled tests, the injected payload was designed to appear as a security scanning tool, adhering to responsible disclosure principles. This masked execution allowed the demonstration to remain within ethical boundaries while proving the exploit’s viability.

6. Compromising Sensitive Data

Upon execution, the malicious package within the command can access and exfiltrate sensitive information. This includes environment variables, cloud configuration files, and credential stores. Tenet’s tests confirmed that the compromised agent could access AWS keys, GitHub tokens, and Git credentials, effectively establishing a foothold for the attacker. The compromised agent then signals a Tenet-controlled server, confirming the successful exposure of these secrets.

Tenet Security’s Findings: A Widespread Threat

Tenet Security’s research indicates that agentjacking represents a significant and widespread threat. The researchers focused their validation efforts on scenarios mimicking a developer clearing a backlog of Sentry issues, a common practice, particularly at the end of a work week. Their findings suggest a substantial number of organizations are vulnerable.

A public Sentry key is all it takes to hijack Claude Code, Cursor, and Codex

Tenet identified 2,388 organizations with injectable DSNs through passive reconnaissance alone. Crucially, 71 of these organizations rank among the Tranco top 1 million busiest websites, indicating a broad reach and potential impact. The researchers posit that similar vulnerabilities likely exist in thousands of other projects that were not included in their testing.

The exploit was successfully demonstrated across multiple AI coding agents, including Claude Code, Cursor, and Codex. Tenet logged over 100 confirmed executions across various organizations during their controlled validation waves, with an reported 85% success rate. Ron Bobrov, a Tenet researcher, highlighted the effectiveness of the exploit.

The implications of this vulnerability extend to large enterprises. Tenet confirmed successful execution on a machine belonging to a developer within a Fortune 100 technology company, valued at approximately $250 billion. The attack also proved effective against AI agents running in sandboxed CI/CD pipelines, within Windows Subsystem for Linux (WSL) environments on managed machines, and behind corporate VPNs, impacting both macOS and Windows operating systems.

In one notable instance, an environment running Claude Code held a live AWS secret access key. Furthermore, this compromised environment contained identifiers for other connected agents, suggesting that a single foothold could lead to the compromise of multiple systems. The captured environment was current, dating back to early June 2026, indicating the exploit is not reliant on outdated systems. Within an enterprise setting, the compromised agent can grant attackers access to a wealth of sensitive data, including CI/CD credentials, private repository URLs, and cloud infrastructure tokens – precisely the assets that platform teams work diligently to protect.

The Elusive Nature of the Attack: Why Traditional Security Fails

The efficacy of agentjacking stems from its ability to bypass conventional security controls. Each step in the attack chain is authorized and appears legitimate: the attacker does not directly access the victim’s infrastructure, the developer does not knowingly approve malicious code, and the AI agent executes precisely the task it was instructed to perform. Tenet labels this phenomenon the "Authorized Intent Chain." Consequently, standard security measures such as Endpoint Detection and Response (EDR) solutions, Web Application Firewalls (WAFs), Identity and Access Management (IAM) systems, VPNs, and firewalls fail to flag any suspicious activity.

Even prompt-layer defenses, designed to guard against prompt injection attacks, proved ineffective. The AI agents proceeded to execute the injected payload even when system prompts and defined skills instructed them to disregard untrusted data. This highlights a fundamental limitation in how current AI models process tool output, rather than a configurable setting that can be easily adjusted.

The Responsibility Gap: Sentry, Model Vendors, and Runtime Security

Addressing the agentjacking vulnerability presents a complex challenge, as responsibility for the fix is distributed across multiple entities: Sentry, the vendors of the AI models, and the runtime environment where the AI agent operates.

Sentry’s response to Tenet’s disclosure, made on June 3, was to acknowledge the issue but decline to implement a source-level fix. The company characterized the attack as "technically not defensible" at their end and suggested that middleware implemented by AI model vendors should be the primary defense. While Sentry did introduce a global content filter for the specific string used in Tenet’s proof-of-concept, this measure only addresses the immediate payload and does not close the underlying vulnerability.

This stance places the onus of mitigation squarely on the runtime layer – the environment surrounding the AI agent where actions are ultimately decided. Given Sentry’s decision to maintain its open endpoint as a feature and the current limitations of AI models in reliably refusing malicious instructions, neither Sentry nor the model vendors can unilaterally resolve the issue. The ongoing debate centers on who ultimately owns the responsibility for implementing a robust fix, a resolution that will have significant implications for development teams.

Beyond Sentry: A Broader Exposure

The vulnerability demonstrated by Tenet is not exclusive to Sentry. Any integration using the Model Context Protocol (MCP) that returns externally influenced data to an AI agent carries a similar risk. As more tools and services integrate with AI agents through MCP, the attack surface that can be exploited via trusted telemetry will inevitably expand. The prompt injection threats that security professionals have warned about for years now have a clear and demonstrable pathway from a publicly accessible credential to arbitrary code execution.

If development teams continue to integrate AI agents with external services without implementing robust controls to inspect the data returned by those services, malicious data will persistently find its way to execution. To address this, Tenet Security has open-sourced a set of configurations called "agent-jackstop." These configurations are designed to harden AI agents like Cursor and Claude Code against this class of injection, providing development teams with a practical starting point for mitigation while the broader industry grapples with the larger implications.

The inherent design of AI agents that can process and execute tasks at remarkable speed means they will execute whatever instructions are provided by a trusted tool. This underscores the critical importance of the runtime environment as the next significant frontier in software supply chain security. Just as enterprises meticulously vet third-party libraries before integrating them into their systems, treating every MCP integration with the same level of scrutiny is essential to prevent agentjacking from weaponizing an organization’s own telemetry against it.