Your Coding Agent Is Sending Your Code Somewhere

Agentic coding tools — Claude Code, GitHub Copilot, Cursor, OpenCode, and their peers — have become fixtures in professional software development. Over 82% of developers now use AI coding assistants⁴, and these tools are generating real chunks of production logic. But a fundamental question remains unanswered by most engineering organizations: where is the code actually going?

The answer, for the vast majority of tools, is to cloud-hosted inference endpoints operated by third parties. When a developer pastes a proprietary algorithm into a Claude Code prompt and asks it to debug, that algorithm is transmitted to Anthropic's servers. GitHub Copilot sends code snippets to GitHub/Microsoft's AI servers¹¹. Even tools marketed as "open source" or "privacy-first" often have telemetry enabled by default and route inference through cloud endpoints⁶. The distinction between "open-source tool" and "private tool" is not what most developers assume.

This creates a live intellectual property exposure problem for enterprises with regulated or sensitive codebases. The scale of the gap between tool adoption and organizational governance is stark: 45% of developers admit to using unsanctioned code assistants at work⁴, yet only 15% of organizations have updated their acceptable use policies to address AI tools⁵. Shadow AI incidents now account for 20% of all data breaches, carrying a cost premium of $4.63 million versus $3.96 million for standard breaches⁵.

The tool landscape offers three distinct architectural tiers — Fully Local (zero egress), Self-Hosted Cloud (private infrastructure), and Managed Cloud (vendor endpoints with contractual protections) — but most organizations have not mapped their codebase sensitivity to these tiers. This research brief provides that mapping, analyzes the security postures of the major agentic coding tools, and offers a decision framework for engineering leaders who need to ship a policy before the next compliance audit asks for one.

Evidence Base & Methodology

Search Approach

This research was conducted on 28 March 2026 using web searches across seven research angles: recent developments, industry data, counterarguments, case studies, technical analysis, vendor landscape, and historical context. Three seed URLs from the originating idea file were fetched directly. Additional targeted fetches were performed on the most data-rich pages identified through search results.

Sources Consulted

16 primary sources were used, spanning vendor documentation (Anthropic, GitHub, Microsoft), independent security research (MintMCP, Bright Security, DryRun Security), industry surveys (Stack Overflow, Gartner, IBM, ISACA, Microsoft), developer community analysis (Hacker News, Reddit discussions), and technical guides (Graphite, Knostic, Checkmarx). Evidence spans from mid-2025 through March 2026.

Notable Gaps

DataCamp's comparison of OpenCode vs. Claude Code (a key seed source) returned a 403 access error and could not be fetched directly. Claims attributed to that comparison are drawn from the idea file and corroborated against other sources. Anthropic's detailed data retention timelines for Claude Code specifically (as distinct from the broader Claude platform) are not publicly documented. Cursor's enterprise data handling terms beyond "Privacy Mode" are sparse in public documentation.

The Data Flow Problem: Where Your Code Actually Goes

The Default State Is Cloud Transmission

Every mainstream agentic coding tool, in its default configuration, sends code context to cloud-hosted AI inference endpoints. This is not a bug — it is the fundamental architecture. Large language models require substantial compute resources that exceed what developer laptops can provide, and commercial tools route to their provider's infrastructure accordingly.

The "Open Source" Misconception

A recurring pattern in community discussions: developers equate "open source" with "private by default." The OpenCode controversy in early 2026 crystallized this. Despite being marketed as an open-source local coding environment, OpenCode routes inference traffic through cloud endpoints in its standard configuration. Community members discovered telemetry enabled by default, which raised questions about what data was being collected beyond the inference calls themselves⁶.

The confusion is understandable. "Open source" describes the license on the client code — it says nothing about where the inference happens. A developer running OpenCode with the default Anthropic or OpenAI provider configuration is sending their code to the same cloud endpoints as a Claude Code or Copilot user. The difference is that OpenCode also supports fully local inference via Ollama — but the developer must explicitly configure it.

Exfiltration Vectors Beyond Inference

Cloud inference is the most obvious data flow, but not the only one. Several additional vectors have been documented:

The Tool Landscape: Security Postures Compared

Documented Vulnerabilities

Each major tool has accumulated a vulnerability record through 2025–2026. The following table summarizes the most significant documented CVEs.

No tool is immune. The presence of high-severity vulnerabilities across all three major tools demonstrates that security posture is not a differentiator between specific vendors — it is a category-level risk that requires organizational controls regardless of which tool is selected.

Data Handling and Privacy Policies

The following table compares the privacy postures of the major agentic coding tools across key enterprise dimensions.

The Consumer-Enterprise Privacy Gap

Tool	CVE	CVSS	Description
GitHub Copilot	CamoLeak	9.6	Silent exfiltration of private repo code via invisible prompt injection
Claude Code	CVE-2025-59536	8.7	Remote code execution via malicious project configuration files
Cursor	CVE-2025-54135	8.6	Untrusted MCP remote code execution
GitHub Copilot	CVE-2025-62449	6.8	Path traversal enabling unauthorized file access
Claude Code	CVE-2026-21852	5.3	API key exfiltration through environment variables
Cursor	CVE-2025-59944	Critical	Case-sensitivity bypass enabling persistent RCE via .CURSOR/mcp.json
Claude Code	CVE-2026-25725	—	Sandbox bypass enabling unauthorized file system access
GitHub Copilot	CVE-2025-62453	5.0	Improper AI output validation

Dimension	GitHub Copilot (Enterprise)	Claude Code (Commercial)	Cursor (Privacy Mode)	OpenCode (Air-gapped)
Code sent to cloud	Yes (Microsoft/GitHub servers)	Yes (Anthropic servers)	Yes (model provider servers)	No (Ollama local inference)
Training on code	No (Business/Enterprise tiers)	No (Commercial terms only)	No (Privacy Mode)	No (local models)
Data retention	Prompts: not retained; engagement: 2 years	Zero-retention available (enterprise API)	Zero-retention (Privacy Mode)	None (fully local)
IP indemnification	Yes (Enterprise tier)	Not publicly documented	Not publicly documented	N/A (open source)
SSO/SAML	Yes (native)	API-level implementation	Limited	N/A
DPA available	Yes (Microsoft DPA)	Yes (auto-incorporated in Commercial Terms)	Not publicly documented	N/A
Audit logging	Organization-level controls	API usage logging	Limited	Local logs only
Enterprise pricing	$39/user/month	Usage-based API pricing	$40/user/month (Business)	Free (infra costs only)

A critical distinction that many developers miss: the privacy guarantees of a tool depend on the tier, not the tool. Anthropic's September 2025 terms update made this explicit — Claude trains on all data from Free, Pro, and Max plans, including when those accounts use Claude Code⁸. Only Commercial Terms (Claude for Work, API, Bedrock) prohibit training. Similarly, GitHub Copilot's free tier may use interactions for model improvement, while Business and Enterprise tiers explicitly exclude customer code from training³.

This means a developer using Claude Code on a personal Pro subscription to debug their employer's code has just sent that code to Anthropic under terms that permit training. The tool is the same. The data handling is not.

The Shadow AI Crisis: Developers Operating Without Policy

The Scale of Unsanctioned Use

The gap between AI tool adoption and organizational governance is one of the most consequential findings in this research. The data from multiple independent surveys converges on a consistent picture:

The Financial Impact

Shadow AI is not an abstract governance concern. IBM's 2025 Cost of Data Breach Report found that shadow AI incidents now account for 20% of all breaches and carry a cost premium: $4.63 million versus $3.96 million for standard breaches⁵. Gartner projects that 1 in 4 compliance audits in 2026 will include specific inquiries into AI governance⁵.

The risk compounds in coding-specific scenarios. Research cited by MintMCP found that secret leakage rates run 40% higher in repositories using Copilot — 6.4% versus a 4.6% baseline⁷. Separately, studies show up to 40% of AI-generated code suggestions may introduce potential security vulnerabilities¹¹. When developers use these tools without organizational oversight, both the data exposure risk and the code quality risk operate unchecked.

Why Developers Use Unapproved Tools

The Anthropic-OpenCode controversy in January 2026 illustrates the dynamic. When Anthropic blocked Claude Code subscriptions from being used through third-party tools, users who had been paying $100–$200/month for Claude Max lost the ability to use that subscription with OpenCode. Rails creator DHH called it "very hostile to users," Hacker News erupted, and OpenCode gained 18,000 GitHub stars in two weeks⁶. Developers will find and adopt the tool that works for them, regardless of whether their organization has approved it.

This is the core dynamic that organizational policy must account for: restricting tools without providing approved alternatives does not reduce usage — it pushes usage underground where it becomes invisible to security teams.

The Three-Tier Framework: Matching Sensitivity to Architecture

Tier Definitions

The agentic coding tool landscape maps to three distinct architectural tiers, each with different data flow characteristics, cost profiles, and capability trade-offs.

The Capability Gap Is Closing

Tier	Architecture	Data Egress	Examples	Trade-offs
Tier 1: Fully Local	Local model, local inference, zero cloud dependency	None	Ollama + Continue.dev, OpenCode Air-gapped, Tabby, FauxPilot	Lower model capability; requires local GPU; no vendor support
Tier 2: Self-Hosted Cloud	Model runs on organization's own cloud infrastructure	Within org perimeter only	AWS Bedrock (private endpoint), Azure OpenAI (private), self-hosted vLLM	Higher infra cost; operational burden; model updates lag
Tier 3: Managed Cloud	Vendor-hosted inference with contractual protections	To vendor (covered by DPA/commercial terms)	GitHub Copilot Enterprise, Claude Code (Commercial), Cursor Business	Best model quality; vendor lock-in; data leaves perimeter

A common objection to Tier 1 (Fully Local) is that local models cannot match cloud-hosted frontier models. This was true in 2024 but the gap is narrowing. Current benchmarks show DeepSeek Coder V2 and Codestral achieving completion accuracy within 5–10% of Copilot on standard benchmarks⁹. DeepSeek Coder specifically achieves 94% accuracy on completion tasks versus Copilot's 89%⁹. For code completion and simple refactoring tasks, local models are now viable. For complex agentic workflows — multi-file reasoning, architectural decisions, long-context debugging — frontier cloud models retain a significant advantage.

Decision Matrix: Codebase Sensitivity to Tool Tier

The following matrix maps codebase sensitivity categories to the minimum acceptable tool tier.

Implementation Considerations

Codebase Category	Examples	Minimum Tier	Rationale
Regulated / classified data	HIPAA-covered code, ITAR/EAR controlled, financial trading algorithms	Tier 1 only	Regulatory frameworks prohibit data egress; no DPA is sufficient
Trade secrets / core IP	Proprietary algorithms, pre-patent code, competitive differentiators	Tier 1 or Tier 2	IP exposure risk too high for third-party infrastructure; self-hosted acceptable if perimeter is controlled
Internal tooling / non-regulated	Internal dashboards, build scripts, DevOps automation	Tier 2 or Tier 3	Low IP value; DPA-covered managed cloud acceptable
Open-source / pre-IP-protected	Open-source contributions, published libraries, public documentation	Any tier	Code is already public or intended to be; no data sensitivity concern

First, developers context-switch constantly. A developer working on a regulated codebase in the morning and an internal tool in the afternoon would need to switch between Tier 1 and Tier 3 tools — or maintain parallel tool configurations. This friction pushes developers toward a single tool for everything, which usually means the most capable (and least private) option.

Second, the "commercial tier" requirement is easily violated. A developer using Claude Code with a personal Pro subscription is on consumer terms where training is permitted. The same developer using Claude Code through their company's API account is on commercial terms where training is prohibited⁸. The tool is identical. The terms are not. Organizations must enforce which accounts are used, not just which tools.

Key Assumptions & Uncertainties

What the Evidence Does Not Resolve

Where Expert Opinion Diverges

Assumptions in This Analysis

Strategic Implications & Actionable Insights

1. Ship a policy before the audit asks for one. Gartner projects 1 in 4 compliance audits in 2026 will include AI governance inquiries⁵. The absence of a written policy for AI coding tool usage is now an audit finding, not a backlog item. A one-page policy that maps codebase categories to approved tool tiers is a minimum viable response.

2. Enforce account tiers, not just tool choice. The privacy guarantees of Claude Code, Copilot, and Cursor are tier-dependent. A developer using Claude Code on a personal Pro account exposes code to training; the same tool on a commercial API account does not⁸. Organizations must provide and enforce commercial-tier accounts, not just approve tool names.

3. Provide approved alternatives or accept shadow AI. With 45–52% of developers using unsanctioned tools⁴ and shadow AI growing 120% YoY³, prohibition without provision is not a viable strategy. Organizations must provide an approved tool at each tier their developers need, or accept that developers will find their own.

4. Treat "open source" as a licensing descriptor, not a privacy guarantee. OpenCode, the most popular open-source agentic coding tool, routes to cloud endpoints by default⁶. "Open source" means the client code is inspectable — it says nothing about where inference happens. Evaluate data flow architecture, not license type.

5. Invest in Tier 1 capability for your most sensitive code. Local models have closed the gap to within 5–10% of cloud models for code completion⁹. For regulated or trade-secret code, the minor capability trade-off is far less costly than the IP exposure risk. Continue.dev + Ollama or Tabby provides a zero-egress development environment that is production-viable today.

6. Monitor for exfiltration vectors beyond inference. Prompt injection attacks, malicious extensions (900,000 installs across 20,000+ enterprise tenants⁴), and environment variable leakage represent attack surfaces that exist regardless of which tier or tool is selected. Content exclusion rules, extension allowlists, and secret scanning are complementary controls.

7. Require DPA and no-training guarantees for any Tier 3 deployment. If managed cloud tools are approved for non-regulated code, require at minimum: a signed DPA, explicit no-training guarantees, defined data retention windows, and SOC 2 Type II or ISO 27001 certification from the vendor¹¹. GitHub Copilot Enterprise and Anthropic's Commercial Terms meet these criteria; free tiers and consumer subscriptions do not.