[policy] 5 min · Apr 2, 2026

GitHub Copilot — Your Private Repo Sessions Are Training Data Now

GitHub defaults Copilot Free, Pro, and Pro+ users into AI model training from April 24, 2026. Here is what gets collected, who is exempt, and how to opt out.

#github-copilot#developer-privacy#ai-training-data#policy

On March 25, 2026, GitHub Chief Product Officer Mario Rodriguez announced that starting April 24, Copilot Free, Pro, and Pro+ users will have their interaction data fed into AI training pipelines by default. GitHub does not promise an additional consent prompt when Copilot is used inside private repos — the policy change itself is the notice. You have until April 24 to opt out. Most developers working on proprietary code have not noticed.

TL;DR

  • What: GitHub defaults Copilot Free, Pro, and Pro+ users into AI model training from April 24, 2026
  • Scope: Prompts, accepted completions, code context, file names, repo structure, chat interactions — everything in an active Copilot session
  • The trap: GitHub does not train on code “at rest” in private repos, but the moment Copilot is open inside one, that session is fair game unless you have opted out
  • Action: github.com/settings/copilot/features → Privacy → disable “Allow GitHub to use my data for AI model training” — do it now

GitHub Copilot Training Data — What Actually Happened

The announcement is framed as a product improvement. GitHub wants to use real-world interaction data to make Copilot better. That is a reasonable goal. The problem is the framing around consent and the scope of what “interaction data” actually means.

The data types GitHub will collect include: code snippets, outputs accepted or modified by users, code context surrounding the cursor position, comments and documentation, file names, repository structure and navigation patterns, interactions with Copilot features like chat and inline suggestions, and feedback signals like thumbs up or down ratings.

That is not a narrow telemetry signal. That is a detailed behavioral record of how you write code, what you accept, what you reject, and what your private project structure looks like while Copilot is running. The data may be shared with GitHub affiliates — companies in Microsoft’s corporate family — but according to the announcement, will not be shared with third-party AI model providers.

GitHub draws a distinction between code “at rest” and active session data. They are explicit: “We use the phrase ‘at rest’ deliberately because Copilot does process code from private repositories when you are actively using Copilot.” Opening Copilot inside a private repo is enough to make that session eligible for training. The privacy you thought your private repo afforded you does not extend to your Copilot sessions inside it.

Why This Matters

The “at rest vs. active session” distinction is the real trap, and I want to be direct about it: this is a unilateral policy change that redefines what using a paid developer tool means.

If you are on Copilot Pro or Pro+, you have been paying for a coding assistant. Starting April 24, you are also implicitly agreeing to be a training data contributor unless you explicitly refuse. The product you pay for is now also funding the next version of the product — using your proprietary code context to do it.

This would be less objectionable if the default were reversed. Ask developers to opt in to training data contribution. Let them make that choice deliberately. Instead, GitHub is betting that most users will not notice the policy update, will not change the default, and will continue generating training data indefinitely. That is not a neutral product decision. It is a data acquisition strategy.

The competitive context matters here too — and GitHub’s own FAQ muddies it. GitHub cites Anthropic and Microsoft as examples of providers using similar opt-out data use policies. What the FAQ leaves out: JetBrains explicitly does not use user data to train models at all. Their terms state: “We undertake that We will not use Your Inputs, Data, Outputs, and Suggestions to train any language models.” That is not a similar approach to GitHub’s — it is the opposite stance. Grouping JetBrains alongside opt-out policies to suggest industry-wide consensus is, at minimum, misleading.

Cursor’s position is meaningfully different in a specific way. When Privacy Mode is enabled in Cursor, zero data retention is enforced at the model provider level — meaning nothing is retained by the underlying model provider, not just excluded from training. Copilot’s opt-out stops your data from being used in training, but does not make equivalent guarantees about what is retained for service operation. These are different levels of protection, and the difference matters if you are working on code under NDA or in a regulated industry. The Cursor vs. Copilot comparison covers where this distinction becomes a real factor in the decision.

If your personal GitHub account is a member or outside collaborator of a paid GitHub organization, you may already be protected. GitHub’s updated policy applies only to individual consumer accounts. Business and Enterprise accounts remain governed by their Data Protection Agreements, which prohibit using Copilot interaction data for model training. Check whether your account is organization-affiliated before assuming you need to act — but opt out anyway if you are not certain.

There is also a segment of developers who are unknowingly already exempt: students and teachers accessing Copilot through GitHub Education are excluded from this policy. So are Copilot Business and Enterprise users, whose contracts explicitly prohibit this use of interaction data.

The practical consequence for everyone else — individual developers on Free, Pro, or Pro+ plans — is that the default setting is now working against your interests. GitHub will improve its models partly on the back of your work, your clients’ code, and your project architecture, unless you go change a setting most people have never visited.

The opt-out is not retroactive in the way you might hope. GitHub confirms that once you opt out, they stop collecting from that point forward — but data collected before your opt-out is not deleted. If you are past April 24 reading this, opt out immediately and accept that some session data has already been collected.

The Take

Opt out immediately. The setting is at github.com/settings/copilot/features — go to Privacy, find “Allow GitHub to use my data for AI model training,” set it to Disabled. It takes 30 seconds.

Beyond the immediate action: I would treat any tool that defaults to training-on without explicit opt-in as one that has already told you where its priorities sit. GitHub made a deliberate choice about what the default should be. That choice reveals something about how GitHub thinks about its users relative to its model improvement roadmap.

This is not a reason to stop using Copilot if it is the right tool for your workflow. The tool profile covers what Copilot actually delivers, and the comparison with Cursor gets into where each wins. But you should use it with eyes open — and with the opt-out enabled.

The broader pattern here is worth watching. GitHub is not alone in moving toward opt-out-by-default training data collection. Gemini’s free tier handling set an earlier precedent for how AI tool providers structure defaults to extract data value from free and low-cost tiers. The developers who notice these policy changes and act on them are the ones whose code does not end up contributing to someone else’s product improvements without meaningful consent. The rest are opting in by inertia.