AI Coding Bill Shock — Uber Capped, Microsoft Pulled Claude
Uber burned its 2026 AI coding budget in four months and capped engineers at $1,500/month. Microsoft is revoking Claude Code licenses by June 30. DeepSeek topped...
Uber blew through its entire 2026 AI coding budget in four months. Bloomberg confirmed on June 2 that the company has capped every engineer at $1,500/month per agentic coding tool — Claude Code and Cursor tracked separately — after per-engineer bills ran between $500 and $2,000 monthly across roughly 5,000 engineers with 84–95% adoption. That same week, Microsoft’s Experiences and Devices division began revoking Claude Code licenses entirely, hard deadline June 30. And Ramp’s June corporate spend report placed DeepSeek — the paid API, not the self-hosted open-weight model — at the number one trending slot across more than 50,000 US businesses. Three events in the same week. Same root cause.
TL;DR
- Uber: Entire 2026 AI budget gone by May. $1,500/month cap per tool, per engineer. COO admits no proven link between token spend and shipped features.
- Microsoft: Revoking Claude Code licenses across Windows, M365, Outlook, Teams, and Surface divisions by June 30. Replacing with GitHub Copilot CLI.
- DeepSeek: Now #1 trending AI vendor on Ramp. US companies paying for the API directly — not self-hosting — to escape Anthropic/OpenAI pricing.
- Pattern: GitHub moved to credit billing June 1. Cursor restructured Teams pricing effective July 1. The flat-rate era is over.
What Happened
The Uber story reads like a cautionary tale about runaway spend. That framing is too charitable. This is a structural failure of enterprise AI procurement. Uber deployed Claude Code to 5,000 engineers, watched monthly usage climb to 84–95% by April, and cheered. CEO Dara Khosrowshahi publicly noted that about 10% of the company’s code is now AI-generated. The incentive structure was simple: use more, ship faster, win.
Nobody tied the token bill to shipped product value.
COO Andrew Macdonald made that explicit on the Rapid Response podcast. When asked whether rising Claude Code token consumption was translating to consumer-facing features, his answer was blunt: “That link is not there yet.” The $1,500/month cap — which creates an $18,000/year ceiling per tool, per engineer — is what happens when the bill arrives before the proof does.
Uber’s cap does not solve the underlying problem. It just makes the underlying problem smaller. If you cannot measure productivity at $2,000/month per engineer, you also cannot measure it at $1,500/month per engineer. The cap buys time. It does not buy insight.
Microsoft’s response is structurally different and more revealing. The Experiences and Devices division — Windows, Microsoft 365, Outlook, Teams, Surface — is not capping Claude Code. It is killing it. An internal memo from EVP Rajesh Jha confirmed that most Claude Code licenses across the division will be revoked by June 30, with engineers directed to switch to GitHub Copilot CLI. The stated reason is telling: Claude Code had become “perhaps a little too popular” inside Microsoft, with engineers choosing Anthropic’s tool over Microsoft’s own product.
This is not cost management. This is product strategy disguised as fiscal discipline. Microsoft cannot let its own engineering org validate a competitor’s agentic coding tool while trying to sell GitHub Copilot to the Fortune 500. The financial year-end timing is not coincidental.
Why This Matters
These are not isolated incidents. They are the first visible fractures of a pricing model that the entire AI coding industry built on a bet: that flat-rate subscriptions would scale alongside agentic usage. That bet lost.
GitHub switched all Copilot plans to usage-based credit billing on June 1. Usage is now calculated on token consumption — input, output, and cached tokens — priced at listed API rates per model. Cursor restructured its Teams pricing with a two-tier seat model: Standard at $40/user/month, Premium at $120/user/month with 5x usage, with separate pools for first-party and third-party models. Those changes hit new customers immediately and renewal customers on July 1. Both moves reflect the same realization: when agents run autonomously, context windows expand, and multi-turn sessions compound token costs, flat-rate pricing hemorrhages money.
Agentic sessions are structurally more expensive than copilot sessions. Industry estimates suggest input tokens dominate agentic workflows — multi-turn sessions across full codebases consume token budgets at rates that traditional autocomplete never approached. This is why Uber’s per-engineer bills hit $2,000/month with high adoption: agentic workflows are not autocomplete. They are compute-intensive loops that scale with ambition.
If your team runs agentic coding tools, start tracking cost-per-merged-PR now — not cost-per-seat. Seat-based budgets obscure the actual relationship between spend and output. The metric that would have saved Uber’s budget is one they apparently were not tracking until the money was gone.
Enter DeepSeek. Ramp’s June report shows DeepSeek surpassed PheedLoop and Fireworks AI on the trending software vendors list. The significant detail: US companies are making direct payments to DeepSeek’s API. Not downloading the open-weight model and self-hosting. Not routing through a US intermediary. Paying DeepSeek directly for API access and wiring it into production workflows.
Ramp and multiple procurement teams report materially lower API rates from DeepSeek for common coding tasks compared to Anthropic and OpenAI. For agentic sessions where token costs compound across turns, even a moderate per-token discount multiplies into substantial savings. Engineering teams that hit internal spending caps — or that simply watched what happened to Uber — are redirecting traffic to the cheapest model that clears their quality bar.
This is where the story shifts from FinOps to something harder. DeepSeek is a China-based AI provider, and when enterprise teams wire its API directly into CI/CD pipelines, proprietary source code flows through infrastructure outside US regulatory reach. The cost optimization problem becomes a data residency problem. Every engineering team choosing DeepSeek’s API over self-hosting the open-weight model is making an implicit bet that the geopolitical environment will not change faster than their architecture can adapt. Given current US-China technology tensions, that is not a bet I would take without explicitly acknowledging the tradeoff.
The GitHub Copilot credits transition and Cursor’s Teams restructuring are both symptoms of the same repricing. The industry simultaneously moved from “unlimited usage for a fixed monthly fee” to “you pay for what you burn” within a single billing cycle. For any team running agentic workflows at scale, your AI coding budget is no longer a line item. It is a variable cost that scales with engineering activity, context window size, and model selection.
The Take
The Uber story is being read as “company spent too much on AI.” That misses the point entirely. The real failure is that nobody in procurement, engineering leadership, or finance had a framework to evaluate whether the spend was working. Macdonald’s “that link is not there yet” is not an admission of overspend — it is an admission that Uber optimized for adoption without building the instrumentation to measure return. They tracked usage. They did not track value.
Microsoft’s move is colder but more honest. They are not pretending Claude Code does not work. They are saying it works too well for a competitor’s product, and they are pulling the plug before it becomes entrenched. If you are an enterprise customer relying on Claude Code inside a Microsoft-adjacent stack, June 30 is your planning horizon.
And DeepSeek’s rise to the top of the Ramp chart should genuinely worry anyone thinking about this at the infrastructure level. The market found its pressure release: a cheaper model, hosted by a China-based provider, with enterprise source code flowing through APIs that sit outside US data residency frameworks. That is not a FinOps win. That is a risk transfer that nobody priced in.
If you run a team of 10 or more engineers using agentic coding tools, here is what I would do this week: calculate your actual cost-per-merged-PR for the last 90 days, set a dollar-threshold alert (not a seat count), and have a real conversation about which models your data is flowing through. The flat-rate era is over. What replaces it is either disciplined metering or surprise invoices. Pick one.