[funding] 5 min · Apr 20, 2026

Factory's $150M Bet — When Coding Agents Run for 16 Days Straight

Factory raised $150M at $1.5B for Missions, a multi-agent system where Droids run autonomously for days or weeks. The real story is the lock-in, not the models.

#ai-agents#enterprise-ai#coding-agents#multi-agent-orchestration

Factory closed a $150M Series C on April 16, led by Khosla Ventures with Sequoia Capital, Blackstone, Insight Partners, NEA, Mantis VC, and others participating. Valuation: $1.5B. Keith Rabois joined the board. The numbers are interesting, but the funding headline is a distraction. The actual product shift — Missions, a multi-agent orchestration system where autonomous Droids execute software projects over hours, days, and weeks — is what matters. The longest recorded mission ran for 16 days without a human in the loop. That is not a coding copilot. That is a different product category entirely.

TL;DR

  • What: Factory raised $150M Series C at $1.5B valuation; the real news is Missions, a multi-agent orchestration layer where Droids run autonomously for days
  • Architecture: Coordinator agent decomposes long-horizon goals into subtasks dispatched to specialized Droids (code, review, test, docs, knowledge) — median run ~2 hours, 14% exceed 24 hours
  • Lock-in vector: Not models. Workflow integrations across Slack, Linear, CI/CD, and your backlog create extraction costs that have nothing to do with which LLM sits underneath
  • Action: Watch Missions closely if you’re an enterprise eng leader — but map your exit costs before you onboard

Factory Missions — What Happened

The funding round itself is straightforward venture math. Factory claims six consecutive months of doubling monthly revenue and 200% quarter-over-quarter growth. At $1.5B, they’re valued at a fraction of Cursor’s rumored $50B — which either means they’re undervalued or Cursor’s number is absurd. Probably both.

But Missions is where the structural argument lives. Standard Droid sessions — the interactive coding assistant you’d compare to Cursor or Claude Code — have a median session time of about 8 minutes, with 60% finishing within 15 minutes. That is copilot territory. Missions is something else.

A Mission starts with a coordinator agent that decomposes a long-horizon engineering goal into subtasks. Each subtask gets dispatched to a specialized Droid: one writes code, another handles review, another runs tests, another writes documentation, another does research. The orchestrator selects different models per role from any provider — the coordinator might use one model family while the code-writing worker uses another and the validator uses a third. Factory’s thesis is that systems locked to a single model family are always constrained by that family’s weakest capability.

The runtime numbers tell the story. The median mission runs about 2 hours. 37% run longer than four hours. 14% exceed 24 hours. The longest recorded mission ran for 16 days. These are not prompt-response cycles. These are autonomous engineering programs operating on the timescale of actual project work.

Enterprise adoption backs this up: Factory reports daily Droid usage by hundreds of thousands of developers at Nvidia, Adobe, EY, Palo Alto Networks, Adyen, MongoDB, Bayer, Zapier, Morgan Stanley, and others.

Why This Matters

Every enterprise coding agent I’ve covered — Cursor, Claude Code, the recently announced Cursor 3 agent orchestration features, Codex — is fundamentally built around the single-session prompt-response loop. You open a context window, you describe what you want, the agent does work, you review, you close the session. The context window is the unit of work.

Factory starts from the opposite assumption: real engineering work fragments across time, and one agent context-switching between code, review, docs, and tests within a single context window is structurally wrong. A mission that takes 16 days cannot fit in any context window. It requires persistent state, task decomposition, handoffs between specialized agents, and autonomous error recovery across sessions. This is architecturally closer to a CI/CD pipeline than to a chatbot.

The model-agnostic layer deserves scrutiny. On the surface, it sounds like LiteLLM routing — pick the cheapest or fastest model per request. But Factory’s implementation is more opinionated than that. The orchestrator doesn’t just route; it assigns roles. The coordinator model needs strong planning capability. The code-writing worker needs strong code generation. The validator needs strong reasoning about correctness. This is role-based casting, not cost-based routing. Whether that delivers meaningfully better outcomes than single-model approaches is an empirical question Factory hasn’t published benchmarks to answer.

Factory’s runtime statistics (median 2 hours, longest 16 days) are self-reported from their own platform. No independent benchmarks exist for multi-agent mission success rates, cost per mission, or error recovery quality. Enterprise buyers should request audit data before committing.

The competitive dynamics are a three-way squeeze. From below, IDE-native tools like Cursor are expanding upward — Cursor 3 already adds agent orchestration features that start to look like lightweight missions. From above, model providers like Anthropic (Claude Code) and OpenAI (Codex) are racing to own the enterprise agent relationship directly, cutting out intermediaries. From the side, Cognition’s Devin operates with higher autonomy but less enterprise governance. Factory’s bet is that none of these players will build the full workflow integration layer that makes multi-day missions possible inside actual enterprise engineering processes.

That bet might be right. But it creates a lock-in dynamic that deserves more skepticism than it’s getting.

If you’re evaluating Factory against Cursor or Claude Code, the comparison only makes sense at the session level. For interactive coding, they compete directly. For multi-day autonomous engineering missions, Factory currently has no direct competitor — which means no switching option either.

The Take

I think Factory is right about the architecture. Single-session, single-context-window agents will hit a ceiling for complex engineering projects. Decomposing work across specialized agents operating over days or weeks is closer to how human engineering teams actually function. The runtime data — 14% of missions exceeding 24 hours — suggests they’re not just theorizing about this; they’re already operating in the multi-day regime.

But “model agnosticism” is doing a lot of rhetorical work in Factory’s positioning. The implied message is: you’re not locked in, because you can swap models underneath. That misses where the actual lock-in happens. When Droids are embedded in your Slack workspace, connected to your Linear board, triggering your CI/CD pipelines, reading your backlog, and generating documentation that feeds back into your knowledge base — the exit cost has nothing to do with which model is underneath. It has everything to do with ripping out workflow integrations that touch every surface of your engineering process. This is the same mechanism that made enterprise SaaS sticky long before AI entered the picture. Salesforce didn’t lock you in with a database. It locked you in with 200 custom workflows your team built over three years.

Factory at $1.5B is either the infrastructure layer that sits between model providers and enterprise engineering teams — capturing value from both sides — or it’s an integration layer that gets compressed when model providers build their own orchestration and IDE tools add their own multi-session capabilities. The next 18 months will determine which.

If you’re an enterprise eng leader, Missions is worth evaluating seriously. But before you onboard, draw the integration map. Count every workflow connection point. That’s your exit cost — and nobody at Factory is going to draw it for you.