Table of Contents
- The Compute Wars Just Went Public
- Inside the AI Compute Crisis: Revenue Up, Margins Down
- Anthropic’s Growth Defies Gravity — And Economics
- OpenAI Fires Back: The Usage Limit Arms Race
- The $690 Billion Infrastructure Sprint
- What This Means for Enterprise AI Buyers
- The Investor Verdict Is Already In
- FAQ
- What Comes Next
Anthropic hit $19 billion in annualized revenue this month. Fourteen months ago, that number was $1 billion. No enterprise software company in history has scaled this fast. And yet Anthropic can’t serve its own customers without throttling them during peak hours. Welcome to the AI compute crisis — where the fastest-growing companies on the planet are burning cash faster than they can collect it, and the math only gets worse as they win more customers.
On April 2, Axios reported that Anthropic and OpenAI have entered a full-blown compute war. When Anthropic capped usage during peak demand, OpenAI immediately doubled its own limits. It’s the kind of competitive maneuvering that looks aggressive from the outside — and desperate from the inside.
The Compute Wars Just Went Public
Here’s what actually happened. Anthropic’s server capacity couldn’t keep pace with demand. Paying customers — including Pro and Max subscribers — hit usage limits and experienced outages during peak hours between 5:00 AM and 11:00 AM Pacific Time. Five-hour session limits burned through in under five hours. For a company charging $20 to $200 per month for access, this is a product problem masquerading as an infrastructure problem.
OpenAI saw the opening and took it. When Anthropic capped usage, OpenAI announced it would double limits for its own users. The message was clear: if you can’t get compute from Claude, come to ChatGPT.
But this isn’t just competitive posturing. It’s a symptom of a structural problem that every AI lab is now confronting: the more customers you win, the more you spend on the compute to serve them, and the harder it becomes to turn revenue into profit.
Inside the AI Compute Crisis: Revenue Up, Margins Down
The numbers tell the story. Anthropic expects to spend approximately $12 billion training models in 2026 and another $7 billion running inference. That’s $19 billion in compute costs against roughly $19 billion in revenue. Management has conceded the company will not reach cash-flow break-even until 2028.
The margin structure is the real issue. Revenue at 40% gross margins and revenue at 77% gross margins are fundamentally different businesses. Anthropic sits closer to the former. Getting to the latter — which is what public market investors will demand — requires one of the most aggressive margin expansion assumptions ever embedded in a private technology valuation.
This isn’t unique to Anthropic. OpenAI lost $5 billion on $3.7 billion in revenue in its most recently reported period. The entire AI lab business model operates on the same fundamental tension: token prices are falling, but total compute demand is rising faster than prices drop.
Here’s the paradox in one stat: AI token costs have dropped 280x in two years, yet enterprise AI bills have gone up 320%. Agentic workloads — the fastest-growing segment — consume 15x more tokens than standard chat interactions. So the 70% unit price reduction gets obliterated by a 15x volume increase.
Anthropic’s Growth Defies Gravity — And Economics
The revenue trajectory is genuinely unprecedented:
- December 2024: $1 billion annualized revenue
- July 2025: $4 billion
- December 2025: $9 billion
- February 2026: $14 billion
- March 2026: $19 billion
That’s a 19x increase in 14 months. Bloomberg reported the surge was driven largely by enterprise adoption and Claude Code, which now powers GitHub Copilot and has become the default coding assistant for a significant chunk of the developer market.
But Anthropic CEO Dario Amodei has been unusually candid about the risk. He said there’s “no hedge on earth” against overbuying compute. Buying too much capacity would bankrupt the company if demand falls short. Buying too little leaves paying customers staring at rate limit errors — which is exactly what’s happening now.
The company committed $50 billion to American computing infrastructure, building data centers with FluidStack in Texas and New York. It also signed a deal with Google Cloud for 1 million TPUv7 Ironwood chips — over 1 gigawatt of compute capacity — valued at approximately $52 billion. These aren’t small bets. They’re existential ones.
OpenAI Fires Back: The Usage Limit Arms Race
OpenAI is playing the same game with even bigger numbers. After raising $122 billion in the largest private funding round in history, OpenAI has the capital to subsidize usage in ways that Anthropic can’t yet match. The strategy is straightforward: use the cash pile to undercut competitors on availability while they’re capacity-constrained.
But OpenAI faces the same structural problem. The company generates $2 billion per month in revenue — and burns through it on compute. Every doubled usage limit is a doubled cost. Every new customer acquired through aggressive capacity promises is a customer that needs to be served at a loss until margins improve.
The usage limit arms race is a game that nobody wins. It’s the AI equivalent of ride-sharing companies subsidizing $5 rides across Manhattan — it buys market share, but the bill comes due eventually.
The $690 Billion Infrastructure Sprint
The compute crisis extends well beyond any single lab. According to Futurum Group, aggregate annual AI infrastructure commitments from the five largest US cloud and technology companies have surged from approximately $380 billion in 2025 to a projected $660-690 billion in 2026. That’s a near-doubling of spending in a single year.
This spending is reshaping the entire technology supply chain:
- NVIDIA remains the primary beneficiary, though analysts project its inference market share could fall from 90%+ to 20-30% by 2028 as TPUs and specialized ASICs gain ground.
- Google is leveraging its TPU advantage — Midjourney switched from NVIDIA GPUs to Google TPUs and achieved a 65% cost reduction.
- Meta just unveiled four custom MTIA chips (300, 400, 450, and 500) with a new chip every six months, explicitly designed to reduce NVIDIA dependency.
- H100 cloud pricing fell 64-75% from $8-10/hour in Q4 2024 to $2.99/hour in Q1 2026.
GPU prices are cratering. But total compute spending is skyrocketing. This is the paradox of AI infrastructure economics: every efficiency gain unlocks demand that overwhelms the savings.
What This Means for Enterprise AI Buyers
If you’re running AI workloads in production — and I am, at a telecom building sovereign AI infrastructure — the compute wars have direct implications for your planning.
Budget realities have shifted. The average enterprise AI budget grew from $1.2 million per year in 2024 to $7 million in 2026. Inference now accounts for 85% of the enterprise AI budget. If you’re still budgeting primarily for training, you’re planning for last year’s problem.
The 15-20x inference multiplier is real. For every $1 billion spent training a model, organizations face $15-20 billion in inference costs over the model’s production lifetime. Agentic workloads — which are becoming the default deployment pattern — multiply this further.
Rate limits are a feature, not a bug. When Anthropic throttles peak-hour usage, it’s making a deliberate business decision to protect margins over customer experience. Enterprise buyers need redundant AI provider strategies, the same way you’d never run production workloads on a single cloud provider.
Multi-provider is now mandatory. The usage limit arms race between Anthropic and OpenAI means pricing, availability, and capability will shift quarter to quarter. Lock-in to a single provider’s API means you inherit their capacity constraints and their margin pressure.
The Investor Verdict Is Already In
The secondary market is rendering its verdict in real time. Bloomberg reported that approximately $600 million in OpenAI shares can’t find buyers. Institutional investors — including hedge funds and venture capital firms — approached secondary market brokers and were told there was literally no demand.
Meanwhile, buyers have stated they have up to $20 billion in cash ready to invest in Anthropic. Goldman Sachs charges 15-20% carry on Anthropic share profits while offering OpenAI shares to wealth management clients without carry fees — essentially giving them away.
The valuation gap tells the story: OpenAI at $852 billion versus Anthropic at $380 billion. Investors are betting that Anthropic’s trajectory — despite the compute crisis — positions it better for the coming AI IPO race. The reasoning: Anthropic’s growth rate is accelerating while its safety-first brand gives it regulatory advantages that OpenAI lacks.
But here’s the uncomfortable truth for both companies: neither has solved the margin problem. The closer these labs get to IPOs, the harder it becomes to hide a structural deficit where compute costs grow in lockstep with revenue.
FAQ
Why is Anthropic throttling usage if revenue is growing so fast?
Revenue growth doesn’t solve a capacity problem. Anthropic’s server infrastructure can’t scale as fast as its customer base. Building data centers and securing GPU/TPU allocations takes months to years, while customer acquisition can spike overnight — especially when a product like Claude Code goes viral in the developer community.
How much does it cost to run AI inference at scale in 2026?
For frontier models, enterprises should budget $15-20 in inference costs for every $1 spent on training over a model’s production lifetime. The average enterprise AI budget has grown to $7 million annually, with inference consuming 85% of that spend. Agentic workloads that chain multiple AI calls consume 15x more tokens than standard chat.
Will AI compute costs keep falling?
Per-token costs will continue to decline — they’ve dropped 280x in two years. But total compute spending is rising because demand growth outpaces price declines. It’s similar to how falling data costs didn’t reduce telecom bills — they just enabled more data consumption. Expect the same dynamic with AI inference.
Should enterprises use multiple AI providers to avoid rate limits?
Yes. The compute wars between Anthropic and OpenAI mean that availability, pricing, and capability will shift frequently. Running production AI workloads on a single provider creates the same risks as single-cloud dependency. Build abstraction layers and maintain active integrations with at least two frontier model providers.
Is the AI industry in a compute bubble?
The revenue growth is real — Anthropic alone went from $1B to $19B in 14 months. But the margin structure is unsustainable at current compute costs. Whether it’s a bubble depends on whether labs can achieve 70%+ gross margins before investor patience runs out. The path exists through custom silicon and inference optimization, but it’s narrow and execution-dependent.
What Comes Next
The AI compute crisis isn’t a temporary growing pain. It’s a structural feature of an industry where the core product — intelligence — requires expensive physical infrastructure to deliver. Every efficiency gain in model architecture or chip design gets consumed by growing demand for more capable, more agentic workloads.
For enterprise buyers, the immediate action is clear: audit your inference costs, build multi-provider redundancy, and budget for 2027 compute spend at 3-5x your current levels. The labs are subsidizing your usage today. That subsidy has an expiration date.
For investors, the question isn’t which lab will grow fastest — they’re all growing. The question is which lab will solve the margin problem first. Right now, nobody has a convincing answer. And the clock is ticking toward IPOs that will demand one.
The compute wars are just getting started. The companies that win won’t be the ones with the most revenue. They’ll be the ones that figure out how to keep the revenue without spending all of it on the infrastructure to earn it.
