💻 AI Infrastructure

Google Is Spending $190 Billion on AI This Year. We Ran the Per-Token, Per-User, and Per-Agent Math.

At Google I/O 2026, Sundar Pichai dropped two numbers that define the AI infrastructure era: $190 billion in annual capex and 3.2 quadrillion tokens processed per month. Nobody on stage divided one by the other. We did.

Vast data center complex stretching to the horizon

Six times. That's how quickly Google's infrastructure spending has multiplied since 2022, a period during which annual capital expenditure ballooned from $31 billion to an expected $180-$190 billion, a figure Sundar Pichai delivered today at Google I/O 2026 with the practiced calm of someone reading a weather forecast while standing in the path of a fiscal hurricane.

He also shared the demand side: Google now processes 3.2 quadrillion tokens per month across its surfaces, up from 480 trillion a year ago and a mere 9.7 trillion two years before that, which works out to a 330-fold increase in 24 months, the kind of growth curve that breaks Excel charts and stress-tests the electrical grid simultaneously.

But the most revealing numbers weren't on Pichai's slides, because they emerge only when you start dividing what Google is spending by who and what it's spending it on, calculations that Google conspicuously declined to perform on stage.

The Per-User Infrastructure Bill

Google disclosed three audience sizes at I/O: 900 million monthly active Gemini app users, 2.5 billion AI Overview users, and 1 billion AI Mode users. Take the Gemini app figure, the most direct measure of active AI engagement, and divide it into the capex number, and the result is $211 per user per year in infrastructure spending alone, not revenue, not operating cost, just the concrete-and-silicon bill for keeping the servers running underneath each person who opens the Gemini app.

For context, Google's average revenue per user across all products was roughly $65 in 2025, calculated from approximately $350 billion in total revenue divided by the 5.4 billion people who touch at least one Google product annually. The infrastructure cost per active Gemini user is now 3.2 times the average revenue Google earns from all of its users, a ratio that would terrify any CFO who hadn't already internalized the bet that AI users will eventually become dramatically more valuable than non-AI users.

Using the broader AI Overviews audience of 2.5 billion drops the per-user capex to $76, which is more comfortable but still extraordinary: Google is spending more on infrastructure per AI user than Netflix charges for an annual Standard plan.

This math only works if AI users generate substantially more revenue than non-AI users, whether through deeper engagement with Search ads, adoption of subscription tiers, or enterprise cloud spending that Pichai hinted at when he noted that "when people use our AI-powered features in Search, they use Search more," a claim that translates to: more usage means more ad impressions means more revenue, and the bet is that $211 per user in infrastructure creates a flywheel that returns multiples of the investment. Nobody outside Google knows if it does yet.

The Per-Token Economics

Processing 3.2 quadrillion tokens monthly means Google handles approximately 38.4 quadrillion tokens per year. Spread $190 billion across that volume and you get about $0.005 per thousand tokens in raw infrastructure cost, roughly $5 per million tokens, a figure that establishes the floor beneath every pricing decision Google makes about AI services.

Compare that to what Google charges developers and enterprises through its APIs. Gemini 3.5 Flash pricing starts at approximately $0.075 per million input tokens and $0.30 per million output tokens for prompts under 200K context, meaning even at the blended floor, API pricing exceeds the per-token capex cost by an order of magnitude, which is expected because the capex figure doesn't include operating expenses like electricity, cooling, personnel, or depreciation. But the calculation establishes something important: Google cannot offer free unlimited AI services indefinitely without ads or subscriptions covering at minimum $5 per million tokens in infrastructure depreciation, plus the OpEx layer that sits on top of every hardware dollar.

The directional math is what should keep investors awake. Token volumes grew 330x in two years while capex grew 6x, which means Google is finding massive efficiency gains and processing 55 times more tokens per dollar of capex than it did in 2024. TPU 8i's 2x performance-per-watt improvement and 3.5 Flash's 4x speed advantage represent real engineering wins that compound beautifully in spreadsheet models. But 330x demand growth outrunning 6x spending growth means either efficiency gains continue at this pace indefinitely, which has never happened in the history of computing, or the spending curve steepens further.

The Gemini Spark Loss-Leader Problem

The most interesting announcement at I/O wasn't a model or a benchmark. Gemini Spark is Google's new personal AI agent that runs 24/7 on dedicated virtual machines in Google Cloud, powered by Gemini 3.5 Flash and the Antigravity coding harness, handling long-horizon tasks in the background with no need for the user to keep a laptop open or a browser tab active.

The price for access to this always-on digital employee: $100 per month, as part of the new Google AI Ultra plan that has been dropping steadily from $250 at launch to $200 and now to $100, a pricing trajectory that screams penetration strategy, because the economics of running a persistent VM for each subscriber simply do not work at that price point.

Run the VM math yourself. A comparable always-on instance on Google Cloud, even something modest like an e2-standard-8 with 8 vCPUs and 32 GB of RAM, costs roughly $193/month at on-demand rates, and that is just the compute shell without the inference cost of running Gemini 3.5 Flash continuously, without the Antigravity agent framework overhead, and without the storage for persistent user state. A more realistic fully loaded per-agent cost, accounting for burst inference on TPU or GPU backends when the agent executes tasks, likely exceeds $300/month for active users.

At $100/month, Google is subsidizing every Gemini Spark user by somewhere between $100 and $200, a figure that for a company processing 3.2 quadrillion tokens monthly represents a rounding error on the overall infrastructure bill but reveals the strategic intent with crystalline clarity: get consumers dependent on a persistent AI agent that lives inside the Google ecosystem, touches Gmail, Chrome, Search, YouTube, and Workspace, and makes switching costs so enormous that even a doubling of the subscription price would feel cheaper than rebuilding your digital life elsewhere. This is the most expensive consumer lock-in strategy ever attempted, and it might simultaneously be the cheapest per user retained, because the lifetime value of a fully entangled Spark user, one whose entire digital workflow routes through Google, could dwarf any monthly subsidy.

The Energy Elephant

Google did not disclose the energy consumption of processing 3.2 quadrillion tokens per month, a gap in transparency that leaves analysts estimating from published research. A comprehensive study from researchers at Universidad Carlos III de Madrid, covering 155 model architectures across 21 GPU configurations with over 32,500 measurements, found that energy per token varies from roughly 0.001 to 0.01 watt-hours depending on model size, batch optimization, and the specific hardware doing the computation.

Google's TPUs are specifically designed for inference efficiency, making the lower end of the published range a reasonable baseline for bulk workloads. At 0.001 Wh per token, 3.2 quadrillion tokens per month consume 3.2 terawatt-hours monthly, or 38.4 TWh annually, a number that rises to 115 TWh per year at 0.003 Wh per token, a midpoint that accounts for the diversity of workloads including computationally heavier Omni and Pro model inference alongside the lighter Flash calls.

The International Energy Agency estimates total global data center electricity consumption at approximately 1,000 TWh for 2026, which means that if the midpoint estimate is accurate, Google's token processing alone, not counting Search indexing, YouTube transcoding, Gmail delivery, or any of the company's other massive workloads, would represent roughly 12% of all data center electricity consumed on Earth.

In human-scale terms: 115 TWh per year exceeds the entire annual electricity consumption of the Philippines, a nation of 117 million people, while even the conservative 38 TWh estimate surpasses Sri Lanka. Pichai noted that both TPU 8t and 8i deliver "up to two times better performance-per-watt" than the previous generation, and if the entire fleet upgrades, the energy footprint per token halves. But a 2x efficiency gain buys roughly four months of headroom against 7x annual token growth before you're right back where you started, consuming the same watts to serve the demand that materialized while you were celebrating the improvement.

The Big Picture: $725 Billion and Counting

Google isn't the only company writing checks that rival sovereign budgets. Combined 2026 capex guidance from the four largest AI investors totals approximately $725 billion, a number that TrendForce inflates to $830 billion when you add the next five largest cloud service providers.

Company2026 Capex GuidancePrimary Focus
Amazon$200BAWS, custom silicon (Trainium)
Microsoft$190BAzure, OpenAI infrastructure
Google$180-190BTPUs, Gemini, Cloud
Meta$125-145BLlama training, AR/VR
Apple$14BOn-device AI, Apple Intelligence
Combined~$725B

For perspective: NASA's entire budget from its founding in 1958 through 2024, adjusted for inflation, totals approximately $810 billion, meaning Google alone will spend in one year what took NASA 23 years to accumulate through Mercury, Gemini, Apollo, Skylab, and the Space Shuttle's inaugural flights. The combined Big Tech figure approaches NASA's entire 66-year total in a single calendar year, which raises the question of whether building AI infrastructure is the new space race, except this time the rockets are made of silicon and the destination is revenue rather than the Moon.

The U.S. federal research and development budget for 2026 is approximately $220 billion across all agencies, every dollar from the NIH to DARPA to the NSF combined, which means Google's capex alone represents 86% of what the entire U.S. government spends on R&D. These numbers feel abstract until you trace them to physical reality: $830 billion buys roughly 800-900 data centers globally, according to TrendForce, each requiring land, water, power grid interconnection, and years of construction, with electrical demand that would require new generation capacity equivalent to multiple nuclear power plants per company per year.

The Strongest Case For It

Before dismissing this as irrational exuberance, consider Pichai's most important disclosure: 375 Google Cloud customers each processed more than one trillion tokens in the past year, Cloud revenue grew 63% year-over-year, and the growth was capacity-constrained, which is an investor relations way of saying the company left billions in revenue on the table because it couldn't build servers fast enough to serve existing demand, not hypothetical future demand, not projected demand in some optimistic model, but demand that showed up with purchase orders that Google physically could not fulfill.

The contracted-revenue argument is genuinely strong. If a meaningful share of $190 billion deploys against multi-year enterprise commitments with guaranteed minimum consumption, this is less a bet and more a fulfillment operation, one where the capital expenditure is essentially pre-sold before the concrete is poured. Amazon's parallel $200 billion capex rests on identical logic: AWS backlog exceeds deliverable capacity.

The consumer side is where speculation creeps in. Nine hundred million Gemini users are impressive as a headline, but the revenue per free-tier user remains opaque, and the path from "uses Gemini occasionally to settle a dinner argument" to "pays $100/month for Spark" is long, winding, and littered with the corpses of previous consumer subscription services that learned the hard way that conversion rates typically land in single-digit percentages.

Limitations

Several important caveats constrain this analysis. The $190 billion capex figure covers all of Google's infrastructure, not exclusively AI: it includes Search indexing clusters, YouTube content delivery networks, cloud storage, and other non-AI workloads, meaning the true AI-specific fraction is impossible to isolate with public data, though Pichai's framing tied it overwhelmingly to AI. Our energy-per-token estimates rely on published academic research using NVIDIA GPUs, and Google's custom TPUs may be significantly more efficient in ways the company has chosen not to quantify publicly. The per-user capex calculation treats all Gemini users equally when a developer processing millions of API tokens generates orders of magnitude more infrastructure load than someone asking Gemini a single daily question about the weather. And the Gemini Spark VM cost comparison uses public GCP pricing as a proxy, while Google's internal allocation cost for dedicated infrastructure on its own hardware in its own data centers is substantially lower than what it charges external customers, making the actual loss-leader subsidy unknowable from outside the company.

The Bottom Line

Google just told the world it will spend more money on AI infrastructure this year than most countries generate in GDP, process more tokens in a single month of May 2026 than it did in the entirety of 2024, and offer a 24/7 personal AI agent at a subscription price that almost certainly doesn't cover the compute cost of running it.

The question is no longer whether AI demand is real, because 330x growth in two years settles that definitively. The question is whether $190 billion per year, sustained across the remainder of this decade, generates returns that justify the capital, and at current scale that means Google needs every one of those 900 million Gemini users to become measurably more valuable, clicking more ads, buying more subscriptions, generating more cloud contracts, doing more of everything that converts infrastructure expense into revenue, to make the math work before the depreciation schedule catches up with the spending.

If you're an enterprise customer: negotiate now, because capacity constraints mean Google wants your business badly enough to offer favorable terms on multi-year commitments, and that leverage window closes when the new data centers come online. If you're a developer: watch the 3.5 Flash pricing closely, because Google explicitly said companies could save over $1 billion annually by shifting 80% of workloads to Flash, which is less a product pitch than an acquisition tool priced to pull market share from OpenAI and Anthropic before they can respond. And if you're a consumer wondering whether $100/month for Gemini Spark is worth it: someone at Google did the VM math too, they know they're paying more than you are to run your agent, and the only question is how long the subsidy lasts and what data they expect in return for funding the difference.