The Agentic Stack Wars: Part One - CONFESSION
Google (finally) just said the quiet part out loud. The day the AI industrial complex stopped apologizing for its own business model
This is Part 1 of a four-part series that I will be publishing over the course of a week. It provides advice about how the agentic stack is being managed and positioned by the major vendor players in the Agentic Stack Wars. Please keep watching your inbox for the next three parts.
Last week, Sundar Pichai stood on a stage at Google I/O and announced that Gemini would start billing you by compute consumed rather than messages sent. Your usage budget refreshes every five hours. Hit the usage ceiling, and you get bumped down from Flash to Flash-Lite.
Want more? Pay more.
He said this out loud…in a keynote…with slides. In front of journalists, customers, and investors.
Anthropic and OpenAI have been more subtle, not wanting to spook the customers while trying to covertly impress the investors and corporate bankers. Moving the same commercial mechanics for months now to improve their respective IPO mechanics.
The difference is that they did it through support page updates, policy emails, and even X posts. Each one dressed in the language of operational necessity rather than a category pricing decision.
What Google actually announced
Here is the plain-English version, without the marketing gloss.
Google now runs five consumer AI tiers:
Free → AI Plus (roughly $8/month) → AI Pro ($20/month) → AI Ultra at $100/month → AI Ultra at $200/month.
Compute-based billing means your usage is no longer counted by messages or prompts. It is calculated based on the actual weight of what you are asking for.
A simple text query costs differently from a deep research run, a video generation, or an agentic workflow running repeatedly in the background while you sleep.
When you hit the ceiling on the big models, you get auto-routed to Flash-Lite instead of being cut off entirely.
And if you want to keep running on the good stuff, Google will sell you top-up credits.
Gemini Spark (the always-on background agent that runs in Gmail and Workspace even when your laptop is closed) is currently gated behind the AI Ultra tier at $100 and above.
Because of course it is. Running an autonomous agent that polls your inbox and executes workflows at 2 am genuinely costs more than asking it to summarize a meeting. Charging accordingly is not a betrayal. It is just an honest statement of what the thing actually costs to run.
What is new is that Google said it out loud, in public, at their biggest event of the year, with a slide deck.
The Ladder
Think of the pricing structure as a ladder.
When Gemini was first introduced as a consumer product, Google’s ladder was short: free access with basic Google One storage plans, and, later, a roughly $20-per-month AI premium tier.
At that point, the $20 rung felt like the top tier in consumer AI.
Each year since, Google has added rungs above, below, and around it. The $20 rung stayed roughly where it was. The rest of the ladder grew taller.
Depending on whether you count only the AI-branded plans or the broader Google One storage plans options alongside them, that is now a five to seven-rung structure.
The $20 rung didn’t necessarily get worse. It definitely includes more features than it did before.
But it is no longer the top, and the more capable (and desirable) features lie above.
The frontier model access, highest usage limits, early access features, and the largest bundled storage now sit above it. To replicate the old feeling of being on the top consumer rung, you now need to spend several times more per month, and, depending on your usage, potentially close to ten times as much.
That is the entire “AI shrinkflation” story told honestly. Your $20 rung has not lost features. But what $20 buys, and what Google now considers its best consumer AI product, have moved further apart. The gap is real and continuing to grow.
The others were doing it too, just more quietly
OpenAI tried to retire GPT-4o in August 2025, midway through the GPT-5 launch, with minimal warning. Users pushed back immediately. The model came back within days, and Sam Altman publicly promised “plenty of notice” before any future retirement.
When the formal retirement came in January 2026, it arrived via a blog post with two weeks’ notice. It was covered by TechCrunch and CNBC and framed as an orderly product transition.
The new $100 Pro tier followed in April, announced via a post on X.
Announcements, yes. But the kind you bump into rather than the kind presented to you.
Anthropic’s approach I covered in detail back in April — the four-week sequence of limit changes, the OpenClaw removal, each move dressed in the language of infrastructure management and fair usage.
You can find that here:
The AI Cage Just Locked: Functionality Constraints Are No Longer Coming — They’re Here
This is breaking news stuff that I am getting out and making free to everybody.
None of it was secret. All of it was managed. The commercial logic was identical to what Google announced last week. The communication choice was simply designed to ‘reduce friction’ along the way.
Google just skipped friction management and ripped the Band-Aid off.
That is what is actually new here.
Conclusion
Over the next three articles, I’m going to talk more about the AI Agentic stack and why, from a commercial mechanics perspective, nothing here is particularly new.
Then in Part 4, we’ll bring it all together to consider how you can respond and prepare for it.
Spoiler Alert: To jump to the end, here are the three critical things you can do to survive this all playing out.
Three things worth doing rather than fuming
1. Work out which tier you actually need, not which tier you’re used to.
Most people are paying for a tier they chose when the structure was simpler, and haven’t reconsidered since. The $20 tier is still genuinely useful for much professional work. The $100 tier makes sense if you are running agentic workflows, working with long-context documents, or doing heavy coding. The $200 tier is for people whose AI usage is genuinely continuous throughout the working week, or who are proto-entrepreneurs.
Being honest with yourself about which category you fall into is more productive than resenting the new structure.
2. Understand what is being metered before you rely on it.
The compute-based model rewards knowing which tasks are expensive. A background agent running all day is expensive. A text summary is not. A deep research run is expensive. A quick draft is not.
Claude Extended Thinking at maximum depth is expensive, as I covered in the Opus 4.7 piece, and it is not always necessary for the task in front of you. Understanding the cost profile of the tools you use most is now basic AI literacy.
3. Do not build workflows you cannot export or easily migrate.
The tighter the metering gets, the more valuable it is to own your prompts, frameworks, and outputs as portable assets rather than platform content.
If a price change or a tier restructure would break something critical to how you work, that is a dependency worth addressing now.
What happened at Google I/O 2026 was not surprising. Anyone watching what Anthropic and OpenAI had been doing quietly for months could see exactly where this was heading.
But what’s more enlightening is that we have entered the part of the cycle in which the industry has decided it no longer needs to manage the message.
The politeness phase has ended.
Keep haverin’.
SM
The Agentic Stack Wars — the full series:
Part One — Confession: (this article)
Part Two — Architecture: (Coming June 2)
Part Three — Extraction: (Coming June 4)
Part Four — Reckoning: The AI Free Lunch Was Always a Fairy Tale (Coming June 6)




