The Agentic Stack Wars: Part Four - RECKONING
The AI Free Lunch was always a Fairy Tale. The old software playbook is back, but this time the mechanics are much more expensive
This is Part 4, which concludes our four-part series. It provides advice about how the agentic stack is being managed and positioned by the major vendor players in the Agentic Stack Wars. If you arrived here before reading the other parts, start with Part 1: CONFESSION
A Tale as Old as Time
The technology industry has a remarkable gift for cultural amnesia. Maybe that’s because the first version of any wave gets built by the young and the starry-eyed, the ones who haven’t been around long enough to recognize the rerun.
Every generation gets told the same thing. That this latest wave of technology will finally free us from the constraints of the last one. More productivity. More leisure. More creativity. More democratization. More abundance.
I still remember being told a version of this in high school in the 1970s. I was thirteen, sitting in a Modern Studies class, being invited to challenge the assumption that technology would inevitably free humanity from drudgery and scarcity. (It was wrapped inside a communism-versus-capitalism debate, which tells you something about both the decade and the teacher...plus ça change!)
Half a century later, and forty years into a career in this industry, I’m still waiting.
The tools changed. The promises changed. The brands doing the promising DEFINITELY changed.
But the economic behavior did not.
Which is why the current shock over AI pricing and tiering feels a wee bit theatrical. Everyone is acting as if Google, OpenAI, Anthropic, Microsoft, Perplexity, and the rest have just invented some strange new commercial playbook.
They haven’t. It is all entirely predictable, once you know the patterns.
It is what software companies have always done.
They carve up value, bundle complexity, and gate features. They build premium access paths, cultivate dependency, and keep the true cost of the stack politely out of sight until you are already standing inside it.
The labels are new. The tactics are as old as the hills.
The Old Software Tricks Never Went Away
For decades, software vendors have made a comfortable living monetizing scarcity, including a fair amount of manufactured scarcity.
A capability already sitting in the code became an “advanced module.”
A function you could switch on with a license key became a “premium feature.”
A workflow that the customers assumed was included turns out lives in the enterprise edition.
Maintenance crept up faster than inflation, every year, like clockwork.
Support tiers multiplied.
Implementation services became essential.
Integration work became unavoidable.
Consulting became the shadow product standing quietly behind the software product, with the invoice already drawn up.
Then SaaS arrived and changed the vocabulary.
Licenses became subscriptions. Maintenance became customer success. Version upgrades became continuous delivery. Server installs became platforms. Professional services became Transformation Enablement. Integration became Ecosystem Architecture.
The names changed, but the doing did not:
Land. Expand. Bundle. Gate. Meter. Escalate.
Now AI has arrived, wearing a hoodie, speaking the soft language of assistant, creativity, and human empowerment through effortless coding and agentic workflow.
But underneath the hoodie is the same old commercial instinct: find the control point, own it and then price and monetize it.
I watched these exact relabelling mechanics play out a few weeks ago with the phrase “ harness engineering, a shiny new label bolted onto a discipline the industry has practiced, under other names, for thirty years.
Shiny new word, same work.
Horse of a different color?
A few weeks ago, I suddenly noticed a word turning up everywhere I looked.
The Stack Is the Diagnostic Framework
This is why the architecture matters. Not because the world needs another lovingly arranged taxonomy of agentic tools (we did the actual teardown in Part 2: ARCHITECTURE and I won’t make you sit through it twice). It matters because the architecture shows you what the marketing language is working so hard to obfuscate.
A serious AI agent is not a chatbot, or a model, or a clever interface. It is a stack. Data at the foundation, an orchestrator deciding what needs to happen, a connectivity layer reaching into your other systems, an action layer doing the work, and a runtime burning compute to keep the whole contraption alive.
That is the reality behind the agentic magic trick for both vendors and customers.
Every single layer is a future toll booth.
This Is Not Just Software Anymore
The old packaged-software model had gorgeous, elegant economics for the vendor and investor.
Building it was expensive, and so were sales and support. But once the product existed, the next copy cost almost nothing. A CD would cost pennies. A download costs less. A license key emailed, costs nothing at all.
It meant you could sell something for hundreds of thousands of dollars while the marginal cost of delivery is effectively zero. That was the beautiful, slightly absurd magic of almost 100% gross margins on software.
Then SaaS came and dented the magic a little without destroying it. Now the vendor had to host, secure, scale, patch, monitor, and back the application. Real cost-to-serve.
But in a mature SaaS business, the economics still worked beautifully, and high gross margins of 65-80% were entirely possible.
AI-as-a-service is a different animal.
With AI-as-a-Service, every serious interaction burns real compute. Every long-context session incurs memory and retrieval costs. Every grounded answer may need indexing, storage, search, permissions, and governance across several cycles.
Every agentic workflow may call one to dozens of tools, invoke several models, cross a dozen APIs and MCPs, and run in a cloud environment that someone has to secure and monitor.
This is not a static copy of code sitting quietly on your instance in a server somewhere. It is an active operating fabric.
And active operating fabrics cost real money. I ran the numbers on exactly this for a single clinical-reasoning workflow back in July of last year, and a single orchestrated multi-model query came in at five to ten times the cost of a plain single-model one. Extrapolate that across a 3,000-clinician health system and you are into six and seven figures a year, before anyone has so much as mentioned a license fee.
A Power User Is Now a Financial Risk
This is the part a lot of people still haven’t caught up to.
In old software, a power user was great news. Sticky and almost certain to renew. They were the best advocates and generated great case studies. The more they used the product, the better the customer story looked.
In AI, a power user becomes a variable cost machine.
You see, a casual user asking for a dinner recipe is cheap, easy to serve.
A professional running deep research, analyzing long documents, generating code, comparing contracts, mining the CRM, and orchestrating tasks across half a dozen systems is not cheap. Building, expanding, no control gates. That user isn’t just engaged. That user is consuming AI infrastructure and data access like a drunken sailor. The meter runs fast!
And in any serious enterprise organization, the bill doesn’t stop at inference. Grounded data means storage, cleansing, classification, indexing, access control, retention, auditability, and governance. Connectivity means APIs and MCPs, vendor system connectors, identity management, permission boundaries, monitoring, and maintenance that never quite ends.
For the vendor, they have to build and maintain that fabric. Gross margins can no longer be predicted in those heady, high double-digit numbers. The ‘Gravy’ train has literally slowed to a ‘Treacle’ train. That’s the new AI Stack Wars reality.
So when a vendor introduces usage caps, credits, model routing, or consumption pools, that is not pure cynicism. It is also margin protection. It looks like an upsell. It’s actually planning for survival and the long war!
For a customer, they now have to configure, govern, and operate it in a messy real-world environment that can flex depending on how aggressively their users adopt and deploy this new AI fabric.
The customer CIO and CFO’s new world reality is one where AI-based solution tooling becomes a volatile maelstrom of highly dynamic operating costs. Your budget predictability just went out the window with nary a whisper of ROI justification available to argue the case with the board when they ask what just happened to our cost base!
The Utopian Story Has Always Been Too Convenient
This is where the conversation gets bigger than pricing and software.
For decades, each new technology wave has arrived carrying a story of liberation. Automation will free us. Computers will free us. The internet will democratize everything. Cloud will make infrastructure effortless. And now? AI will create abundance.
Some of all of that is true at the level of capability. Technology does make extraordinary things possible.
But capability is not the same as social outcome.
Steam did not end the concentration of power and money.
Neither did Electricity when it arrived.
Computing did not end it.
The internet did not end it.
Cloud did not end it.
A brief Economics and Philosophical digression
The better angels of human behavior and “The Invisible Hand” could certainly direct positive change, but they tend to be elusive.
Each previous wave changed the machinery of power. None of them abolished it, and I suspect AI will not be any different.
In capitalist systems, the gains will accrue to those who own the platforms, models, data centers, chips, capital, and distribution.
In state-capitalist and authoritarian systems, the same dynamic shows up wearing a uniform: state elites, favored firms, surveillance infrastructure, and politically controlled access.
Different costumes. Similar selfish human behavior.
There will always be people and institutions that want more. More control, more margin, more leverage, more dependency, more power.
That is not a bug in the technology.
It is a feature of human systems.
The Real Question Is Who Owns each Toll Booth on this Toll Road
So buyers should stop asking only which AI tool has the best demo or which model is best for this or that.
The better questions are harder, and most of them rhyme.
Who owns the data layer I am becoming dependent on?
Who controls the orchestrator and its architecture?
Who maintains the connectors?
Who governs the APIs and MCPs?
Who pays for the compute?
Who can change the limits without notice?
Who can move the feature I rely on into a tier I don’t have access to?
Who can bundle that one tool I need with three I don’t?
Who can raise the price once my workflow is wired in too deeply to move?
Just exactly who controls each and every toll booth?
That is the conversation worth having. Because this was never only a feature race. It is a stack war, and every major player is fighting to own a different layer of the operating environment that the concept of work itself is moving onto.
“Are you ready, Player One?”
Google wants the ecosystem center. Microsoft wants the governance and workflow layer. OpenAI and Anthropic want the intelligence layer. Perplexity wants the browser-native middle ground. The open-source faithful want sovereignty, and inherit the maintenance bill that comes with it.
Different strategies. Same underlying question. Where does the dependency collect?
Because that is exactly where the pricing power will show up.
Now read that list of Toll Road questions back, because there is a second way to use it. Every question points to a booth that somebody else owns. Turn them around, and they become the only move that has ever worked in a market built like this.
Don’t just work out who owns the toll booths. Work out who is trying to convince you to take the new expressway around several toll booths, but lock you into their part of the stack.
Institutional Memory
And don’t forget. You already own part of that Toll Road and have half forgotten about it. Your data is a toll booth, too. The proprietary record of how your organization actually works, which no general model can reconstruct from the outside. So is the institutional memory wrapped around it, the hard-won understanding of which exception matters and which precedent applies, the part that currently lives in your best people’s heads and walks out the door when they leave or you RIF them.
And there is a newer one that almost nobody is collecting yet. The moment you put agents to work, they leave a trace of every step taken and every point where a human stepped in to overrule them. That gets subtly filed away.
Most firms will keep those traces at first, mainly to justify the bill, and they’ll argue they are encoding that experience into the new operating paradigm. Then they will notice the trace is worth more than the AI stack bill, because it is the truest record they have ever held of how the place actually decides, operates, and innovates.
Unlike that model you rent and the platform you rent it on, this asset, this toll booth is yours, and it cannot be repossessed when the contract ends. Plan to hold it close because, whether you believe it or not, institutional memory is already leaking out of your organization at a rate of knots. It’s leaving through the front door as your knowledge workers leave or get RIF’d. And its leaking out the backdoor to the very vendors and models you have servicing the rest of your AI stack.
There Is No Free Lunch
The free phase was never the destination. It was always the on-ramp.
It trained and habituated us. It taught us to bring our work, our questions, our files, our half-formed ideas into the machine. That wasn’t charity. It was market formation and more surreptitiously harvesting.
Now the infrastructure bill is coming due.
There will be a great deal of talk about responsible scaling, premium experiences, and enterprise-grade controls, and some of it is materially true. But the plain version is shorter.
The free lunch was promotional. Next comes the real invoice.
The only real difference from every cycle before it is that this lunch isn’t being copied onto a CD for pennies. It is being cooked in data centers, served through models, grounded in your corporate data, piped through brittle APIs, orchestrated across cloud runtimes, and powered by warehouses full of GPUs.
Somebody is going to pay for it.
Probably you.
Probably your employer.
Probably the taxpayer.
Probably all of the above.
I have watched that part come good in every technology economic cycle since the 1970s, and this one will be no kinder about it.
The Disney ending?
But paying the Villians is not the only thing that happens in this story, and it never was. The question that decides where you actually stand is not whether the invoice arrives. It is whether you kept any ground worth charging for of your own — the data and the hard-won judgment a model cannot copy — and whether a human stayed in charge of the decisions that genuinely matter. That is the one input that no vendor meter has ever found a way to price. Otherwise, they’d be in your business as your competitor.
That part is still yours. Holding onto it on purpose, rather than by luck, is the whole game now, and it is a longer story than this one. It is the next one I might tell.
Conclusion
In Part 1, I gave you a Spoiler to the series. I’ll repeat those three pieces of advice here, because they bear repeating as we close out.
Three things worth doing rather than fuming
1. Work out which tier of each layer you actually need, not which tier you’re used to or being offered.
Most people are paying for a tier they chose when the structure was simpler, and haven’t reconsidered since. The $20 tier is still genuinely useful for much professional work. The $100 tier makes sense if you are running agentic workflows, working with long-context documents, or doing heavy coding. The $200 tier is for people whose AI usage is genuinely continuous throughout the working week, or who are proto-entrepreneurs.
Being honest with yourself about which category you fall into is more productive than resenting the new structure.
2. Understand what is being metered before you rely on it.
The compute-based model rewards knowing which tasks are expensive. A background agent running all day is expensive. A text summary is not. A deep research run is expensive. A quick draft is not.
Claude Extended Thinking at maximum depth is expensive, as I covered in the Opus 4.7 piece, and it is not always necessary for the task in front of you. Understanding the cost profile of the tools you use most is now basic AI literacy.
3. Do not build workflows you cannot export or easily migrate.
The tighter the metering gets, the more valuable it is to own your prompts, skills, frameworks, and outputs as portable assets rather than platform content. Understand how you’d migrate them if the day comes that you might have to.
If a price change or a tier restructure would break something critical to how you work, that is a dependency worth addressing now.
Keep haverin’.
SM
The Agentic Stack Wars — the full series:
Part One — Confession: Google (Finally) Just Said The Quiet Part Out Loud
Part Two — Architecture: Same Stack, Different Hoodie
Part Three — Extraction: Your AI Budget Is Already Wrong
Part Four — Reckoning: The AI Free Lunch Was Always a Fairy Tale (this article)





