The AI Layoff Spreadsheet Had One Assumption. It Just Inverted.

Eighteen months ago, the business case for replacing a human team with software fit on a single slide. Salary, benefits, and overhead on one side; a per-token API rate on the other. The API rate was a rounding error, and it was getting cheaper every quarter. The decision looked obvious, which is exactly the kind of decision that should make a contrarian nervous.

The slide was built on one load-bearing assumption: that the cost of running the machine would keep falling toward zero. That assumption has now reversed in two places at once — the labor outcome and the hardware floor underneath it — and most of the people who approved those layoffs have not yet noticed the second one.

The canary in the org chart

Start with what is already visible. The companies that moved fastest to swap headcount for automation are quietly walking it back. A February 2026 survey of 600 HR professionals who had executed layoffs in the prior year found that two-thirds of employers who cut roles for AI reasons have already rehired for them, with 32.7% rehiring a quarter to half of eliminated positions and another 35.6% restaffing more than half. The reversals came fast — over half of these leaders rebuilt within six months of cutting.

The reason is not sentimental. More than half of those leaders reported their automated systems demanded far more human oversight than projected, and a fifth said the tools simply underperformed. This is the AI boomerang now landing in customer service, copywriting, software, and HR, where the payroll “saving” evaporated into the hidden cost of hallucinated policies, regulatory exposure, and crisis management. Gartner’s projection that half of all firms cutting service roles for AI will rehire similar functions by 2027 is no longer a forecast; it is a description of a process already underway.

This is the part of the story the labor press has covered well, and it is also the part most likely to be dismissed as an execution problem — a few overeager executives who will be replaced by smarter ones who automate correctly next time. That dismissal misses the structural shift sitting one layer down.

The cost curve everyone extrapolated

The layoff math assumed inference would behave like every other software cost: marginal, deflationary, asymptotically free. And on a pure unit basis, that story is technically true and getting truer. Enterprise data shows the blended cost of a million tokens fell roughly 67% year over year, from $18.40 to $6.07 between early 2025 and early 2026. Gartner expects inference on a trillion-parameter model to cost nearly 90% less by 2030. If you only read the unit-price line, the original spreadsheet still looks vindicated.

The invoice tells the opposite story. The same providers watching per-token prices collapse are sending bills that climb every quarter, because token consumption grew 1,001% while total spend still rose 497% over the same fifteen-month window. Agentic workflows fire ten to twenty model calls per task, retrieval pipelines inflate every context window, and always-on monitoring agents burn compute around the clock whether or not a human asked for anything. Replacing a worker does not generate one tidy API call; it generates a continuous, branching, retrying stream of them. The unit got cheaper and the bill got bigger, and it is the bill that hits the income statement.

There is a deeper trap inside the cheap-unit narrative. Today’s per-token prices are not a market clearing price — they are a subsidized one. The frontier labs are pricing inference below cost to capture share, funded by venture capital and hyperscaler cross-subsidies, creating what is best understood as a false floor. A floor held up by capital rather than economics is a floor that moves the moment capital discipline returns. Any business model that booked permanent savings against a temporarily subsidized input has mispriced its own foundation.

The floor just inverted

Here is the piece the layoff spreadsheets never modeled, because in early 2024 it would have sounded absurd: the physical cost of running the machine is now rising, and the thing pushing it up is the same AI buildout that justified the cuts.

The mechanism is memory. The wafer capacity used to manufacture high-bandwidth memory for AI accelerators is largely the same capacity used to make the commodity DRAM that goes into every server. Manufacturers chase the premium, and HBM commands more than four times the price of conventional DDR5 server DRAM, so Samsung, SK Hynix, and Micron have reallocated production toward it. Commodity memory gets starved. Server DRAM contract prices rose 43 to 48% in a single quarter at the end of 2025, with further increases above 60% projected through the first half of 2026.

This is not a chip-cycle wobble that mean-reverts in two quarters. The pressure persists as long as accelerator demand keeps priority access to wafer capacity, which is why the European hosting market is repricing wholesale. Hetzner, long the reference point for cheap compute, attributed a 30-to-50% price increase to memory and component costs it called barely comprehensible, with DRAM up roughly 171% year over year. OVHcloud, Netcup, Scaleway, and IONOS are all moving the same direction. The cheap-compute era that made “just run it on a server” a throwaway line is being repriced to a higher plateau through 2027 and possibly 2028.

Stack the two movements together and the original decision looks very different. The labor-replacement case assumed the machine would get cheaper to run. Instead, the unit price is subsidized and due to normalize upward, the consumption pattern multiplies the bill regardless of unit price, and the hardware floor beneath all of it is rising on a multi-year structural shortage. Three independent vectors, all pointing the wrong way for anyone who treated automation as a fixed-cost substitution for a variable-cost workforce.

What the contrarian does with this

The consensus has already moved from “AI will replace everyone” to “AI is more expensive than expected.” That is not the contrarian edge — that is yesterday’s contrarianism, now priced in. The edge is recognizing what kind of cost this is. A business that automated a function and dissolved the team did not convert a variable cost into a fixed one. It converted a relatively stable, predictable labor cost into a volatile input exposed to subsidized pricing, consumption explosions, and a global memory shortage simultaneously. The human team you could budget. The new cost structure you cannot, and you no longer have the people who knew how to do the work without it.

The actionable read is not “AI doesn’t work.” It plainly does, in the augmentation pattern the rehiring firms are converging on — machine for the first pass, human for judgment, exceptions, and trust. The read is about which businesses absorbed a hidden short position on compute cost and called it a productivity gain. The ones that indexed their savings to a falling cost curve are the ones whose margins compress when the curve turns, and the curve has turned. Watch for it first in the firms that automated most aggressively, reported the cleanest efficiency stories to shareholders, and have the least pricing power to pass a doubled infrastructure bill downstream. The reckoning will not announce itself as an AI story. It will show up as a margin miss, three weeks to three months after the slide that promised the opposite.