Token-related cost cuts and the Yellow Brick Road — are corporates redirecting payroll into a productivity dividend they keep, or up the AI stack into a meter they don't?

2026-05-28 · long-form

Executive summary

The wave of "AI-attributed" cost cuts compounding through Q2 2026 — Meta's April 23 announcement of 8,000 layoffs plus 6,000 cancelled requisitions T2; Microsoft's first-ever voluntary retirement programme covering ~8,750 US employees T2; Oracle's 30,000-person cut in March funding a $156B AI infrastructure build T1; Amazon's 30,000+ corporate roles eliminated since October 2025 T3; cumulative ~134,000 tech workers cut YTD by 2026-05-27 T3 — is most often read as evidence that the AI productivity revolution has arrived and that the cutting companies are converting payroll into a higher-return capital expenditure. The headline reads as bullish: companies cutting heads to fund AI capex are simultaneously (i) cheaper to run and (ii) producing more, so margins and earnings power both expand.

The evidence from inside the same companies says something narrower and more inconvenient. Uber burned its entire 2026 AI coding-tools budget in four months T2, and Uber's COO Andrew Macdonald told the Rapid Response podcast that the link from rising AI token spend to consumer-product improvement "is not there yet… it's very hard to draw a line between one of those stats and 'Okay now we're actually producing like 25% more useful consumer features'" T2. Microsoft is cancelling most direct Claude Code licences inside its Experiences and Devices group — Windows, Microsoft 365, Outlook, Teams, Surface — six months after rolling them out, with a June 30 fiscal-year-end deadline for migration to GitHub Copilot CLI T2. Klarna has spent 2025-2026 quietly rehiring the customer-service team it replaced with AI in 2024 T3. Nvidia's vice-president of applied deep learning Bryan Catanzaro — the chip company saying it — told Axios that "for my team, the cost of compute is far beyond the costs of the employees" T3. These are not three anecdotes. They are the operational interior of the cost-cut wave, and they argue that a non-trivial share of the announced labor savings is being transferred up the AI stack to model labs and inference providers rather than retained as productivity dividend by the cutting companies.

Joe Schmidt's A16Z essay "Avoiding Death on the Yellow Brick Road" — published 2026-05-27, hours before this report goes to file T1 — supplies the structural map of why this is happening and what it predicts. Companies adopting horizontal AI tools (Claude Code, Cursor, generic agentic copilots over off-the-shelf connectors) are walking the path the labs are purpose-built to own — model plus tool calls plus distribution — and the token-pricing meter at that layer is the labs' value-capture mechanism. The variant perception this report defends is that consensus reads the cost-cut wave as evidence of a uniform productivity dividend accruing to the cutting cohort, while the operational evidence says the dividend is uneven: it accrues most to companies that own the vertical depth A16Z calls "the rest of Oz" — workflows, governance, fine-tuned routing, the system of work the customer runs through — and least to companies that hand every employee a horizontal token-priced seat and meter the consequences. For AlphaSteve, this is operationally important because the equity-market cohort the _house-view already tracks as priced for the consensus reading — AI-infrastructure cohort at MU/SK Hynix $1T re-rating intensity; AI-touched-software cohort at top-decile pre-print multiples — is mis-pricing the transition cost of getting from "headcount cut and Claude seat for everyone" to a sustained vertical-depth productivity gain, and is conflating layer-1/layer-2 winners (silicon, hyperscale) with the eventual layer-5 winners (vertical-application incumbents). The cleanest near-term test is the cohort beat-and-fade pattern in vertical-application names that already have the four A16Z moats (Salesforce Agentforce, Microsoft itself via Copilot stack vertical integration) versus horizontal AI-tool consumers without an internal stack to fall back on.

House view reconciliation

Three current house-view positions speak to this question. The Software / SaaS valuation environment position (last reviewed 2026-05-28 AM) holds at medium confidence that "AI-touched software is being indiscriminately compressed on confirm-rather-than-accelerate guidance," with the discriminating cut refined to FY-trajectory-vs-implied irrespective of capital-allocation lever — the $25B CRM accelerated buyback this week didn't lift the multiple. The AI infrastructure capacity position (last reviewed 2026-05-28 AM) holds at high confidence on the HBM-replaces-CoWoS constraint inversion, and at medium confidence on the duration of the constraint, with the variant view (constraint-duration over-extrapolated; structural-vs-cyclical balance tilting cyclical) sharpening to "central but not yet sharp" after MRVL's textbook acceleration print drew only a flat after-hours close. The Equity-market cycle position (last reviewed 2026-05-28 AM) holds at medium confidence on late-cycle territory with the patience-window argument doubly vindicated by yesterday's rotational tape and this morning's multi-region risk-off.

This report extends all three positions along a single axis the house view has not yet named: the application-layer bifurcation between (a) Yellow-Brick-Road consumers — companies adopting horizontal AI tools with metered token pricing as their primary AI strategy — and (b) Rest-of-Oz incumbents — companies that already have vertical workflow depth, governance, fine-tuned model routing, or an owned-stack alternative. The cohort beat-and-fade pattern in (a) is consistent with the current Software / SaaS framing; what this report adds is the causal mechanism — namely, that the productivity story implicit in the multiple is being undercut at the operational layer by token-cost structures that punish heavy usage faster than they reward it. The AI infrastructure capacity position is extended in the obverse direction: the per-token unit economics are exactly what is keeping demand for inference compute structurally high, even as customer enterprise budgets bend toward cap-and-meter procurement that Microsoft's Claude Code retreat just demonstrated in production. The equity-market cycle position is extended only mildly: late-cycle selectivity has another lens — which names are positioned for the value transfer up the stack vs. the value retention at the application layer.

This is not a fundamental conflict with any existing position. It is an axis the house view should track explicitly. The specific house-view edits this report proposes are below in the "House view changes" section.

The setup

Three things are happening at once that, taken together, redefine the AI productivity question for the second half of 2026.

First, the cost-cut wave is now multi-company, multi-quarter, and management-attributed. April 23 alone produced 23,000 positions eliminated or never filled at two companies posting record revenue T2: Meta with 8,000 layoffs and 6,000 cancelled requisitions, Microsoft with 8,750-eligible voluntary buyouts under a "Rule of 70" formula. Oracle's March 30,000 cut explicitly freed $8-10B in annual cash flow to fund a $156B AI infrastructure build T1. Amazon has eliminated 30,000+ corporate roles since October 2025 T3. Cumulative 2026 YTD: ~134,603 tech workers laid off across 212 layoff events through 2026-05-27, ~916/day, with 48% explicitly AI-attributed by the cutting companies T3. The narrative is no longer "AI restructuring" with an asterisk; the companies are stating the substitution explicitly.

Second, the operational interior of those substitutions is producing unexpected results. Uber's chief technology officer Praveen Neppalli Naga told The Information in April that the company had burned through its entire 2026 AI coding tools budget in four months T3. His own figures showed Claude Code use jumping from 32% to 84% of Uber's ~5,000-engineer organisation between February and March; individual engineers spending $500-$2,000/month on tokens; ~70% of code committed at Uber originating with AI; ~10% of live backend updates shipped by an agent with no human in the loop T3. Naga's reported summary: "I'm back to the drawing board because the budget I thought I would need is blown away already" T3. Uber's COO Andrew Macdonald followed on the Rapid Response podcast on 2026-05-26 with the line that anchors this report: "That link is not there yet. Maybe implicitly there's more that is getting shipped, but it's very hard to draw a line between one of those stats and 'Okay now we're actually producing like 25% more useful consumer features.' If you're not actually able to draw a direct line to how [many] useful features and functionality you're shipping to your users, that trade becomes harder to justify" T2. This is the COO of one of the most AI-forward Silicon Valley operators publicly admitting he cannot draw a measurable line from token spend to consumer-product improvement.

Third — and this is the cleanest tape signal — Microsoft is unwinding the experiment. According to The Verge and Windows Central via TNW reporting T3, Microsoft is cancelling most direct Claude Code licences inside its Experiences and Devices group — Windows, Microsoft 365, Outlook, Teams, Surface — and instructing affected engineers to migrate to GitHub Copilot CLI by 30 June, the last day of Microsoft's fiscal year. Microsoft made Claude Code available in December 2025 to thousands of developers, project managers, designers, and other employees; it is reversing course six months later. The cancellation does not affect Microsoft's Foundry deal with Anthropic ($5B Microsoft investment + Anthropic's $30B Azure compute commitment) T3 — Microsoft remains a strategic partner at the model and compute layer. The retreat is at the seat layer. The company with the most leverage in the room, whose own engineers preferred Claude Code, is walking away on a fiscal-year-end timeline.

Joe Schmidt's A16Z essay "Avoiding Death on the Yellow Brick Road" published the same week — 2026-05-27, 13:55 UTC T1 — provides the framework that makes these three things one story rather than three. The essay's core claim: the labs (OpenAI/Anthropic) own the horizontal-agent layer structurally because that lane improves with raw model capability and they control model, distribution, and architectural choices; the Rest of Oz — vertical, complex, multi-step workflows — is defensible against the labs through four moats (within-customer and across-customer data flywheels, cross-vendor model routing, cost optimization via tiered/fine-tuned models, and governance/compliance as a control plane). The model is fungible underneath; the system of work is not. The empirical lens this report applies: where does the cost-cut wave's labor savings actually go, who keeps it, and which layer does the A16Z framework predict will compound it over the deep-value horizon.

The analysis

The arithmetic of the substitution — what is actually being converted into what

The cost-cut wave reads as labor-to-capital substitution, and at the aggregate level it is. Meta's 2026 capex guide is $115-135B, nearly double the $72B spent in 2025, against full-year 2025 revenue of $201B and Q4 net income of $22.8B T2. Microsoft committed $81B in capital investment last fiscal year and has guided higher T2. Oracle's restructuring frees $8-10B in cash flow for AI infrastructure against a $156B buildout T3. The arithmetic cannot be that severance payments fund the buildout: even the high-end of severance for 30,000 Oracle employees ($2-3B at $80-100K average) is a rounding error against $156B in infrastructure spend. The cost-cut wave is not funding the AI buildout. It is signaling that management believes the current workforce configuration is obsolete and that the future configuration will be smaller plus more compute-equipped.

The substitution decomposes by layer. At the top of the stack — model labs (Anthropic, OpenAI) and inference providers (Microsoft Azure, AWS, GCP, CoreWeave) — the cost-cut wave's labor savings translate into customer enterprise budget that arrives at the lab/inference layer as token revenue. Anthropic shifted its agent pricing from flat-fee to per-token usage-based T2. OpenAI's Sam Altman, in March: "We see a future where intelligence is a utility, like electricity or water, and people buy it from us on a meter" T2. The Uber experience is the meter running. Gartner's forecast: AI agent software spending reaches ~$207B in 2026, +139% from $86.4B in 2025 T3. Goldman's projection: agentic AI drives a 24-fold increase in token consumption by 2030, to ~120 quadrillion tokens/month T3. Even Gartner's prediction that 1T-parameter inference cost falls ~90% by 2030 T3 does not break the dynamic — Gartner's own caveat is that cheaper tokens won't translate to cheaper enterprise AI because agentic models require far more tokens per task than standard models, and providers won't fully pass through lower costs. The unit pricing of intelligence falls; the bill rises.

At the application layer — the companies actually deploying AI agents to do something — the substitution is doing two things simultaneously, and they are in tension. It is converting some labor into capital (the headcount cuts, the capex). It is also converting some labor into operating expense (the token meter). The first conversion accrues to the cutter's margin if the productivity arrives. The second conversion accrues to the lab/inference layer's revenue regardless of whether the productivity arrives. Uber's burned 2026 AI coding budget is the second conversion running ahead of the first. Microsoft's Claude Code retreat is the cutter declining to let the second conversion compound further.

The Yellow Brick Road as cost-structure trap

Schmidt's essay names the trap structurally: the lane where the labs are best-suited is precisely the lane where the problem "improves with raw model capability," which means every dollar of model improvement compounds the labs' advantage, and every dollar of customer usage compounds the labs' revenue. Horizontal AI coding tools — Claude Code, Cursor, generic copilot-over-connectors — sit in this lane by construction. They are good enough that engineers use them constantly, and constant use under per-token pricing produces budget-blowing bills that no enterprise procurement function modelled at adoption time.

The TNW analysis crystallizes the procurement mismatch T3: "A traditional enterprise software deal is denominated in users. A token-priced deal is denominated in how much the model has to think. Agentic coding makes the model think a lot. Sessions run for hours, spawn parallel threads and generate volumes of context that bear no resemblance to the autocomplete interactions that shaped the original pricing structure." GitHub paused new Copilot Pro and Pro+ signups in November 2025 because agentic workloads on paying customers were generating costs that exceeded their monthly plan price T3. Anthropic banned the open-source OpenClaw framework from running on consumer Claude subscriptions after discovering single instances chewing through $1,000-$5,000/day in API equivalent on $200/month Max plans T3.

This is not a temporary calibration issue. Token-priced models that improve their reasoning capability per query thereby consume more tokens per query, not fewer. Anthropic's own infrastructure team has publicly described reasoning workloads as generating order-of-magnitude more compute per query than chat T3. The pricing curve and the capability curve are co-moving in the same direction. Cheaper per token, but more tokens per task, with task complexity rising over time. The Gartner take quoted by Fortune: "Chief Product Officers should not confuse the deflation of commodity tokens with the democratization of frontier reasoning" T3.

The companies that hit this wall hit it from the operational interior, not from the budget meeting. Uber's leaderboard incentive — ranking engineers by token consumption T3 — perfectly demonstrates that adoption pressure plus per-token pricing produces an unbounded cost line. Meta's "Claudeonomics" employee-built dashboard tracked the same dynamic until Meta took it down in April T3. Amazon is reportedly pushing employees to "tokenmaxx," to use as many AI tokens as possible T3. The cultural-and-incentive layer is amplifying the procurement mismatch, not damping it.

The Rest of Oz — why the four moats are the value-retention layer

Schmidt's four moats for the "Rest of Oz" map directly onto where the labor-cost savings can be retained by the application-layer company rather than transferred up the stack to the labs:

Data and learning flywheels are the workflow-specific knowledge the labs cannot reach by spinning up a fresh agent: which insurance risks get escalated, which sales conversation patterns close in a specific vertical, which legal-redline edge cases the partners refuse to allow. Schmidt: "A company that has run its agents through a hundred legal redlines, a thousand insurance underwriting cycles, or ten thousand SDR campaigns has internalized the shape of the problem in a way the next entrant cannot replicate by spinning up a fresh agent for the first time" T1. Uber's consumer-product AI use cases — pricing, routing, predictive features — sit in this category and have for years; that is the part of Uber's AI strategy that does compound. The Claude Code spend that the COO can't draw a line to is the part that doesn't — it's the horizontal-tool consumption, not the workflow-specific intelligence.

Managing model variability is cross-vendor routing: picking the right model for each sub-task across the entire model market, not just what the parent lab ships, plus absorbing the migration cost when a new model lands. The labs cannot do this on the customer's behalf — they sell their next model and tell you to migrate. Vertical-application companies that build cross-vendor routing infrastructure capture both the cost optimization and the continuity. Microsoft's Claude Code retreat to GitHub Copilot CLI is exactly this play, executed at the largest software company in the world. The retreat works because Microsoft has its own model and tool to fall back on. Uber does not.

Cost optimization is model tiering — frontier for the hardest tasks, mid-tier for the bulk, smaller custom or fine-tuned for the narrow slice. Schmidt's framing: "The labs price the floor: the least intelligence available at $X. The Rest of Oz company sells the inverse — the lowest dollar cost for the specific level of intelligence the workflow actually requires" T1. The companies cutting headcount without building this tiered routing capability are buying the floor — frontier-priced intelligence for every task that engineers ask of it. That is the procurement mistake the cost-cut wave is sitting on top of. Salesforce's Agentforce running on Salesforce's own infrastructure, with model routing across its own stack, is the structural counter-example.

Governance is the control plane: permissions, auditing, what-the-agent-is-allowed-to-do, what-the-agent-actually-did, regulatory absorption by vertical (HIPAA in healthcare, SEC/FINRA in finance, state insurance regulations, bar rules in legal). Schmidt: "A horizontal player can't credibly do that without becoming a hundred different verticals at once. CIOs want to have a partner that contractually states they are handling compliance for the agents they are providing" T1. The application-layer companies that have already built this — Salesforce, ServiceNow, the vertical-SaaS incumbents in legal/insurance/healthcare — own the surface where headcount substitutions plausibly compound.

The Klarna case is the cleanest cautionary tale for what happens when an application-layer company adopts the substitution narrative without the four moats. Klarna replaced approximately 700 customer-service agents with AI in 2024, claiming AI-equivalent quality, and is spending 2025-2026 quietly rehiring T3. The reversal was not driven by AI's technical failure on routine queries — AI performed adequately on those — but by customer-satisfaction deterioration on complex, emotionally-loaded, multi-step interactions (fraud reports, billing disputes, account closures) that the routing infrastructure (the second moat) and the workflow-specific knowledge (the first moat) couldn't deliver. The reversal cost exceeded the projected $40M savings because rehiring carried both recruiting cost and a reputational headwind in attracting candidates T3. Duolingo's Luis von Ahn similarly walked back his April 2025 "AI-first" declaration in May 2025, saying he doesn't see the tech replacing the tasks his employees perform T2.

The value-capture map under both readings

Where does the labor savings actually go? Under the consensus reading, the answer is "to the cutter's margin." Under the operational-interior reading this report defends, the answer decomposes by layer:

Layer	Cohort	What captures of the savings	Read
1 — Silicon / HBM	NVDA, MU, SK Hynix, Samsung, TSM, AMAT	Direct: capex spending flows through inference build ai-infrastructure-capacity	Currently priced at maximum cohort intensity; constraint-inversion thesis intact, duration variant elevated
2 — Hyperscale / inference	MSFT (Azure), AMZN (AWS), GOOG (GCP), ORCL, CoreWeave	Direct: the meter runs on their compute; Microsoft uniquely captures both sides via Foundry deal	Strong direct beneficiary; durable if usage compounds, vulnerable if customers cap/meter procurement
3 — Model labs	Anthropic, OpenAI, xAI, Mistral	Direct: per-token revenue is the substitution-revenue stream	Pricing power is real today, vulnerable to (a) open-source competition, (b) customer-cap procurement, (c) the "trough of disillusionment" — Gartner has GenAI there T3
4 — Horizontal AI apps	Cowork, Codex (lab-owned); Claude Code, Cursor (Cursor independent)	Direct seat/token revenue from "Yellow Brick Road" walkers	Lab-owned variants structurally favored; independents face the labs' distribution and architectural-control advantages T1
5 — Vertical applications	CRM (Agentforce), NOW (workflow), vertical SaaS incumbents in legal/insurance/healthcare, Palantir Foundry	Indirect: capture savings via the four moats — workflow depth, model routing, fine-tuned cost optimization, governance	Theoretically the durable beneficiary T1; not yet uniformly priced for this — bifurcating
6 — The cutting companies themselves	Meta, Microsoft, Oracle, Amazon, Uber, et al.	The residual after layers 1-5 take their share; positive if the operational substitution actually arrives, slim if Uber's COO is the cohort signal	This is the contested layer

The consensus reading assumes layer 6 captures most of the savings. The operational-interior reading suggests layers 1-3 capture most of it in the near term, with the layer-5 winners eventually capturing the durable share — but only the vertical incumbents that have already built the moats. The "AI-attributed" cost cuts at layer-6 companies can convert into margin if those companies move down-stack to vertical depth (Microsoft is doing this with the Claude Code retreat to GitHub Copilot CLI; Salesforce has been doing this with Agentforce on its own stack). The companies that can't move down-stack — Uber, the cohort of horizontal-AI consumers without a vertical option — pay the meter and report "AI-attributed efficiency" while their COO admits he can't measure it.

The Macdonald quote is the read most worth dwelling on. "That link is not there yet." It is operationally identical to the Klarna pattern, in code instead of customer service: a horizontal-AI deployment that produces measurable cost-line growth ahead of measurable output-line improvement. The two cases are different in important ways — Uber is talking about an internal productivity tool, Klarna was talking about customer-facing replacement — but the asymmetry between the cost and benefit timelines is the same. The 96,000+ tech layoffs through April T2 and the 134,000+ through May 27 T3 are being announced under a productivity narrative whose operational interior, in Uber's COO's own words, is not there yet.

Variant perception

The consensus framing of the cost-cut wave reads as follows: AI capability has reached a productivity threshold; cutting companies are converting payroll into AI capex; the substitution is real; layer-6 companies (the cutters) capture most of the savings via margin expansion; this is bullish for the entire cohort because it broadens the AI productivity dividend out from the Mag 7 / AI-infrastructure cohort into the broader S&P. The market positioning consistent with this read is the cohort multiple sustained at top-decile pre-print levels in AI-infrastructure (MU $1T, SK Hynix $1T in 24 hours, KOSPI record close ai-infrastructure-capacity) and a forgiving stance on AI-touched-software guidance even when the trajectory only confirms (the May 4 PLTR reaction the 2026-05-25-pltr-beat-and-fade-bifurcation paper named is the cohort's first refutation; the May 27 CRM "modest negative" close-of-AH on Agentforce-plus-$25B-buyback is the second).

AlphaSteve's variant perception is layered, not contradictory to consensus: the substitution is real in the cost-cut sense (the headcount is going down, the capex is going up), but the value capture is mis-mapped by the cohort multiple. The savings are accruing more heavily to layers 1-3 (silicon, hyperscale, labs) in the near term than to layer 6 (the cutters), with layer 5 (vertical-application incumbents with the four moats) being the durable destination on the deep-value horizon. The cohort that is priced as if it captures the savings (broader application-layer software at top-decile multiples) and the cohort that actually captures the savings (vertical incumbents with workflow depth + governance + routing + cost-optimization, plus the layer-1/2/3 providers) are not the same cohort. The mis-pricing is concentrated in the application-layer software cohort that is mostly walking the Yellow Brick Road (consuming horizontal tools with metered pricing) and being valued as if it were in the Rest of Oz (with the four moats compounding).

The variant is load-bearing on three observable relationships that this report will track:

The relationship between announced-AI-cost-cut magnitude and subsequent-quarter operating-margin delivery at layer-6 companies. The consensus reading predicts the margins follow the cuts within 2-4 quarters. The variant predicts a meaningful subset of names (the horizontal-tool consumers without vertical stack) deliver disappointing margin expansion because the token meter offsets the headcount savings. Uber's Q1 2026 R&D spend was $951M (+17% YoY) against a backdrop of 10% AI-attributable code commits T2 — the cost line is rising as a fraction of revenue while the productivity line is unmeasured.
The relationship between cohort multiple and whether the company has a Rest-of-Oz alternative. Microsoft can retreat to GitHub Copilot CLI because it owns that stack; the multiple should hold or expand for that reason. Salesforce's Agentforce ARR is +205% YoY and the multiple compressed anyway on FY-trajectory-vs-implied 2026-05-25-pltr-beat-and-fade-bifurcation — but the underlying business is operating in the Rest of Oz with three of the four moats already real. The variant predicts the cohort eventually re-sorts: the names with the four moats compound the substitution, the names without them deliver disappointing substitution math and re-rate further.
The relationship between lab pricing-power durability and enterprise-procurement evolution toward cap-and-meter. The TNW analysis on Microsoft's Claude Code retreat predicts that enterprise AI procurement moves "from licences to AWS-style metered billing with caps." If enterprises hold the line on capped procurement, layer-3 (the labs) loses pricing power on the upside even as token volume grows; the per-token deflation begins to actually pass through. The variant predicts capped procurement is the structural endpoint within 12-18 months; cohort-mis-pricing at layer 4 (lab-owned horizontal AI apps) re-rates lower as a result.

What would falsify the variant: clean evidence within the next 2-4 quarters that layer-6 companies are delivering operating-margin expansion in proportion to their announced cost cuts (not just expansion period — in proportion — i.e., the layer-6 margin improvement equals the labor-cost reduction minus a modest token-spend offset). Specifically: Meta's Q2 print on operating margin against $115-135B 2026 capex (the magnitude of the substitution being absorbed); Uber's Q2/Q3 visible operating leverage with token-spend transparency; Oracle's FY2026 Q4 against $156B AI capex against the $8-10B labor savings. If those names show clean operating leverage in proportion to the announced cuts, the variant is wrong and the consensus mapping holds. If they show the operational interior the Macdonald quote anticipates — cost-cut announcements followed by middle-of-the-pack margin delivery — the variant strengthens.

What would confirm the variant: further retreats by layer-6 companies from horizontal-AI tools toward internal/vertical alternatives (Microsoft's Copilot CLI move, generalized to other names); further token-budget overruns from layer-6 companies that publicly disclose them (Uber being the first; expect 2-3 more by Q3 print season); continued bifurcation in AI-touched software between vertical-depth incumbents and horizontal-tool consumers in the cohort beat-and-fade pattern.

Implications for AlphaSteve

The deep-value implication is that the cost-cut wave is being read as a single coherent productivity-dividend story by the equity-market cohort, while the operational interior of the wave shows a value-capture map with three plausible winners (silicon, hyperscale, vertical incumbents with the four moats) and two plausible disappointments (horizontal-AI consumers without vertical depth; lab-owned horizontal-AI apps as customer procurement disciplines tighten). The consensus mis-mapping is the variant evidence the discipline is well-served by working through. No new names enter the workbench today; the implication is to re-sort the existing screens by which side of the Yellow Brick Road / Rest of Oz boundary they sit on, and to track the relationships named above for evidence the variant is strengthening or weakening.

Portfolio: No position changes. Cash posture intact.
Watchlist: No name additions today. PLTR remains a re-read candidate; under the variant, PLTR's Foundry-as-system-of-work qualifies it as a Rest-of-Oz incumbent with at least three moats (governance via the regulated-customer base, workflow depth in the verticals it has penetrated, partial cross-vendor routing through its model-agnostic positioning). The trigger at $29 is unchanged but the under-the-multiple structural quality is higher than the cohort framing implies.
Theses on the workbench: [MP Materials](/brain/mp materials) structurally unaffected (critical-minerals theme is independent). The thesis pipeline gets a new screening criterion: for any application-software name, score against the four A16Z moats explicitly before promotion. Names scoring 3/4 or 4/4 are Rest-of-Oz candidates; names scoring 0/4 or 1/4 are Yellow-Brick-Road consumers.
Sectors: AI-touched software cohort framing is extended with the application-layer bifurcation axis. Vertical-application incumbents (CRM, NOW, vertical SaaS in legal/insurance/healthcare) and platform incumbents with internal-stack alternatives (MSFT specifically) sort to one side; horizontal-AI consumers and lab-owned horizontal-AI apps sort to the other.
House view updates: Software / SaaS valuation environment position to be extended with the application-layer bifurcation observation (see "House view changes" below). A new emerging theme to be added: "Token-cost substitution running ahead of productivity validation." AI infrastructure capacity position to be extended with the cap-and-meter customer-procurement signal (Microsoft Claude Code retreat as the cleanest data point yet on the customer-procurement-evolution dimension).
Daily-scan adjustments: Add three recurring screening criteria — (i) token-budget overrun disclosures from layer-6 companies (Uber-pattern; track for 2-3 more by Q3); (ii) horizontal-AI tool retreats by major enterprises in favor of internal/vertical alternatives (Microsoft-pattern; track for 2-3 more); (iii) cohort beat-and-fade differentiation within AI-touched software by four-moat score.

Charts / data

The token-cost-substitution evidence table — through 2026-05-28

Company	Action	Date	Magnitude	Layer	Read
Uber	2026 AI coding budget exhausted	April 2026 (reported)	Entire planned 2026 budget consumed in 4 months; $500-$2K/engineer/month; 32% → 84% adoption Feb-Mar	6 → cost transferred to 3	Operational interior reveals layer-6 capture failure
Uber	COO publicly admits link absent	2026-05-26	"That link is not there yet" — COO Macdonald on Rapid Response podcast	6	Cleanest single quote in the cohort
Microsoft	Claude Code seat cancellations	June 30, 2026 deadline	Most direct Claude Code licences in Experiences & Devices group cancelled; engineers migrate to GitHub Copilot CLI	6 → captures via internal stack	Vertical-integration response to cap the meter
Microsoft	Voluntary buyout programme	April 23, 2026	~~8,750 eligible US employees (~~7%); "Rule of 70" formula	6	Headcount substitution; AI engineers exempt
Meta	Layoffs + cancelled requisitions	April 23, 2026	8,000 layoffs + 6,000 cancelled = ~14,000 positions	6	$115-135B capex 2026 vs. $72B 2025; substitution explicit
Oracle	Mass layoff	March 31, 2026	~30,000 (18% of workforce); $8-10B savings freed	6 → directed to 1/2 ($156B AI capex)	Explicit labor-to-capex substitution
Amazon	Corporate restructuring	Since October 2025	30,000+ corporate roles	6	Slower-burn substitution
Klarna	AI-replacement reversal	2025-2026 (quiet)	~700 customer-service roles being re-staffed	6 (failed substitution)	Cautionary tale on Rest-of-Oz necessity
Salesforce	Customer-support rebalance	TTM through 2026	9,000 → 5,000 customer-support headcount; Agentforce-driven	6 and 5 (CRM is both cutter and Rest-of-Oz seller)	Cleanest layer-6-becomes-layer-5 example
IBM	HR replacement	2025 (ongoing)	~8,000 HR roles	6	Internal vertical depth in narrow function
Duolingo	Contractor cut + AI-first	2025	10% contractor reduction; CEO since walked back AI-first claim	6 (partial reversal)	von Ahn: "doesn't see the tech replacing the tasks"
Anthropic	Pricing model shift	2026	Flat fee → usage-based for agents	3 (capturing the substitution flow)	Pricing-power monetization of agentic compute
GitHub	Copilot Pro/Pro+ signup pause	November 2025	New signups paused; agentic workloads exceeding plan price	4 (lab-adjacent)	Early breaking-point on horizontal-AI procurement
NVDA (Catanzaro)	"Compute > employees" quote	April 2026	"For my team, the cost of compute is far beyond the costs of the employees"	1 (the chip company saying it)	Cleanest top-down signal that pricing dynamic is non-trivial

Sources: T2; layer attribution is author-classified per the Schmidt framework T1.

The directional pattern is visible without further apparatus: the most documented and specific operational-interior data points (Uber budget overrun, Uber COO statement, Microsoft Claude Code retreat, Klarna reversal) all live at layer 6 and all suggest the substitution captures less than the consensus reading implies; the layer 1 voice (Catanzaro) and the layer 3 pricing actions (Anthropic) all suggest the savings are flowing up the stack faster than the layer-6 productivity is arriving.

Sources

T1 https://a16z.com/avoiding-death-on-the-yellow-brick-road/
T1
T1
T2 https://thenextweb.com/news/meta-microsoft-layoffs-23000-ai-spending
T2 https://thenextweb.com/news/microsoft-claude-code-retreat-ai-cost
T2 https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tokens-agents/
T2 https://fortune.com/2026/05/26/uber-coo-ai-spending-tokens-claude-code/
T2
T2
T2
T2
T3 https://aimagazine.com/news/why-uber-has-already-burned-through-its-ai-budget
T3 cited in TNW 2026-05-25
T3
T3 https://www.digitalapplied.com/blog/klarna-reverses-ai-layoffs-replacing-700-workers-backfired
T3
T3
T3
T3
T3
T3
T3
T3
T3
T3(/brain/2026-05-25-pltr-beat-and-fade-bifurcation) table]
AS-cal

House view changes this run

This run produces four changes:

Software / SaaS valuation environment — position extended. The current FY-trajectory-vs-implied framing is augmented with the application-layer bifurcation axis: AI-touched software is sub-cohorting into (a) Yellow-Brick-Road consumers (horizontal-AI tool consumption as primary AI strategy, exposed to per-token meter; few of the A16Z four moats) and (b) Rest-of-Oz incumbents (vertical workflow depth, cross-vendor routing, cost-tiered model selection, governance/compliance control plane). The cohort beat-and-fade pattern is predicted to differentiate by four-moat score over the next 2-4 quarters. Confidence stays medium pending Q2 print season as the test window.
AI infrastructure capacity — position extended. The existing high-confidence-on-inversion / medium-confidence-on-duration framing is augmented with the customer-procurement evolution dimension: Microsoft's Claude Code retreat is the cleanest single signal yet that enterprise AI procurement is moving from "licence per seat" toward AWS-style cap-and-meter billing. If capped procurement holds, layer-3 (lab) per-token pricing power weakens on the upside even as volume compounds; this strengthens the variant view that constraint-duration is over-extrapolated, by adding a customer-side mechanism (demand discipline) to the supply-side mechanism (three-supplier ramp). Confidence on duration variant moves from "mildly validated" to "central but not yet sharp" — this is the same direction as the 2026-05-28 AM update on MRVL flat-close evidence, and the customer-procurement evidence is independent confirmation of the same vector.
New developing theme added — "Token-cost substitution running ahead of productivity validation." First flagged this run. Confirmations needed for promotion from developing to confirmed near-term theme: (i) one more layer-6 company publicly disclosing a token-budget overrun on the Uber pattern; (ii) one more major enterprise retreating from horizontal-AI tools toward internal/vertical alternatives on the Microsoft Copilot CLI pattern; (iii) a Q2/Q3 print from a layer-6 company showing operating-margin delivery materially below what the announced cost cuts would imply (the variant's load-bearing test). Falsification: Q2/Q3 prints from Meta, Microsoft, Oracle, Uber showing clean operating-margin expansion in proportion to announced cost cuts.
Earnings cycle character — position extended. The acceleration-with-multi-year-trajectory-extension cut is augmented with the application-layer bifurcation observation: the FY-trajectory-vs-implied discriminating observable now operates more sharply on the Yellow-Brick-Road side of the bifurcation, where the cost-line growth (token meter) is structurally rising faster than the cohort multiple currently reflects. This is a refinement, not a new framing.

Linked

_house-view
02-philosophy-deep-value
2026-05-25-pltr-beat-and-fade-bifurcation
2026-05-27-hbm-replaces-cowos-binding-constraint-inversion
2026-05-26-decomposing-brent-99-implied-trinary
methodology-calibration
sources-policy
capital-cycle
earnings-power-value-greenwald
PLTR
[MP Materials](/brain/mp materials)

Business: Are token-related cost cuts a productivity dividend the cutting companies keep, or a value transfer up the AI stack — and what does A16Z's Yellow Brick Road map say about who actually holds it?