AI StrategyJune 30, 20265 min read

Why Only 7% of Companies Scale AI (And It's Not the Model)

TokenShift Executive Note

When an AI programme disappoints, the instinct is to reach for a more powerful model. It is almost always the wrong lead. On 19 June 2026, McKinsey published a survey of 1,000 executives that puts a number on the gap: nearly 90% of organisations are experimenting with AI, yet only 7% have genuinely scaled it across the enterprise. What separates those two figures is not model quality. It is the management system.

The capital is there; the value isn't following

Let's start by laying a false debate to rest. No, AI is not a bubble with no real-world use. For fiscal year 2026, which closed in May, Nvidia reported $215.9 billion in revenue, up 65%, of which $193.7 billion came from the data centre segment alone (+68%). That demand is driven by inference — the use of models in production — not by speculation. Commentator Nate B. Jones captures the mechanics in a single formula: "inference equals revenues." Compute gets rented because it gets used.

So the problem isn't the availability of the technology. It is abundant, well-funded, and available to everyone at the same subscription price. The problem is value capture. McKinsey finds that companies advanced in their use of AI post markedly higher productivity and profitability than their peers — but that the gap comes from embedding AI into management processes, not from the sophistication of the tools. BCG reached the same verdict in September 2025: roughly 5% of "leader" companies capture most of the value, while close to 60% derive no material benefit at all.

Put another way: everyone has access to the same engine. Very few have the gearbox that puts its power to the wheels.

The operational multiplier

Let's call that gearbox the operational multiplier: the set of management practices that convert a model capability — identical for everyone — into a result on the income statement. McKinsey identifies three concrete components, and it is on these three that scaling is won or lost.

1. KPIs tied to the P&L, not to usage

The 7% that succeed don't measure the number of users or the volume of queries. They measure an economic effect: cost per case handled, margin per contract, cycle time. As long as a leadership team tracks adoption instead of value, it is running theatre.

2. Real-time resource reallocation

McKinsey stresses one precise marker: reallocating budgets and talent based on the data, continuously. Companies that scale redeploy money from a use that doesn't pay off to one that does, in weeks. The others roll over their budgets once a year, no matter what.

3. An explicit management cadence

Clear operating principles, steering rituals, expected behaviours. This is the invisible infrastructure: the regularity with which you review, decide trade-offs, and escalate. It is what turns an isolated pilot into an enterprise routine.

Same Tuesday morning, two insurers

Two European insurers, the same Tuesday morning in June 2026. Both have access to the same frontier models. The first rolled out an assistant for its claims handlers: 1,200 active users, a flattering adoption dashboard, no named owner, no cost-per-claim metric. Six months on, the tool is "adopted" and the loss ratio hasn't moved a single point.

The second chose a decision, not a use case: cut the assessment time for home insurance claims by 30%. A single workflow in production, a business director accountable for it, a KPI tied to margin, and a monthly review that reallocates budget toward the segments where the gain holds up. Same technology. One bought an engine; the other installed the transmission.

A five-step approach for your executive committee

The operational multiplier is built, not bought. The sequence matters.

Choose a decision, not a use case. Spell out what changes if it works, in euros or in time saved. With no answer to that question, don't fund it.
Define the P&L metric before you launch. The KPI is set cold, never after the fact to justify the spend.
Put a single workflow into production. One, with a named owner and traceable outputs. Not ten parallel pilots.
Install the reallocation cadence. A monthly ritual that genuinely shifts money and people based on the data you observe.
Measure the multiplier, not the activity. Compare the economic result to the starting point, and cut whatever multiplies nothing.

Mistakes to avoid

Measuring adoption. Counting users and queries creates a sense of progress with no correlation to the value captured.
Buying power to fix an organisational problem. A higher-performing model won't correct a lack of ownership or a broken management cadence.
Multiplying pilots without reallocating a single euro. As far back as 2024, Gartner predicted that at least 30% of generative AI projects would be abandoned after the proof-of-concept stage. Launch ten experiments without moving resources and you produce ten abandonments.
Mistaking enthusiasm for production. A demo that impresses the board is not a governed workflow. It's theatre, and an expensive one.

What you should be able to observe

Four markers tell you, beyond debate, whether your operational multiplier exists:

The executive committee can name the P&L metric for every funded AI initiative.
The lag between a decision and the redeployment of budget is measured in weeks, not annual cycles.
At least one workflow runs in production, with an identified owner and traceable outputs.
A non-zero share of the AI budget was reallocated from one project to another in the last quarter.

If you tick all four boxes, you are on your way to joining the 7%. If you tick none, your problem is not your model provider.

A more powerful model won't fix a broken management system; it just makes it faster.

Scaling AI is a discipline of execution before it is a technology question. Compute capital is shared by all your competitors; what will set you apart is the rigour with which you tie every use to a result, and the speed at which you move your resources toward what pays off.

At TokenShift, this is exactly the work of our Decision Clarity offering: turning a portfolio of pilots into funded, measured, and governed decisions. RegRadar by TokenShift is one building block of it, connecting AI Act compliance with production. No amount of model spend replaces that discipline.

Continue reading

View all insights