Back to Insights
AI Governance6 min read

AI ROI: the five lines your CFO should demand before the first euro of deployment

TokenShift Executive Note

AI ROI: the five lines your CFO should demand before the first euro of deployment

The comfortable assumption goes like this: deploy first, measure later. ROI will show up "as adoption grows."

The numbers say otherwise. According to S&P Global Market Intelligence (2025), the share of companies scrapping the majority of their AI initiatives jumped from 17% to 42% in a single year. On average, 46% of proofs-of-concept are buried before they reach production.

This isn't a technology problem. It's a measurement problem. And measurement is the CFO's home turf.

The real issue: ROI isn't a verdict, it's an entry condition

Most AI programs don't fail because the model is weak. They fail because no one has answered a far simpler question: what changes if this works?

McKinsey (The State of AI, 2025) lays it bare: 88% of organizations use AI in at least one function, yet only 39% report any impact on EBIT — and roughly 6% reach "high performer" status, with an attributable EBIT impact of at least 5%. Adoption has become universal. Value remains rare.

Gartner sounded the alarm back in July 2024: at least 30% of generative AI projects would be abandoned after the proof-of-concept by the end of 2025 — for four reasons that have nothing to do with algorithms: data quality, risk control, costs, and fuzzy business value.

Our view: AI ROI isn't something you observe after the fact. It's something you put in writing beforehand. We call it the Five-Line Proof: five indicators, each fitting on a single line of a memo, mandatory before any deployment. If one line is missing, the project isn't ready for production. It's ready for the stage.

The Five-Line Proof

1. The quantified baseline

What does the workflow cost today, without AI? Processing time, error rate, unit cost, monthly volume. Measured, dated, and signed off by the business — not guessed at in a meeting.

This is the line most often missing. Without a baseline, any claimed gain is unverifiable by design. ROI without a starting point isn't ROI — it's an opinion.

2. The attributable delta

What gain can be credited to AI — and to AI alone? Not to the reorganization that came with it, not to the data cleanup done along the way, not to the simple effect of paying attention to the process.

The MIT study (NANDA initiative, The GenAI Divide, August 2025) made waves by claiming that only about 5% of generative AI pilots produce a measurable acceleration in revenue. A caveat is in order: the measure concerns short-term P&L impact, not technical failure, and the methodology has been debated. But the underlying signal holds: most pilots simply can't demonstrate their delta. Demanding attribution is how you give yourself a shot at being in the 5% — or at stopping in time.

3. The full cost of production

The cost of a pilot is not the cost of a production system. The real bill includes integration with existing systems, inference consumption at scale, human oversight, monitoring, compliance documentation — the European AI Act imposes its own transparency and traceability requirements — and maintenance as models evolve.

A CFO who sees only the "licenses" line sees only a fraction of the cost. This is precisely the "cost escalation" Gartner flagged as a cause of abandonment.

4. The escalation rate

What share of cases does the system hand back to a human? This figure tells two things at once: the system's real quality and the residual human cost.

An escalation rate that isn't tracked is a risk that isn't managed. An escalation rate of zero is even more alarming: it means no one is watching. The right question isn't "how many cases does the AI handle?" but "who handles the ones it doesn't, and what do they cost?".

5. The payback date

On what date does the cumulative delta overtake the full cost? A date, not a horizon. "Within 18 to 24 months" isn't an answer for governed production; it's a pitch answer.

If the date runs past the solution's likely lifespan — in a market where models change every six months — the project isn't funding a transformation. It's funding an experiment.

The same Tuesday morning, two companies

Two insurers launch the same pilot: automated claims triage.

The first deploys fast. Six months later, the team "senses" a gain, the business disputes it, the CFO rules: shut it down. The project joins S&P Global's 42%.

The second demanded the Five-Line Proof before the first euro. Baseline: 11 minutes per file, measured over three months. Target delta, full cost, capped escalation rate, payback date. Six months later, the boardroom debate isn't "does this work?" but "how fast do we scale?".

Same technology. Same budget. The difference isn't in the model — it's in the measurement contract signed before starting.

What this changes for your board

The Five-Line Proof isn't a reporting tool. It's a decision tool, with three immediate effects:

  • It filters before you spend. A project unable to produce its five lines isn't penalized — it's sent back for scoping. That's cheaper than discovering the gap in production.
  • It creates ownership. Every line has an owner: the baseline belongs to the business, the full cost to IT and the CFO, the escalation rate to the workflow lead. The ownership map precedes deployment.
  • It prepares for compliance. The AI Act's transparency and documentation obligations assume exactly what these indicators produce: traceability of what the system does and what is expected of it. ROI discipline and regulatory discipline are one and the same discipline.

An ROI you can't calculate before deployment won't become calculable after.

The five questions for your next board meeting

  1. Do we have a measured baseline — not an estimate — for every workflow that's a candidate for AI?
  2. Does our attribution method separate the AI gain from the reorganization gain?
  3. Does our full cost include human oversight, compliance, and maintenance — or just licenses?
  4. Who tracks the escalation rate, and at what threshold does it trigger an alert?
  5. Does every AI project have a written payback date — and what do we do if it slips?

If three of the five answers are missing, the topic for your next board meeting isn't a new pilot. It's putting the measuring instrument in place.

And in your own company: does your CFO currently have the authority to block an AI deployment for lack of a baseline?

Follow TokenShift for the rest of this series on moving from pilot to governed production.

Sources:

  • S&P Global Market Intelligence, Voice of the Enterprise: AI & Machine Learning (2025) — spglobal.com
  • McKinsey & Company, The State of AI (2025) — mckinsey.com
  • Gartner, press release dated July 29, 2024 — gartner.com
  • MIT NANDA, The GenAI Divide (August 2025), via Fortune, 08/18/2025

#AIGovernance #EnterpriseAI #Boardroom #ProductionDeployment

Continue reading

View all insights