The AI Bill Your CFO Didn't Budget For

CyberWorqs
2 hours ago
5 min read

How token economics are reshaping enterprise cost structures and why most finance teams aren't ready.

There's a well-known economic principle called the Jevons Paradox. Formulated in 1865, it observes that when a resource becomes more efficient or cheaper to use, total consumption of that resource tends to increase, not decrease, because the lower price makes it viable for more applications at greater scale.

It is playing out in enterprise AI right now, with striking precision.

Token prices have fallen 280-fold in two years. Enterprise AI bills have tripled. The unit cost of AI is collapsing. The total cost of AI is compounding. For most organisations, the financial frameworks to understand, forecast, or govern that gap simply don't exist yet.

What Tokens Actually Are and Why They're a Finance Problem

A token is the fundamental unit of AI computation. It's how language models process and generate text, not word by word, but in fragments. The sentence "Hello, how are you?" breaks into six tokens. Unlike a human reader who might skim a long document for the relevant parts, an AI model reads everything it's sent: every word, every comma, every extra space.

Every employee prompt, every chatbot exchange, every automated workflow running in the background consumes tokens. And every one of those tokens has a price.

Traditional enterprise software offered predictable pricing: annual licences, multi-year agreements, seat-based models that finance teams could forecast with reasonable accuracy. Even cloud computing eventually settled into patterns procurement could model. Token-based AI pricing is breaking that model open.

Unlike a software licence, token spend isn't fixed. It scales with usage, with the complexity of prompts, with the length of context windows, and with the number of autonomous steps an AI agent takes. A surge in internal experimentation, a new product feature, or even a poorly optimised prompt can cause costs to spike in ways that are difficult to anticipate.

The Scale of What's Already Happening

This isn't a future problem.

In 2025, organisations spent an average of $1.2 million on AI-native applications, more than double the prior year. Nearly 8 in 10 IT leaders report being hit with unexpected charges tied to consumption-based AI pricing.

The average enterprise AI budget has grown from $1.2 million per year in 2024 to $7 million in 2026. Some Fortune 500 companies are reporting monthly AI inference bills in the tens of millions of dollars.

Uber burned through its entire planned AI budget for 2026 within the first few months of the year, after encouraging engineers to use AI coding tools aggressively. This wasn't a technology failure. It was a governance and forecasting failure.

Based on a Deloitte survey of 550 US enterprise leaders, many organisations already generate above 10 billion tokens per month. The proportion expected to exceed 100 billion tokens per month is projected to triple between 2025 and 2028.

The Pricing Shift Coming for Enterprise Contracts

Many organisations locked in flat-fee or seat-based AI contracts in 2023 and 2024. That pricing model is being phased out.

Anthropic has already moved its enterprise billing model from flat fees to fully token-based pricing. Other providers are expected to follow within six months.

Google has introduced a tiered system, AI Pro at $19.99 per month and AI Ultra at $249.99 per month, with a credits mechanism that meters usage rather than offering unlimited access. The shift signals that even a company with Google's infrastructure cannot sustain unlimited token consumption at flat-rate pricing across hundreds of millions of users.

There is also a structural risk embedded in current pricing that very few finance teams have modelled: the current API pricing that enterprises have budgeted around is subsidised by venture capital and hyperscaler cross-subsidies. In 2025, OpenAI generated $3.7 billion in revenue and lost an estimated $5 billion, spending $1.35 for every dollar it earns, driven by the cost of serving inference requests at scale. When that subsidy normalises, enterprise AI costs will reprice. Business cases built on today's token rates carry that risk.

The Agentic Multiplier

The cost dynamics that applied to early AI deployments, a query in and a response out, no longer reflect how most organisations are using AI in 2026.

Agentic AI consumes tokens in ways that no traditional budget model anticipated. Enterprises have moved from experimental chatbots to production-scale agentic deployments, and AI inference now represents 85% of the enterprise AI budget.

For the first time, running agents costs more than building models. Inference now captures 55% of all AI cloud infrastructure spend.

An agentic workflow, where an AI model reasons through a problem, calls external tools, checks its own outputs, and iterates toward a result, can consume hundreds of times more tokens than a single query. Most budget models were not built with that multiplier in mind.

The ROI Accountability Problem

Spending is rising. The ability to account for it is not keeping pace.

Fewer than 15% of AI decision-makers reported a measurable improvement in EBITDA from AI investments in the last twelve months. Fewer than a third can link the value of their AI spending to concrete changes in the profit and loss statement.

According to Grant Thornton's CFO survey for Q1 2026, 68% of CFOs expect to further increase IT and digital transformation spending, the highest figure recorded across 21 quarters of the survey. Investment confidence is high. Measurement capability is not.

Forrester predicts that organisations will defer 25% of planned AI spending from 2026 to 2027, a market correction driven by CFOs unable to connect AI expenditure to P&L outcomes.

What the Cost Structure Actually Looks Like

Four cost categories account for the bulk of unmanaged AI spend:

Inference costs, the per-token cost of running a model, now represent the dominant line item for most organisations and scale non-linearly as agentic use increases.

Token bloat: every poorly worded prompt, every unnecessarily long context window, every retry loop due to errors incurs costs regardless of whether the result is usable. Prompt quality is now a financial variable.

Infrastructure and egress: early decisions about whether to self-host, use the cloud, or use third-party infrastructure can dictate as much as 40% of AI expenses. One construction company moved from a cloud-hosted AI tool costing under $200 per month to over $10,000 per month once the system went into production use, simply because the hosting decision hadn't been made with scale in mind.

Model selection: not all tasks require frontier models. Defaulting to the most capable model for every query is the equivalent of using a Formula 1 engine to drive to the shops. FinOps teams should identify the cost per token for both training and inference, and match model capability to task requirements to avoid unnecessary spend.

Where Finance Teams Need to Focus

A 20% reduction in token price is more than offset by a 25% increase in usage. Structural cost reductions come through governance, not through better purchase prices.

The practical actions that make a material difference are unglamorous but effective: real-time spend attribution by team and use case; prompt standards that treat context window length as a cost variable; model tiering policies that match task complexity to model capability; and AI cost scenario modelling built into financial planning cycles, specifically stress-testing against pricing normalisation.

The proportion of enterprise spend going to AI is already significant, with 50% of leaders reporting they're spending 21 to 50% of their digital transformation budgets on AI. At that scale, the absence of a financial governance framework isn't a gap. It's a risk.

The question for finance leaders in 2026 isn't whether AI is worth investing in. It's whether the organisation has the visibility to know if it already is.

Sources: Deloitte AI Infrastructure 2028 Survey; Menlo Ventures State of Generative AI 2025; FinOps Foundation 2026 State of FinOps Report; Grant Thornton CFO Survey Q1 2026; Forrester Research; Artefact; PYMNTS.

1300 984 340

The AI Bill Your CFO Didn't Budget For

Recent Posts

Subscribe to Keep Up to Date with Cyber Security Risk Management