The AI Tooling Budget: How to Spend Wisely in 2026

TL;DR

The AI tooling budget should be 30% or less of total AI program cost. The other 70% is people, evals, governance, and change. Budget against the anchor metric, not against revenue. Audit before you buy. Centralize procurement under one owner. Reserve 10 to 15 percent for experimentation. Most brands are already paying for 70% of what they need, they just do not know it.

The 30/70 rule: tooling is the small line.
Budget against the anchor metric, not revenue.
Audit first, then buy.
Centralize procurement under one owner.

The 30/70 rule

The most expensive mistake in AI budgeting is treating tooling as the program. Tooling is the visible line, the easy thing to compare across vendors, and the thing CFOs ask about first. It is also, in a well-run program, 30 percent or less of total cost.

The other 70 percent is people (engineers, strategists, the owner of the anchor metric), evals and observability infrastructure, governance and compliance work, and change management to get the org to actually use the AI. None of those are line items on a vendor invoice. All of them are larger and more important than the vendor invoice.

The brands that budget 70% of total AI cost on tooling end up with a procurement-led program: a lot of software, a lot of dashboards, and no movement on the anchor metric. The brands that hold tooling to 30% end up with an operating program that ships.

Tooling is the smallest line in a serious AI budget. If it is the largest, the program is not serious.

The allocation framework

Inside the tooling line, allocate across four buckets. These are not rigid percentages but the relative weights I have seen produce real outcomes.

Foundation model spend (35 to 50% of tooling)

This is the variable cost of the actual model calls. It is the line that scales with usage. Two principles here: do not over-optimize for cost-per-call early (the model running the anchor workflow should be the best one for the job), and do build the observability to track unit economics from day one. The model call is cheap, but tens of thousands of unmonitored model calls is not.

Vertical SaaS (25 to 40% of tooling)

Tools purpose-built for a workflow: CX automation, creative production, retention, lifecycle, analytics. These are the ROI tools because they bundle the AI with the workflow context the team already operates in. The trap is letting every team buy its own. Vertical SaaS sprawl is the most common waste in AI tooling budgets.

Infrastructure (10 to 20% of tooling)

Hosting, orchestration, vector storage, retrieval, security tooling, identity. Boring, necessary, undervalued. Underspending here is a near-term saving and a long-term tax. The brand that skips observability tooling pays for it in production incidents later.

Monitoring and evals (10 to 15% of tooling)

Eval tooling, prompt management, output tracking, conversion attribution. This is the smallest and most undervalued bucket. It is also the bucket that determines whether your foundation model spend is working. Eval discipline is the difference between a pipeline that improves quarter over quarter and one that drifts.

The audit-first principle

Before approving any new AI tool, audit what is already in the building. Every consumer brand I have walked into in the last three years was already paying for AI features it was not using:

The marketing automation tool has AI features in the next tier up that were never turned on.
The CX platform has AI deflection built in that is sitting dormant.
The analytics tool has an LLM-based insight layer included.
Two different teams are paying for two different AI copy tools that do the same thing.

The audit produces three artifacts: the existing AI footprint, the duplicated lines, and the dormant capabilities ready to turn on. Often 60 to 70 percent of what a brand needs in year one is already paid for. The audit makes that visible. The procurement freeze (see below) creates the room for the audit to happen.

Budget against the anchor, not revenue

Most AI tooling budget conversations start with the wrong number. Someone benchmarks "AI spend as a percentage of revenue" against an industry report and proposes a target. That is the wrong anchor.

The right anchor is the dollar value of the P&L line the AI program is trying to move. If the anchor metric is worth $10M annually, the program budget gets sized against that. If the anchor is worth $50M annually, the program is much larger. Revenue does not enter the math.

This is the same principle I write about in The AI Transformation Playbook for Consumer Brands. The anchor metric is the most important decision in the entire program, and it is also the most important input to the budget. Skip the anchor, and the budget becomes guesswork.

The procurement freeze pattern

One of the highest-leverage moves a leadership team can make is a 60-day AI tool procurement freeze. No new AI tool purchases for 60 days. The freeze does four things:

It surfaces the existing footprint. Every department lists what they have and what they pay.
It exposes duplication. Two teams are usually paying for the same capability.
It forces prioritization. The genuinely needed tool gets bought after the freeze. The nice-to-have does not.
It creates room for the audit. Without the freeze, the audit gets out-paced by new purchases.

The freeze is unpopular. Teams that have been told they cannot buy the tool they want to buy will push back. Hold the line. Sixty days is short. The audit and the consolidation produce far more value than the foregone purchase.

Vendor consolidation math

After the freeze and the audit comes the consolidation. The math is simple. Three overlapping AI tools at $40K each is $120K. One purpose-built tool that covers the same surface at $90K is a $30K save, plus the time saved on integrations, training, and vendor management.

The consolidation has limits. Buying one monolithic platform that does ten things badly is worse than maintaining three focused tools that each do one thing well. The consolidation principle is: collapse the redundant, keep the differentiated. Apply judgment to which tools actually do something different.

Document the consolidation decisions. Two years from now, when a new VP arrives and wants to "evaluate the stack," the documentation prevents re-doing the same audit and re-buying the same redundant tools.

Every consumer brand I have audited was paying for 70 percent of what it needed. Buying the other 30 percent comes after the audit, not before.

Reserve for experimentation

The last 10 to 15 percent of the tooling budget is for experimentation. Not pilots tied to the anchor metric (those are funded directly), but the early-stage capability exploration that surfaces the next anchor. Two principles:

Reserve it explicitly. If experimentation does not have its own line, it gets defunded the first time the main program needs more.
Cap it. Experimentation that grows to half the budget is a sign the team is avoiding the operating work. Cap it at 15 percent.

The reserve is what lets the program stay current as the model landscape shifts. The cap is what keeps the program disciplined.

The bottom line

AI tooling budgets in 2026 should be 30 percent or less of total program cost, allocated across foundation models, vertical SaaS, infrastructure, and evals. Budget against the anchor metric, not revenue. Audit before you buy. Freeze procurement for 60 days to make the audit possible. Consolidate where there is duplication. Reserve 10 to 15 percent for experimentation, with a cap.

The brands that do this end up spending less than they expected and getting more from the spend than the brands that did not. The brands that do not end up with a large vendor invoice, a small set of dormant tools, and no movement on the metric that actually matters. Start with the anchor. Audit second. Buy last.

FAQ

How much should I budget for AI tooling?

Budget AI tooling against the anchor metric, not against revenue. If the AI program is targeting a P&L line worth ten million dollars, allocating 5 to 15 percent against the full transformation in year one is defensible. Tooling itself is usually 30 percent or less of that total.

What is a typical AI tooling spend at a consumer brand?

There is no typical spend, and benchmarks against revenue are misleading. The right anchor is the dollar value the program is trying to move. A brand with a five million dollar retention opportunity should not be spending the same as a brand with a fifty million dollar one.

Should AI tooling spend grow as a percentage of budget over time?

Tooling spend should grow in absolute dollars and shrink as a percentage of total AI cost. Year one is tooling-heavy because the foundation is being laid. By year three, the percentage in people, evals, and platform work should be larger than the tooling line.

What is the highest-ROI line item in an AI tooling budget?

The highest-ROI line item is usually a tie between foundation model spend on the workflow tied to the anchor metric and the eval and observability tooling that makes those calls accountable. Cheap models running unmonitored produce worse outcomes than slightly more expensive models running with real evals.

How do you avoid AI vendor sprawl?

Centralize AI procurement under one named owner. Run a 60-day procurement freeze and audit what you have before approving anything new. Most brands are already paying for 70 percent of what they need. The other 30 percent gets bought after the audit, not before.

Where do most AI budgets fail?

Most AI budgets fail by funding tooling before the anchor metric is defined. Software gets bought, dashboards get built, and no one can tell you what number is moving as a result. The fix is to define the anchor first, then allocate tooling against it.

About the author

Nicholas Harris is an AI-native operator at the intersection of generative AI and consumer growth. He is President at CreativeOS, an AI-powered SaaS platform serving 25,000+ brands, and Founder at Automatic, an AI consultancy for consumer brands. He has delivered three exits and built consumer-brand operations from SMB through nine-figure scale, including 110.6% e-commerce revenue growth at NASM, an 11x EBITDA exit at SplitTesting.com, and ~17% MoM revenue growth at Veyl Ventures on a flat media budget.

He is currently open to VP AI, AI Transformation, Head of Growth, and Fractional CTO roles at consumer-facing companies. Based in Mesa, AZ. Remote or Phoenix metro preferred.

Get in touch