AI costs are turning into a consulting finance problem.
Not because the tooling is wildly expensive in absolute terms, but because it breaks the muscle memory of how most firms run commercially: sell time, manage utilization, and protect margin through predictable delivery effort.
AI flips that. Some work gets dramatically faster. Some becomes more iterative. And the cost drivers stop looking like headcount and start looking like licenses, tokens, model tiers, and “one more run” cycles.
If you’re a partner, practice lead, COO, or finance lead, the question shows up quickly:
Do we treat AI as overhead, allocate it to projects, or package it into a new pricing model?
This post gives you a practical way to decide, without turning your delivery teams into token accountants or quietly eroding client trust.
The real issue isn’t cost. It’s consistency.
In most firms, “AI adoption” starts as personal usage: someone uses a chat assistant to tidy a deck storyline, summarize notes, or draft an RFP response. The cost is invisible or bundled into a company card SaaS bill.
Then two things happen:
- Teams begin using AI directly in deliverable workflows (research, analysis, drafting, synthesis).
- Usage shifts from fixed licenses to variable API spend (and those bills can spike fast).
When that happens, firms run into three commercial problems:
- Uneven margins. One team quietly becomes “AI-native” and ships with fewer hours; another team keeps working the old way. Both charge the same rate card.
- Uneven risk controls. Some teams use consumer tools and paste sensitive material; others are locked down. Both sell under the same brand.
- Client trust questions. Buyers start asking (explicitly or implicitly) what changed in how work is produced, reviewed, and evidenced.
The aim is not to track every prompt. It’s to create a consistent operating model that protects margin and quality while staying defensible with clients.
Three models firms default to (and what breaks)
Model 1: Treat AI as overhead
This is the most common starting point: AI is “just another SaaS line item,” covered by standard rates.
When overhead works:
- AI usage is low and mostly internal (e.g., meeting prep, early brainstorming).
- You’re in a market where clients expect you to use modern tools, and value-based pricing already dominates.
- You have enough scale that the cost disappears into the P&L noise.
What breaks:
- High-variance API usage (especially if teams experiment without guardrails).
- Internal resentment (“why is my project margin punished by someone else’s token habit?”).
- A governance gap: nobody owns policy, and tool sprawl grows.
Model 2: Allocate AI costs to projects
The second instinct is to tag spend per project: licenses, API usage, or both.
When project allocation works:
- You’re doing heavy, deliverable-shaping AI work (e.g., large research sweeps, iterative drafting, translation/localization, code generation for analytics).
- You need to protect margins on fixed-fee projects.
- You have clients who expect transparent cost-plus structures (more common in delivery/managed services than strategy).
What breaks:
- You create a new admin tax (teams policing usage instead of shipping outcomes).
- You invite the wrong client conversation (“Are you billing me for your tools?”).
- You often measure the wrong thing: tokens instead of value created or hours avoided.
Model 3: Price AI as a capability (recommended for most firms)
This is the model that scales: don’t itemize tokens. Design an “AI-enabled delivery” commercial posture that you can explain and defend.
Two common patterns:
- Capability uplift: your standard rate assumes modern tooling and AI-assisted delivery (with governance), and you compete on outcomes, speed, and quality.
- Packaging: you create an offer tier (e.g., “rapid diligence sprint,” “AI-assisted proposal accelerator,” “evidence-backed report factory”) with a defined workflow, review gates, and a fixed commercial structure.
This model aligns incentives:
- You invest in tooling and knowledge systems because it increases throughput.
- You keep the client conversation on value, not token counts.
- You can define what “AI-enabled” actually means in terms of quality controls.
It also matches what’s happening in parts of the market: some consultancies are experimenting with subscription or managed-capacity constructs that decouple delivery from pure time-and-materials. (See Sources.)
A decision framework: what should be overhead vs. allocatable?
You don’t have to pick one model for the whole firm. You can set a policy that splits spend into three buckets.
Bucket A: Always overhead (baseline enablement)
Treat these as table stakes:
- Secure enterprise chat access for non-sensitive internal work
- Standard productivity assistants
- Approved prompt libraries and templates
- Baseline training and enablement
The goal is to avoid fragmentation and “shadow AI.” If teams can’t access an approved baseline, they will improvise.
Bucket B: Allocatable when usage is material (project acceleration)
Allocate costs when they are clearly driven by a project and clearly change the unit economics:
- API-based large runs (e.g., systematic document extraction across a large corpus)
- Specialist model usage for heavy reasoning or long-context analysis
- Intensive automation (e.g., custom retrieval pipelines, scripted evaluation runs)
Even here, keep the policy simple: set thresholds (e.g., “allocatable above €X per week or €Y per deliverable”) rather than tracking every call.
Bucket C: Capitalizable as a reusable asset (firm IP)
If you’re building reusable internal assets (templates, evaluation suites, structured knowledge bases, deliverable blueprints), the commercial logic is different:
- These assets benefit many future projects.
- The cost belongs with your capability building, not a single engagement.
Treat this as “firm R&D” and manage it as an investment portfolio with owner accountability and adoption metrics.
The metric that matters: “AI cost per hour avoided”
Finance teams often want a simple unit to reason about. Consultants already speak in hours.
Instead of “tokens per project,” use a metric that connects to margin decisions:
- AI cost per hour avoided (or per deliverable produced)
This forces the right conversation:
- Are we using AI to reduce low-value effort?
- Are we spending AI cycles where senior judgement is still the bottleneck?
- Are we accidentally increasing iteration by generating more options than anyone can review?
You don’t need perfect measurement. You need directional clarity so you can set guardrails.
The “hidden” cost: review time and evidence
The fastest way to kill AI ROI is to move the bottleneck from drafting to verification.
If a team produces a report draft 3x faster but then spends days validating what the model said, you didn’t accelerate delivery—you changed the shape of the work.
So the cost model has to include:
- Review hours (especially partner/SME review)
- Evidence collection (what sources back each claim?)
- Quality assurance and sign-off
This is why governance and cost policy are linked. A firm that cannot produce evidence trails will either:
- under-review (risk), or
- over-review (cost), or
- quietly ban AI (missed opportunity).
If you want a governance-first view, see: AI governance for consulting firms: policies that ship.
What to say to clients (without triggering the wrong debate)
Clients rarely want to pay for your tools. They want confidence that the deliverable is correct, defensible, and produced responsibly.
A useful client-facing position sounds like this:
- “We use AI-assisted workflows to accelerate drafting and analysis, so we can spend more time on judgement and recommendations.”
- “We have controls around what data can be used, how outputs are reviewed, and how evidence is tracked.”
- “Where AI meaningfully changes delivery speed, we structure pricing around outcomes and timelines, not keystrokes.”
Avoid:
- “We bill you for tokens.”
- “We don’t use AI at all.” (Often not credible, and it can be commercially disadvantageous.)
Transparency is about the workflow and controls, not line-item inference costs.
Where Altea fits: fewer “one-off prompts,” more reusable delivery
The most sustainable way to manage AI costs is to reduce repeated work.
In practice, that means moving from:
- ad-hoc prompting against scattered files
to:
- a structured knowledge layer that can reliably reuse methods, evidence, and prior deliverables across projects.
That is the point where AI becomes a firm capability rather than an expense line.
Altea’s positioning is built for exactly this: connecting to internal sources (e.g., SharePoint and operational tools), building an indexed, explainable knowledge base, and supporting controlled, source-backed outputs for deliverable work.
A simple starting policy you can ship this quarter
If you want something you can implement quickly, start with five decisions:
- Approved tools and tiers: which tools are allowed for which data classes.
- Overhead baseline: what every consultant gets by default.
- Allocatable threshold: when usage becomes project-charged (keep it simple).
- Quality gates: what must be reviewed, evidenced, and signed off before client delivery.
- Packaging: which offers become explicitly “AI-enabled” and how they are priced.
The firms that win won’t be the ones with the strictest controls or the most experimentation.
They’ll be the ones who can explain a consistent model that protects quality, margin, and trust—while making delivery meaningfully faster.
Closing: make AI spend a competitive advantage, not a surprise bill
If your AI usage is rising, don’t wait for a finance fire drill.
Put a lightweight cost policy in place, tie it to governance and quality controls, and design a client-facing narrative that keeps the conversation on value and defensibility.
If you want to see what an “AI-enabled delivery” operating model could look like in your firm, Altea can help you build a trusted knowledge layer that accelerates proposals and deliverables with source-backed outputs.
Sources
- Business Insider (Apr 2026): Big Four leaders share how they use AI at work
- Business Insider (Jun 5, 2025): Globant changes billing model for the AI era (“AI Pods” concept)
- r/consulting (Feb 24–25, 2026): Discussion thread on treating AI tools as overhead vs allocating to projects
- KPMG (Sep 5, 2025): Customer accounting for SaaS arrangements (implementation cost considerations)