Accounting for LLM Costs: Per-Feature Amortization Models for SaaS

Building AI features into a SaaS product is exciting — and expensive if you don’t account for where that cost actually lands. Most FinOps writeups focus on high-level savings: batching, quantization, or switching models. That’s useful, but it misses the question product teams desperately need answered: How do you allocate LLM costs to features and customers so pricing, margins, and product decisions are sane?

This article walks through exactly that: the cost components you must track, the math for cost-per-action and cost-per-outcome, amortization approaches (one-time vs recurring), a practical spreadsheet + worked example (Customer X), pricing strategies you can use, and monitoring practices to catch cost leakage early. Keywords you’ll see throughout: LLM cost allocation, per-feature cost, AI pricing model, amortize inference cost.

Why per-feature accounting matters for AI products

When an engineer says “the model call costs $0.03 per 1k tokens,” that fact is necessary but not sufficient. Product teams need unit economics at the feature and customer level. Why?

Pricing clarity. If a feature drives unit economics that are negative (or razor-thin) you either need to charge more, limit usage, or find cheaper technical alternatives.
Feature profitability. Some features are “loss leaders” that increase retention; others must pay for themselves. You can only tell the difference if you know the true cost per feature.
Fair cross-customer allocation. Enterprise customers often use many more tokens than small customers. Without per-feature allocation, your customer-level margins will be wrong.
Informed tradeoffs. Should you use a cheaper model, add caching, or gate usage? Per-feature cost models show the tradeoffs in dollars and cents, not vague percentages.
Financial forecasting & fundability. Investors and finance teams want predictable unit economics. LLM costs muddy that predictability unless carefully amortized and tracked.

Per-feature accounting is about turning raw model invoices into business signals — what to charge, what to subsidize, and where to optimize.

Cost components (tokens, model tier, infra, network)

To allocate cost correctly, break it into discrete buckets. Every action your app takes touches one or more of these.

Token-based inference costs (model cost).
- Most vendors price inference by tokens (per 1,000 tokens) or by compute. Tokens are the core variable cost: both prompt (input) and output tokens matter.
- Model tier matters: small, medium, large models trade accuracy and capability for cost. Choose the tier that meets product requirements — and reflect the tier in cost allocation.
Infra & orchestration costs.
- This includes GPU or CPU time if you self-host, container runtimes, autoscaling overhead, and control plane services (orchestrator, autoscaler, Kubernetes nodes).
- For hosted APIs, this can be invisible but often shows up in higher cloud bills (egress, logging, monitoring).
Network & egress.
- Especially for heavy output or multi-media features (images, video), bandwidth and CDN costs matter. Even simple text outputs have egress costs at scale.
Storage & telemetry.
- Prompt logs, embeddings, vector indexes, and result storage are all recurring. If you store conversation history or large embeddings for personalization, these are material.
Engineering & model development amortization.
- Model fine-tuning, building prompt libraries, running evaluation suites, and integrating function calling — these are upfront costs that should be amortized across expected usage.
Safety & moderation costs.
- Content filters, human review for flagged results, and legal/compliance reviews add incremental per-action or per-flag costs.
Operational overhead & margins.
- Support, SRE oncall, and margin target. These are business costs you’ll want to fold into pricing decisions.

Each of these should be tracked and, where possible, expressed in a per-action or per-outcome basis so they can be rolled up into a single unit economics number.

Formula: cost-per-action, cost-per-outcome, amortization over time

Below are the formulas you’ll use to convert invoices and one-time costs into per-feature economics.

1) Basic cost per action (CPA)

An action is the atomic unit of feature usage: one summarize call, one conversation turn, one image generation, etc.

$CPA=ModelCost_perAction+Infra_perAction+Network_perAction+Storage_perAction+Ops_perAction\text{CPA} = \text{ModelCost\_perAction} + \text{Infra\_perAction} + \text{Network\_perAction} + \text{Storage\_perAction} + \text{Ops\_perAction}$

2) Amortization of upfront costs (per action)

When you spend $U$ dollars up front (fine-tuning, building a special pipeline), amortize it across a period and expected actions.

3) Cost per outcome (CPO)

Often you care about the cost per business outcome (e.g., a published summary, a generated lead). If $k$ actions are needed to produce one outcome:

$CPO=k×CPA\text{CPO} = k \times \text{CPA}$

4) Cost per customer per month (CPC)

To forecast at customer granularity:

$CPC=CPA×avg_actions_per_customer_per_month\text{CPC} = \text{CPA} \times \text{avg\_actions\_per\_customer\_per\_month}$

These formulas let you run scenarios: what if a customer doubles usage? What if tokens per action increase? Quickly you can see the margin erosion.

Practical spreadsheet & worked example (Customer X)

Below is a worked example you can copy into a spreadsheet. I’ll walk through one consistent scenario (numbers are illustrative — replace with your actual invoices and usage).

Scenario (Customer X)

Feature: “Smart Summary” → one action = take a document and return a concise 5-bullet summary.
Average tokens per action: Input 800 tokens, Output 150 tokens → Total 950 tokens.
Model tier cost: $0.03 per 1,000 tokens (medium tier).
Infra overhead per action: $0.005 (autoscaled API/compute amortized).
Network per action: $0.001.
Storage/logging per action: $0.0005.
Fine-tune / engineering up-front cost: $25,000, amortized over 12 months.
Customer X usage: 5,000 active users, 8 actions per user per month → 40,000 actions / month.

Step 1 — model cost per action

$USD\text{ModelCost} = \frac{950}{1000} \times 0.03 = 0.0285\ \text{USD}$

Step 2 — amortization per action

Monthly amortization = $25{,}000 / 12 = 2{,}083.33$ USD per month.

$Amort_perAction=2,083.33/40,000=0.0520833 USD\text{Amort\_perAction} = 2{,}083.33 / 40{,}000 = 0.0520833\ \text{USD}$

Step 3 — combine all per-action costs

$action\begin{aligned} \text{CPA} &= 0.0285\ (\text{model}) + 0.005\ (\text{infra}) + 0.001\ (\text{network}) \\ &\quad + 0.0005\ (\text{storage}) + 0.0520833\ (\text{amort}) \\ &= 0.0870833\ \text{USD per action} \end{aligned}$

Step 4 — monthly & per-user math

Monthly cost for Customer X:

$USD\text{MonthlyTotal} = 0.0870833 \times 40{,}000 = 3{,}483.33\ \text{USD}$

Cost per user per month:

$USD\text{CostPerUserPerMonth} = 3{,}483.33 / 5{,}000 = 0.69667\ \text{USD}$

Put that into a spreadsheet with columns:

Item	Value
Tokens/input	800
Tokens/output	150
Total tokens/action	950
Model price per 1k tokens	$0.03
Model cost/action	$0.0285
Infra cost/action	$0.0050
Network cost/action	$0.0010
Storage cost/action	$0.0005
Monthly amortization (engineering)	$2,083.33
Actions/month	40,000
Amortization per action	$0.05208
CPA (fully loaded)	$0.08708
Monthly total cost	$3,483.33
Cost per user per month	$0.6967

Lesson from the example: Engineering amortization was the biggest single line here (~$0.052). If your fine-tune or integration costs are high and usage is low, amortization dominates. Conversely, if you scale actions massively, token/model cost will dominate — both are levers you can manage.

Pricing strategies using amortized cost (credits, subscriptions)

Armed with CPA and CPC, you can pick a pricing strategy. Here are practical approaches:

1) Add-on per-feature subscription

Charge a fixed monthly fee that covers the average CPC for the expected user profile plus margin.
When to use: Predictable usage, enterprise customers, and when onboarding friction must be low.
Calculation: Set price = CPC × markup factor (e.g., 2×–3×) to cover variability and support.

Example: If CPC = $0.70/user/month, price add-on at $1.95/month (2.8×) for margin and variability.

2) Pay-per-use / credits

Sell credits (e.g., 1 credit = 1 action) or charge per action. Credits are good for fair allocation.
When to use: Variable user usage and developer APIs.
Calculation: Price per credit = CPA × markup. Decide whether to volume-discount (tiered pricing) for large customers.

Example: CPA = $0.087 → charge $0.20 per action, with discounts at 1,000+ actions.

3) Hybrid (quota + overage)

Give a subscription that includes N actions, then charge per action over the quota. This balances predictability and fairness.
When to use: Consumer freemium models, where you want to encourage usage but limit abuse.

Example: $5/month includes 30 actions; extra at $0.15/action.

4) Bundling & credits as retention tool

Include feature in higher tier plans (bundle) and treat cost as part of customer LTV. Bundles can justify higher ACV for enterprise clients.
Remember: bundling masks variable costs. Use per-customer CPC to model margin erosion if adoption rises.

5) Cost-shielding with model engineering

Offer a high-volume discount but require customers to accept a cheaper model (or batched inference) for lower price tiers.
Use amortization to justify initial low margin — e.g., subsidize early adopters while usage grows.

How to pick markups?

For per-action micro-features: 2.5×–4× markup to cover support, fraud, variability.
For enterprise add-ons: margin can be lower because total contract value and stickiness offset risk.
Always model worst-case usage (95th percentile) when sizing guarantees.

Monitoring & alerts for cost leakage

Unit economics are only useful if you can detect drift. Build observability specifically for LLM costs.

Metrics to emit

Cost per action (rolling 7/30 day) — if this increases, problem likely with token count or model changes.
Tokens per action distribution — median, p75, p95. Large jumps mean prompt changes or customers feeding larger inputs.
Actions per customer per month (by cohort) — track top-consumers (long tail).
Monthly amortization utilization — are up-front costs being covered by expected usage?
Blocked / moderated results rate and human review costs — rising moderation costs are often hidden.
Anomaly in model tier usage — someone switching to a bigger model (or you mistakenly routing) can spike cost.

Alerts to configure

Cost per action > X% increase vs baseline (24h/7d/30d) — investigate prompt drift or user behavior change.
Top 10 customers consume > Y% of monthly tokens — consider special pricing or negotiation.
Daily spend > budget forecast — pause experimental features automatically.
Model pricing changes (vendor) — vendor price updates must trigger a review of pricing and amortization assumptions.

Practical tooling

Integrate cost metrics into your internal dashboard (Datadog/Looker/Grafana). Emit a daily cost breakdown by feature, customer, and model tier.
Build a “cost sandbox” alert that shows how much a single buggy backfill or a bad prompt could cost in 24 hours.

Operational & governance notes

Version your prompts and model tiers. Tag every deployed prompt and model version; this lets you attribute cost and performance regressions.
Treat amortization as first-class budget items. Put fine-tune and project costs into capital planning so product managers can see true ROI.
Automate quota enforcement. Feature flags + enforcement middleware let you control runaway tests.
Negotiate vendor contracts with spikes in mind. If you have a few high users, negotiate volume discounts or committed usage to stabilize unit economics.
Run cost postmortems monthly. Every 10% variance from forecast should trigger an investigation and action item.

Final checklist — what to implement this week

Instrument tokens per action and calculate model cost per action automatically.
Add infra, network, and storage per-action estimates into your billing dashboard.
Add amortization rows for fine-tune and dev costs and compute amort_per_action per feature.
Build alerts for tokens per action p95 shifts and top-consumer spikes.
Draft pricing sketches for add-on, credit, and hybrid models based on CPA × markup scenarios.
Run one worked example for a representative customer (like Customer X) and validate in finance.

Closing thought

LLM fees are not an abstract engineering concern — they are a recurring business cost that must be translated into product decisions. The difference between a sustainable AI feature and a costly experiment is often just one disciplined spreadsheet: model tokens, infra, amortized engineering, and clear pricing. Do that work once, embed it in dashboards, and your product team will be able to make decisions with dollars instead of gut feelings.

RomoTech