Guides

Token cost calculator

How to estimate AI API token costs before a request runs

A token cost calculator helps developers estimate spend before sending requests. The useful calculation separates input tokens, output tokens, cached reads, and per-model unit prices.

The basic formula

A practical estimate is input tokens multiplied by the input unit price, plus output tokens multiplied by the output unit price, plus cached read tokens multiplied by the cache unit price.

Why output caps matter

Output tokens are often the variable part of a request. Setting max output tokens gives teams an upper bound before routing, which helps prevent accidental long responses from using more budget than expected.

  • Estimate worst-case output before sending
  • Use smaller caps for testing and trial keys
  • Compare estimates with request-level logs after completion

Prepaid credit planning

Prepaid gateways make estimates easier to reason about because requests stop when balance or key budgets are exhausted instead of becoming an open-ended invoice.

FAQ

Are token estimates exact?

No. Estimates are useful before routing, but final billing should use reported usage from the completed request when available.

Why track cached tokens separately?

Cached reads usually have a different cost profile, so separating them makes billing more transparent.