How Many Tokens?

← Back to counter

How much does GPT-4o cost per million tokens?

The short answer

As of 2026-04-26, OpenAI charges:

Output is 4× the price of input. This is typical for OpenAI and Anthropic models — generation is more expensive than reading.

What that means in real money

For a typical chat exchange (1,000-token prompt, 200-token reply):

input:  1,000 / 1,000,000 × $2.50 = $0.0025
output:   200 / 1,000,000 × $10.00 = $0.002
total per call:                      $0.0045
total per 1 million calls:        $4,500

For a longer RAG-style query (10,000 input, 500 output):

input:  10,000 / 1,000,000 × $2.50 = $0.025
output:    500 / 1,000,000 × $10.00 = $0.005
total per call:                       $0.03
total per 1 million calls:          $30,000

The longer the input, the more dominant the input cost becomes — and the more your model choice matters for total spend.

Cheaper alternatives

If you're cost-sensitive, GPT-4o is mid-range:

ModelInput ($/M)Output ($/M)When to consider
GPT-4o mini$0.15$0.60Most workloads — 17× cheaper, smaller quality gap than you'd expect
Gemini 2.5 Flash$0.075$0.30Cheapest exact-tokenizer option, 1M context
Claude Haiku 4.5$0.80$4.00When you want Claude's instruction-following at low cost
DeepSeek V3$0.27$1.10Cheapest frontier-tier model (subject to compliance fit)

More expensive but stronger on hard prompts

ModelInputOutputWhen to consider
Claude Sonnet 4.6$3.00$15.00Better instruction-following on nuanced tasks
Claude Opus 4.7$15.00$75.00Frontier reasoning, when output quality justifies the bill
Gemini 2.5 Pro$1.25$10.00Long-context (2M tokens), multimodal

Get a live cost estimate

Paste your actual prompt into the counter. It will show exact token counts across every model and the per-call cost based on your expected output ratio.

Try this on every model

Try the live counter →