#tHow Many Tokens?

← Back to counter

What is the cheapest AI model?

The short answer

As of April 27, 2026, GPT-5 Nano at $0.05 per million input tokens / $0.40 per million output tokens is the cheapest exact-tokenizer model from a major provider.

For multimodal workloads or longer context, Gemini 2.5 Flash-Lite ($0.10 / $0.40) is the closest GA alternative, same output price, double the input rate, but with a 1M-token context window vs GPT-5 Nano's 400K.

Cheapest models, ranked

Sorted by per-1M-call cost on a typical 1,000-token input + 200-token output workload:

RankModelInput ($/M)Output ($/M)1M calls costNotes
1GPT-5 Nano$0.05$0.40$130Cheapest input rate of any exact-tokenizer model. 400K context.
2Gemini 2.5 Flash-Lite$0.10$0.40$1801M context, multimodal, GA
3GPT-4.1 Nano$0.10$0.40$1801M context, exact o200k_base tokenizer
4Gemini 3.1 Flash-Lite Preview$0.25$1.50$550Preview tier; cheapest Gemini 3
5GPT-5.4 Nano$0.20$1.25$450Reasoning-tier nano; cached input $0.02/M
6Llama 3.1 8B (Together)$0.18$0.18$216Open weights; ≈±3% tokenizer estimate
7DeepSeek V3$0.27$1.10$490Frontier capability at this price
8Gemini 2.5 Flash$0.30$2.50$800Was the cheap-flash default; Flash-Lite is now cheaper
9GPT-5 Mini$0.25$2.00$650Step-up from Nano; better reasoning
10Gemini 3 Flash Preview$0.50$3.00$1,100Pro-tier capability at Flash prices
11Claude Haiku 4.5$1.00$5.00$2,000Best Claude instruction-following at low cost

"Cheap" depends on what you need

The cheapest model isn't always the right answer. Consider:

Cheaper than this list

If you really want sub-$0.05 per million tokens:

Get a real cost comparison

Paste your prompt into the counter, it shows the actual token count and per-call cost across every model, so you can choose by total cost on your workload instead of by per-million headline.

Try this on every model

Try the live counter →