#tHow Many Tokens?

← Back to counter

Cheapest long-context AI model

Long context unlocks codebase-scale analysis, multi-document Q&A, and large agent state. We filtered to models with at least 100,000-token context windows, then ranked by a realistic 50,000 input + 1,000 output workload.

GPT-5 Nano is the cheapest model for this workload at $2,900 per 1M calls (OpenAI). GPT-4.1 Nano is second at $5,400.

Ranked cheapest first

#ModelInput $/MOutput $/MPer 1M calls
#1 GPT-5 Nano
OpenAI
$0.05 $0.40 $2,900
#2 GPT-4.1 Nano
OpenAI
$0.10 $0.40 $5,400
#3 Gemini 2.5 Flash-Lite
Google
$0.10 $0.40 $5,400
#4 GPT-4o mini
OpenAI
$0.15 $0.60 $8,100
#5 Llama 3.1 8B
Meta
$0.18 $0.18 $9,180
#6 GPT-5.4 Nano
OpenAI
$0.20 $1.25 $11,250
#7 Gemini 3.1 Flash-Lite
Google
$0.25 $1.50 $14,000
#8 GPT-5 Mini
OpenAI
$0.25 $2.00 $14,500
#9 DeepSeek V3
DeepSeek
$0.27 $1.10 $14,600
#10 Gemini 2.5 Flash
Google
$0.30 $2.50 $17,500
#11 GPT-4.1 Mini
OpenAI
$0.40 $1.60 $21,600
#12 Gemini 3 Flash
Google
$0.50 $3.00 $28,000
#13 Llama 3.1 70B
Meta
$0.59 $0.79 $30,290
#14 DeepSeek V3.1
DeepSeek
$0.60 $1.70 $31,700
#15 Qwen 2.5 Coder 32B
Alibaba
$0.80 $0.80 $40,800
#16 GPT-5.4 Mini
OpenAI
$0.75 $4.50 $42,000
#17 Llama 3.3 70B
Meta
$0.88 $0.88 $44,880
#18 Qwen 2.5 72B
Alibaba
$0.90 $0.90 $45,900
#19 Claude Haiku 4.5
Anthropic
$1.00 $5.00 $55,000
#20 o3-mini
OpenAI
$1.10 $4.40 $59,400
#21 o4-mini
OpenAI
$1.10 $4.40 $59,400
#22 GPT-5.1
OpenAI
$1.25 $10.00 $72,500
#23 GPT-5
OpenAI
$1.25 $10.00 $72,500
#24 Gemini 2.5 Pro
Google
$1.25 $10.00 $72,500
#25 GLM-5.1
zhipu
$1.40 $4.40 $74,400
#26 GPT-5.3
OpenAI
$1.75 $14.00 $101,500
#27 GPT-5.2
OpenAI
$1.75 $14.00 $101,500
#28 Qwen3 Coder 480B
Alibaba
$2.00 $2.00 $102,000
#29 Mistral Large
Mistral
$2.00 $6.00 $106,000
#30 GPT-4.1
OpenAI
$2.00 $8.00 $108,000
#31 o3
OpenAI
$2.00 $8.00 $108,000
#32 Gemini 3.1 Pro
Google
$2.00 $12.00 $112,000
#33 GPT-4o
OpenAI
$2.50 $10.00 $135,000
#34 GPT-5.4
OpenAI
$2.50 $15.00 $140,000
#35 DeepSeek R1
DeepSeek
$3.00 $7.00 $157,000
#36 Claude Sonnet 4.6
Anthropic
$3.00 $15.00 $165,000
#37 Llama 3.1 405B
Meta
$3.50 $3.50 $178,500
#38 Claude Opus 4.8
Anthropic
$5.00 $25.00 $275,000
#39 GPT-5.5
OpenAI
$5.00 $30.00 $280,000
#40 GPT-4 Turbo
OpenAI
$10.00 $30.00 $530,000
#41 Claude Opus 4.8 (Fast Mode)
Anthropic
$10.00 $50.00 $550,000
#42 GPT-5 Pro
OpenAI
$15.00 $120.00 $870,000
#43 o3-pro
OpenAI
$20.00 $80.00 $1,080,000
#44 GPT-5.2 Pro
OpenAI
$21.00 $168.00 $1,218,000
#45 GPT-5.5 Pro
OpenAI
$30.00 $180.00 $1,680,000
#46 GPT-5.4 Pro
OpenAI
$30.00 $180.00 $1,680,000

Workload assumption: 50,000 input tokens + 1,000 output tokens per call, scaled to 1M calls. Pricing as of 2026-05-31.

How to use this ranking

The winner is mathematically cheapest at the listed workload shape — that's not the same as "best for the use case." Cheaper models often have lower reasoning depth, smaller context windows, or worse instruction-following. Use this as the cost baseline, then test the top 2-3 candidates on your real prompts via the live counter.

Pricing snapshots come from each provider's published rate cards and are tracked in the full pricing changelog. Tokenizer accuracy per model is documented in the methodology.

Other ranked use cases

Try the live counter →