Cheapest long-context AI model
Long context unlocks codebase-scale analysis, multi-document Q&A, and large agent state. We filtered to models with at least 100,000-token context windows, then ranked by a realistic 50,000 input + 1,000 output workload.
Ranked cheapest first
| # | Model | Input $/M | Output $/M | Per 1M calls |
|---|---|---|---|---|
| #1 | GPT-5 Nano OpenAI |
$0.05 | $0.40 | $2,900 |
| #2 | GPT-4.1 Nano OpenAI |
$0.10 | $0.40 | $5,400 |
| #3 | Gemini 2.5 Flash-Lite |
$0.10 | $0.40 | $5,400 |
| #4 | GPT-4o mini OpenAI |
$0.15 | $0.60 | $8,100 |
| #5 | Llama 3.1 8B Meta |
$0.18 | $0.18 | $9,180 |
| #6 | GPT-5.4 Nano OpenAI |
$0.20 | $1.25 | $11,250 |
| #7 | Gemini 3.1 Flash-Lite |
$0.25 | $1.50 | $14,000 |
| #8 | GPT-5 Mini OpenAI |
$0.25 | $2.00 | $14,500 |
| #9 | DeepSeek V3 DeepSeek |
$0.27 | $1.10 | $14,600 |
| #10 | Gemini 2.5 Flash |
$0.30 | $2.50 | $17,500 |
| #11 | GPT-4.1 Mini OpenAI |
$0.40 | $1.60 | $21,600 |
| #12 | Gemini 3 Flash |
$0.50 | $3.00 | $28,000 |
| #13 | Llama 3.1 70B Meta |
$0.59 | $0.79 | $30,290 |
| #14 | DeepSeek V3.1 DeepSeek |
$0.60 | $1.70 | $31,700 |
| #15 | Qwen 2.5 Coder 32B Alibaba |
$0.80 | $0.80 | $40,800 |
| #16 | GPT-5.4 Mini OpenAI |
$0.75 | $4.50 | $42,000 |
| #17 | Llama 3.3 70B Meta |
$0.88 | $0.88 | $44,880 |
| #18 | Qwen 2.5 72B Alibaba |
$0.90 | $0.90 | $45,900 |
| #19 | Claude Haiku 4.5 Anthropic |
$1.00 | $5.00 | $55,000 |
| #20 | o3-mini OpenAI |
$1.10 | $4.40 | $59,400 |
| #21 | o4-mini OpenAI |
$1.10 | $4.40 | $59,400 |
| #22 | GPT-5.1 OpenAI |
$1.25 | $10.00 | $72,500 |
| #23 | GPT-5 OpenAI |
$1.25 | $10.00 | $72,500 |
| #24 | Gemini 2.5 Pro |
$1.25 | $10.00 | $72,500 |
| #25 | GLM-5.1 zhipu |
$1.40 | $4.40 | $74,400 |
| #26 | GPT-5.3 OpenAI |
$1.75 | $14.00 | $101,500 |
| #27 | GPT-5.2 OpenAI |
$1.75 | $14.00 | $101,500 |
| #28 | Qwen3 Coder 480B Alibaba |
$2.00 | $2.00 | $102,000 |
| #29 | Mistral Large Mistral |
$2.00 | $6.00 | $106,000 |
| #30 | GPT-4.1 OpenAI |
$2.00 | $8.00 | $108,000 |
| #31 | o3 OpenAI |
$2.00 | $8.00 | $108,000 |
| #32 | Gemini 3.1 Pro |
$2.00 | $12.00 | $112,000 |
| #33 | GPT-4o OpenAI |
$2.50 | $10.00 | $135,000 |
| #34 | GPT-5.4 OpenAI |
$2.50 | $15.00 | $140,000 |
| #35 | DeepSeek R1 DeepSeek |
$3.00 | $7.00 | $157,000 |
| #36 | Claude Sonnet 4.6 Anthropic |
$3.00 | $15.00 | $165,000 |
| #37 | Llama 3.1 405B Meta |
$3.50 | $3.50 | $178,500 |
| #38 | Claude Opus 4.8 Anthropic |
$5.00 | $25.00 | $275,000 |
| #39 | GPT-5.5 OpenAI |
$5.00 | $30.00 | $280,000 |
| #40 | GPT-4 Turbo OpenAI |
$10.00 | $30.00 | $530,000 |
| #41 | Claude Opus 4.8 (Fast Mode) Anthropic |
$10.00 | $50.00 | $550,000 |
| #42 | GPT-5 Pro OpenAI |
$15.00 | $120.00 | $870,000 |
| #43 | o3-pro OpenAI |
$20.00 | $80.00 | $1,080,000 |
| #44 | GPT-5.2 Pro OpenAI |
$21.00 | $168.00 | $1,218,000 |
| #45 | GPT-5.5 Pro OpenAI |
$30.00 | $180.00 | $1,680,000 |
| #46 | GPT-5.4 Pro OpenAI |
$30.00 | $180.00 | $1,680,000 |
Workload assumption: 50,000 input tokens + 1,000 output tokens per call, scaled to 1M calls. Pricing as of 2026-05-31.
How to use this ranking
The winner is mathematically cheapest at the listed workload shape — that's not the same as "best for the use case." Cheaper models often have lower reasoning depth, smaller context windows, or worse instruction-following. Use this as the cost baseline, then test the top 2-3 candidates on your real prompts via the live counter.
Pricing snapshots come from each provider's published rate cards and are tracked in the full pricing changelog. Tokenizer accuracy per model is documented in the methodology.