GPT-4o mini: token counter & pricing
OpenAI · exact (uses official tokenizer) · pricing as of 2026-05-31.
- Provider
- OpenAI
- API model ID
gpt-4o-mini- Context window
- 128,000 tokens
- Input price
- $0.15 per 1M tokens
- Output price
- $0.60 per 1M tokens
- Tokenizer accuracy
- exact (uses official tokenizer)
- Pricing as of
- 2026-05-31
Open the counter to count tokens for GPT-4o mini in real time.
What is GPT-4o mini?
GPT-4o mini is OpenAI's small, fast, cheap model, designed for high-volume workloads where GPT-4o is overkill. 17× cheaper than GPT-4o on input, 17× cheaper on output, while keeping the same o200k_base tokenizer and 128k context window.
How tokens are counted here
GPT-4o mini uses OpenAI's o200k_base tokenizer. We compute counts in your browser via js-tiktoken, your prompt never leaves your device for OpenAI counts. Counts are exact.
When to use GPT-4o mini
- High-volume classification and extraction. Same accuracy as GPT-4o on most labeling tasks.
- Short chat replies where the model isn't doing heavy reasoning.
- First-pass filtering in pipelines that escalate harder cases to GPT-4o or Claude Opus.
- RAG over routine documents where the retrieval did the heavy lifting.
When not to use it: anything requiring careful multi-step reasoning, structured planning, or nuanced instruction-following on subtle constraints. The quality gap to GPT-4o is real on these.
Pricing notes
At $0.15/$0.60 per million, GPT-4o mini is one of the cheapest frontier-tier models. Direct competitors:
- Claude Haiku 4.5 ($0.80/$4), better instruction-following, much more expensive.
- Gemini 2.5 Flash ($0.075/$0.30), half the price of GPT-4o mini, 1M-token context.
- Llama 3.1 8B (~$0.18/$0.18), comparable price, hosted via Together/Groq, lower quality on most benchmarks.
For most price-sensitive workloads, the choice is GPT-4o mini vs Gemini Flash. Test both; the winner depends entirely on your prompt distribution.
Common questions
Is GPT-4o mini just GPT-3.5 with new branding?
No. It's a distinct model with substantially better benchmark scores than the old gpt-3.5-turbo line, but priced at roughly 30% of the cost. OpenAI deprecated gpt-3.5-turbo in favor of mini.
Does mini support function calling and structured outputs?
Yes, same OpenAI features as GPT-4o (function calling, JSON mode, structured outputs with schema). The capability surface is the same; the only difference is reasoning quality on hard prompts.
What's a typical cost for a chat exchange?
A 500-token prompt with a 100-token reply: $0.000075 input + $0.00006 output = $0.000135 per call, or $135 per million calls. Use the calculator above with your real prompt for an accurate number.
Compare GPT-4o mini to other models
- GPT-5.5 (OpenAI, $5.00/$30.00)
- GPT-5.5 Pro (OpenAI, $30.00/$180.00)
- GPT-5.4 (OpenAI, $2.50/$15.00)
- GPT-5.4 Mini (OpenAI, $0.75/$4.50)
- GPT-5.4 Nano (OpenAI, $0.20/$1.25)
- GPT-5.4 Pro (OpenAI, $30.00/$180.00)
- GPT-5.3 (OpenAI, $1.75/$14.00)
- GPT-5.2 (OpenAI, $1.75/$14.00)
- GPT-5.2 Pro (OpenAI, $21.00/$168.00)
- GPT-5.1 (OpenAI, $1.25/$10.00)
- GPT-5 (OpenAI, $1.25/$10.00)
- GPT-5 Mini (OpenAI, $0.25/$2.00)
- GPT-5 Nano (OpenAI, $0.05/$0.40)
- GPT-5 Pro (OpenAI, $15.00/$120.00)
- GPT-4.1 (OpenAI, $2.00/$8.00)
- GPT-4.1 Mini (OpenAI, $0.40/$1.60)
- GPT-4.1 Nano (OpenAI, $0.10/$0.40)
- o3 (OpenAI, $2.00/$8.00)
- o3-mini (OpenAI, $1.10/$4.40)
- o3-pro (OpenAI, $20.00/$80.00)
- o4-mini (OpenAI, $1.10/$4.40)
- GPT-4o (OpenAI, $2.50/$10.00)
- GPT-4 Turbo (OpenAI, $10.00/$30.00)
- Llama 3.1 8B (Meta, $0.18/$0.18)
- Gemini 2.5 Flash-Lite (Google, $0.10/$0.40)
- Gemini 3.1 Flash-Lite (Google, $0.25/$1.50)