Gemini 2.5 Flash-Lite: token counter & pricing
Google · exact (uses official tokenizer) · pricing as of 2026-04-27.
- Provider
- API model ID
gemini-2.5-flash-lite- Context window
- 1,000,000 tokens
- Input price
- $0.10 per 1M tokens
- Output price
- $0.40 per 1M tokens
- Tokenizer accuracy
- exact (uses official tokenizer)
- Pricing as of
- 2026-04-27
Open the counter to count tokens for Gemini 2.5 Flash-Lite in real time.
What is Gemini 2.5 Flash-Lite?
Gemini 2.5 Flash-Lite is Google's smallest GA Gemini model — designed for the highest-volume, lowest-cost workloads where you still want an exact tokenizer and a 1M-token context window.
At $0.10 input / $0.40 output per 1M tokens, it's the cheapest GA model with a vendor-official tokenizer — undercut only by GPT-5 Nano ($0.05 input) on raw input price.
How tokens are counted here
Gemini 2.5 Flash-Lite uses Google's official models.countTokens endpoint via our serverless proxy. Exact.
Pricing notes
$0.10 input / $0.40 output per 1M tokens. Cached input $0.01/M.
For 1,000-token prompt + 200-token reply: $0.000180 per call, $180 per 1M calls.
The closest direct comparisons:
| Model | Input | Output | 1M calls @ 1k/200 |
|---|---|---|---|
| GPT-5 Nano | $0.05 | $0.40 | $130 |
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | $180 |
| GPT-4.1 Nano | $0.10 | $0.40 | $180 |
| Llama 3.1 8B | $0.18 | $0.18 | $216 |
GPT-5 Nano wins on raw price. Flash-Lite wins on 1M context window and multimodal input. Choose by what your workload needs.
When to use Gemini 2.5 Flash-Lite
- High-volume classification and extraction at the lowest possible Google price.
- Multimodal at scale — image labeling, video frame tagging, audio transcription metadata.
- Long-context retrieval where you need >400K tokens (GPT-5 family caps at 400K).
- Production-stable cheap tier — GA, unlike the Gemini 3 Preview models.
When not to use it:
- Pure text reasoning where GPT-5 Nano is cheaper.
- Frontier-tier tasks. Use Gemini 3 Pro or GPT-5.5.
Common questions
How does Flash-Lite compare to Gemini 3 Flash?
Flash-Lite ($0.10/$0.40) is much cheaper. Gemini 3 Flash ($0.50/$3.00) is much more capable, positioned as a Pro replacement. Test both — Flash-Lite is often enough.
Is the count_tokens endpoint free?
Yes — Google's countTokens is separate from generation billing. Our proxy adds a 30-day cache so we don't burn quota on identical inputs.
Does Flash-Lite support tools and function calling?
Yes — same Gemini API surface as the Pro and Flash tiers. Reliability is somewhat lower than Pro on complex multi-tool workflows; verify with your prompts.
Compare Gemini 2.5 Flash-Lite to other models
- Gemini 3.1 Pro (Google, $2.00/$12.00)
- Gemini 3 Flash (Google, $0.50/$3.00)
- Gemini 3.1 Flash-Lite (Google, $0.25/$1.50)
- Gemini 2.5 Pro (Google, $1.25/$10.00)
- Gemini 2.5 Flash (Google, $0.30/$2.50)
- GPT-4.1 Nano (OpenAI, $0.10/$0.40)
- GPT-4o mini (OpenAI, $0.15/$0.60)
- GPT-5 Nano (OpenAI, $0.05/$0.40)