Gemini 2.5 Flash-Lite: token counter & pricing

Google · exact (uses official tokenizer) · pricing as of 2026-07-26.

Updated 2026-07-26 · By Clinton Patrick · Methodology

Provider: Google
API model ID: gemini-2.5-flash-lite
Context window: 1,000,000 tokens
Input price: $0.10 per 1M tokens
Output price: $0.40 per 1M tokens
Tokenizer accuracy: exact (uses official tokenizer)
Pricing as of: 2026-07-26

Open the counter to count tokens for Gemini 2.5 Flash-Lite in real time.

What is Gemini 2.5 Flash-Lite?

Gemini 2.5 Flash-Lite is Google's smallest GA Gemini model, designed for the highest-volume, lowest-cost workloads where you still want an exact tokenizer and a 1M-token context window.

At $0.10 input / $0.40 output per 1M tokens, it's the cheapest GA model with a vendor-official tokenizer, undercut only by GPT-5 Nano ($0.05 input) on raw input price.

How tokens are counted here

Gemini 2.5 Flash-Lite uses Google's official models.countTokens endpoint via our serverless proxy. Exact.

Pricing notes

$0.10 input / $0.40 output per 1M tokens. Cached input $0.01/M.

For 1,000-token prompt + 200-token reply: $0.000180 per call, $180 per 1M calls.

The closest direct comparisons:

Model	Input	Output	1M calls @ 1k/200
GPT-5 Nano	$0.05	$0.40	$130
Gemini 2.5 Flash-Lite	$0.10	$0.40	$180
GPT-4.1 Nano	$0.10	$0.40	$180
Llama 3.1 8B	$0.18	$0.18	$216

GPT-5 Nano wins on raw price. Flash-Lite wins on 1M context window and multimodal input. Choose by what your workload needs.

When to use Gemini 2.5 Flash-Lite

High-volume classification and extraction at the lowest possible Google price.
Multimodal at scale, image labeling, video frame tagging, audio transcription metadata.
Long-context retrieval where you need >400K tokens (GPT-5 family caps at 400K).
Production-stable cheap tier. GA, unlike the Gemini 3 Preview models.

When not to use it:

Pure text reasoning where GPT-5 Nano is cheaper.
Frontier-tier tasks. Use Gemini 3 Pro or GPT-5.5.

Common questions

How does Flash-Lite compare to Gemini 3 Flash?

Flash-Lite ($0.10/$0.40) is much cheaper. Gemini 3 Flash ($0.50/$3.00) is much more capable, positioned as a Pro replacement. Test both. Flash-Lite is often enough.

Is the count_tokens endpoint free?

Yes. Google's countTokens is separate from generation billing. Our proxy adds a 30-day cache so we don't burn quota on identical inputs.

Does Flash-Lite support tools and function calling?

Yes, same Gemini API surface as the Pro and Flash tiers. Reliability is somewhat lower than Pro on complex multi-tool workflows; verify with your prompts.

Compare Gemini 2.5 Flash-Lite to other models

Gemini 3.1 Pro (Google, $2.00/$12.00)
Gemini 3 Flash (Google, $0.50/$3.00)
Gemini 3.1 Flash-Lite (Google, $0.25/$1.50)
Gemini 2.5 Pro (Google, $1.25/$10.00)
Gemini 2.5 Flash (Google, $0.30/$2.50)
GPT-4.1 Nano (OpenAI, $0.10/$0.40)
GPT-4o mini (OpenAI, $0.15/$0.60)
GPT-5 Nano (OpenAI, $0.05/$0.40)