#tHow Many Tokens?

Pricing changelog

Every change to AI model pricing tracked by this site, with the source link. Newest first.

DateModelFieldFromToSource
2026-05-31 claude-opus updated claude-opus-4-7 claude-opus-4-8 https://www.anthropic.com/news/claude-opus-4-8
Anthropic released Claude Opus 4.8 on 2026-05-28. Standard mode pricing unchanged at $5 input / $25 output per 1M tokens. The claude-opus entry now resolves to 4.8 (apiId: claude-opus-4-8). Inherits the Opus 4.7 tokenizer behavior (up to 35% more tokens than legacy Claude models for the same text). Anthropic describes 4.8 as 'a modest but tangible improvement' with gains in agentic coding, reasoning, knowledge work, and honesty.
2026-05-31 claude-opus-fast added https://www.anthropic.com/news/claude-opus-4-8
Added Claude Opus 4.8 Fast Mode as a separate entry: $10 input / $50 output per 1M tokens, producing tokens at ~2.5x normal speed. Anthropic dropped the fast-tier price 3x vs Opus 4.7 (was $30/$150). Useful for latency-sensitive workloads where Opus quality is needed but the standard tier's throughput is the bottleneck.
2026-05-27 llama-family tokenizer-accuracy approx-3pct exact https://www.npmjs.com/package/llama-tokenizer-js
Shipped real BPE tokenization for the Llama family via llama-tokenizer-js (lazy-loaded ~2MB chunk on first Llama count). All llama-* models now labeled 'exact' instead of '≈±3%'. Mistral / Qwen / DeepSeek / GLM still use heuristic (now character-class-aware — buckets text into ASCII / digit / CJK / whitespace and applies per-class ratios — more accurate than the prior constant-ratio version, still labeled ≈±3%).
2026-04-27 oss-batch added https://www.together.ai/pricing
Added 5 OSS models: Llama 3.3 70B ($0.88/$0.88, current Together flagship Meta), DeepSeek V3.1 ($0.60/$1.70 Together listing), DeepSeek R1 ($3/$7 reasoning), Qwen3 Coder 480B ($2/$2 current Alibaba coding flagship), GLM-5.1 ($1.40/$4.40 new Zhipu provider). Llama 3.1 entries kept but flagged 'no longer on Together's main page — verify provider'. Llama 4 NOT added — not on Together's current pricing page.
2026-04-27 openai-batch added https://developers.openai.com/api/docs/pricing
Added 21 OpenAI models: full GPT-5 family (5, 5.1, 5.2, 5.2 Pro, 5.3, 5.4, 5.4 Mini, 5.4 Nano, 5.4 Pro, 5.5, 5.5 Pro, mini, nano, Pro tiers), GPT-4.1 family (4.1, mini, nano), and o-series reasoning (o3, o3-mini, o3-pro, o4-mini).
2026-04-27 google-batch added https://ai.google.dev/gemini-api/docs/pricing
Added Gemini 3 family: 3.1 Pro Preview ($2/$12), 3 Flash Preview ($0.50/$3), 3.1 Flash-Lite Preview ($0.25/$1.50). Added Gemini 2.5 Flash-Lite GA ($0.10/$0.40).
2026-04-27 gemini-2-5-flash input 0.075 0.3 https://ai.google.dev/gemini-api/docs/pricing
Correction — Gemini 2.5 Flash priced higher than initial entry. Live Google pricing page lists $0.30 input / $2.50 output.
2026-04-27 gemini-2-5-flash output 0.3 2.5 https://ai.google.dev/gemini-api/docs/pricing
Correction — see input change above.
2026-04-27 claude-opus input 15 5 https://platform.claude.com/docs/en/about-claude/pricing
Correction — initial Opus 4.7 entry was based on prior-generation Opus pricing. Anthropic prices Opus 4.7 at $5 input / $25 output.
2026-04-27 claude-opus output 75 25 https://platform.claude.com/docs/en/about-claude/pricing
Correction — see input change above.
2026-04-27 claude-haiku input 0.8 1 https://platform.claude.com/docs/en/about-claude/pricing
Correction — Haiku 4.5 priced higher than initial entry suggested.
2026-04-27 claude-haiku output 4 5 https://platform.claude.com/docs/en/about-claude/pricing
Correction — see input change above.
2026-04-26 all initial site launch
Initial pricing snapshot — 15 models across 7 providers.