Pricing changelog
Every change to AI model pricing tracked by this site, with the source link. Newest first.
| Date | Model | Field | From | To | Source |
|---|---|---|---|---|---|
| 2026-05-31 | claude-opus | updated | claude-opus-4-7 | claude-opus-4-8 | https://www.anthropic.com/news/claude-opus-4-8 |
| Anthropic released Claude Opus 4.8 on 2026-05-28. Standard mode pricing unchanged at $5 input / $25 output per 1M tokens. The claude-opus entry now resolves to 4.8 (apiId: claude-opus-4-8). Inherits the Opus 4.7 tokenizer behavior (up to 35% more tokens than legacy Claude models for the same text). Anthropic describes 4.8 as 'a modest but tangible improvement' with gains in agentic coding, reasoning, knowledge work, and honesty. | |||||
| 2026-05-31 | claude-opus-fast | added | — | — | https://www.anthropic.com/news/claude-opus-4-8 |
| Added Claude Opus 4.8 Fast Mode as a separate entry: $10 input / $50 output per 1M tokens, producing tokens at ~2.5x normal speed. Anthropic dropped the fast-tier price 3x vs Opus 4.7 (was $30/$150). Useful for latency-sensitive workloads where Opus quality is needed but the standard tier's throughput is the bottleneck. | |||||
| 2026-05-27 | llama-family | tokenizer-accuracy | approx-3pct | exact | https://www.npmjs.com/package/llama-tokenizer-js |
| Shipped real BPE tokenization for the Llama family via llama-tokenizer-js (lazy-loaded ~2MB chunk on first Llama count). All llama-* models now labeled 'exact' instead of '≈±3%'. Mistral / Qwen / DeepSeek / GLM still use heuristic (now character-class-aware — buckets text into ASCII / digit / CJK / whitespace and applies per-class ratios — more accurate than the prior constant-ratio version, still labeled ≈±3%). | |||||
| 2026-04-27 | oss-batch | added | — | — | https://www.together.ai/pricing |
| Added 5 OSS models: Llama 3.3 70B ($0.88/$0.88, current Together flagship Meta), DeepSeek V3.1 ($0.60/$1.70 Together listing), DeepSeek R1 ($3/$7 reasoning), Qwen3 Coder 480B ($2/$2 current Alibaba coding flagship), GLM-5.1 ($1.40/$4.40 new Zhipu provider). Llama 3.1 entries kept but flagged 'no longer on Together's main page — verify provider'. Llama 4 NOT added — not on Together's current pricing page. | |||||
| 2026-04-27 | openai-batch | added | — | — | https://developers.openai.com/api/docs/pricing |
| Added 21 OpenAI models: full GPT-5 family (5, 5.1, 5.2, 5.2 Pro, 5.3, 5.4, 5.4 Mini, 5.4 Nano, 5.4 Pro, 5.5, 5.5 Pro, mini, nano, Pro tiers), GPT-4.1 family (4.1, mini, nano), and o-series reasoning (o3, o3-mini, o3-pro, o4-mini). | |||||
| 2026-04-27 | google-batch | added | — | — | https://ai.google.dev/gemini-api/docs/pricing |
| Added Gemini 3 family: 3.1 Pro Preview ($2/$12), 3 Flash Preview ($0.50/$3), 3.1 Flash-Lite Preview ($0.25/$1.50). Added Gemini 2.5 Flash-Lite GA ($0.10/$0.40). | |||||
| 2026-04-27 | gemini-2-5-flash | input | 0.075 | 0.3 | https://ai.google.dev/gemini-api/docs/pricing |
| Correction — Gemini 2.5 Flash priced higher than initial entry. Live Google pricing page lists $0.30 input / $2.50 output. | |||||
| 2026-04-27 | gemini-2-5-flash | output | 0.3 | 2.5 | https://ai.google.dev/gemini-api/docs/pricing |
| Correction — see input change above. | |||||
| 2026-04-27 | claude-opus | input | 15 | 5 | https://platform.claude.com/docs/en/about-claude/pricing |
| Correction — initial Opus 4.7 entry was based on prior-generation Opus pricing. Anthropic prices Opus 4.7 at $5 input / $25 output. | |||||
| 2026-04-27 | claude-opus | output | 75 | 25 | https://platform.claude.com/docs/en/about-claude/pricing |
| Correction — see input change above. | |||||
| 2026-04-27 | claude-haiku | input | 0.8 | 1 | https://platform.claude.com/docs/en/about-claude/pricing |
| Correction — Haiku 4.5 priced higher than initial entry suggested. | |||||
| 2026-04-27 | claude-haiku | output | 4 | 5 | https://platform.claude.com/docs/en/about-claude/pricing |
| Correction — see input change above. | |||||
| 2026-04-26 | all | initial | — | — | site launch |
| Initial pricing snapshot — 15 models across 7 providers. | |||||