AI pricing news & live model leaderboard

May 31, 2026 Live ranking

AI cost leaderboard, ranked cheapest first

Cost to run a 1,000-token-input / 200-token-output prompt 1,000 times, across the eight cheapest non-deprecated models we track. Updates automatically with every pricing snapshot.

#	Model	Per 1M calls
#1	GPT-5 Nano	$130.00
#2	GPT-4.1 Nano	$180.00
#3	Gemini 2.5 Flash-Lite	$180.00
#4	Llama 3.1 8B	$216.00
#5	GPT-4o mini	$270.00
#6	GPT-5.4 Nano	$450.00
#7	DeepSeek V3	$490.00
#8	Gemini 3.1 Flash-Lite	$550.00

full pricing changelog · try your own prompt

May 31, 2026 Editorial

Claude Opus 4.8: what actually changed and what to do

Anthropic's modest Opus 4.8 upgrade lands at the same standard price but quietly drops the fast-tier 3x. Worth the model-string swap; worth a serious look if latency is your bottleneck.

Read the full article →

May 31, 2026 Update

claude-opus: updated

Anthropic released Claude Opus 4.8 on 2026-05-28. Standard mode pricing unchanged at $5 input / $25 output per 1M tokens. The claude-opus entry now resolves to 4.8 (apiId: claude-opus-4-8). Inherits the Opus 4.7 tokenizer behavior (up to 35% more tokens than legacy Claude models for the same text). Anthropic describes 4.8 as 'a modest but tangible improvement' with gains in agentic coding, reasoning, knowledge work, and honesty.

source

May 31, 2026 Models added

Added: claude-opus-fast

Added Claude Opus 4.8 Fast Mode as a separate entry: $10 input / $50 output per 1M tokens, producing tokens at ~2.5x normal speed. Anthropic dropped the fast-tier price 3x vs Opus 4.7 (was $30/$150). Useful for latency-sensitive workloads where Opus quality is needed but the standard tier's throughput is the bottleneck.

source

May 27, 2026 Accuracy upgrade

llama-family tokenizer: approx-3pct → exact

Shipped real BPE tokenization for the Llama family via llama-tokenizer-js (lazy-loaded ~2MB chunk on first Llama count). All llama-* models now labeled 'exact' instead of '≈±3%'. Mistral / Qwen / DeepSeek / GLM still use heuristic (now character-class-aware — buckets text into ASCII / digit / CJK / whitespace and applies per-class ratios — more accurate than the prior constant-ratio version, still labeled ≈±3%).

source

April 27, 2026 Models added

Added: oss

Added 5 OSS models: Llama 3.3 70B ($0.88/$0.88, current Together flagship Meta), DeepSeek V3.1 ($0.60/$1.70 Together listing), DeepSeek R1 ($3/$7 reasoning), Qwen3 Coder 480B ($2/$2 current Alibaba coding flagship), GLM-5.1 ($1.40/$4.40 new Zhipu provider). Llama 3.1 entries kept but flagged 'no longer on Together's main page — verify provider'. Llama 4 NOT added — not on Together's current pricing page.

source

April 27, 2026 Models added

Added: openai

Added 21 OpenAI models: full GPT-5 family (5, 5.1, 5.2, 5.2 Pro, 5.3, 5.4, 5.4 Mini, 5.4 Nano, 5.4 Pro, 5.5, 5.5 Pro, mini, nano, Pro tiers), GPT-4.1 family (4.1, mini, nano), and o-series reasoning (o3, o3-mini, o3-pro, o4-mini).

source

April 27, 2026 Models added

Added: google

Added Gemini 3 family: 3.1 Pro Preview ($2/$12), 3 Flash Preview ($0.50/$3), 3.1 Flash-Lite Preview ($0.25/$1.50). Added Gemini 2.5 Flash-Lite GA ($0.10/$0.40).

source

April 27, 2026 Price change