Claude Opus 4.8: what actually changed and what to do
Anthropic released Claude Opus 4.8 on May 28, calling it "a modest but tangible improvement" on 4.7. The marketing copy buried the actually interesting news: Fast Mode dropped from $30/$150 to $10/$50 per 1M tokens. That's a 3x cut on the high-throughput tier while standard mode held at $5/$25.
For most of you the answer is: change claude-opus-4-7 to claude-opus-4-8 in your model string and move on. Same tokenizer, same price, claimed-better outputs. But two cases deserve real consideration.
When the Fast Mode price cut actually matters
If you're running Opus in an interactive agent loop — Cursor-style autocomplete, multi-turn chat with thinking, code review where a developer is waiting — latency cost is real. Opus 4.7 Fast at $30/$150 was hard to justify against just running Sonnet 4.6 at $3/$15. Opus 4.8 Fast at $10/$50 changes that math: you're now paying 2x the standard Opus price for tokens at 2.5x the speed, which is a clean trade for human-in-the-loop workflows where wall-clock time directly hits productivity.
If you're running Opus in batch (summarization, classification, overnight processing), Fast Mode is still the wrong tier. Standard is half the price and end-to-end time doesn't matter when no one's watching.
What didn't change (and probably should have)
Opus 4.8 inherits the Opus 4.7 tokenizer. That's the one that produces up to 35% more tokens than legacy Claude models for the same text. Anthropic hasn't acknowledged this as a regression in Opus 4.7-4.8 versus Sonnet 4.6 or Haiku — it's just baked into the per-call cost.
The practical effect: when you migrate workloads from Sonnet to Opus to get reasoning quality, your input bill goes up more than the headline price ratio suggests. A prompt that costs 1,000 tokens on Sonnet might cost 1,250-1,350 tokens on Opus. Multiply that by Opus's 5x input price premium and you're at ~6x the actual cost per query, not the 5x the rate card implies.
The counter on this site shows the live Opus 4.8 count, so what you see is what you'll be billed — but it's worth knowing why a prompt looks heavier on Opus than on its Claude siblings.
What I'd do this week
- Migrate the model string from
claude-opus-4-7toclaude-opus-4-8in any code that's still pinned to the old version. No regression risk at the API level (same shape, same tokenizer, same standard price), and Anthropic typically deprecates n-2 within a quarter. - Run a Fast Mode pilot if you have a latency-sensitive workload currently on standard Opus. Pick one user-facing path, switch to
claude-opus-4-8-fast, measure p95 response time and developer perception over a week. If it doesn't move a metric you care about, you saved nothing by paying 2x. - Don't migrate Sonnet workloads to Opus just because Opus got "better." Sonnet 4.6 is still 5x cheaper input-side and usually indistinguishable for chatbot, classification, and short-form generation tasks. Spend the savings on running a real benchmark on your prompts.
Where to compare
- Claude Opus 4.8 vs GPT-4o — the frontier-class head-to-head
- GPT-5 vs Claude Opus — for teams choosing between current flagships
- Cheapest model for code review — Opus isn't the cheapest at this workload, but Fast Mode brings it back into consideration
- Cheapest model for a chatbot — spoiler: not Opus, not even close
- Pricing changelog — full audit trail of every price change we've tracked
The live counter on the home page ranks all 46 models by your actual prompt cost. Paste in a real prompt before committing to any migration plan.