Claude Sonnet 4.6: token counter & pricing
Anthropic · exact (uses official tokenizer) · pricing as of 2026-04-26.
- Provider
- Anthropic
- API model ID
claude-sonnet-4-6- Context window
- 200,000 tokens
- Input price
- $3.00 per 1M tokens
- Output price
- $15.00 per 1M tokens
- Tokenizer accuracy
- exact (uses official tokenizer)
- Pricing as of
- 2026-04-26
Open the counter to count tokens for Claude Sonnet 4.6 in real time.
What is Claude Sonnet 4.6?
Claude Sonnet 4.6 is Anthropic's mid-tier model — the workhorse most production Claude workloads should default to. Strong reasoning, strong code, strong writing, at 5× lower input cost than Opus and 5× lower output cost than Opus.
How tokens are counted here
Claude Sonnet uses Anthropic's official /v1/messages/count_tokens endpoint via our serverless proxy. Counts are exact — identical to what Anthropic's billing system charges.
The proxy sends the prompt to Anthropic's tokenization endpoint only. The prompt is never logged, never stored, and never used for training (per Anthropic's policy on count_tokens).
When to use Sonnet over Opus or Haiku
- Most production workloads. Sonnet is the default; reach for Opus only when you've measured Sonnet falling short on your task.
- Long-form writing where quality matters but Opus is overkill — most blog posts, emails, summaries.
- Code generation and review on routine diffs. Opus is worth the 5× premium only on architecture-class problems.
- RAG with substantial context — the 200,000-token window handles most documents in one shot.
If your workload is high-volume classification, extraction, or short Q&A, Claude Haiku is 4× cheaper with quality differences invisible to most users.
Common questions
How does Sonnet compare to GPT-4o on price?
Sonnet: $3/$15 per million. GPT-4o: $2.50/$10. GPT-4o is ~17% cheaper on input, 33% cheaper on output. For input-heavy workloads with short replies, GPT-4o usually wins on raw cost. Sonnet often wins on instruction-following nuance — measure with your prompts.
Does the 200,000-token context window cost more?
No. Input is billed per token regardless of context-window position. A 100k-token prompt costs the same per token as a 1k-token prompt — the total just scales with the number of tokens you send.
Is prompt caching available on Sonnet?
Yes — Anthropic's prompt caching reduces cost on repeated long-context prompts (e.g., the same RAG document across many queries). Cached input tokens cost ~10% of the standard rate. Not reflected in the calculator above; factor it in manually if your workload qualifies.
Compare Claude Sonnet 4.6 to other models
- Claude Opus 4.7 (Anthropic, $15.00/$75.00)
- Claude Haiku 4.5 (Anthropic, $0.80/$4.00)
- GPT-4o (OpenAI, $2.50/$10.00)
- Llama 3.1 405B (Meta, $3.50/$3.50)
- Mistral Large (Mistral, $2.00/$6.00)