Claude Sonnet 4.6: token counter & pricing

Anthropic · exact (uses official tokenizer) · pricing as of 2026-04-26.

Provider: Anthropic
API model ID: claude-sonnet-4-6
Context window: 200,000 tokens
Input price: $3.00 per 1M tokens
Output price: $15.00 per 1M tokens
Tokenizer accuracy: exact (uses official tokenizer)
Pricing as of: 2026-04-26

Open the counter to count tokens for Claude Sonnet 4.6 in real time.

What is Claude Sonnet 4.6?

Claude Sonnet 4.6 is Anthropic's mid-tier model — the workhorse most production Claude workloads should default to. Strong reasoning, strong code, strong writing, at 5× lower input cost than Opus and 5× lower output cost than Opus.

How tokens are counted here

Claude Sonnet uses Anthropic's official /v1/messages/count_tokens endpoint via our serverless proxy. Counts are exact — identical to what Anthropic's billing system charges.

The proxy sends the prompt to Anthropic's tokenization endpoint only. The prompt is never logged, never stored, and never used for training (per Anthropic's policy on count_tokens).

When to use Sonnet over Opus or Haiku

Most production workloads. Sonnet is the default; reach for Opus only when you've measured Sonnet falling short on your task.
Long-form writing where quality matters but Opus is overkill — most blog posts, emails, summaries.
Code generation and review on routine diffs. Opus is worth the 5× premium only on architecture-class problems.
RAG with substantial context — the 200,000-token window handles most documents in one shot.

If your workload is high-volume classification, extraction, or short Q&A, Claude Haiku is 4× cheaper with quality differences invisible to most users.

Common questions

How does Sonnet compare to GPT-4o on price?

Sonnet: $3/$15 per million. GPT-4o: $2.50/$10. GPT-4o is ~17% cheaper on input, 33% cheaper on output. For input-heavy workloads with short replies, GPT-4o usually wins on raw cost. Sonnet often wins on instruction-following nuance — measure with your prompts.

Does the 200,000-token context window cost more?

No. Input is billed per token regardless of context-window position. A 100k-token prompt costs the same per token as a 1k-token prompt — the total just scales with the number of tokens you send.

Is prompt caching available on Sonnet?

Yes — Anthropic's prompt caching reduces cost on repeated long-context prompts (e.g., the same RAG document across many queries). Cached input tokens cost ~10% of the standard rate. Not reflected in the calculator above; factor it in manually if your workload qualifies.

Compare Claude Sonnet 4.6 to other models

Claude Opus 4.7 (Anthropic, $15.00/$75.00)
Claude Haiku 4.5 (Anthropic, $0.80/$4.00)
GPT-4o (OpenAI, $2.50/$10.00)
Llama 3.1 405B (Meta, $3.50/$3.50)
Mistral Large (Mistral, $2.00/$6.00)

Detailed comparisons

Claude Sonnet 4.6 vs GPT-4o