How many tokens is a word?
The short answer
In typical English text, 1 word ≈ 1.3 tokens across most modern AI tokenizers (GPT-4o, Claude, Gemini). Or, conversely, 1,000 tokens ≈ 750 words.
Single short common words ("the", "and", "cat") are usually 1 token. Longer or less common words ("antidisestablishmentarianism", "tokenization") split into 2-5 pieces.
Why words don't map cleanly to tokens
Tokenizers don't split on whitespace — they split on subword units learned from training data. The most common letter sequences become single tokens; rare sequences split into multiple.
A few examples (GPT-4o's o200k_base tokenizer):
cat→ 1 tokentokenizer→ 1 token (common in AI training data)tokenization→ 2 tokens (token+ization)pneumonoultramicroscopicsilicovolcanoconiosis→ 13 tokens
Punctuation, leading whitespace, and capitalization all affect tokenization too. cat (with leading space) is a different token than cat.
Practical estimation
Rough rules of thumb for English:
| Unit | Tokens |
|---|---|
| 1 word | ~1.3 |
| 1 sentence | ~15-25 |
| 1 paragraph (~5 sentences) | ~75-125 |
| 1 page (~500 words) | ~650 |
| 1,000 words | ~1,300 |
For non-English languages, the ratio shifts substantially:
- Chinese / Japanese with
o200k_basetokenizer: ~2-3 tokens per character (vs. ~1.3 per English word) - Code: 10-20% more tokens than equivalent prose
- Numbers and identifiers: usually 1 token per ~3 digits
Get an exact count
Paste your text into the counter — it shows the exact token count for your input across every supported model. Different models have different tokenizers, so the count varies by 5-20% across providers for the same text.
Try this on every model
- Claude Opus 4.7 $15.00/$75.00
- Claude Sonnet 4.6 $3.00/$15.00
- Claude Haiku 4.5 $0.80/$4.00
- GPT-4o $2.50/$10.00
- GPT-4o mini $0.15/$0.60
- GPT-4 Turbo $10.00/$30.00
- Gemini 2.5 Pro $1.25/$10.00
- Gemini 2.5 Flash $0.07/$0.30
- Llama 3.1 405B $3.50/$3.50
- Llama 3.1 70B $0.59/$0.79
- Llama 3.1 8B $0.18/$0.18
- Mistral Large $2.00/$6.00
- DeepSeek V3 $0.27/$1.10
- Qwen 2.5 72B $0.90/$0.90
- Qwen 2.5 Coder 32B $0.80/$0.80