How Many Tokens?

← Back to counter

How many tokens is a word?

The short answer

In typical English text, 1 word ≈ 1.3 tokens across most modern AI tokenizers (GPT-4o, Claude, Gemini). Or, conversely, 1,000 tokens ≈ 750 words.

Single short common words ("the", "and", "cat") are usually 1 token. Longer or less common words ("antidisestablishmentarianism", "tokenization") split into 2-5 pieces.

Why words don't map cleanly to tokens

Tokenizers don't split on whitespace — they split on subword units learned from training data. The most common letter sequences become single tokens; rare sequences split into multiple.

A few examples (GPT-4o's o200k_base tokenizer):

Punctuation, leading whitespace, and capitalization all affect tokenization too. cat (with leading space) is a different token than cat.

Practical estimation

Rough rules of thumb for English:

UnitTokens
1 word~1.3
1 sentence~15-25
1 paragraph (~5 sentences)~75-125
1 page (~500 words)~650
1,000 words~1,300

For non-English languages, the ratio shifts substantially:

Get an exact count

Paste your text into the counter — it shows the exact token count for your input across every supported model. Different models have different tokenizers, so the count varies by 5-20% across providers for the same text.

Try this on every model

Try the live counter →