#tHow Many Tokens?

← Back to counter

How many tokens is a word?

The short answer

In typical English text, 1 word ≈ 1.3 tokens across most modern AI tokenizers (GPT-4o, Claude, Gemini). Or, conversely, 1,000 tokens ≈ 750 words.

Single short common words ("the", "and", "cat") are usually 1 token. Longer or less common words ("antidisestablishmentarianism", "tokenization") split into 2-5 pieces.

Why words don't map cleanly to tokens

Tokenizers don't split on whitespace, they split on subword units learned from training data. The most common letter sequences become single tokens; rare sequences split into multiple.

A few examples (GPT-4o's o200k_base tokenizer):

Punctuation, leading whitespace, and capitalization all affect tokenization too. cat (with leading space) is a different token than cat.

Practical estimation

Rough rules of thumb for English:

UnitTokens
1 word~1.3
1 sentence~15-25
1 paragraph (~5 sentences)~75-125
1 page (~500 words)~650
1,000 words~1,300

For non-English languages, the ratio shifts substantially:

Get an exact count

Paste your text into the counter, it shows the exact token count for your input across every supported model. Different models have different tokenizers, so the count varies by 5-20% across providers for the same text.

Try this on every model

Try the live counter →