How Many Tokens?

← Back to counter

How many tokens are in a PDF?

The short answer

A typical text-only PDF is ~250 to 400 tokens per page of normal-density English. A 20-page document is usually 5,000-8,000 tokens.

That estimate assumes:

Why the range is so wide

PDFs aren't uniform. The token count for the same page varies by:

How to get an exact count

1. Extract the text (pdfplumber, pypdf, pdftotext, etc.). 2. Paste it into the counter on the home page. 3. The result is exact for OpenAI, Anthropic, and Gemini.

The cost implication

A 50-page PDF at ~325 tokens/page is ~16,000 tokens. On Claude Sonnet ($3/M input), that's about $0.05 per call to feed it as context. On Claude Opus ($15/M), ~$0.25. On Gemini 2.5 Flash ($0.075/M), about $0.001.

If you're going to send the same PDF to a model many times, cache the embedding or use a model with prompt caching to avoid re-tokenizing on every call.

Try this on every model

Try the live counter →