#tHow Many Tokens?

← All models

Llama 3.3 70B: token counter & pricing

Meta · exact (uses official tokenizer) · pricing as of 2026-05-31.

Provider
Meta
API model ID
meta-llama/Llama-3.3-70B-Instruct-Turbo
Context window
128,000 tokens
Input price
$0.88 per 1M tokens
Output price
$0.88 per 1M tokens
Tokenizer accuracy
exact (uses official tokenizer)
Pricing as of
2026-05-31

Open the counter to count tokens for Llama 3.3 70B in real time.

What is Llama 3.3 70B?

Llama 3.3 70B is Meta's current flagship 70B-class open-weights model as advertised on Together.ai's pricing page. Same Llama tokenizer family as 3.1, improved instruction-following and reasoning behavior, single-rate pricing at $0.88 per 1M tokens (input and output identical, a common pattern for open-weights inference providers).

How tokens are counted here

Llama uses a SentencePiece-based BPE tokenizer. We approximate in your browser using a family-tuned heuristic, accurate within roughly ±3% of the reference tokenizer for typical English text. Marked ≈±3% in the results table.

For exact counts, run AutoTokenizer.from_pretrained("meta-llama/Llama-3.3-70B-Instruct") locally on your text.

Pricing notes

$0.88 input / $0.88 output per 1M (Together.ai indicative pricing, verify your actual provider).

Single-rate pricing means output-heavy workloads (long generation) cost the same per-token as input-heavy workloads (long RAG context). That's structurally different from OpenAI / Anthropic, where output typically costs 4-10× input.

For 1,000 input + 200 output: $0.00106 per call, $1,056 per 1M calls.

When to use Llama 3.3 70B

When not to use it:

Common questions

Llama 3.3 70B vs Llama 3.1 70B?

3.3 is Together's current advertised flagship Meta model. 3.1 70B is no longer prominently listed on Together's pricing page (kept in this catalog for SEO and for users who still call it on other providers). For new work, default to 3.3.

Llama 3.3 vs Qwen 3 Coder 480B for code?

Qwen 3 Coder 480B ($2/$2 Together) is specifically tuned for code generation, outperforms Llama 3.3 on coding benchmarks but is more expensive. For general-purpose work, Llama 3.3 is the better default.

Where's Llama 4?

Not currently on Together's published pricing page as of April 2026. We don't ship Llama 4 entries to avoid pretending it's available where it isn't. If your provider offers Llama 4, the tokenizer is similar enough to 3.x that this counter's estimate will be close, but verify.

Compare Llama 3.3 70B to other models