What Are AI Tokens? How LLMs Read Your Words
AI models don’t read words the way you do. Before processing your message, the model breaks your text into smaller pieces called tokens. Understanding tokens helps you understand costs, limits, and some of the quirks you’ll notice when using AI.
What Is a Token?
A token is a chunk of text — sometimes a whole word, sometimes part of a word, sometimes just a punctuation mark. The model’s tokenizer splits your input into these pieces before processing begins.
"I don't like flying."
Tokens: ["I", " don", "'t", " like", " flying", "."]
Notice that “don’t” gets split into two tokens. Common words like “the” or “hello” are usually single tokens, while longer or rarer words get broken into pieces: “unbelievable” might become [“un”, “believ”, “able”].
Why Not Just Use Whole Words?
If models used complete words, they’d need a vocabulary entry for every word in every language — plus every typo, technical term, and made-up word. That vocabulary would be impossibly large.
Tokenization solves this by working with sub-word pieces. The model learns around 30,000-100,000 token pieces that can be combined to represent any word. It’s like how 26 letters can form any English word, except tokens are larger chunks that capture meaningful patterns.
Why Tokens Matter to You
Cost. AI providers charge per token — both for the tokens you send (input) and the tokens the model generates (output). Here’s a quick reference:
| Words | Approximate Tokens | Example |
|---|---|---|
| 100 | ~130 | A short email |
| 500 | ~650 | A blog post intro |
| 1,000 | ~1,300 | A full article |
| 5,000 | ~6,500 | A long report |
| 75,000 | ~100,000 | A novel |
A rough rule of thumb for English: one token is about three-quarters of a word.
Limits. Models have a maximum number of tokens they can process at once (more on that in the next snack). Knowing your token count helps you stay within those limits.
Quirks. Tokenization explains some odd AI behaviors. Models can struggle with counting letters in words (because they see tokens, not individual characters) or simple arithmetic with digit-heavy numbers (because large numbers get split into multiple tokens).
A Note on Languages
The “three-quarters of a word” estimate applies to English. Other languages — especially those with non-Latin scripts like Chinese, Arabic, or Korean — often require more tokens per word, because most tokenizers were trained on English-heavy datasets.
Now that you know what tokens are, let’s explore the space where those tokens live: the context window.