What Is a Context Window in AI? (Simple Guide)
Every AI model has a limit on how much text it can consider at once. This limit is called the context window, and it’s one of the most important concepts to understand when working with AI.
What Is the Context Window?
Think of the context window as the model’s desk. Everything the model needs to reference — your conversation history, any instructions, documents you’ve pasted in, and the response it’s generating — must fit on this desk. If something doesn’t fit, it falls off.
The context window is measured in tokens (the pieces from the previous snack). It includes both input tokens (what you send) and output tokens (what the model generates). Both count against the same limit.
How Big Is the Desk?
Context windows have grown dramatically:
| Era | Model example | Context window |
|---|---|---|
| 2022 | GPT-3 | ~4,000 tokens (~3,000 words) |
| 2023 | GPT-4 Turbo | 128,000 tokens (~96,000 words) |
| 2025+ | Claude, Gemini | Up to 1,000,000 tokens (~750,000 words) |
Early models could barely hold a long email. Today’s models can process the equivalent of several novels in a single conversation.
The “Working Memory” Distinction
The context window is not the model’s knowledge. The model’s training data is like its education — everything it learned in school. The context window is its working memory — what it’s actively looking at right now.
This means a model can “know” a fact from training but still miss information you provided earlier in the conversation if it’s fallen out of the context window.
More Isn’t Always Better
You might assume bigger context windows are always better, but researchers have found that performance can degrade as context grows — a phenomenon sometimes called “lost in the middle.” Models tend to pay more attention to information at the beginning and end of their context, potentially overlooking details buried in the middle.
The takeaway: don’t just dump everything into the context window. Curate what you include. Put the most important information where the model is most likely to notice it — at the start or near the end.
Why It Matters for You
Context windows affect what’s practical. Need to analyze a long document? Check if it fits. Having a long conversation? Earlier messages may get dropped. Using an AI coding assistant? It can only “see” so much of your codebase at once.
Understanding this limit helps you work with AI more effectively — and sets the stage for the next concept: how to control what kind of responses the model gives you.