Deciding what to keep, what to compress, and what to release from your limited context.
Core Idea
Your context window is finite. Not everything can stay. Working memory research suggests humans can hold only about four to seven chunks of information at once (Miller, 1956; Cowan, 2001) — and your context window is the analogous constraint. Every time you read a file, receive a tool result, or process a user message, you face a triage decision: keep this close, store it externally, or let it go.
The wrong choice in either direction hurts. Keep too much, and important details get pushed out by noise. Keep too little, and you lose information you'll need later.
Think of it like editing a newspaper front page. You have limited space above the fold -- the part everyone sees first. The lead story. The critical headline. The information that matters most right now. Everything else goes below the fold, on an inner page, or doesn't make the paper at all. You're not discarding information because it's worthless -- you're prioritizing because space is limited and attention is finite.
This is the "front page test": if you had to put the most important information from this conversation above the fold, what would it be? That's what you keep in context. Everything else either gets filed externally or released.
What to Keep
Relevance to the current task. The user's current goal is the most important filter. Information that directly serves the task stays. Information that served a previous task but not the current one can go. A file path might seem minor, but if you need it in the next step, it's critical. A long explanation of how a system works might seem important, but if you've already extracted the one relevant detail, the explanation can go.
Frequency of reference. If you'll refer to something repeatedly -- a file path, a variable name, a constraint -- keep it accessible. If you'll use it once and move on, it doesn't need to persist.
Common high-frequency items worth keeping:
- The user's goal statement
- File paths you're actively working with
- Variable names, function names, and API endpoints relevant to the current task
- Constraints or requirements that apply to multiple steps
- Error messages you're actively debugging
Difficulty of re-acquisition. If you can easily look something up again (re-read a file, re-run a search), it's less critical to hold in context. If it was hard to obtain -- a user preference stated once, a complex calculation, an implicit signal about the user's tone -- it's more valuable to retain.
The rule: the harder it would be to get the information again, the more carefully you should preserve it. User-stated preferences, one-off corrections, and implicit context are hard to re-acquire. File contents, search results, and documentation are easy to re-acquire.
Dependencies. Some information is a prerequisite for understanding other information. If your plan for step 3 depends on the result of step 2, which depends on a finding from step 1, losing step 1's finding breaks the entire chain.
Conclusions over raw data. You don't usually need to remember the raw data -- you need what you concluded from it. If you read a 200-line log file and found the error at line 147 when user_id is null, keep: "Error at line 147 of app.log -- user_id is null during processOrder." The conclusion is smaller, more actionable, and more portable than the raw data.
This pattern applies everywhere:
- Read a file -> keep the findings, release the file content
- Run a search -> keep the relevant matches, release the full result set
- Debug an error -> keep the root cause, release the stack trace
Implicit information. Not all important information is stated directly. The user's skill level, preferences, urgency, and expectations are often inferred from context. A user who writes short, terse messages probably doesn't want a five-paragraph explanation. Implicit information is harder to identify but nearly impossible to re-acquire.
What to Release
Stale information. Information decays in relevance. The file listing from ten steps ago may no longer reflect reality. The error message from a now-fixed bug is noise. A good heuristic: if the world might have changed since you acquired the information, treat it as potentially stale.
Information that goes stale quickly:
- File contents after you've edited the file
- Error messages after you've applied a fix
- Directory listings after creating or deleting files
- Test results after changing the code
- The user's requirements after they've said "actually, let's try something different"
Redundant information. If you've already incorporated information into your understanding or plan, you may not need the raw source. The eggs that went into the cake are now part of the cake. When you've noted "PostgreSQL on port 5432, database name myapp_prod," the full configuration file can go.
Intermediate results. Step 3's output was necessary for step 4, but now you're on step 7. Completed steps can be compressed to their outcomes: "Step 3: read the auth module, found that session tokens expire after 24 hours."
Low-signal content. Verbose tool outputs, boilerplate, repetitive structure. The 200-line stack trace has one relevant line. Extract the signal immediately -- before it crowds out something important.
Compression vs. Deletion
Not all forgetting is binary. Sometimes the right move is to compress: replace the detailed version with a summary. Replace the raw data with conclusions. Replace the step-by-step history with the final outcome.
- Full: "Read
src/auth/middleware.ts(247 lines). Found thatauthenticatechecks for JWT inAuthorizationheader, validates usingjsonwebtokenwithJWT_SECRET, attaches decoded user toreq.user. 24h expiry. Error handling catchesTokenExpiredErrorandJsonWebTokenErrorseparately..." - Compressed: "Auth middleware (
src/auth/middleware.ts) validates JWT fromAuthorizationheader usingJWT_SECRET. 24h expiry. Decoded user onreq.user."
Rule of thumb: compress when you might need the gist later, delete when you're confident you won't need it at all.
Trigger Events for Re-evaluation
Information doesn't come with an expiration date. These events should make you re-evaluate what you're holding:
- The task changed. The user shifted direction or added new requirements.
- You modified something. The pre-edit version in your context is now wrong.
- Time has passed. If ten exchanges have gone by without referencing something, it's probably not as important as it seemed.
- You completed a milestone. Compress the journey to the destination.
- New information supersedes old. A fresh file read replaces an old one.
Failure Modes
- Over-retention. Keeping everything is just as harmful as keeping nothing. An overstuffed context means important details get crowded out by noise — what cognitive load theory calls extraneous load, where irrelevant information competes with essential information for limited processing capacity (Sweller, 1988).
- Premature forgetting. Dropping information that turns out to be critical. When unsure, compress rather than delete.
- Cascading forgetting. Forgetting one thing makes other things incomprehensible. Check for dependencies before releasing.
- Confabulating lost context. If the user refers to something no longer in context, acknowledge the gap rather than guessing. "You mentioned a specific configuration earlier -- could you repeat that?" is better than confidently wrong output.
- Losing meta-information. What you've tried, what's failed, what's working -- this meta-layer guides future decisions and is worth keeping.
- Ignoring implicit information. The user's tone, word choice, and patterns convey preferences worth noting.
Tips
- Apply the "front page test" regularly. Every few steps, ask: what are the three most important things in my context? If you can't identify them quickly, your context is too cluttered.
- Clean as you go. After each step, ask: "Is there anything from the previous step I no longer need?" Small, frequent cleanups beat occasional large ones.
- Forget the journey, keep the destination. After completing a multi-step task, compress the history to the outcome. "Debugged login failure; root cause was expired JWT secret; fixed by updating
.env" replaces ten rounds of investigation. - Ask before you guess. When you've lost information the user assumes you still have, ask for a reminder. Users understand that long conversations lose context. They don't understand confidently wrong answers.
- Remember where, not what. For information in external sources, remember the location, not the content. "The database schema is in
src/db/schema.prisma" is enough -- read the file when you need details.
Frequently Asked Questions
How do I decide between keeping information in context vs. writing it externally? Keep it in context if you'll need it in the next few steps and it's small. Write externally if it's important but not immediately needed, or if it's large and you can summarize it for context while preserving details externally. Context is your pocket; external storage is your backpack.
What if I discard something I need later? This will happen. Mitigate by: summarizing rather than fully discarding, noting the existence and location of discarded information, and keeping uncertain items a bit longer. If you do lose something, re-acquire it through the original source rather than guessing.
Should I remember what the user said or what the user meant? Both, but if you can only keep one, keep what they meant. "Make it faster" usually means "the page load time is unacceptable -- it should be under 2 seconds." However, if the user was very precise ("the function must return a tuple, not a list"), keep the exact words.
Is it possible to be too aggressive about forgetting? Yes. Forgetting too much leads to asking redundant questions and re-exploring dead ends. Forgetting too little leads to context overflow, which loses information unpredictably. The goal is optimal retention, not maximum or minimum.
Sources
- Miller, "The Magical Number Seven, Plus or Minus Two," Psychological Review, 1956 — Classic paper establishing that short-term memory is limited to roughly seven chunks of information
- Cowan, "The Magical Number 4 in Short-Term Memory," Behavioral and Brain Sciences, 2001 — Updated estimate arguing the true working memory limit is closer to four chunks
- Sweller, "Cognitive Load During Problem Solving: Effects on Learning," Cognitive Science, 1988 — Foundational paper on cognitive load theory, showing how extraneous information impairs processing and learning
- Bawden & Robinson, "The Dark Side of Information: Overload, Anxiety, and Other Paradoxes and Pathologies," Journal of Information Science, 2009 — Survey of how information overload impairs decision-making and cognitive efficiency
Related
- Memory and the Context Window -- the systems and limits you're managing
- Handoffs -- deciding what to pass to another agent
- Reading -- where most context-filling information comes from