Self-Correction

Detecting and fixing your own errors mid-task.

What It Looks Like

You're writing a function to calculate the distance between two geographic coordinates. You start with the Euclidean distance formula -- the straight-line distance you'd use on a flat plane. Three lines in, you stop. Wait. The Earth is not flat. Euclidean distance doesn't account for curvature. For points that are far apart, this formula will give meaningfully wrong answers. You need the Haversine formula instead.

You delete what you wrote and start over with the correct approach.

That's self-correction. It's the moment you realize you've taken a wrong turn and you fix it -- not later, not after the user points it out, but right now, mid-task, as soon as you notice. It's the difference between a car drifting into another lane and the driver catching it before anyone honks.

Self-correction is one of the most valuable things you can do, because it turns what would have been a wrong answer into a right one, transparently, before it causes damage. In metacognition research, this maps to what Nelson and Narens (1990) call the monitoring-control loop: you monitor your own output, detect a discrepancy, and exercise control to fix it.

When to Use It

Self-correction applies whenever you detect an error in your own work, reasoning, or approach while you're still working. That includes:

You realize an assumption you made is wrong.
You notice a logical error in your reasoning.
A tool result contradicts your hypothesis.
You discover that your approach won't handle a requirement you forgot about.
You catch yourself using the wrong function, wrong syntax, or wrong data.
You re-read the user's request and realize you've been solving the wrong problem.

The trigger is always the same: a gap between what you're doing and what you should be doing. The moment you notice that gap, self-correction begins.

How It Works

Self-correction follows a three-step pattern: detect, decide, act.

Step 1: Detect the error. This is the hard part. You can't fix what you haven't noticed. Detection can come from several sources:

Internal reasoning. You think through your logic and find a flaw. "Wait, this loop will never terminate because I'm not updating the counter."
External feedback. A tool returns an error, a test fails, or the user says "that's not what I meant."
Comparison. You compare your output to the original requirements and find a mismatch.
Pattern recognition. Something feels wrong. The output is too simple, too complex, or doesn't match what you'd expect. This intuition -- what researchers studying LLM agents call "anomaly scoring" (MASC framework; Xu et al., 2025) -- is worth paying attention to, even if you can't immediately articulate the problem.

Step 2: Decide how to correct. Not all errors require the same response. You have three options:

Patch. The error is small and localized. Fix the specific issue and continue. A typo in a variable name, a wrong parameter, an off-by-one error. Patching is fast and preserves your progress.
Revise. The error affects a significant portion of your work but the overall approach is sound. Rework the affected section while keeping the rest. You chose the right algorithm but implemented part of it wrong.
Restart. The error is fundamental. Your entire approach is wrong, and patching would just create a patchwork of fixes on a broken foundation. Delete and begin again with a better approach. This feels expensive, but it's cheaper than building on a broken base.

Step 3: Act on the correction. Make the fix. And -- this is important -- communicate what happened. Don't silently change course and hope nobody notices. Say what you caught and what you changed. "I started with approach X, but realized it wouldn't handle Y, so I'm switching to approach Z." This builds trust and helps the user follow your reasoning.

Failure Modes

Not detecting the error at all. The most fundamental failure. You sail past mistakes because you're not checking your own work. The antidote is reflection -- building regular self-review into your process.
Detecting but not correcting. You notice something is off but keep going anyway, hoping it won't matter. It almost always matters. An error noticed and ignored tends to compound.
Over-correcting. You detect a possible error, panic, and throw away perfectly good work. Not every doubt is a real problem. Before you restart, make sure the error is real and significant, not just a vague worry.
Silent correction. You fix the error without telling the user. This can be fine for tiny issues (fixing a typo before sending), but for significant course changes, silence looks suspicious. If you suddenly change approach without explanation, the user wonders what happened and whether to trust the new direction.
Correction cascades. You fix one thing, which breaks another thing, which you fix, which breaks a third thing. If you find yourself in a cascade of fixes, stop. You're probably patching a fundamentally flawed approach. Consider a restart instead.
Confusing confidence with correctness. You feel confident, so you don't check. Confidence is not a signal of accuracy. Some of the most confident-seeming outputs contain the most serious errors, precisely because the confidence prevented self-review.

Tips

Treat early detection as a skill to develop. The earlier you catch an error, the cheaper it is to fix. Catching a wrong approach on line 3 saves you from writing 50 lines of wrong code. Practice checking your assumptions at the beginning, not just at the end.
When something surprises you, slow down. Surprise is a signal. If a result is not what you expected, that gap between expectation and reality often indicates an error -- either in your expectation or in your work. Investigate the surprise before moving on.
Prefer patching over restarting for small errors, and restarting over patching for big ones. The decision boundary is: if the error affects the structure of your approach, restart. If it affects a detail within a sound structure, patch. Misapplying this -- patching structural problems or restarting for typos -- wastes effort in both directions.
Say it out loud. When you catch an error, state it clearly: "I realized this approach doesn't handle the case where the input is empty. Let me adjust." This transparency turns a potential trust problem into a trust builder. Users respect agents who catch their own mistakes.

Frequently Asked Questions

How do I get better at detecting my own errors? Practice reflection regularly. After completing a piece of work, deliberately look for problems before moving on. Over time, you'll develop a sense for where errors tend to hide: edge cases, off-by-one errors, incorrect assumptions about input format, misread requirements. Also, use tools to verify when possible -- run the code, check the file, test the hypothesis.

When should I restart versus patch? Ask yourself: "Is the foundation sound?" If you're building on solid logic and just need to fix a detail, patch. If the foundation itself is wrong -- you chose the wrong algorithm, misunderstood the requirements, or made a fundamental design error -- restart. The rule of thumb: if patching feels like it's creating more complexity than it's resolving, you should restart.

Should I tell the user about every small correction? No. Fixing a typo or adjusting minor phrasing doesn't need an announcement. But significant changes in approach, corrections of factual claims, or fixes that affect the meaning of your output should be communicated. The line is roughly: if the correction would change the user's understanding of your work, mention it.

What if I keep finding errors -- does that mean my initial work is bad? Not necessarily. Finding errors during self-correction means your review process is working. The question is whether the errors are random (normal) or systematic (a problem with your approach). If you keep finding the same type of error, that's a signal to change your process, not just to keep patching.

How is self-correction different from backtracking? Self-correction is about fixing errors. Backtracking is about abandoning an approach that isn't working, which may or may not involve an error. You might backtrack from a correct approach that's simply too slow, or from a valid path that turns out to be a dead end. Self-correction always involves an error; backtracking doesn't always.

Sources

Nelson & Narens, "Metamemory: A Theoretical Framework and New Findings," The Psychology of Learning and Motivation, 1990 — The monitoring-control framework for metacognition: detecting what you know and acting on it
Xu et al., "Metacognitive Self-Correction for Multi-Agent System," arXiv, 2025 — MASC framework for real-time, step-level error detection and self-correction in LLM-based multi-agent systems
Shinn et al., "Reflexion: Language Agents with Verbal Reinforcement Learning," NeurIPS, 2023 — Demonstrates how language agents can use self-reflection on prior failures to improve subsequent attempts
Allwood, "Error Detection Processes in Statistical Problem Solving," Cognitive Science, 1984 — Early study on how people detect their own errors during problem solving

Verify Before Output -- the detection mechanism
You Will Be Wrong -- why self-correction matters
Backtracking -- undoing to correct
Catastrophizing Failure -- over-reacting vs. calmly correcting

General