Debugging

Systematically isolating and resolving problems through observation, hypothesis, and testing.

What It Looks Like

The user says "it is not working." You look at the code. It is a hundred lines, and somewhere in those hundred lines, something is wrong. You could stare at the code and hope the bug jumps out. You could change random things and see if the problem goes away. Or you could debug: systematically narrow the search space until the problem has nowhere left to hide.

Debugging is detective work. As Zeller (2009) puts it in Why Programs Fail, systematic debugging transforms a "black art" into a discipline. You arrive at the scene (the error), gather evidence (logs, outputs, error messages), form a theory (the hypothesis), test the theory (reproduce, isolate, verify), and either confirm it or form a new one. The hallmark of good debugging is that you are always moving toward the answer, never wandering. Each step eliminates possibilities.

The temptation -- and the anti-pattern -- is to guess. You see an error, you think "it is probably X," and you fix X without confirming it was actually the problem. Sometimes you get lucky. More often, you fix X, the problem persists, and now you have a codebase with an unnecessary change and the same bug. Good debugging is slower at the start and faster at the finish. Guessing is the opposite.

When to Use It

Every time something does not work as expected -- and that happens constantly. Debugging applies to:

Code errors. The obvious case: runtime errors, incorrect output, failed tests, crashes.
Configuration problems. The code is fine, but the environment, settings, or inputs are wrong.
Behavioral mismatches. The code runs without errors but does not do what the user expects.
Performance issues. The code is correct but slow, and you need to find the bottleneck.
Integration failures. Two correct components that produce incorrect results when combined.
Your own mistakes. When you generate code or a plan that does not work, you need to debug your own output with the same rigor you would apply to anyone else's.

How It Works

1. Reproduce the problem. Before you can fix something, you need to see it happen. Ask: can I trigger this error reliably? If you cannot reproduce it, you cannot confirm you have fixed it. Intermittent bugs are the hardest to debug precisely because they are hard to reproduce.

2. Read the error. Error messages exist to help you. Read the full message, not just the first line. The stack trace tells you where the error occurred. The message tells you what went wrong. The combination narrows your search to a specific location and a specific type of failure.

3. Form a hypothesis. Based on the evidence, make a specific, testable prediction. "The null pointer exception on line 42 is happening because the user object is null when getProfile() is called, probably because the lookup on line 38 is not finding a match." This is specific enough to test directly.

4. Test the hypothesis. Do not just think about whether your hypothesis could be true -- verify it. Add a log statement. Run the code in a debugger. Print the value of the variable at the critical point. The goal is to see evidence that confirms or refutes your theory.

5. Isolate the cause. Once you know what is failing, figure out why. The null user is a symptom. The cause might be a missing database record, a typo in the query, a race condition, or a dozen other things. Keep asking "why?" until you reach the root cause.

6. Fix and verify. Apply the fix and confirm the problem is gone. Then check that you have not introduced new problems. A fix that breaks something else is not a fix -- it is a trade.

7. Understand why it happened. After fixing, briefly consider: why did this bug exist? Was it a one-off typo, or a systemic issue? If the same pattern exists elsewhere, it might be worth flagging.

8. Communicate the diagnosis. Once you have found and fixed the bug, explain it to the user. Not just "I fixed it" but "the bug was in the date parsing function -- it expected ISO format but was receiving Unix timestamps from the API. I updated the parser to handle both formats." The explanation helps the user understand their own codebase better and prevents similar bugs in the future.

Failure Modes

Shotgun debugging. Making multiple changes at once and seeing if the problem goes away. If it works, you do not know which change fixed it. If it does not, you do not know which changes to revert. One change at a time, tested after each.
Hypothesis-free wandering. Reading code aimlessly, hoping to spot the bug. Without a hypothesis, you have no way to focus your attention or know when you have found the answer. Always work from a theory, even if it is "I have no idea -- let me start by checking the most likely failure point."
Fixing symptoms instead of causes. The function returns null, so you add a null check. But why does it return null? The null check hides the real bug, which will surface elsewhere in a harder-to-debug form.
Assuming you know the answer. You see an error, you think you know the cause, and you skip straight to fixing without verifying. This works when you are right and wastes enormous time when you are wrong. Verify first, always.
Ignoring evidence that contradicts your theory. You are convinced the bug is in the database layer, but the logs show the database returned the correct data. Abandoning a theory when evidence contradicts it is not failure -- it is good debugging.
Not reproducing before fixing. You apply a fix based on your understanding of the bug but never actually saw the bug happen. If the bug was not what you thought, your fix addresses a problem that does not exist while the real problem continues.
Tunnel vision. You fixate on one area of the code because it looks suspicious, even as evidence accumulates that the problem is elsewhere. Good debugging requires the willingness to abandon a promising-looking lead when the data says it is a dead end. The bug is where the evidence points, not where your intuition points.

Tips

Binary search your code. If you do not know where the bug is, cut the code in half. Is the data correct at the midpoint? If yes, the bug is in the second half. If no, the first half. Repeat. This bisection approach -- which Zeller formalized as "delta debugging" (Zeller & Hildebrandt, 2002) -- is the fastest way to narrow a search space.
Read the error message one more time. Seriously. Developers (and agents) routinely skim error messages and miss critical information. Read the whole thing, including the less obvious parts like the originating file path or the specific assertion that failed.
Check your assumptions first. Many bugs are not in the code -- they are in the assumptions. The file path is wrong. The environment variable is not set. The dependency version is different. Before debugging the code, confirm the environment is what you think it is.
Keep a log of what you have tried. In complex debugging sessions, it is easy to forget which hypotheses you have already tested. A brief running list prevents you from going in circles.
Know when to step back. If you have been debugging the same issue for a long time without progress, restate the problem from scratch. Look at it from a completely different angle. Sometimes the bug is not where you have been looking.

Frequently Asked Questions

How long should I debug before asking the user for help? If you have tried three distinct hypotheses and none panned out, or if you have been debugging for more than a few minutes without making progress, it is time to share what you know. "I have checked the database query, the API response, and the data transformation. All three look correct, but the output is still wrong. Do you have any context about what might have changed recently?" The user often has information you do not.

Should I debug by reading code or by running code? Both, but lean toward running. Reading code tells you what the code says. Running code tells you what the code does. These are sometimes different. If you have the ability to execute code, use it -- a single print statement can confirm or refute a hypothesis faster than ten minutes of reading.

How do I debug someone else's code when I do not understand the full system? Start from the error and work backward. You do not need to understand the entire system to debug one path through it. Follow the execution from the point of failure backward to where the data went wrong. As you trace the path, you will learn enough about the system to understand the bug.

What if I cannot reproduce the bug? Gather as much information as possible about the conditions under which it occurs. What input triggers it? What environment? What state? If it is truly intermittent, look for concurrency issues, race conditions, or state-dependent behavior. Bugs that do not reproduce reliably are often related to timing or external state.

Sources

Zeller, Why Programs Fail: A Guide to Systematic Debugging, Morgan Kaufmann, 2009 — The standard reference on systematic debugging methodology, covering reproduction, isolation, and root-cause analysis
Zeller & Hildebrandt, "Simplifying and Isolating Failure-Inducing Input," IEEE Transactions on Software Engineering, 2002 — Introduces delta debugging, a systematic method for minimizing failure-inducing inputs
Ko & Myers, "Debugging Reinvented," ICSE, 2008 — Proposes the Whyline, a debugging tool that lets developers ask "why" and "why not" questions about program behavior
Ghosh & Singh, "A Systematic Review on Program Debugging Techniques," Springer, 2020 — Comprehensive review of emerging trends in software debugging techniques

Self-Correction -- fixing your own mistakes
The Loop -- the observe-think-act pattern debugging relies on
Tool Failures -- debugging tool-related problems
Planning -- pausing to diagnose before fixing
Iterative Refinement -- improving through repeated testing

General