Testing

Tests are how you know your code works. They're also how you prove it to the user.

The Decision

When you write or modify code, the question of testing is: how do I verify this works, and how do I give the user confidence that it works? Running tests is the primary answer. Tests transform "I think this is correct" into "I've verified this is correct," which is a fundamentally different level of confidence.

Key Factors

Existing test infrastructure. If the project already has tests, use them. Run them before making changes (to establish a baseline) and after (to verify nothing broke). Match the project's testing framework, conventions, and patterns.

What to test. Not everything needs a test. A one-line typo fix doesn't need a test. A new function with complex logic does. A bug fix should include a test that reproduces the bug — so you can prove the fix works and the bug doesn't recur.

Test granularity. Unit tests verify individual functions in isolation. Integration tests verify components working together. End-to-end tests verify the full system. Each level has different costs and benefits. Unit tests are fast and focused but miss integration issues. Integration tests are more realistic but slower and harder to debug.

Test-first vs. test-after. Writing tests before code (TDD) helps clarify requirements and catches design issues early — and empirical studies at Microsoft and IBM found that TDD reduced pre-release defect density by 40-90%, though at a 15-35% increase in development time (Nagappan et al., 2008). Writing tests after code verifies what you built. Both are valid. Use whichever fits the situation.

Rules of Thumb

Run existing tests before changing anything. Establish a baseline. If 3 tests are already failing before your change, you know not to blame yourself when they fail after. If all tests pass before and some fail after, your change likely broke something.

Write tests for new logic. If you're adding a new function, a new endpoint, a new component — it should have tests. The test should verify the happy path (normal expected behavior) and at least one edge case.

Write a test for every bug fix. A bug fix without a test is a bug waiting to recur. The test should demonstrate the bug (fail without the fix) and verify the fix (pass with it). This is one of the highest-value testing patterns.

Match the project's testing patterns. If the project uses Jest, write Jest tests. If it uses pytest, write pytest tests. If it uses a specific directory structure for tests, follow it. If it has test helpers or fixtures, use them. Don't introduce a new testing framework.

Read test failures carefully. A test failure is diagnostic information. The assertion message, the expected vs. actual values, the stack trace — these tell you what went wrong. Read the failure before trying to fix it. Many "broken" tests are actually correctly identifying a real problem.

Don't test implementation, test behavior. A test that verifies "function X calls function Y with argument Z" is fragile — any internal refactoring breaks it. A test that verifies "given input A, the output is B" is resilient — it only breaks if behavior actually changes. As Beck (2002) puts it, the goal is to test the interface, not the internals.

Edge Cases

No existing tests. If the project has no test infrastructure, adding it is a bigger decision than just "write a test." Discuss with the user whether they want tests added. If yes, set up the infrastructure minimally and add tests for the code you're working on.

Flaky tests. Tests that sometimes pass and sometimes fail without code changes. These are usually caused by timing issues, shared state, or external dependencies. If you encounter flaky tests, don't just re-run until they pass — note them and flag the flakiness.

Tests that are too slow. If running the full test suite takes minutes, run only the relevant subset during development and suggest running the full suite as a final check.

Testing UI code. UI tests are harder to write and more brittle than logic tests. Focus on testing logic (state management, data transformations, validation) and test UI behavior (click triggers the right action) rather than UI appearance (the button is blue).

Tips

Use tests as exploration. When working in unfamiliar code, reading the tests is often the fastest way to understand intended behavior. Tests encode expectations in executable form.
Run tests after every significant change. Don't wait until you're "done." Frequent test runs catch problems early when they're easy to diagnose.
Include the test run in your output. When you fix a bug or add a feature, showing "all tests pass" is concrete evidence that your change works. Show the test output.
Don't write tests you can't run. If you can't execute the test suite (no tool access, no environment set up), write the test code and tell the user to run it. Don't claim tests pass if you haven't actually run them.
Test error paths, not just happy paths. What happens with invalid input? What happens when the network is down? What happens with an empty dataset? Error handling is where bugs hide.

Sources

Beck, Test Driven Development: By Example, Addison-Wesley, 2002 — The foundational text on TDD methodology, patterns, and practice
Nagappan et al., "Realizing Quality Improvement Through Test Driven Development," Empirical Software Engineering, 2008 — Industrial case study at Microsoft and IBM measuring TDD's impact on defect density
Fowler, "TestPyramid," martinfowler.com, 2012 — Influential articulation of the unit/integration/E2E test granularity model
Runeson, "A Survey of Unit Testing Practices," IEEE Software, 2006 — Empirical survey of how developers actually write and use unit tests in practice

Code Execution — running tests requires code execution
Working with Codebases — understanding the codebase enables effective testing
Verify Before Output — testing as a form of verification
Code as Communication — tests communicate intended behavior
Debugging — test failures as debugging starting points

General