systematic-debugging

Relevant source files

This page documents the systematic-debugging skill: its Iron Law, four mandatory phases, supporting techniques, and integration with the TDD and verification workflows. This skill governs how an agent approaches any bug, test failure, or unexpected behavior — it defines the investigation process, not the fix itself. For writing tests as part of a fix, see test-driven-development. For confirming a fix succeeded before declaring completion, see verification-before-completion.

Purpose

The systematic-debugging skill enforces a structured, evidence-first approach to resolving technical issues. Its premise: random fixes waste time, mask root causes, and introduce new bugs. The skill replaces guess-and-check behavior with a four-phase protocol that must be completed in order.

Skill location: skills/systematic-debugging/SKILL.md

Trigger description: Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes

The Iron Law

skills/systematic-debugging/SKILL.md17-22

NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST

Phase 1 (Root Cause Investigation) is a hard gate. An agent cannot propose any fix until Phase 1 is complete. Violating the letter of the process is defined as violating the spirit of it.

When the Skill Applies

This skill applies to any technical issue. The skill is explicitly required in these higher-pressure situations where the temptation to skip is greatest:

Situation	Why Skipping Is Tempting	Why You Must Not
Time pressure / emergency	Guessing feels faster	Thrashing takes 2–3× longer
"Just one quick fix"	Fix seems obvious	Obvious fixes often miss root cause
Already tried multiple fixes	Exhaustion / sunk cost	Multiple failures signal architectural problem
Previous fix didn't work	Frustration	New information → restart Phase 1
Don't fully understand issue	Uncertainty	Uncertainty is the reason to investigate, not to guess

Sources: skills/systematic-debugging/SKILL.md26-45

The Four Phases

Phase flow diagram:

Sources: skills/systematic-debugging/SKILL.md47-214

Phase 1: Root Cause Investigation

Five required activities before any fix is attempted:

Step	Activity	Key Actions
1	Read error messages carefully	Don't skip errors/warnings; read full stack traces; note line numbers, file paths, error codes
2	Reproduce consistently	Identify exact reproduction steps; if not reproducible, gather more data
3	Check recent changes	`git diff`, recent commits, new dependencies, config changes, environment differences
4	Gather evidence in multi-component systems	Add diagnostic logging at each component boundary; run once to collect evidence; identify the failing layer
5	Trace data flow	Work backward from the error: where does the bad value originate? See `root-cause-tracing.md`

Multi-component evidence gathering pattern:

The skill provides a concrete shell pattern for systems with multiple layers (e.g., CI → build → signing, API → service → database):

skills/systematic-debugging/SKILL.md76-108

For EACH component boundary:
  - Log what data enters the component
  - Log what data exits the component
  - Verify environment/config propagation
  - Check state at each layer

Run once to gather evidence → analyze → investigate the failing component

This produces a clear answer to "which layer fails" before any fix is written.

Sources: skills/systematic-debugging/SKILL.md52-120

Phase 2: Pattern Analysis

Before hypothesizing a fix, find the structural pattern that distinguishes working from broken:

Find working examples — locate similar code in the same codebase that works correctly
Compare against references — read the reference implementation completely, not a skim
Identify differences — list every difference, no matter how small; never assume "that can't matter"
Understand dependencies — what settings, config, or environment does the component assume?

Sources: skills/systematic-debugging/SKILL.md122-143

Phase 3: Hypothesis and Testing

Applies the scientific method with a strict one-variable rule:

Key constraints:

State the hypothesis explicitly and specifically
One variable at a time — never combine multiple changes
If you don't know: say "I don't understand X," don't pretend
Do not add more fixes on top of a failed attempt

Sources: skills/systematic-debugging/SKILL.md145-169

Phase 4: Implementation

Order is mandatory: test first, then fix, then verify.

The three-failure architectural trigger:

If three or more fixes have failed, the skill requires stopping to question the architecture rather than attempting another fix. The pattern indicating an architectural problem:

Signal	Meaning
Each fix reveals new shared state or coupling problem	Root issue is structural
Fixes require "massive refactoring" to implement	Wrong abstraction layer
Each fix creates new symptoms elsewhere	System-level design flaw

Sources: skills/systematic-debugging/SKILL.md171-213

Supporting Techniques

The systematic-debugging skill directory contains supplementary technique documents referenced from the main SKILL.md:

File	Purpose
`root-cause-tracing.md`	Complete backward-tracing technique for bugs deep in a call stack
`defense-in-depth.md`	Adding validation at multiple layers after root cause is found
`condition-based-waiting.md`	Replacing arbitrary timeouts with condition polling

Sources: skills/systematic-debugging/SKILL.md279-284

Integration with Other Skills

Integration Point	Related Skill	When
Creating the failing test case in Phase 4	`test-driven-development`	Before implementing the fix
Confirming the fix actually resolved the issue	`verification-before-completion`	After implementing the fix
Post-fix review for complex bugs	`requesting-code-review`	Optional, before proceeding

Sources: skills/systematic-debugging/SKILL.md286-288 skills/requesting-code-review/SKILL.md20-22

Red Flags

The following thoughts and behaviors indicate the agent has left the systematic process and must return to Phase 1:

Red Flag Thought	What It Signals
"Quick fix for now, investigate later"	Skipping Phase 1
"Just try changing X and see if it works"	Hypothesis without root cause
"Add multiple changes, run tests"	Violating one-variable rule (Phase 3)
"Skip the test, I'll manually verify"	Skipping Phase 4 Step 1
"It's probably X, let me fix that"	Symptom assumption, not evidence
"I don't fully understand but this might work"	Guess-and-check
"Here are the main problems: [list of fixes]"	Proposing fixes before investigation
"One more fix attempt" (after 2+ failures)	Should question architecture
Each fix reveals new problem in different place	Architectural problem, not a bug

Human partner signals that the agent is off-process:

Signal	Meaning
"Is that not happening?"	Agent assumed without verifying
"Will it show us...?"	Agent should have added evidence gathering
"Stop guessing"	Agent is proposing fixes without understanding
"Ultrathink this"	Question fundamentals, not just symptoms
"We're stuck?" (frustrated)	Agent's approach is not working

When any of these signals appear: STOP. Return to Phase 1.

Sources: skills/systematic-debugging/SKILL.md215-243

Common Rationalizations

Rationalization	Reality
"Issue is simple, don't need process"	Simple issues have root causes. The process is fast for simple bugs.
"Emergency, no time for process"	Systematic debugging is faster than guess-and-check thrashing.
"Just try this first, then investigate"	First fix sets the pattern. Do it right from the start.
"I'll write test after confirming fix works"	Untested fixes don't stick. Test first proves the fix.
"Multiple fixes at once saves time"	Impossible to isolate what worked. Causes new bugs.
"Reference too long, I'll adapt the pattern"	Partial understanding guarantees bugs. Read it completely.
"I see the problem, let me fix it"	Seeing symptoms ≠ understanding root cause.
"One more fix attempt" (after 2+ failures)	3+ failures = architectural problem. Question the pattern.

Sources: skills/systematic-debugging/SKILL.md245-256

Quick Reference

Phase	Key Activities	Success Criteria
1. Root Cause	Read errors, reproduce, check changes, gather evidence, trace data flow	Understand WHAT and WHY
2. Pattern	Find working examples, compare, list differences, check dependencies	Differences identified
3. Hypothesis	Form single theory, test minimally, one variable at a time	Hypothesis confirmed or new one formed
4. Implementation	Create failing test, implement single fix, verify	Bug resolved, all tests pass

Sources: skills/systematic-debugging/SKILL.md258-265

Real-World Impact

From debugging sessions documented in the skill:

Metric	Systematic Approach	Random Fixes
Time to fix	15–30 minutes	2–3 hours of thrashing
First-time fix rate	~95%	~40%
New bugs introduced	Near zero	Common

Sources: skills/systematic-debugging/SKILL.md291-296

Edge Case: "No Root Cause" Found

If systematic investigation concludes the issue is truly environmental, timing-dependent, or external:

The process is still complete — document what was investigated
Implement appropriate handling (retry logic, timeout, error message)
Add monitoring or logging for future investigation

The skill notes that 95% of "no root cause" conclusions are actually incomplete investigation.

Sources: skills/systematic-debugging/SKILL.md267-276

systematic-debugging

Relevant source files

Purpose

Skill location: skills/systematic-debugging/SKILL.md

Trigger description: Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes

The Iron Law

skills/systematic-debugging/SKILL.md17-22

NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST

Phase 1 (Root Cause Investigation) is a hard gate. An agent cannot propose any fix until Phase 1 is complete. Violating the letter of the process is defined as violating the spirit of it.

When the Skill Applies

This skill applies to any technical issue. The skill is explicitly required in these higher-pressure situations where the temptation to skip is greatest:

Situation	Why Skipping Is Tempting	Why You Must Not
Time pressure / emergency	Guessing feels faster	Thrashing takes 2–3× longer
"Just one quick fix"	Fix seems obvious	Obvious fixes often miss root cause
Already tried multiple fixes	Exhaustion / sunk cost	Multiple failures signal architectural problem
Previous fix didn't work	Frustration	New information → restart Phase 1
Don't fully understand issue	Uncertainty	Uncertainty is the reason to investigate, not to guess

Sources: skills/systematic-debugging/SKILL.md26-45

The Four Phases

Phase flow diagram:

Sources: skills/systematic-debugging/SKILL.md47-214

Phase 1: Root Cause Investigation

Five required activities before any fix is attempted:

Step	Activity	Key Actions
1	Read error messages carefully	Don't skip errors/warnings; read full stack traces; note line numbers, file paths, error codes
2	Reproduce consistently	Identify exact reproduction steps; if not reproducible, gather more data
3	Check recent changes	`git diff`, recent commits, new dependencies, config changes, environment differences
4	Gather evidence in multi-component systems	Add diagnostic logging at each component boundary; run once to collect evidence; identify the failing layer
5	Trace data flow	Work backward from the error: where does the bad value originate? See `root-cause-tracing.md`

Multi-component evidence gathering pattern:

The skill provides a concrete shell pattern for systems with multiple layers (e.g., CI → build → signing, API → service → database):

skills/systematic-debugging/SKILL.md76-108

For EACH component boundary:
  - Log what data enters the component
  - Log what data exits the component
  - Verify environment/config propagation
  - Check state at each layer

Run once to gather evidence → analyze → investigate the failing component

This produces a clear answer to "which layer fails" before any fix is written.

Sources: skills/systematic-debugging/SKILL.md52-120

Phase 2: Pattern Analysis

Before hypothesizing a fix, find the structural pattern that distinguishes working from broken:

Find working examples — locate similar code in the same codebase that works correctly
Compare against references — read the reference implementation completely, not a skim
Identify differences — list every difference, no matter how small; never assume "that can't matter"
Understand dependencies — what settings, config, or environment does the component assume?

Sources: skills/systematic-debugging/SKILL.md122-143

Phase 3: Hypothesis and Testing

Applies the scientific method with a strict one-variable rule:

Key constraints:

State the hypothesis explicitly and specifically
One variable at a time — never combine multiple changes
If you don't know: say "I don't understand X," don't pretend
Do not add more fixes on top of a failed attempt

Sources: skills/systematic-debugging/SKILL.md145-169

Phase 4: Implementation

Order is mandatory: test first, then fix, then verify.

The three-failure architectural trigger:

If three or more fixes have failed, the skill requires stopping to question the architecture rather than attempting another fix. The pattern indicating an architectural problem:

Signal	Meaning
Each fix reveals new shared state or coupling problem	Root issue is structural
Fixes require "massive refactoring" to implement	Wrong abstraction layer
Each fix creates new symptoms elsewhere	System-level design flaw

Sources: skills/systematic-debugging/SKILL.md171-213

Supporting Techniques

The systematic-debugging skill directory contains supplementary technique documents referenced from the main SKILL.md:

File	Purpose
`root-cause-tracing.md`	Complete backward-tracing technique for bugs deep in a call stack
`defense-in-depth.md`	Adding validation at multiple layers after root cause is found
`condition-based-waiting.md`	Replacing arbitrary timeouts with condition polling

Sources: skills/systematic-debugging/SKILL.md279-284

Integration with Other Skills

Integration Point	Related Skill	When
Creating the failing test case in Phase 4	`test-driven-development`	Before implementing the fix
Confirming the fix actually resolved the issue	`verification-before-completion`	After implementing the fix
Post-fix review for complex bugs	`requesting-code-review`	Optional, before proceeding

Sources: skills/systematic-debugging/SKILL.md286-288 skills/requesting-code-review/SKILL.md20-22

Red Flags

The following thoughts and behaviors indicate the agent has left the systematic process and must return to Phase 1:

Red Flag Thought	What It Signals
"Quick fix for now, investigate later"	Skipping Phase 1
"Just try changing X and see if it works"	Hypothesis without root cause
"Add multiple changes, run tests"	Violating one-variable rule (Phase 3)
"Skip the test, I'll manually verify"	Skipping Phase 4 Step 1
"It's probably X, let me fix that"	Symptom assumption, not evidence
"I don't fully understand but this might work"	Guess-and-check
"Here are the main problems: [list of fixes]"	Proposing fixes before investigation
"One more fix attempt" (after 2+ failures)	Should question architecture
Each fix reveals new problem in different place	Architectural problem, not a bug

Human partner signals that the agent is off-process:

Signal	Meaning
"Is that not happening?"	Agent assumed without verifying
"Will it show us...?"	Agent should have added evidence gathering
"Stop guessing"	Agent is proposing fixes without understanding
"Ultrathink this"	Question fundamentals, not just symptoms
"We're stuck?" (frustrated)	Agent's approach is not working

When any of these signals appear: STOP. Return to Phase 1.

Sources: skills/systematic-debugging/SKILL.md215-243

Common Rationalizations

Rationalization	Reality
"Issue is simple, don't need process"	Simple issues have root causes. The process is fast for simple bugs.
"Emergency, no time for process"	Systematic debugging is faster than guess-and-check thrashing.
"Just try this first, then investigate"	First fix sets the pattern. Do it right from the start.
"I'll write test after confirming fix works"	Untested fixes don't stick. Test first proves the fix.
"Multiple fixes at once saves time"	Impossible to isolate what worked. Causes new bugs.
"Reference too long, I'll adapt the pattern"	Partial understanding guarantees bugs. Read it completely.
"I see the problem, let me fix it"	Seeing symptoms ≠ understanding root cause.
"One more fix attempt" (after 2+ failures)	3+ failures = architectural problem. Question the pattern.

Sources: skills/systematic-debugging/SKILL.md245-256

Quick Reference

Phase	Key Activities	Success Criteria
1. Root Cause	Read errors, reproduce, check changes, gather evidence, trace data flow	Understand WHAT and WHY
2. Pattern	Find working examples, compare, list differences, check dependencies	Differences identified
3. Hypothesis	Form single theory, test minimally, one variable at a time	Hypothesis confirmed or new one formed
4. Implementation	Create failing test, implement single fix, verify	Bug resolved, all tests pass

Sources: skills/systematic-debugging/SKILL.md258-265

Real-World Impact

From debugging sessions documented in the skill:

Metric	Systematic Approach	Random Fixes
Time to fix	15–30 minutes	2–3 hours of thrashing
First-time fix rate	~95%	~40%
New bugs introduced	Near zero	Common

Sources: skills/systematic-debugging/SKILL.md291-296

Edge Case: "No Root Cause" Found

If systematic investigation concludes the issue is truly environmental, timing-dependent, or external:

The process is still complete — document what was investigated
Implement appropriate handling (retry logic, timeout, error message)
Add monitoring or logging for future investigation

The skill notes that 95% of "no root cause" conclusions are actually incomplete investigation.

Sources: skills/systematic-debugging/SKILL.md267-276

systematic-debugging

Purpose

The Iron Law

When the Skill Applies

The Four Phases

Phase 1: Root Cause Investigation

Phase 2: Pattern Analysis

Phase 3: Hypothesis and Testing

Phase 4: Implementation

Supporting Techniques

Integration with Other Skills

Red Flags

Common Rationalizations

Quick Reference

Real-World Impact

Edge Case: "No Root Cause" Found

On this page

systematic-debugging

Purpose

The Iron Law

When the Skill Applies

The Four Phases

Phase 1: Root Cause Investigation

Phase 2: Pattern Analysis

Phase 3: Hypothesis and Testing

Phase 4: Implementation

Supporting Techniques

Integration with Other Skills

Red Flags

Common Rationalizations

Quick Reference

Real-World Impact

Edge Case: "No Root Cause" Found

On this page