test-driven-development

Relevant source files

This page documents the test-driven-development skill: its Iron Law, the RED/GREEN/REFACTOR cycle with mandatory verification gates, rationalization patterns to reject, and the companion testing-anti-patterns.md reference. This skill governs how agents write code during plan execution.

For how TDD fits into the broader development pipeline, see the Complete Workflow Pipeline. For applying TDD when authoring new skills (not production code), see Test-Driven Development for Skills. For the debugging counterpart, see systematic-debugging.

Skill Metadata

Field	Value
File	`skills/test-driven-development/SKILL.md`
Name	`test-driven-development`
Description trigger	`Use when implementing any feature or bugfix, before writing implementation code`
Companion reference	`skills/test-driven-development/testing-anti-patterns.md`

The Iron Law

skills/test-driven-development/SKILL.md33-45

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

If production code was written before its test exists and was observed failing, the code must be deleted. There are no exceptions:

Do not keep it as a "reference"
Do not "adapt" it while writing tests alongside
Do not look at it while writing the test
Delete means delete. Implement fresh from the tests only.

RED / GREEN / REFACTOR Cycle

The three phases form a strict loop. Each phase has a mandatory verification step that must not be skipped.

Diagram: RED/GREEN/REFACTOR Cycle with Verification Gates

Sources: skills/test-driven-development/SKILL.md47-68

Phase 1 — RED: Write the Failing Test

skills/test-driven-development/SKILL.md71-112

Write one minimal test that describes the expected behavior. Requirements:

Requirement	Good	Bad
One behavior per test	Single `expect` on one outcome	Test verifies email AND domain AND whitespace
Clear name	`'retries failed operations 3 times'`	`'retry works'` or `'test1'`
Real code, not mocks	Calls the real function under test	Asserts that a mock was called N times

Tests should demonstrate the desired API as if writing a usage example. The name should describe behavior, not the mechanism.

Phase 2 — Verify RED (Mandatory)

skills/test-driven-development/SKILL.md113-129

Run the test suite and confirm three things:

The test fails (not errors out due to a typo or missing import)
The failure message is the expected one
The test fails because the feature is missing, not because of a test bug

Outcome	Action
Test passes immediately	You are testing existing behavior. Fix the test.
Test errors (syntax/import)	Fix the error, re-run. Do not proceed to GREEN.
Test fails correctly	Proceed to GREEN.

Phase 3 — GREEN: Minimal Code

skills/test-driven-development/SKILL.md132-166

Write the simplest code that makes the test pass. Do not:

Add optional parameters or configuration not required by the test
Refactor other code
Implement features the test does not yet demand (YAGNI)

The implementation exists only to pass the current test. Nothing more.

Phase 4 — Verify GREEN (Mandatory)

skills/test-driven-development/SKILL.md168-183

Run the full test suite:

Confirm:

The new test passes
All previously passing tests still pass
Output is pristine: no errors, no warnings

Outcome	Action
New test fails	Fix code, not test
Other tests fail	Fix regressions before continuing
All pass, clean output	Proceed to REFACTOR

Phase 5 — REFACTOR: Clean Up

skills/test-driven-development/SKILL.md185-196

Only after GREEN, improve the code's internal quality:

Remove duplication
Improve names
Extract helpers

Do not add behavior. After every change, re-run the suite to confirm all tests remain green.

Scope: When TDD Applies

skills/test-driven-development/SKILL.md18-29

Category	TDD Required
New features	Always
Bug fixes	Always
Refactoring	Always
Behavior changes	Always
Throwaway prototypes	Ask human partner
Generated code	Ask human partner
Configuration files	Ask human partner

The urge to skip TDD "just this once" is a rationalization signal, not a legitimate exception.

Rationalization Table

skills/test-driven-development/SKILL.md257-270

The skill explicitly catalogs common rationalizations and their rebuttals:

Rationalization	Reality
"Too simple to test"	Simple code breaks. Test takes 30 seconds.
"I'll write tests after"	Tests-after pass immediately — proves nothing.
"Tests after achieve same goals"	Tests-after answer "what does this do?" Tests-first answer "what should this do?"
"Already manually tested"	Ad-hoc ≠ systematic. No record, can't re-run.
"Deleting X hours is wasteful"	Sunk cost fallacy. Keeping unverified code is technical debt.
"Keep as reference, write tests first"	You'll adapt it. That's testing after. Delete means delete.
"Need to explore first"	Fine. Throw away exploration. Start with TDD.
"Test is hard = design is unclear"	Listen to the test. Hard to test = hard to use.
"TDD will slow me down"	TDD is faster than debugging. Pragmatic = test-first.
"Existing code has no tests"	You're improving it. Add tests for existing code.

Red Flags — Stop and Start Over

skills/test-driven-development/SKILL.md272-288

Any of the following means: Delete the code. Start over.

Code written before test
Test written after implementation
Test passes immediately without code changes
Cannot explain why the test failed
Tests planned to be added "later"
Rationalizing "just this once"
"I already manually tested it"
"Tests after achieve the same purpose"
"It's about spirit not ritual"
"Keep as reference" or "adapt existing code"
"Already spent X hours, deleting is wasteful"
"TDD is dogmatic, I'm being pragmatic"
"This is different because..."

Bug Fix Example

skills/test-driven-development/SKILL.md291-325

The skill includes a concrete walkthrough for fixing a bug (empty email accepted):

Diagram: Bug Fix TDD Flow

Sources: skills/test-driven-development/SKILL.md291-325

Verification Checklist

skills/test-driven-development/SKILL.md327-340

Before marking any task complete:

Every new function/method has a test
Watched each test fail before implementing
Each test failed for the expected reason (feature missing, not typo)
Wrote minimal code to pass each test
All tests pass
Output pristine (no errors, warnings)
Tests use real code (mocks only if unavoidable)
Edge cases and errors covered

Failure to check all boxes means TDD was skipped. Start over.

When Stuck

skills/test-driven-development/SKILL.md343-350

Problem	Solution
Don't know how to test	Write the wished-for API. Write the assertion first. Ask human partner.
Test too complicated	Design is too complicated. Simplify the interface.
Must mock everything	Code is too coupled. Use dependency injection.
Test setup is huge	Extract helpers. Still complex? Simplify the design.

Hard-to-test code is a design signal, not a TDD limitation.

Debugging Integration

skills/test-driven-development/SKILL.md352-355

When a bug is found during development:

Write a failing test that reproduces the bug
Follow the full TDD cycle
The test proves the fix and prevents regression

Never fix a bug without a test.

Testing Anti-Patterns Reference

skills/test-driven-development/testing-anti-patterns.md1-20

The companion file testing-anti-patterns.md is loaded when writing or changing tests, adding mocks, or considering test-only methods on production classes.

Diagram: Anti-Pattern Reference — Code Entity Map

Sources: skills/test-driven-development/testing-anti-patterns.md1-20 skills/test-driven-development/SKILL.md357-363

Anti-Pattern Summary Table

skills/test-driven-development/testing-anti-patterns.md275-283

Anti-Pattern	Fix
Assert on mock elements (`*-mock` test IDs)	Test real component or unmock it
Test-only methods on production classes	Move cleanup to `test-utils/` helpers
Mock without understanding side effects	Understand dependency chain first; mock minimally
Incomplete mock data structures	Mirror full real API response schema
Tests as afterthought	TDD — tests first
Over-complex mock setup	Consider integration tests with real components

Iron Laws of Testing

skills/test-driven-development/testing-anti-patterns.md14-19

1. NEVER test mock behavior
2. NEVER add test-only methods to production classes
3. NEVER mock without understanding dependencies

Relationship to Other Skills

Diagram: TDD Skill in the Development Pipeline

Sources: skills/test-driven-development/SKILL.md327-340 skills/test-driven-development/SKILL.md357-363

writing-plans (see Writing Implementation Plans) structures each task with explicit RED/GREEN/REFACTOR steps.
subagent-driven-development (see Subagent-Driven Development) dispatches an implementer subagent that is required to follow TDD.
systematic-debugging (see systematic-debugging) takes over when a root cause must be investigated; TDD governs how the fix is applied afterward.
verification-before-completion (see Other Essential Skills) provides the final gate after the TDD checklist is satisfied.

test-driven-development

Relevant source files

Skill Metadata

Field	Value
File	`skills/test-driven-development/SKILL.md`
Name	`test-driven-development`
Description trigger	`Use when implementing any feature or bugfix, before writing implementation code`
Companion reference	`skills/test-driven-development/testing-anti-patterns.md`

The Iron Law

skills/test-driven-development/SKILL.md33-45

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

If production code was written before its test exists and was observed failing, the code must be deleted. There are no exceptions:

Do not keep it as a "reference"
Do not "adapt" it while writing tests alongside
Do not look at it while writing the test
Delete means delete. Implement fresh from the tests only.

RED / GREEN / REFACTOR Cycle

The three phases form a strict loop. Each phase has a mandatory verification step that must not be skipped.

Diagram: RED/GREEN/REFACTOR Cycle with Verification Gates

Sources: skills/test-driven-development/SKILL.md47-68

Phase 1 — RED: Write the Failing Test

skills/test-driven-development/SKILL.md71-112

Write one minimal test that describes the expected behavior. Requirements:

Requirement	Good	Bad
One behavior per test	Single `expect` on one outcome	Test verifies email AND domain AND whitespace
Clear name	`'retries failed operations 3 times'`	`'retry works'` or `'test1'`
Real code, not mocks	Calls the real function under test	Asserts that a mock was called N times

Tests should demonstrate the desired API as if writing a usage example. The name should describe behavior, not the mechanism.

Phase 2 — Verify RED (Mandatory)

skills/test-driven-development/SKILL.md113-129

Run the test suite and confirm three things:

The test fails (not errors out due to a typo or missing import)
The failure message is the expected one
The test fails because the feature is missing, not because of a test bug

Outcome	Action
Test passes immediately	You are testing existing behavior. Fix the test.
Test errors (syntax/import)	Fix the error, re-run. Do not proceed to GREEN.
Test fails correctly	Proceed to GREEN.

Phase 3 — GREEN: Minimal Code

skills/test-driven-development/SKILL.md132-166

Write the simplest code that makes the test pass. Do not:

Add optional parameters or configuration not required by the test
Refactor other code
Implement features the test does not yet demand (YAGNI)

The implementation exists only to pass the current test. Nothing more.

Phase 4 — Verify GREEN (Mandatory)

skills/test-driven-development/SKILL.md168-183

Run the full test suite:

Confirm:

The new test passes
All previously passing tests still pass
Output is pristine: no errors, no warnings

Outcome	Action
New test fails	Fix code, not test
Other tests fail	Fix regressions before continuing
All pass, clean output	Proceed to REFACTOR

Phase 5 — REFACTOR: Clean Up

skills/test-driven-development/SKILL.md185-196

Only after GREEN, improve the code's internal quality:

Remove duplication
Improve names
Extract helpers

Do not add behavior. After every change, re-run the suite to confirm all tests remain green.

Scope: When TDD Applies

skills/test-driven-development/SKILL.md18-29

Category	TDD Required
New features	Always
Bug fixes	Always
Refactoring	Always
Behavior changes	Always
Throwaway prototypes	Ask human partner
Generated code	Ask human partner
Configuration files	Ask human partner

The urge to skip TDD "just this once" is a rationalization signal, not a legitimate exception.

Rationalization Table

skills/test-driven-development/SKILL.md257-270

The skill explicitly catalogs common rationalizations and their rebuttals:

Rationalization	Reality
"Too simple to test"	Simple code breaks. Test takes 30 seconds.
"I'll write tests after"	Tests-after pass immediately — proves nothing.
"Tests after achieve same goals"	Tests-after answer "what does this do?" Tests-first answer "what should this do?"
"Already manually tested"	Ad-hoc ≠ systematic. No record, can't re-run.
"Deleting X hours is wasteful"	Sunk cost fallacy. Keeping unverified code is technical debt.
"Keep as reference, write tests first"	You'll adapt it. That's testing after. Delete means delete.
"Need to explore first"	Fine. Throw away exploration. Start with TDD.
"Test is hard = design is unclear"	Listen to the test. Hard to test = hard to use.
"TDD will slow me down"	TDD is faster than debugging. Pragmatic = test-first.
"Existing code has no tests"	You're improving it. Add tests for existing code.

Red Flags — Stop and Start Over

skills/test-driven-development/SKILL.md272-288

Any of the following means: Delete the code. Start over.

Code written before test
Test written after implementation
Test passes immediately without code changes
Cannot explain why the test failed
Tests planned to be added "later"
Rationalizing "just this once"
"I already manually tested it"
"Tests after achieve the same purpose"
"It's about spirit not ritual"
"Keep as reference" or "adapt existing code"
"Already spent X hours, deleting is wasteful"
"TDD is dogmatic, I'm being pragmatic"
"This is different because..."

Bug Fix Example

skills/test-driven-development/SKILL.md291-325

The skill includes a concrete walkthrough for fixing a bug (empty email accepted):

Diagram: Bug Fix TDD Flow

Sources: skills/test-driven-development/SKILL.md291-325

Verification Checklist

skills/test-driven-development/SKILL.md327-340

Before marking any task complete:

Every new function/method has a test
Watched each test fail before implementing
Each test failed for the expected reason (feature missing, not typo)
Wrote minimal code to pass each test
All tests pass
Output pristine (no errors, warnings)
Tests use real code (mocks only if unavoidable)
Edge cases and errors covered

Failure to check all boxes means TDD was skipped. Start over.

When Stuck

skills/test-driven-development/SKILL.md343-350

Problem	Solution
Don't know how to test	Write the wished-for API. Write the assertion first. Ask human partner.
Test too complicated	Design is too complicated. Simplify the interface.
Must mock everything	Code is too coupled. Use dependency injection.
Test setup is huge	Extract helpers. Still complex? Simplify the design.

Hard-to-test code is a design signal, not a TDD limitation.

Debugging Integration

skills/test-driven-development/SKILL.md352-355

When a bug is found during development:

Write a failing test that reproduces the bug
Follow the full TDD cycle
The test proves the fix and prevents regression

Never fix a bug without a test.

Testing Anti-Patterns Reference

skills/test-driven-development/testing-anti-patterns.md1-20

The companion file testing-anti-patterns.md is loaded when writing or changing tests, adding mocks, or considering test-only methods on production classes.

Diagram: Anti-Pattern Reference — Code Entity Map

Sources: skills/test-driven-development/testing-anti-patterns.md1-20 skills/test-driven-development/SKILL.md357-363

Anti-Pattern Summary Table

skills/test-driven-development/testing-anti-patterns.md275-283

Anti-Pattern	Fix
Assert on mock elements (`*-mock` test IDs)	Test real component or unmock it
Test-only methods on production classes	Move cleanup to `test-utils/` helpers
Mock without understanding side effects	Understand dependency chain first; mock minimally
Incomplete mock data structures	Mirror full real API response schema
Tests as afterthought	TDD — tests first
Over-complex mock setup	Consider integration tests with real components

Iron Laws of Testing

skills/test-driven-development/testing-anti-patterns.md14-19

1. NEVER test mock behavior
2. NEVER add test-only methods to production classes
3. NEVER mock without understanding dependencies

Relationship to Other Skills

Diagram: TDD Skill in the Development Pipeline

Sources: skills/test-driven-development/SKILL.md327-340 skills/test-driven-development/SKILL.md357-363

writing-plans (see Writing Implementation Plans) structures each task with explicit RED/GREEN/REFACTOR steps.
subagent-driven-development (see Subagent-Driven Development) dispatches an implementer subagent that is required to follow TDD.
systematic-debugging (see systematic-debugging) takes over when a root cause must be investigated; TDD governs how the fix is applied afterward.
verification-before-completion (see Other Essential Skills) provides the final gate after the TDD checklist is satisfied.

test-driven-development

Skill Metadata

The Iron Law

RED / GREEN / REFACTOR Cycle

Phase 1 — RED: Write the Failing Test

Phase 2 — Verify RED (Mandatory)

Phase 3 — GREEN: Minimal Code

Phase 4 — Verify GREEN (Mandatory)

Phase 5 — REFACTOR: Clean Up

Scope: When TDD Applies

Rationalization Table

Red Flags — Stop and Start Over

Bug Fix Example

Verification Checklist

When Stuck

Debugging Integration

Testing Anti-Patterns Reference

Anti-Pattern Summary Table

Iron Laws of Testing

Relationship to Other Skills

On this page

test-driven-development

Skill Metadata

The Iron Law

RED / GREEN / REFACTOR Cycle

Phase 1 — RED: Write the Failing Test

Phase 2 — Verify RED (Mandatory)

Phase 3 — GREEN: Minimal Code

Phase 4 — Verify GREEN (Mandatory)

Phase 5 — REFACTOR: Clean Up

Scope: When TDD Applies

Rationalization Table

Red Flags — Stop and Start Over

Bug Fix Example

Verification Checklist

When Stuck

Debugging Integration

Testing Anti-Patterns Reference

Anti-Pattern Summary Table

Iron Laws of Testing

Relationship to Other Skills

On this page