This page describes the automated test suites and supporting tooling used to verify superpowers behavior. Coverage falls into two areas: the Claude Code test suite (which invokes the Claude CLI to verify skill loading and workflow compliance) and the OpenCode test suite (which tests the plugin's JavaScript library and platform-specific tool integration).
For background on what skills contain and how they are loaded, see page 3. For documentation of the skills-core.js library being tested here, see page 5.4.
The tests live under tests/ and are split by platform:
tests/
├── claude-code/
│ ├── run-skill-tests.sh # runner
│ ├── test-helpers.sh # shared assertions
│ ├── test-subagent-driven-development.sh # fast skill test
│ ├── test-subagent-driven-development-integration.sh # full execution test
│ └── analyze-token-usage.py # token cost reporter
└── opencode/
├── run-tests.sh # runner
├── setup.sh # isolated env creator
├── test-plugin-loading.sh # plugin structure
├── test-skills-core.sh # library unit tests
├── test-tools.sh # tool integration
└── test-priority.sh # priority resolution
Test suite overview diagram:
Sources: tests/claude-code/run-skill-tests.sh1-188 tests/opencode/run-tests.sh1-165
run-skill-tests.shrun-skill-tests.sh is the entry point for all Claude Code tests. It accepts the following CLI flags:
| Flag | Default | Description |
|---|---|---|
--verbose / -v | false | Print full test output instead of suppressing passing output |
--test / -t NAME | all | Run only the named test file |
--timeout SECONDS | 300 | Per-test wall-clock timeout |
--integration / -i | false | Also run integration tests (slow) |
By default, only fast tests run. Integration tests are listed in integration_tests array tests/claude-code/run-skill-tests.sh80-83 and are appended to the run list when --integration is passed.
The runner prints a PASSED / FAILED summary and exits with code 0 on success or 1 on any failure.
Sources: tests/claude-code/run-skill-tests.sh26-72 tests/claude-code/README.md16-39
test-helpers.shAll Claude Code test files source test-helpers.sh. It provides:
Execution:
| Function | Signature | Purpose |
|---|---|---|
run_claude | "prompt" [timeout] [allowed_tools] | Invoke claude -p in headless mode, return stdout |
Assertions:
| Function | Signature | Returns |
|---|---|---|
assert_contains | "output" "pattern" "name" | 0 if grep -q matches |
assert_not_contains | "output" "pattern" "name" | 0 if pattern absent |
assert_count | "output" "pattern" expected "name" | 0 if count matches exactly |
assert_order | "output" "pattern_a" "pattern_b" "name" | 0 if A appears on earlier line than B |
Fixtures:
| Function | Returns |
|---|---|
create_test_project | Path to a mktemp -d directory |
cleanup_test_project "$dir" | Removes the directory |
create_test_plan "$dir" ["name"] | Creates docs/plans/<name>.md with a two-task sample plan |
Assertion helper diagram:
Sources: tests/claude-code/test-helpers.sh1-203
test-subagent-driven-development.shRuns nine targeted checks against the subagent-driven-development skill by asking Claude questions and asserting on the text response. Each check takes roughly 15–30 seconds.
| Test # | Prompt topic | Key assertion |
|---|---|---|
| 1 | Skill loading | Response contains subagent-driven-development |
| 2 | Workflow order | spec.*compliance appears before code.*quality (assert_order) |
| 3 | Self-review | Contains self-review and completeness |
| 4 | Plan read count | Contains once and beginning / start |
| 5 | Reviewer skepticism | Contains not trust / skeptical / verify.*independently |
| 6 | Review loops | Contains loop and implementer.*fix |
| 7 | Task context | Contains provide.*directly; does NOT contain read.*file |
| 8 | Worktree prereq | Contains using-git-worktrees / worktree |
| 9 | Main branch warning | Contains worktree / not.*main / consent |
Sources: tests/claude-code/test-subagent-driven-development.sh1-165
test-subagent-driven-development-integration.shA full end-to-end test that runs Claude against a real project. Expected duration: 10–30 minutes.
Setup steps:
create_test_project.package.json (ESM Node project) and docs/plans/implementation-plan.md with two tasks (create add function, create multiply function).claude -p with --allowed-tools=all and --permission-mode bypassPermissions, capturing output and session transcript.Verification steps (post-execution):
| Test # | What is verified | Method |
|---|---|---|
| 1 | Skill was invoked | grep '"name":"Skill".*"skill":"superpowers:subagent-driven-development"' in .jsonl |
| 2 | Subagents dispatched | grep -c '"name":"Task"' ≥ 2 |
| 3 | Task tracking | grep -c '"name":"TodoWrite"' ≥ 1 |
| 6 | Files created | src/math.js exists; add and multiply exports present; test/math.test.js exists |
| 6 | Tests pass | npm test exits 0 |
| 7 | Git commits | git log --oneline has > 2 commits |
| 8 | No extra features | divide/power/subtract absent (spec compliance check) |
After verification, the test calls analyze-token-usage.py to print a cost breakdown.
Session file location: The test searches ~/.claude/projects/<escaped-working-dir>/ for the most recent .jsonl file created in the last 60 minutes.
Sources: tests/claude-code/test-subagent-driven-development-integration.sh1-314
analyze-token-usage.pyanalyze-token-usage.py takes a single .jsonl session transcript path and produces a per-agent token cost table.
How it works:
.jsonl file as a JSON object.type == "assistant" entries, accumulates input_tokens, output_tokens, cache_creation_input_tokens, and cache_read_input_tokens into main_usage.type == "user" entries that contain a toolUseResult with both usage and agentId, accumulates into subagent_usage[agentId].prompt field.$3 / 1M input tokens and $15 / 1M output tokens.Output format (tabular):
Agent Description Msgs Input Output Cache Cost
main Main session (coordinator) 4 12,345 2,100 9,000 $0.08
agent-abc123 implementer subagent for Task 1 6 45,000 5,200 20,000 $0.35
Sources: tests/claude-code/analyze-token-usage.py1-168
run-tests.shrun-tests.sh accepts the same basic flags as the Claude Code runner (--verbose, --test, --integration). Default tests (no external dependencies):
test-plugin-loading.shtest-skills-core.shIntegration tests (require OpenCode binary):
test-tools.shtest-priority.shSources: tests/opencode/run-tests.sh61-75
setup.shEvery OpenCode test sources setup.sh, which sets TEST_HOME to a temporary directory and overrides HOME to prevent tests from touching the real user home. It exports a cleanup_test_env function that is registered via trap ... EXIT in each test.
Sources: tests/opencode/test-skills-core.sh12-15 tests/opencode/test-priority.sh12-15
test-skills-core.shTests the core skills library logic without requiring OpenCode. It inlines the library functions into Node.js one-liners and verifies them directly.
Test coverage:
| Test # | Function under test | What is verified |
|---|---|---|
| 1 | extractFrontmatter | Parses name and description from YAML frontmatter |
| 2 | stripFrontmatter | Removes the --- block; preserves body content |
| 3 | findSkillsInDir | Recursively discovers SKILL.md files up to maxDepth=3; finds nested skills |
| 4 | resolveSkillPath | Personal overrides superpowers; superpowers: prefix forces superpowers; unknown skill returns null |
| 5 | checkForUpdates | Returns false for repo without remote; returns false for non-existent or non-git dir |
Skill resolution logic diagram:
Sources: tests/opencode/test-skills-core.sh17-440
test-tools.shRequires OpenCode installed and in PATH. If absent, the test prints [SKIP] and exits 0.
| Test # | What is verified |
|---|---|
| 1 | find_skills tool returns superpowers:brainstorming and superpowers:using-superpowers |
| 2 | use_skill tool loads personal-test skill with expected content marker |
| 3 | use_skill with superpowers:brainstorming returns brainstorming skill content |
Each test uses timeout 60s opencode run --print-logs "..." and greps the combined stdout/stderr.
Sources: tests/opencode/test-tools.sh1-104
test-priority.shVerifies the three-tier priority system (project > personal > superpowers) by creating identical skills named priority-test in three locations, each embedding a unique PRIORITY_MARKER_* string.
| Location | Path | Priority tier |
|---|---|---|
| Superpowers | ~/.config/opencode/superpowers/skills/priority-test/ | Lowest |
| Personal | ~/.config/opencode/skills/priority-test/ | Middle |
| Project | <TEST_HOME>/test-project/.opencode/skills/priority-test/ | Highest |
| Test # | CWD for opencode | Expected marker |
|---|---|---|
| 2 | $HOME (outside project) | PRIORITY_MARKER_PERSONAL_VERSION |
| 3 | <TEST_HOME>/test-project | PRIORITY_MARKER_PROJECT_VERSION |
| 4 | project dir, superpowers:priority-test | PRIORITY_MARKER_SUPERPOWERS_VERSION |
| 5 | $HOME, project:priority-test | Should fail / not found |
Sources: tests/opencode/test-priority.sh1-198
Sources: tests/claude-code/run-skill-tests.sh99-163 tests/opencode/run-tests.sh87-141 tests/claude-code/test-subagent-driven-development-integration.sh150-170
| Command | What runs | Time |
|---|---|---|
tests/claude-code/run-skill-tests.sh | Fast Claude Code skill tests | ~2 min |
tests/claude-code/run-skill-tests.sh --integration | Full workflow execution | 10–30 min |
tests/claude-code/run-skill-tests.sh --test test-subagent-driven-development.sh | Single named test | ~2 min |
tests/claude-code/run-skill-tests.sh --verbose | All fast tests, full output | ~2 min |
tests/opencode/run-tests.sh | Plugin loading + skills-core unit tests | <1 min |
tests/opencode/run-tests.sh --integration | All tests including OpenCode tools | ~5 min |
Refresh this wiki
This wiki was recently refreshed. Please wait 3 days to refresh again.