Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
ee6ac16
feat: add smart-ralph scaffolding spec and task breakdown
claude Feb 9, 2026
19d9391
fix(commands): add disable-model-invocation to deepen-plan
claude Feb 9, 2026
5dd759e
fix(commands): add disable-model-invocation to feature-video
claude Feb 9, 2026
bdfdf4f
fix(commands): add disable-model-invocation to resolve_todo_parallel
claude Feb 9, 2026
ecc3eb6
fix(commands): add disable-model-invocation to test-browser
claude Feb 9, 2026
a67b582
fix(workflows): add disable-model-invocation to all 5 workflow commands
claude Feb 9, 2026
dcdf36e
fix(commands): add argument-hint to deploy-docs
claude Feb 9, 2026
7b4b932
chore: mark task 1.7 verification complete
claude Feb 9, 2026
95b604a
chore: pass Phase 1 quality checkpoint (task 1.8)
claude Feb 9, 2026
ed6d68b
fix(reproduce-bug): add agent-browser critical header and prerequisites
claude Feb 9, 2026
123c8e8
fix(reproduce-bug): replace all MCP tool refs with agent-browser CLI
claude Feb 9, 2026
b45e819
fix(reproduce-bug): add agent-browser CLI reference section
claude Feb 9, 2026
e665f39
chore: verify no stale MCP references remain (task 1.12)
claude Feb 9, 2026
3b02822
chore: create hooks directory structure
claude Feb 9, 2026
2264bc3
feat(hooks): add hooks.json configuration for safety guardrails
claude Feb 9, 2026
56fd4ad
feat(hooks): add validate-bash.sh for destructive command detection
claude Feb 9, 2026
401681e
feat(hooks): add protect-env-files.sh for secret file protection
claude Feb 9, 2026
9ec49f2
chore(hooks): make hook scripts executable
claude Feb 9, 2026
b3b4935
chore: update progress for task 1.17
claude Feb 9, 2026
32e5de9
chore: pass Phase 1 quality checkpoint
claude Feb 9, 2026
c141943
feat(work): add input validation for plan file path
claude Feb 9, 2026
25fa13c
feat(review): add input validation for PR number/branch/URL
claude Feb 9, 2026
4fb888a
feat(reproduce-bug): add input validation for issue number
claude Feb 9, 2026
d2e3a19
chore(spec): verify all validation error messages use What/Why/Fix fo…
claude Feb 9, 2026
0fd9f41
chore: pass input validation quality checkpoint
claude Feb 9, 2026
57da592
feat(ci): add command frontmatter validation tests
claude Feb 9, 2026
050ecba
feat(ci): add hook script unit tests (24 test cases)
claude Feb 9, 2026
9d60f5d
chore: verify full test suite passes (task 2.8)
claude Feb 9, 2026
922211e
chore: pass Phase 2 quality checkpoint
claude Feb 9, 2026
360a951
feat(work): add interactive plan picker with autonomous bypass
claude Feb 9, 2026
59ec095
feat(review): add interactive target selector with autonomous bypass
claude Feb 9, 2026
ed3d519
feat(compound): add interactive category confirmation with autonomous…
claude Feb 9, 2026
eb5c9a7
feat(plan): add L1/L2/L3 layer detection for idea refinement
claude Feb 9, 2026
9cd7d3e
fix(workflows): ensure AskUserQuestion design rules consistency
claude Feb 9, 2026
fff4336
chore: pass interactive patterns quality checkpoint
claude Feb 9, 2026
ac167ed
feat(plan): add state checkpoint for workflow resumability
claude Feb 9, 2026
594e735
feat(work): add state discovery for workflow resumability
claude Feb 9, 2026
a4ed553
feat(work): handle state management edge cases
claude Feb 9, 2026
6e868c2
chore: pass Phase 3 quality checkpoint
claude Feb 9, 2026
5221b82
chore: pass full test suite (task 4.1)
claude Feb 9, 2026
b8dda8e
chore: verify QA acceptance criteria (task 4.2)
claude Feb 9, 2026
f816b51
chore: pass final quality validation (task 4.3)
claude Feb 9, 2026
d10978b
chore: create PR #165 and verify CI (task 4.4)
claude Feb 9, 2026
3c1bc51
chore(spec): mark task 5.1 complete - PR #165 verified
claude Feb 9, 2026
4915b17
chore(ci): monitor CI status for PR #165
claude Feb 9, 2026
d84e579
chore(pr): address code review comments (none pending)
claude Feb 9, 2026
3d42c53
chore: pass final validation - all 45 tasks complete
claude Feb 9, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions ai/tasks/ci-validation/BREAKDOWN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
---
id: ci-validation.BREAKDOWN
module: ci-validation
priority: 4
status: pending
version: 1
origin: spec-workflow
dependsOn: [frontmatter-audit, hooks]
tags: [smart-ralph, compound-engineering]
---
# CI Validation

## Context

The CI pipeline (`ci.yml`) runs `bun test` but does not validate command markdown files -- no YAML frontmatter lint, no field completeness checks, no broken-reference detection. This module adds two test files to the existing `bun test` suite: one for command frontmatter validation (6 assertion groups across 24 commands) and one for hook script unit tests (24 test cases across 2 scripts). Both run automatically in CI with zero new dependencies.

## Tasks

1. **Create `tests/command-validation.test.ts`** -- Write a Bun test file that:
- Discovers all `.md` files in `plugins/compound-engineering/commands/` and `commands/workflows/` via glob
- Uses the existing `parseFrontmatter()` utility from `src/utils/frontmatter.ts`
- Implements 6 assertion groups per command file:
- **1.1**: YAML frontmatter parses without error (parseFrontmatter returns non-empty data)
- **1.2**: `name` field is a non-empty string
- **1.3**: `description` field is a non-empty string
- **1.4**: `argument-hint` field is a string
- **1.5**: `disable-model-invocation` is boolean `true` -- UNLESS the command body contains `# ci-allow: model-invocation` (escape hatch)
- **1.6**: Command body does not match any pattern in `REMOVED_TOOL_PATTERNS` array (starting with `/mcp__plugin_compound-engineering_pw__/`)

2. **Add GitHub Actions inline annotations** -- For each test failure, emit `::error file=<relative-path>,line=1::<message>` via `console.error()` so GitHub shows annotations inline on PR diffs.

3. **Define `REMOVED_TOOL_PATTERNS` array** -- Create a constant array of regex patterns for known-removed tools. Start with `[/mcp__plugin_compound-engineering_pw__/]`. This array is extended when tools are deprecated in the future.

4. **Create `tests/hook-scripts.test.ts`** -- Write a Bun test file that tests both hook scripts using `Bun.spawn()`:
- Implements a `runHook(script, input)` helper function that pipes JSON to stdin via `Bun.spawn()`, waits for completion, and returns `{ exitCode, stdout, stderr }`
- **validate-bash.sh tests** (14 cases):
- 2.1: Normal command (`ls -la`) -> allow (exit 0, no JSON)
- 2.2: Non-force git push -> allow
- 2.3: `git push --force origin feat/auth` -> ask, reason contains "Force push"
- 2.4: `git push -f origin main` -> ask
- 2.5: `git reset --hard HEAD~3` -> ask, reason contains "Hard reset"
- 2.6: `rm -rf src/components` -> ask, reason contains "Recursive delete"
- 2.7: `rm -fr dist/build` -> ask (flag reorder)
- 2.8: `rm -rf /` -> deny, reason contains "Catastrophic"
- 2.9: `rm -rf ~` -> deny
- 2.10: `rm -rf $HOME` -> deny
- 2.11: `rm -rf node_modules` -> allow (safe target)
- 2.12: `rm -rf .cache` -> allow (safe target)
- 2.13: Empty command -> allow
- 2.14: `cd src && rm -rf dist` -> ask (piped command)
- **protect-env-files.sh tests** (10 cases):
- 2.15: `.env` -> ask
- 2.16: `.env.local` -> ask
- 2.17: `.env.production` -> ask
- 2.18: `src/index.ts` -> allow
- 2.19: `cert.pem` -> ask
- 2.20: `private.key` -> ask
- 2.21: `credentials.json` -> ask
- 2.22: `secret.yml` -> ask
- 2.23: `src/env-utils.ts` -> allow (similarly named safe file)
- 2.24: Empty file_path -> allow

5. **Add `jq` availability check** -- As the first test in `hook-scripts.test.ts`, check if `jq` is available on the system. If not, skip all hook tests with a descriptive message: "jq is required for hook scripts. Install with: brew install jq (macOS) or apt-get install jq (Linux)".

6. **Verify both test files are picked up by `bun test`** -- Confirm tests run as part of the existing `bun test` command (which runs all `tests/*.test.ts`). No changes to `.github/workflows/ci.yml` needed.

7. **Run full test suite** -- Execute `bun test` and verify all new tests pass alongside the existing 8 tests (AC-11).

## Acceptance Criteria

- AC-1 (from QA): All 24 commands have valid YAML frontmatter with name, description, argument-hint, and disable-model-invocation. command-validation.test.ts passes.
- AC-2 (from QA): No command body references removed Playwright MCP tools. Removed-pattern check passes.
- AC-3 (from QA): validate-bash.sh correctly handles all 14 test cases.
- AC-4 (from QA): protect-env-files.sh correctly handles all 10 test cases.
- AC-10 (from QA): `bun test` passes on both macOS and ubuntu-latest.
- AC-11 (from QA): All existing 8 tests continue to pass alongside new tests.
- Test failures include file path and Fix instruction for GitHub Actions inline annotations.
- The escape hatch (`# ci-allow: model-invocation` comment in command body) correctly exempts commands from the disable-model-invocation check.

## Files to Create/Modify

### New Files (2)

| File | Purpose |
|------|---------|
| `tests/command-validation.test.ts` | CI test: validates all 24 command files for frontmatter completeness, required fields, and removed tool references |
| `tests/hook-scripts.test.ts` | CI test: validates both hook scripts with 24 test cases covering all decision paths (allow, ask, deny) |
63 changes: 63 additions & 0 deletions ai/tasks/frontmatter-audit/BREAKDOWN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
id: frontmatter-audit.BREAKDOWN
module: frontmatter-audit
priority: 0
status: pending
version: 1
origin: spec-workflow
dependsOn: []
tags: [smart-ralph, compound-engineering]
---
# Frontmatter Audit

## Context

14 of 24 commands have `disable-model-invocation: true`, but 10 do not -- including all 5 workflow commands and 4 utility commands. This directly contributes to the 316% context budget bloat issue by allowing the model to auto-load command instructions unnecessarily. Additionally, 1 command (`deploy-docs.md`) is missing `argument-hint`. This module performs the lowest-effort, highest-impact change: 11 single-line YAML frontmatter edits.

## Tasks

1. **Add `disable-model-invocation: true` to `commands/deepen-plan.md`** -- Insert `disable-model-invocation: true` into the YAML frontmatter block (between the `---` delimiters). Value must be boolean `true`, not string `"true"`.

2. **Add `disable-model-invocation: true` to `commands/feature-video.md`** -- Same single-line frontmatter addition.

3. **Add `disable-model-invocation: true` to `commands/resolve_todo_parallel.md`** -- Same single-line frontmatter addition.

4. **Add `disable-model-invocation: true` to `commands/test-browser.md`** -- Same single-line frontmatter addition.

5. **Add `disable-model-invocation: true` to `commands/workflows/brainstorm.md`** -- Same single-line frontmatter addition.

6. **Add `disable-model-invocation: true` to `commands/workflows/compound.md`** -- Same single-line frontmatter addition.

7. **Add `disable-model-invocation: true` to `commands/workflows/plan.md`** -- Same single-line frontmatter addition.

8. **Add `disable-model-invocation: true` to `commands/workflows/review.md`** -- Same single-line frontmatter addition.

9. **Add `disable-model-invocation: true` to `commands/workflows/work.md`** -- Same single-line frontmatter addition.

10. **Add `argument-hint` to `commands/deploy-docs.md`** -- Insert `argument-hint: "[optional: --dry-run to preview changes]"` into the YAML frontmatter.

11. **Verify all 24 commands now have both fields** -- Run a quick scan to confirm 24/24 commands have `argument-hint` and 24/24 have `disable-model-invocation: true`.

## Acceptance Criteria

- AC-1 (from QA): All 24 commands have valid YAML frontmatter with name, description, argument-hint, and disable-model-invocation. Verified by `bun test` (command-validation.test.ts) once CI module is complete.
- All frontmatter values use boolean `true`, not string `"true"`. The existing CLI parser checks `data["disable-model-invocation"] === true`.
- The `disable-model-invocation` flag only prevents model-initiated invocation, not explicit `/slash-command` invocation. The lfg/slfg chains use explicit slash syntax and must continue to work.
- Context budget does not regress (AC-7 from QA, verified during integration-testing module).

## Files to Create/Modify

### Modified Files (10 commands + 1 argument-hint)

| File | Change |
|------|--------|
| `plugins/compound-engineering/commands/deepen-plan.md` | Add `disable-model-invocation: true` |
| `plugins/compound-engineering/commands/feature-video.md` | Add `disable-model-invocation: true` |
| `plugins/compound-engineering/commands/resolve_todo_parallel.md` | Add `disable-model-invocation: true` |
| `plugins/compound-engineering/commands/test-browser.md` | Add `disable-model-invocation: true` |
| `plugins/compound-engineering/commands/deploy-docs.md` | Add `argument-hint: "[optional: --dry-run to preview changes]"` |
| `plugins/compound-engineering/commands/workflows/brainstorm.md` | Add `disable-model-invocation: true` |
| `plugins/compound-engineering/commands/workflows/compound.md` | Add `disable-model-invocation: true` |
| `plugins/compound-engineering/commands/workflows/plan.md` | Add `disable-model-invocation: true` |
| `plugins/compound-engineering/commands/workflows/review.md` | Add `disable-model-invocation: true` |
| `plugins/compound-engineering/commands/workflows/work.md` | Add `disable-model-invocation: true` |
70 changes: 70 additions & 0 deletions ai/tasks/hooks/BREAKDOWN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
id: hooks.BREAKDOWN
module: hooks
priority: 3
status: pending
version: 1
origin: spec-workflow
dependsOn: []
tags: [smart-ralph, compound-engineering]
---
# Safety Hooks

## Context

The compound-engineering plugin has zero hooks -- no PreToolUse or PostToolUse safety guardrails exist. Commands like `lfg` and `slfg` chain destructive operations (git push, rm -rf) without confirmation gates. This module creates the plugin's first hooks directory with two PreToolUse bash scripts that provide safety guardrails for destructive bash commands and sensitive file edits, using the "ask" decision mode for all operations except catastrophic deletes.

## Tasks

1. **Create `hooks/` directory structure** -- Create `plugins/compound-engineering/hooks/` and `plugins/compound-engineering/hooks/scripts/` directories.

2. **Create `hooks/hooks.json`** -- Write the hook configuration file with:
- A `description` field: "Safety guardrails for compound-engineering plugin. Prompts for confirmation before destructive operations."
- PreToolUse matcher for `Bash` tool -> runs `validate-bash.sh`
- PreToolUse matcher for `Write|Edit` tools -> runs `protect-env-files.sh`
- Use `${CLAUDE_PLUGIN_ROOT}/hooks/scripts/` for script paths
- Timeout: 10s for bash validation, 5s for env file protection

3. **Create `hooks/scripts/validate-bash.sh`** -- Write the PreToolUse hook script that:
- Reads JSON from stdin, extracts `tool_input.command` via `jq`
- Pattern 1: Detects `git push --force` / `git push -f` -> returns "ask" with branch context
- Pattern 2: Detects `git reset --hard` -> returns "ask" with warning about uncommitted changes
- Pattern 3: Detects `rm -rf` / `rm -fr` (only these variants, not `rm -r`) -> three-tier logic:
- Hard deny catastrophic targets: `/`, `~`, `$HOME`, `$CLAUDE_PROJECT_DIR`, `.`
- Silent allow safe targets: `*/node_modules`, `*/.cache`, `*/tmp`, `*/__pycache__`, `*/.next`
- Ask for everything else with target path in reason
- All other commands: exit 0 (allow, no JSON output)
- Uses `set -euo pipefail` for consistent error handling

4. **Create `hooks/scripts/protect-env-files.sh`** -- Write the PreToolUse hook script that:
- Reads JSON from stdin, extracts `tool_input.file_path` via `jq`
- Detects files matching the curated secrets pattern: `\.env($|\.)`, `\.pem$`, `\.key$`, `credentials`, `secret.*\.(json|yml|yaml)`
- Returns "ask" with reason mentioning secrets for matching files
- All other files: exit 0 (allow, no JSON output)
- Empty file_path: exit 0

5. **Make hook scripts executable** -- Run `chmod +x` on both bash scripts.

6. **Verify hooks.json is discoverable** -- Confirm the file is at the path expected by the CLI's `loadHooks()` function (`hooks/hooks.json` relative to plugin root). The existing parser at `src/parsers/claude.ts:124` looks for this exact path.

7. **Test hooks locally** -- Verify hooks appear in Claude Code's `/hooks` menu as `[Plugin]` entries and fire correctly for test commands.

## Acceptance Criteria

- AC-3 (from QA): validate-bash.sh correctly handles all 14 test cases (normal commands allow, force push asks, hard reset asks, rm -rf meaningful path asks, rm -rf / denies, safe targets allow, etc.).
- AC-4 (from QA): protect-env-files.sh correctly handles all 10 test cases (.env asks, .env.local asks, .pem asks, .key asks, credentials asks, normal files allow, etc.).
- AC-12 (from QA): Hook scripts have executable permissions.
- Hooks use "ask" mode for all destructive operations (per PM Q2), never "deny" except for catastrophic `rm -rf /` / `rm -rf ~` / `rm -rf $HOME`.
- Hooks fire for all tool calls including those from subagents (per UX Q2).
- Hook scripts use only `jq` and standard bash tools -- no new dependencies.
- Hook scripts complete within their configured timeouts (10s for validate-bash.sh, 5s for protect-env-files.sh).

## Files to Create/Modify

### New Files (3)

| File | Purpose |
|------|---------|
| `plugins/compound-engineering/hooks/hooks.json` | Hook configuration -- defines PreToolUse matchers for Bash and Write/Edit tools |
| `plugins/compound-engineering/hooks/scripts/validate-bash.sh` | PreToolUse hook: validates destructive bash commands (force push, hard reset, rm -rf) |
| `plugins/compound-engineering/hooks/scripts/protect-env-files.sh` | PreToolUse hook: protects .env, .pem, .key, credentials, and secret files from unintended edits |
47 changes: 47 additions & 0 deletions ai/tasks/input-validation/BREAKDOWN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
id: input-validation.BREAKDOWN
module: input-validation
priority: 2
status: pending
version: 1
origin: spec-workflow
dependsOn: [reproduce-bug-fix]
tags: [smart-ralph, compound-engineering]
---
# Input Validation

## Context

Commands accept `$ARGUMENTS` but never validate them before executing multi-step workflows. A missing PR number or invalid file path silently cascades through git and gh CLI calls, producing cryptic errors deep in workflows instead of clear early messages. This module adds instructional validation sections to the three highest-value commands, using the three-part What/Why/Fix error message format.

## Tasks

1. **Add Input Validation section to `workflows/work.md`** -- Insert an `## Input Validation` section before Phase 1. When `$ARGUMENTS` is provided, validate the plan file path: check file exists, ends in `.md`, is in `docs/plans/`. On failure, show What/Why/Fix error with `ls -1t docs/plans/*.md | head -5` to list available plans.

2. **Add Input Validation section to `workflows/review.md`** -- Insert an `## Input Validation` section before Main Tasks. Parse the argument as PR number (numeric), GitHub URL (extract PR number), branch name (check `git rev-parse --verify`), or keyword ("latest", "current"). On unrecognizable input, show What/Why/Fix error listing valid formats.

3. **Add Input Validation section to `reproduce-bug.md`** -- Insert an `## Input Validation` section early in the command. Validate that `$ARGUMENTS` is a numeric GitHub issue number. On non-numeric input, show What/Why/Fix error with correct usage example (`/reproduce-bug 42`). Optionally verify issue exists with `gh issue view`.

4. **Ensure validation is permissive** -- Each validation section must attempt to infer the argument type from its format before rejecting. For example, `/workflows:review` should accept `892`, `https://github.com/org/repo/pull/892`, `feat/user-auth`, and `current` -- only failing if no reasonable interpretation exists.

5. **Ensure all error messages include three parts** -- Every validation error must include: (a) What happened (clear statement), (b) Why (context about what the command expected), (c) Fix (actionable next step with usage example). This is critical for agent self-correction -- Claude reads the Why to understand and fix the issue.

6. **Wrap validation in `<input_validation>` tags** -- Use the tag pattern from TECH spec to clearly delineate the validation section in the command markdown. Include "If validation passes: Proceed to Phase 1" at the end.

## Acceptance Criteria

- Manual test 3.3 (from QA): Running `/workflows:work nonexistent.md` produces a What/Why/Fix error with file-not-found message, explanation of plan file expectations, and path suggestion.
- Manual test 3.6 (from QA): Running `/workflows:review abc` produces a What/Why/Fix error for invalid PR number with correct format examples.
- Manual test 3.12 (from QA): Running `/reproduce-bug notanumber` produces a What/Why/Fix error for invalid issue number.
- NG-1 (from QA): Error messages follow What/Why/Fix format for all validation failures.
- Validation does not interfere with the happy path -- valid arguments proceed immediately with no additional prompts or delays.

## Files to Create/Modify

### Modified Files (3)

| File | Change |
|------|--------|
| `plugins/compound-engineering/commands/workflows/work.md` | Add `## Input Validation` section with plan file path validation before Phase 1 |
| `plugins/compound-engineering/commands/workflows/review.md` | Add `## Input Validation` section with PR number/branch/URL validation before Main Tasks |
| `plugins/compound-engineering/commands/reproduce-bug.md` | Add `## Input Validation` section with issue number validation |
Loading