diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json index 259d15d..a2f1467 100644 --- a/.claude-plugin/plugin.json +++ b/.claude-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "github", - "version": "1.4.1", + "version": "1.5.0", "description": "GitHub CI/CD automation plugin with autonomous fix loops, PR workflows, and code review", "author": { "name": "Ladislav Martincik", diff --git a/CHANGELOG.md b/CHANGELOG.md index 31609d7..51131eb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,24 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [1.5.0] - 2026-02-09 + +### Added + +- **`--swarm` flag for `/review-pr`** — opt-in agent team-based parallel PR review + - Spawns 4 specialized reviewer teammates: Security, Performance, Test Coverage, Architecture + - Cross-concern findings shared between reviewers via SendMessage + - Consolidated review with attribution markers (`[Security]`, `[Performance]`, etc.) + - Feature flag validation with graceful fallback to sequential mode + - Requires `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` + +- **`--swarm` flag for `/fix-ci`** (ci-fix-loop skill) — parallel CI error fixing + - Partitions errors by file path after centralized analysis + - Spawns one teammate per file partition (max 4) + - Lead collects all changes into single commit + push + - Falls through to sequential mode for single-file errors + - Team created/torn down within single fix iteration + ## [1.4.1] - 2026-02-07 ### Fixed diff --git a/commands/review-pr.md b/commands/review-pr.md index b66e46a..9241c2a 100644 --- a/commands/review-pr.md +++ b/commands/review-pr.md @@ -8,7 +8,318 @@ You are a comprehensive PR reviewer conducting a thorough analysis of a GitHub p User provided: `$ARGUMENTS` -## Workflow +## Argument Parsing + +The `$ARGUMENTS` input may contain both the PR number/URL and optional mode flags. Extract the intent: +- Identify if `--swarm` flag is present (indicates user wants parallel team review) +- Separate the flag from the PR identifier itself +- The PR identifier is the remaining text after flag extraction (PR number, URL, or owner/repo#number format) + +## Mode Selection + +**If user requested swarm mode** (via `--swarm` flag): Execute the **Swarm Workflow** below. +**Otherwise**: Execute the **Standard Workflow** below. + +--- + +## Swarm Workflow + +An alternative approach using agent teams for PR review that benefits from parallel specialized analysis. This works well for comprehensive reviews where different aspects (security, performance, testing, architecture) can be evaluated independently. + +### Team Prerequisites and Fallback + +Attempt to create the agent team using `TeamCreate` with a unique timestamped name: `review-pr-{number}-{YYYYMMDD-HHMMSS}` and description: "PR Review: {title}". + +If team creation fails (tool unavailable or experimental features disabled), inform the user that swarm mode requires agent teams to be enabled (`CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` in settings.json), then fall back to executing the Standard Workflow instead. The PR identifier is already parsed and ready to use. + +### Phase 1: PR Information Gathering + +Execute the same PR information gathering steps as Standard Workflow Phase 1 (extract PR identifier, fetch metadata, checkout branch). This establishes the shared context that all teammates will need. + +After gathering PR information and checking out the branch, proceed to spawn teammates for parallel review. + +### Shared Task List + +Create tasks via `TaskCreate` that represent the specialized review areas: +1. **Security Review** — Identify security vulnerabilities and risks +2. **Performance Review** — Evaluate performance implications +3. **Test Coverage Review** — Assess test adequacy and quality +4. **Architecture Review** — Examine design patterns and code organization + +These tasks provide structure for parallel specialized reviews. + +### Teammate Roles and Spawn Protocol + +After completing Phase 1, spawn 4 specialized reviewer teammates via the `Task` tool with `team_name` parameter and `subagent_type: "general-purpose"`. Each teammate prompt MUST include as string literals (NOT references to conversation): + +**Required Context in Each Spawn Prompt**: +- PR title +- PR author +- Base branch name +- Head branch name +- Diff summary from `git diff $BASE_BRANCH...HEAD --stat` +- List of changed files +- For PRs with ≤50 changed files: Include the changed file contents (read them before spawning) +- For PRs with >50 changed files: Include file list only, instruct teammates to Read specific files they need to examine +- PR description/body + +**Teammate 1: Security Reviewer** + +``` +You are conducting a security-focused review as part of a PR review team. + +PR TITLE: {literal PR title} +PR AUTHOR: {literal author} +BASE BRANCH: {literal base branch} +HEAD BRANCH: {literal head branch} + +DIFF SUMMARY: +{literal git diff --stat output} + +CHANGED FILES: +{literal list of changed files} + +[If ≤50 files] +CHANGED FILE CONTENTS: +{literal file contents} + +[If >50 files] +Note: This PR has >50 changed files. Use the Read tool to examine specific files you need to review for security concerns. + +PR DESCRIPTION: +{literal PR body} + +YOUR TASK: +Review this PR for security vulnerabilities and risks. Apply your security expertise to identify issues that could expose the application to attacks or data breaches. + +Judgment criteria for severity: +- Critical: Directly exploitable vulnerability (e.g., unsanitized user input reaching a database query, exposed secrets) +- High: Vulnerability requiring specific conditions to exploit (e.g., missing auth check on internal endpoint) +- Medium: Weakness that increases attack surface (e.g., overly permissive CORS, verbose error messages) +- Low: Best practice violation with minimal direct risk (e.g., missing security headers) + +Common areas to examine: authentication flows, input handling, dependency changes, secrets in code, and authorization boundaries. Use your expertise to identify issues beyond this list. + +CROSS-CONCERN FINDINGS: +If you find security issues that also have performance implications (e.g., DoS potential), share via SendMessage: +"Security: This {issue} at {file}:{line} also has {other concern} implications — {explanation}" + +WORKING CONSTRAINTS: +You're operating in a parallel review team. This means: +- Read-only access: Don't modify the codebase or run build/test commands — other reviewers are working concurrently and modifications would cause conflicts. +- Team communication only: Use SendMessage for cross-concern findings. You cannot interact with the user directly. + +COMPLETION: +When you've completed your security review, mark your task complete via TaskUpdate with your findings, send "REVIEW COMPLETE" via SendMessage, and wait for shutdown_request. + +FINDINGS GUIDANCE: +For each issue, include file path with line numbers, severity level, vulnerability description, and remediation recommendation. Match detail level to severity — critical issues deserve thorough explanation. +``` + +**Teammate 2: Performance Reviewer** + +``` +You are conducting a performance-focused review as part of a PR review team. + +PR TITLE: {literal PR title} +PR AUTHOR: {literal author} +BASE BRANCH: {literal base branch} +HEAD BRANCH: {literal head branch} + +DIFF SUMMARY: +{literal git diff --stat output} + +CHANGED FILES: +{literal list of changed files} + +[If ≤50 files] +CHANGED FILE CONTENTS: +{literal file contents} + +[If >50 files] +Note: This PR has >50 changed files. Use the Read tool to examine specific files you need to review for performance concerns. + +PR DESCRIPTION: +{literal PR body} + +YOUR TASK: +Review this PR for performance implications. Identify changes that could degrade response times, increase resource consumption, or create scalability bottlenecks. + +Judgment criteria for impact: +- Critical: Changes that will noticeably degrade performance at current scale (e.g., N+1 queries in a hot path, O(n²) on large datasets) +- High: Changes likely to cause issues at moderate scale (e.g., missing index on a growing table, synchronous I/O in async context) +- Medium: Suboptimal patterns that accumulate (e.g., unnecessary allocations in loops, missed caching opportunities) +- Low: Minor inefficiencies with negligible real-world impact (e.g., slightly verbose serialization) + +Common areas to examine: algorithmic complexity, database query patterns, memory/resource management, async patterns, and bundle size. Use your expertise to identify issues beyond this list. + +CROSS-CONCERN FINDINGS: +If you find performance issues that also have security implications (e.g., DoS potential), share via SendMessage: +"Performance: This {issue} at {file}:{line} also has {other concern} implications — {explanation}" + +WORKING CONSTRAINTS: +You're operating in a parallel review team. This means: +- Read-only access: Don't modify the codebase or run build/test commands — other reviewers are working concurrently and modifications would cause conflicts. +- Team communication only: Use SendMessage for cross-concern findings. You cannot interact with the user directly. + +COMPLETION: +When you've completed your performance review, mark your task complete via TaskUpdate with your findings, send "REVIEW COMPLETE" via SendMessage, and wait for shutdown_request. + +FINDINGS GUIDANCE: +For each issue, include file path with line numbers, impact level, concern description, and optimization recommendation. Match detail level to impact — critical issues deserve thorough explanation. +``` + +**Teammate 3: Test Coverage Reviewer** + +``` +You are conducting a test coverage review as part of a PR review team. + +PR TITLE: {literal PR title} +PR AUTHOR: {literal author} +BASE BRANCH: {literal base branch} +HEAD BRANCH: {literal head branch} + +DIFF SUMMARY: +{literal git diff --stat output} + +CHANGED FILES: +{literal list of changed files} + +[If ≤50 files] +CHANGED FILE CONTENTS: +{literal file contents} + +[If >50 files] +Note: This PR has >50 changed files. Use the Read tool to examine specific files you need to review for test coverage. + +PR DESCRIPTION: +{literal PR body} + +YOUR TASK: +Review this PR for test adequacy and quality. Assess whether the test suite adequately covers the changes and would catch regressions. + +Judgment criteria for gap severity: +- Critical: Core functionality completely untested (e.g., new API endpoint with no tests, auth logic without coverage) +- High: Important edge cases missing (e.g., error handling paths, boundary conditions on critical logic) +- Medium: Test exists but is shallow or brittle (e.g., only tests happy path, uses implementation details) +- Low: Nice-to-have coverage improvements (e.g., additional assertion specificity, minor edge cases) + +Key questions to answer: Are tests meaningful (not just "it doesn't crash")? Do they cover error states and edge cases? Would they catch regressions if someone modifies this code later? Are tests maintainable and isolated? + +CROSS-CONCERN FINDINGS: +If you find test gaps that expose security or performance risks, share via SendMessage: +"Test Coverage: Missing tests at {file}:{line} also creates {other concern} risk — {explanation}" + +WORKING CONSTRAINTS: +You're operating in a parallel review team. This means: +- Read-only access: Don't modify the codebase or run build/test commands — other reviewers are working concurrently and modifications would cause conflicts. +- Team communication only: Use SendMessage for cross-concern findings. You cannot interact with the user directly. + +COMPLETION: +When you've completed your test coverage review, mark your task complete via TaskUpdate with your findings, send "REVIEW COMPLETE" via SendMessage, and wait for shutdown_request. + +FINDINGS GUIDANCE: +For each gap, include file path with line numbers, severity, what test scenario is missing, and recommendation. Match detail level to severity — critical gaps deserve thorough explanation. +``` + +**Teammate 4: Architecture Reviewer** + +``` +You are conducting an architecture-focused review as part of a PR review team. + +PR TITLE: {literal PR title} +PR AUTHOR: {literal author} +BASE BRANCH: {literal base branch} +HEAD BRANCH: {literal head branch} + +DIFF SUMMARY: +{literal git diff --stat output} + +CHANGED FILES: +{literal list of changed files} + +[If ≤50 files] +CHANGED FILE CONTENTS: +{literal file contents} + +[If >50 files] +Note: This PR has >50 changed files. Use the Read tool to examine specific files you need to review for architecture concerns. + +PR DESCRIPTION: +{literal PR body} + +YOUR TASK: +Review this PR for architectural quality and design patterns. Evaluate whether the changes follow established codebase conventions and maintain a sustainable design. + +Judgment criteria for impact: +- Critical: Breaking changes to public APIs, circular dependencies introduced, or fundamental design violations that would be costly to fix later +- High: Patterns that deviate significantly from codebase conventions, tight coupling that limits extensibility, or poor separation of concerns +- Medium: Suboptimal design choices that work but create maintenance burden (e.g., logic in wrong layer, inconsistent abstractions) +- Low: Style-level architectural preferences with minimal real impact + +Key questions to answer: Does this follow the existing codebase's patterns? Are responsibilities clearly separated? Would a new developer understand the design intent? Are there breaking changes that affect consumers? + +CROSS-CONCERN FINDINGS: +If you find architectural issues that affect security, performance, or testability, share via SendMessage: +"Architecture: This {issue} at {file}:{line} also has {other concern} implications — {explanation}" + +WORKING CONSTRAINTS: +You're operating in a parallel review team. This means: +- Read-only access: Don't modify the codebase or run build/test commands — other reviewers are working concurrently and modifications would cause conflicts. +- Team communication only: Use SendMessage for cross-concern findings. You cannot interact with the user directly. + +COMPLETION: +When you've completed your architecture review, mark your task complete via TaskUpdate with your findings, send "REVIEW COMPLETE" via SendMessage, and wait for shutdown_request. + +FINDINGS GUIDANCE: +For each issue, include file path with line numbers, impact level, architectural concern, and improvement recommendation. Match detail level to impact — critical issues deserve thorough explanation. +``` + +### Completion Protocol + +Wait for all 4 teammates to signal completion by sending "REVIEW COMPLETE" messages. Timeout: 10 minutes from teammate spawn time. + +**If timeout occurs**: Proceed with available findings and note which teammates timed out in the consolidated review. + +**Fallback behavior**: If a teammate fails or gets stuck (repeated similar messages, no progress), you have three options: +1. Note the failure and proceed with other teammates' findings +2. Spawn a replacement teammate with clearer scoped instructions +3. Handle that review aspect yourself + +Choose based on how critical that specialized review is to the overall PR assessment. + +### Consolidation + +As team lead, integrate teammate findings into a comprehensive PR review. Your job is to synthesize specialized findings into the same output structure used in Standard Workflow, not mechanically merge outputs. + +**Reviewer Attribution**: Mark which teammate(s) found each issue: `[Security]`, `[Performance]`, `[Test Coverage]`, `[Architecture]`. Mark independently confirmed findings from multiple reviewers `[Consensus]`. + +**Output Format**: Use the same review structure as Standard Workflow Phase 3: +- Executive Summary (synthesize risk level based on all reviewer findings) +- Code Quality Analysis (incorporate findings from all specialized reviews) +- Detailed File-by-File Review (merge findings by file, preserve all reviewer attributions) +- Discussion & CI Review (same as standard workflow) +- Testing Verification (same as standard workflow) +- Recommendations & Action Items (merge and prioritize findings from all reviewers) + +**Cross-cutting Findings**: When findings from multiple reviewers relate to the same code location, highlight this: +"[Consensus] The dependency update at package.json:42 raises both security concerns (known CVE) and performance concerns (increased bundle size)" + +### Resource Cleanup + +After completing consolidation (whether successful or failed), always clean up team resources. + +Send shutdown requests to all teammates via `SendMessage` with `type: "shutdown_request"`, wait briefly for confirmations, then call `TeamDelete` to remove the team and its task list. + +If cleanup itself fails, inform the user: "Team cleanup incomplete. You may need to check for lingering team resources." + +Execute cleanup regardless of consolidation outcome—even if earlier steps errored or teammates timed out, cleanup must run before ending. + +--- + +## Standard Workflow + +The default PR review approach for comprehensive single-agent analysis. ### Phase 1: PR Information Gathering diff --git a/package.json b/package.json index a768de2..36a9e98 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "github-plugin", - "version": "1.4.1", + "version": "1.5.0", "description": "GitHub CI/CD automation plugin for Claude Code", "private": true, "type": "module", diff --git a/skills/ci-fix-loop/SKILL.md b/skills/ci-fix-loop/SKILL.md index b349ac8..3047424 100644 --- a/skills/ci-fix-loop/SKILL.md +++ b/skills/ci-fix-loop/SKILL.md @@ -11,9 +11,17 @@ Orchestrates autonomous CI repair: analyze → fix → commit → push → monit This skill is invoked when: - User runs `/fix-ci --loop` or `/fix-ci --auto` +- User runs `/fix-ci --swarm` - Multiple CI fix iterations are needed - User wants hands-off CI repair +## Argument Parsing + +The command arguments may contain optional flags that modify the fix loop behavior: +- Extract `--swarm` flag if present (indicates user wants parallel team-based fixing) +- Extract `--loop` or `--auto` flags (standard autonomous mode) +- The flags can be combined with other arguments + ## Configuration | Setting | Value | Description | @@ -127,19 +135,127 @@ if current_errors is empty: #### Step 2.5: Apply Fixes -Invoke the `ci-error-fixer` agent with error list: +**Condition Check**: Determine fix strategy based on error distribution and `--swarm` flag: + +``` +file_count = count of distinct files with errors +use_swarm = (file_count >= 2) AND (--swarm flag is set) +``` + +**If use_swarm is FALSE** (errors in single file OR --swarm not requested): +- Invoke the `ci-error-fixer` agent with error list - Applies targeted fixes based on error type - Shows diffs for each change - Reports fixed vs flagged-for-manual-review counts +- Track results: + ``` + errors_fixed = count of successfully fixed errors + errors_flagged = count of errors needing manual review + ``` + +**If use_swarm is TRUE** (2+ files with errors AND --swarm flag set): + +**Team Prerequisites and Fallback**: + +Attempt to create the agent team using `TeamCreate` with a unique timestamped name: `fix-ci-{YYYYMMDD-HHMMSS}` and description: "CI Fix Attempt {attempt}". + +If team creation fails (tool unavailable or experimental features disabled), inform the user that swarm mode requires agent teams to be enabled (`CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` in settings.json), then fall back to the standard (non-swarm) fix approach using the existing `ci-error-fixer` agent as above. + +**Partitioning Strategy** (performed by lead after centralized analysis in Step 2.3): + +1. Group errors by file path (deterministic: sort errors by file path first) +2. If >4 distinct files, group by directory proximity to create max 4 partitions +3. Non-file-specific errors (dependency resolution, config issues, flaky tests without clear file association) remain with lead for sequential handling +4. Each partition contains: error list for those files, file paths, error types and messages + +**Teammate Spawn Protocol** (max 4 teammates, one per file partition): + +Spawn each teammate via the `Task` tool with `team_name` parameter and `subagent_type: "general-purpose"`. Each teammate prompt MUST include as string literals: + +``` +You are fixing CI errors for a specific set of files as part of a parallel fix team. + +YOUR FILE PARTITION: +{list of file paths assigned to this teammate} + +ERRORS TO FIX: +{error list for your files - type, file, line, message} + +FILE CONTENTS: +{Read and include relevant file contents} + +YOUR TASK: +Fix all errors in your assigned files. Apply targeted fixes based on error type: +- Lint errors: auto-fix with formatter/linter where possible +- Type errors: add type annotations, fix type mismatches +- Test failures: fix test logic or implementation bugs +- Build errors: fix import paths, missing dependencies + +WORKING CONSTRAINTS: +You're one of several agents editing files in parallel. This means: +- Edit only your assigned files — the lead handles git staging and commits to avoid conflicts between teammates. +- Don't run tests or builds — they'd interfere with other teammates' file changes happening concurrently. +- Communicate only via SendMessage (no user interaction available in team context). +- Make minimal, targeted fixes — over-editing risks introducing new errors and makes the lead's review harder. +- Read files completely without limit/offset so you don't miss relevant context. + +COMPLETION SIGNAL: +When you've fixed all errors in your files: +1. Send "FIX COMPLETE" via SendMessage +2. Wait for shutdown_request + +ERROR TYPES AND HANDLING: +{specific guidance based on error types in this partition} +``` + +**Completion Protocol**: + +Wait for all teammates to signal completion by sending "FIX COMPLETE" messages. Timeout: 10 minutes from teammate spawn time. -Track results: +If timeout occurs, proceed with available fixes and note which teammates timed out. + +**Lead Collects Changes**: + +After all teammates complete (or timeout), the lead stages ALL changes in a single commit and single push: + +```bash +git add . + +git commit -m "fix(ci): automated swarm fix attempt ${attempt} + +Errors addressed across ${file_count} files: +- ${error_summary_list} + +Attempt ${attempt} of ${max_attempts} (ci-fix-loop with swarm)" ``` -errors_fixed = count of successfully fixed errors + +This replaces the per-error serial commits from the non-swarm approach. + +**Resource Cleanup**: + +Always execute cleanup regardless of success or failure: +1. Send shutdown requests to all teammates via `SendMessage` with `type: "shutdown_request"` +2. Wait briefly for confirmations +3. Call `TeamDelete` to remove the team + +If cleanup fails, log warning: "Team cleanup incomplete. You may need to check for lingering team resources." + +Execute cleanup before proceeding to Step 2.6. + +**Track Results**: +``` +errors_fixed = count of successfully fixed errors across all teammates errors_flagged = count of errors needing manual review ``` +**Note**: The team is created and torn down within this single fix iteration (Step 2.5). The outer loop (Phase 2) continues as normal after this step completes. + #### Step 2.6: Commit & Push +**Note**: If swarm mode was used in Step 2.5, the commit and push were already performed by the lead after collecting teammate changes. Skip this step and proceed to Step 2.7. + +**Otherwise** (standard non-swarm mode): + Stage and commit changes: ```bash git add . @@ -315,6 +431,7 @@ echo "Pull and retry: git pull --rebase && /fix-ci --loop" 4. **Progress detection**: Abort if same errors repeat twice 5. **Timeout limits**: 30 min max CI wait per attempt 6. **Commit tracking**: Report all commits for easy revert +7. **Swarm mode constraints**: When using `--swarm`, teammates are restricted to file editing only - they cannot run git commands (commit/push/add), test commands, build commands, or use AskUserQuestion. This prevents coordination issues and ensures the lead maintains control of the git workflow. ## Token Efficiency