diff --git a/commands/address-pr-comments.md b/commands/address-pr-comments.md index 2419b71..0cc7e24 100644 --- a/commands/address-pr-comments.md +++ b/commands/address-pr-comments.md @@ -1,197 +1,239 @@ --- -description: Interactive PR comment resolution workflow +description: Interactive or autonomous PR comment resolution with mandatory replies --- # Address PR Comments Command -Interactive PR comment resolution workflow for addressing reviewer feedback efficiently. +Comprehensive PR comment resolution workflow that addresses reviewer feedback and posts replies to GitHub for every comment. User provided: `$ARGUMENTS` -## Workflow +## Workflow Overview -### 1. Fetch PR Data +This command fetches all PR comments, processes them according to mode (interactive/autonomous), applies fixes where possible, and **posts a reply to GitHub for every comment** — ensuring no feedback is left unacknowledged. -Extract PR number from arguments (could be number like "18" or URL like "https://github.com/owner/repo/pull/123"). +## Phase 1: Preflight Checks and Comment Fetching + +### 1.1 Preflight Checks + +Before any processing, verify the environment is ready. These checks prevent partial failures mid-processing (e.g., committing without git identity, posting replies without auth, or accidentally staging uncommitted work alongside PR fixes). + +1. **`gh auth status`** — The command posts replies to GitHub. Without authentication, it would process comments and apply fixes but fail silently when posting. Abort early with "Not authenticated. Run: gh auth login" +2. **`git status --porcelain`** — The command creates per-fix commits. If the working tree is dirty, those unrelated changes could get accidentally staged alongside PR fixes. Abort with "Working tree has uncommitted changes. Stash or commit changes first." +3. **`git config user.name` and `git config user.email`** — Required for creating commits. Without these, git will either fail or create commits with no author. Abort with instructions to configure. + +Fail fast on any of these — proceeding would cause confusing partial failures later. + +### 1.2 Extract PR Information + +Parse `$ARGUMENTS` to extract PR number and determine repository context: + +- If argument is a number (e.g., "123"): Use as PR number +- If argument is a GitHub URL (e.g., "https://github.com/owner/repo/pull/123"): Extract PR number +- If no argument: Abort with error "PR number or URL required" + +Determine repository owner and name from current directory or URL. + +### 1.3 Fetch All Comment Types + +Fetch all three types of PR comments using GitHub API with pagination: ```bash -# Get PR details and comments -gh pr view {PR_NUMBER} --json title,body,state,author,headRefName,baseRefName,url,reviews +# Get current authenticated user (for reply deduplication) +CURRENT_USER=$(gh api user --jq '.login') -# Get PR review comments (file/line comments) -gh api repos/{OWNER}/{REPO}/pulls/{PR_NUMBER}/comments +# 1. Review comments (file/line-specific comments) +gh api --paginate repos/{OWNER}/{REPO}/pulls/{PR_NUMBER}/comments > review_comments.json -# Get issue comments (general PR comments) -gh api repos/{OWNER}/{REPO}/issues/{PR_NUMBER}/comments +# 2. Review summaries (top-level review comments) +gh api --paginate repos/{OWNER}/{REPO}/pulls/{PR_NUMBER}/reviews > review_summaries.json -# Checkout PR branch -gh pr checkout {PR_NUMBER} +# 3. Issue comments (general PR discussion) +gh api --paginate repos/{OWNER}/{REPO}/issues/{PR_NUMBER}/comments > issue_comments.json ``` -### 2. Analyze Comments +### 1.4 Normalize Comment Structure -Parse fetched comments and: -- Filter actionable comments (exclude author's own comments, focus on file/line refs) -- Read affected files for context -- Categorize by type: - - **CODE**: Logic changes, bug fixes, refactoring suggestions - - **STYLE**: Formatting, naming conventions, code style - - **DOCS**: Documentation improvements, comment clarity - - **TEST**: Test coverage, test improvements - - **QUESTIONS**: Clarifications needed +Parse and normalize all three types into a unified structure. Each comment should have: + +```javascript +{ + id: number, // Comment ID for reply targeting + type: string, // "review" | "review-summary" | "issue" + path: string | null, // File path (review comments only) + line: number | null, // Line number (review comments only) + diff_hunk: string | null, // Diff context (review comments only) + body: string, // Comment text + user: string, // Username of commenter + created_at: string, // Timestamp + in_reply_to_id: number | null, // Parent comment ID (null if top-level) + review_state: string | null // Review state for summaries (APPROVED, CHANGES_REQUESTED, etc.) +} +``` -### 2.5. Score Comments for Autonomous Mode +**Type-specific normalization:** -For each actionable comment, calculate confidence score (0-100) based on multiple factors: +1. **Review comments** (`type: "review"`): + - Extract from `review_comments.json` + - Has `path`, `line`, `diff_hunk` + - May have `in_reply_to_id` -**Scoring Factors:** +2. **Review summaries** (`type: "review-summary"`): + - Extract from `review_summaries.json` + - No file/line context + - Has `state` field (APPROVED, CHANGES_REQUESTED, COMMENTED, etc.) + - `body` may be empty/null -1. **Specificity** (0-30 points): - - Has file path AND line number: +20 points - - Has concrete code suggestion or example: +10 points - - Vague or general comment: +0 points +3. **Issue comments** (`type: "issue"`): + - Extract from `issue_comments.json` + - No file/line context + - General discussion -2. **Language Clarity** (0-25 points): - - Directive language ("Please change", "Must fix", "Should use"): +25 points - - Suggestion language ("Consider", "Maybe", "Could"): +15 points - - Question only ("Why did you...?"): +5 points +### 1.5 Filter Comments -3. **Category Confidence** (0-25 points): - - **STYLE** (formatting, naming): +25 points (highly automatable) - - **DOCS** (documentation, comments): +20 points - - **CODE** (logic, refactoring): +15 points (needs more care) - - **TEST** (test coverage): +10 points - - **QUESTIONS** (clarifications): +5 points +Apply filtering rules to determine which comments require processing: -4. **Reviewer Authority** (0-10 points): - - Maintainer or repository owner: +10 points - - Core contributor: +7 points - - Other contributors: +3 points +**EXCLUDE:** +- Comments where `in_reply_to_id != null` (already a reply in a thread) +- Comments where `user` matches PR author's username +- Comments from bots (user contains "bot" or "[bot]") +- Comments with empty/whitespace-only `body` -5. **Discussion Status** (0-10 points): - - Marked as "CHANGES_REQUESTED" or blocking: +10 points - - Unresolved discussion: +7 points - - Already resolved: +0 points (skip) +**INCLUDE:** +- All other comments from review comments, review summaries, and issue comments +- Questions, suggestions, praise, requests — everything gets processed -**Confidence Thresholds:** +**DO NOT filter out:** +- Questions (they need answers) +- Praise (they deserve acknowledgment) +- Out-of-scope comments (they need deferral responses) -- **High (80-100)**: Auto-address in autonomous mode - - Clear, specific, safe changes with concrete suggestions - - Example: "Use const instead of let on line 42" +### 1.6 Check for Existing Replies (Idempotency) -- **Medium (60-79)**: Present to user in interactive mode - - Reasonable suggestions but need human judgment - - Example: "Consider extracting this logic into a separate function" +For each comment that passed filtering, check if the current user has already replied: -- **Low (0-59)**: Skip or flag for manual review - - Vague, ambiguous, or complex changes - - Example: "This approach might have performance issues" +```bash +# For review comments (type: "review") +# Check if any reply in the comment thread is from CURRENT_USER +gh api repos/{OWNER}/{REPO}/pulls/{PR_NUMBER}/comments/{COMMENT_ID} --jq '.user.login' +# Also check the thread by looking at all comments and filtering by in_reply_to_id -**Scoring Examples:** +# For issue comments (type: "issue") +# Check if any subsequent issue comment quotes/references this one from CURRENT_USER +# For review summaries (type: "review-summary") +# Check if any issue comment from CURRENT_USER references the review ID ``` -Comment: "Please use const instead of let for maxRetries on line 42" -- Specificity: 30 (file+line+suggestion) -- Language: 25 (directive) -- Category: 25 (STYLE) -- Authority: 10 (maintainer) -- Status: 10 (CHANGES_REQUESTED) -= TOTAL: 100 → High confidence, auto-address -Comment: "Consider refactoring this method for better readability" -- Specificity: 0 (no location, vague) -- Language: 15 (suggestion) -- Category: 15 (CODE) -- Authority: 3 (contributor) -- Status: 7 (unresolved) -= TOTAL: 40 → Low confidence, skip +**Mark as `already_replied: true`** if an existing reply from current user is found. These comments will be skipped from reply posting but shown in the summary. -Comment: "Add null check before accessing user.email on line 120" -- Specificity: 30 (file+line+concrete) -- Language: 25 (directive) -- Category: 15 (CODE) -- Authority: 10 (maintainer) -- Status: 10 (blocking) -= TOTAL: 90 → High confidence, auto-address -``` +### 1.7 Confidence Scoring + +Score each comment 0-100 based on how confidently you can apply the right fix without human judgment. This score drives autonomous mode filtering and prioritization in interactive mode. + +**Judgment dimensions:** + +- **Specificity**: Does the comment point to exact code and suggest a concrete change? A comment saying "use const instead of let on line 42" is near-certain; "consider refactoring" is not. Comments with `path`, `line`, and code examples score highest. +- **Clarity**: Is the intent unambiguous? Directive language ("please change", "must fix") is clearer than hedged language ("maybe consider", "could perhaps"). Questions without suggested changes score lowest. +- **Risk**: Is the fix low-risk (naming, formatting, documentation) or could it change behavior (logic, API contracts, test assertions)? Low-risk changes can be applied with higher confidence. +- **Authority**: Is the reviewer a maintainer with project context (`OWNER`, `COLLABORATOR`), or an outside contributor who may misunderstand the design? Maintainer feedback carries more weight. +- **Urgency**: Is this from a `CHANGES_REQUESTED` review (blocking merge) or a casual `COMMENTED` review? Blocking reviews deserve higher priority. + +**Thresholds:** + +- **80+**: Safe to auto-fix in autonomous mode — clear, specific, low-risk +- **60-79**: Present to user in interactive mode — reasonable but needs human judgment +- **Below 60**: Reply without fixing — too ambiguous to act on confidently + +Use your judgment. These dimensions are guidelines for holistic assessment, not arithmetic. A comment from a maintainer saying "please rename this variable" might score 95; a vague comment from a contributor saying "this could be better" might score 30. + +Store the score with each comment for filtering and reporting. -### 2.9. Detect Execution Mode +## Phase 2: Per-Comment Processing with Mandatory Replies -**Auto-detect if running in non-interactive environment:** +### 2.1 Detect Execution Mode -The command automatically detects whether to run in interactive or autonomous mode: +Auto-detect whether to run in interactive or autonomous mode: ```bash -# Check if stdin is a terminal (TTY) -if [ -t 0 ]; then +# Check for explicit mode override in arguments +if [[ "$ARGUMENTS" =~ (--autonomous|--auto|auto) ]]; then + MODE="autonomous" +elif [[ "$ARGUMENTS" =~ (--interactive|interactive) ]]; then MODE="interactive" -else +# Check for CI/CD environment +elif [[ -n "$CI" || -n "$GITHUB_ACTIONS" || -n "$JENKINS_URL" || -n "$GITLAB_CI" || -n "$CIRCLECI" || -n "$TRAVIS" ]]; then MODE="autonomous" -fi - -# Check for CI/CD environment variables -if [ -n "$CI" ] || [ -n "$GITHUB_ACTIONS" ] || [ -n "$JENKINS_URL" ] || [ -n "$GITLAB_CI" ]; then +# Check if stdin is a TTY +elif [[ ! -t 0 ]]; then MODE="autonomous" +else + MODE="interactive" fi ``` -**User can override detection with arguments:** -- Arguments contain "auto", "--autonomous", or "--auto" → Force autonomous mode -- Arguments contain "interactive" or "--interactive" → Force interactive mode +### 2.2 Create Safety Checkpoint (Autonomous Mode Only) + +**Before making any changes in autonomous mode**, create a rollback point: -**Mode Behaviors:** +```bash +TIMESTAMP=$(date +%Y%m%d_%H%M%S) +STASH_NAME="pre-autonomous-pr-${PR_NUMBER}-${TIMESTAMP}" -**Interactive Mode:** -- Use AskUserQuestion tool to present all comments -- User selects which items to address -- Show all comments regardless of confidence score -- Verbose explanations and confirmations +# Stash any uncommitted changes +if [ -n "$(git status --porcelain)" ]; then + git stash push -u -m "$STASH_NAME" + echo "✓ Uncommitted changes stashed as: $STASH_NAME" +fi -**Autonomous Mode:** -- Filter comments by confidence threshold (≥80) -- Auto-address high-confidence items without prompting -- Skip low/medium confidence items -- Generate detailed report of actions taken -- Create rollback checkpoint before changes +# Record current commit for reference +ROLLBACK_SHA=$(git rev-parse HEAD) +echo "✓ Safety checkpoint created at commit: $ROLLBACK_SHA" +``` -### 3. Present Options (Interactive Mode) OR Auto-Filter (Autonomous Mode) +### 2.3 Present Comments (Interactive Mode) **IF MODE = "interactive":** -Display actionable comments in organized format: +Display all comments (regardless of confidence score) in organized format: ``` -Found {N} comments to address on PR #{NUMBER}: {TITLE} +Found {N} comments on PR #{NUMBER}: {TITLE} -Actionable Items: +Comments to Review: ───────────────────────────────────────────────────────── -1. [CODE] {file_path}:{line_number} (Score: 85) - Reviewer: @{username} - Comment: "{comment text}" - Suggested change: {describe what needs to be done} +1. [CODE] {file_path}:{line} (Score: 85) — @{reviewer} + "{comment body excerpt}" + Suggested action: {what needs to be done} + +2. [STYLE] {file_path}:{line} (Score: 95) — @{reviewer} + "{comment body excerpt}" + Suggested action: {what needs to be done} -2. [STYLE] {file_path}:{line_number} (Score: 95) - Reviewer: @{username} - Comment: "{comment text}" - Suggested change: {describe what needs to be done} +3. [QUESTION] {file_path}:{line} (Score: 40) — @{reviewer} + "{comment body excerpt}" + Suggested action: Provide explanation -3. [DOCS] {file_path}:{line_number} (Score: 50) - Reviewer: @{username} - Comment: "{comment text}" - Suggested change: {describe what needs to be done} +4. [DOCS] No file specified (Score: 60) — @{reviewer} + "{comment body excerpt}" + Suggested action: Update documentation ───────────────────────────────────────────────────────── Which items would you like me to address? -Options: "1,3,4" | "1-5" | "all" | "1" +Options: "1,3,4" | "1-5" | "all" | "none" ``` -**Use AskUserQuestion tool** to present the selection options to the user. +**Use AskUserQuestion tool** to get user selection. + +Parse user response and mark selected comments for processing. + +### 2.4 Filter Comments (Autonomous Mode) **IF MODE = "autonomous":** -Filter and display only high-confidence comments (score ≥ 80): +Filter comments by confidence threshold (≥80) and display processing plan: ``` 🤖 AUTONOMOUS MODE ACTIVE @@ -199,405 +241,444 @@ Filter and display only high-confidence comments (score ≥ 80): Analyzing {N} total comments on PR #{NUMBER}: {TITLE} Confidence threshold: 80/100 -High-Confidence Items (will be addressed automatically): +High-Confidence Items (will be addressed): ───────────────────────────────────────────────────────── -1. [STYLE] src/utils.ts:42 (Score: 95) - Reviewer: @maintainer +1. [STYLE] src/utils.ts:42 (Score: 95) — @maintainer Comment: "Please use const instead of let for maxRetries" - Action: Change variable declaration + Action: Change variable declaration → will fix + reply -2. [DOCS] README.md:15 (Score: 88) - Reviewer: @maintainer - Comment: "Add installation instructions for Docker setup" - Action: Add documentation section - -3. [CODE] src/auth.ts:120 (Score: 90) - Reviewer: @owner +2. [CODE] src/auth.ts:120 (Score: 90) — @owner Comment: "Add null check before accessing user.email" - Action: Add safety check + Action: Add safety check → will fix + reply -Skipped Items (below confidence threshold): +Below-Threshold Items (will reply without fix): ───────────────────────────────────────────────────────── -4. [CODE] src/api.ts:55 (Score: 65) - Reviewer: @contributor +3. [CODE] src/api.ts:55 (Score: 65) — @contributor Comment: "Consider refactoring this for better performance" - Reason: Vague suggestion - medium confidence, requires human review - -5. [TEST] tests/auth.test.ts (Score: 45) - Reviewer: @contributor - Comment: "Why didn't you add tests for the error case?" - Reason: Question without clear action - low confidence + Action: Reply requesting clarification -6. [QUESTIONS] src/models.ts:78 (Score: 35) - Reviewer: @contributor +4. [QUESTION] src/models.ts:78 (Score: 35) — @contributor Comment: "This approach might have issues" - Reason: Ambiguous feedback - low confidence + Action: Reply requesting specific concerns ───────────────────────────────────────────────────────── -Processing 3 high-confidence items automatically... +Processing {X} high-confidence items + replying to all {N} comments... ``` -Proceed to address only the high-confidence items (score ≥ 80). +### 2.5 Process Each Comment -### 3.5. Create Safety Checkpoint (Autonomous Mode Only) +For each comment (whether selected in interactive mode or high-confidence in autonomous mode), follow this workflow: -**Before making any changes in autonomous mode**, create a rollback point: +#### Category: `actionable/clear` (Score ≥ 80, has file+line) -```bash -# Create timestamp for unique identification -TIMESTAMP=$(date +%Y%m%d_%H%M%S) -STASH_NAME="pre-autonomous-pr-${PR_NUMBER}-${TIMESTAMP}" +**Goal:** Apply the reviewer's suggestion correctly, commit it atomically, and reply with the commit SHA. -# Stash any uncommitted changes (including untracked files) -if [ -n "$(git status --porcelain)" ]; then - git stash push -u -m "$STASH_NAME" - echo "✓ Uncommitted changes stashed" -fi +**Key principles:** -# Record the current commit SHA for reference -ROLLBACK_SHA=$(git rev-parse HEAD) +- **Locate code by context, not line numbers.** The `diff_hunk` from the comment shows surrounding code context. Use this to find the right location because line numbers may have shifted from earlier fixes on the same file. If the context can't be found, treat as `missing-file`. +- **One commit per fix** so each fix is traceable and independently revertable. Stage only the file you changed (`git add {path}` — never `git add .` or `git add -A`, because that risks staging unrelated changes). +- **Verify before committing.** Check `git diff --quiet -- "{path}"` after applying the edit. If the file didn't actually change, the code already matches the suggestion — reply accordingly instead of creating an empty commit. +- **If the file is missing or renamed**, don't guess — categorize as `missing-file` and reply asking the reviewer to confirm relevance. -echo "✓ Safety checkpoint created" -echo " Stash name: $STASH_NAME" -echo " Base commit: $ROLLBACK_SHA" -echo "" +**Commit message format:** `fix: {description}` with body referencing reviewer and comment ID, because this creates an audit trail linking commits to specific review feedback: ``` +fix: {brief description} -**Rollback mechanism:** -- All changes are made on the current working tree -- User can easily undo with: `git stash apply stash^{/$STASH_NAME}` -- If satisfied, user can drop the stash: `git stash drop stash^{/$STASH_NAME}` +Addresses comment from @{reviewer} +Comment ID: {id} +``` -Include rollback instructions in the final report. +Capture the commit SHA via `git rev-parse --short HEAD` and include it in the reply so the reviewer can click through to see exactly what changed. -### 4. Address Changes +Use your judgment on fix complexity. If the suggestion requires changes across multiple files or has behavioral implications you're unsure about, categorize as `actionable/unclear` and ask for clarification instead of guessing. -For each selected item: +#### Category: `actionable/unclear` (Score 60-79, or unclear suggestion) -1. **Show context**: Display the relevant file section and explain the requested change -2. **Read file**: Use Read tool to get current file content -3. **Apply change**: Use Edit tool to make the requested modification -4. **Report**: `✓ Addressed item {N}: {description}` +Reply asking for clarification. Show the reviewer you understood their comment by offering specific interpretations of what they might mean, rather than a generic "please clarify." This narrows the discussion and helps them respond quickly. -Example workflow per item: -``` -Addressing item 2: [STYLE] src/utils.ts:42 -Comment: "Please use const instead of let for immutable variables" +No commit — just reply. -Reading src/utils.ts... -Found line 42: let maxRetries = 3; +#### Category: `not-actionable/question` -Applying change: Converting let to const... -✓ Changed line 42 from 'let' to 'const' +Read the relevant code (if the comment has a `path`, read that file for context) and answer the actual question. Explain *why* the code is written that way based on what you see. If you don't know, say so honestly and suggest who might. -✓ Addressed item 2: Changed variable declaration to const -``` +No commit — just reply. -### 5. Summary +#### Category: `not-actionable/praise` -After addressing all selected items: +Brief acknowledgment. Don't over-elaborate — match the reviewer's energy. -```bash -# Show what changed -git status --short -git diff --stat -``` +No commit — just reply. -**Interactive Mode Report:** +#### Category: `not-actionable/out-of-scope` -``` -📊 Summary - ✅ {N} comments addressed - 📝 {M} files modified +Acknowledge the value of the suggestion and explain why it's being deferred, not dismissed. If you can identify a follow-up context (future PR, separate issue), mention it. -Changes: - • {file1}: {description} - • {file2}: {description} +No commit — just reply. -Remaining: {X} comments still need attention +#### Category: `below-threshold` (Autonomous mode only, Score < 80) -Next steps: - 1. Review changes: git diff - 2. Run tests: {test command if known} - 3. Commit: git add . && git commit -m "fix: address PR review comments" - 4. Push: git push -``` +Be honest about why you didn't auto-fix. Explain the specific reason the confidence was low (e.g., "suggestion lacks concrete implementation details" or "change could affect behavior in ways I can't verify"). The reviewer should understand what to do next. -**Autonomous Mode Report:** +No commit — just reply. -``` -🤖 AUTONOMOUS EXECUTION COMPLETE +#### Category: `missing-file` -📊 Statistics: - ✅ {N} high-confidence comments addressed (score ≥ 80) - ⏭️ {M} comments skipped (below confidence threshold) - 📝 {X} files modified - ⏱️ Execution time: {duration} +The file referenced in the comment doesn't exist. Reply noting it may have been moved or renamed, and ask the reviewer to confirm if the comment is still relevant. -High-Confidence Items Addressed: -────────────────────────────────────────────────────────── +No commit — just reply. -1. ✓ [STYLE] src/utils.ts:42 (Score: 95) - Comment: "Please use const instead of let for maxRetries" - Applied: Changed 'let maxRetries = 3' to 'const maxRetries = 3' +#### Interactive Mode: User Did Not Select -2. ✓ [DOCS] README.md:15 (Score: 88) - Comment: "Add installation instructions for Docker setup" - Applied: Added Docker setup section with installation steps +For comments the user chose not to address, reply noting the comment has been deferred to manual review. -3. ✓ [CODE] src/auth.ts:120 (Score: 90) - Comment: "Add null check before accessing user.email" - Applied: Added 'if (user && user.email)' check +No commit — just reply. -Skipped Items Requiring Human Review: -────────────────────────────────────────────────────────── +### Reply Quality Criteria -4. ⏭️ [CODE] src/api.ts:55 (Score: 65 - Medium confidence) - Comment: "Consider refactoring this for better performance" - Reason: Vague suggestion without concrete implementation details - Action: Recommend manual review by developer +Every reply should be **contextual and useful** — not generic boilerplate. The reviewer should feel their specific feedback was understood, not that a bot processed it. -5. ⏭️ [TEST] tests/auth.test.ts (Score: 45 - Low confidence) - Comment: "Why didn't you add tests for the error case?" - Reason: Question format, no clear action specified - Action: Requires clarification with reviewer +**Judgment criteria:** +- **Actionable fixes**: Reference the specific commit SHA. Briefly describe what changed so the reviewer doesn't have to click through. +- **Clarification requests**: Offer specific interpretations rather than "please clarify." Show you read the comment. +- **Questions**: Answer the actual question with code context. Don't dodge with "great question!" +- **Praise/acknowledgment**: Brief and genuine. One sentence is fine. +- **Deferrals**: Explain *why* it's deferred, not just *that* it's deferred. -6. ⏭️ [QUESTIONS] src/models.ts:78 (Score: 35 - Low confidence) - Comment: "This approach might have issues" - Reason: Ambiguous feedback without specifics - Action: Needs detailed discussion with team +Adapt tone to the reviewer's tone. Match their formality level. A casual "LGTM" deserves a casual acknowledgment, not a formal response. + +### 2.6 Skip Already-Replied Comments + +If `already_replied: true` (detected in Phase 1.6), skip reply posting but include in summary report. + +## Phase 3: Reply Posting Logic + +### 3.1 Safe Reply Body Construction + +**CRITICAL: Prevent shell injection and malformed JSON.** -🔄 Rollback Information: -────────────────────────────────────────────────────────── - Stash: pre-autonomous-pr-${PR_NUMBER}-${TIMESTAMP} - Base commit: ${ROLLBACK_SHA} +Always construct reply bodies using `jq` with `--arg` flag and pipe to `gh api --input -`: - To undo all changes: - git reset --hard ${ROLLBACK_SHA} - git stash apply stash^{/pre-autonomous-pr-${PR_NUMBER}-${TIMESTAMP}} +```bash +# CORRECT METHOD (prevents injection): +jq -n --arg body "$REPLY_TEXT" '{body: $body}' | \ + gh api -X POST "repos/{OWNER}/{REPO}/pulls/{PR_NUMBER}/comments/{COMMENT_ID}/replies" --input - + +# NEVER USE THIS (unsafe): +gh api -X POST "..." -f body="$REPLY_TEXT" +``` + +### 3.2 Reply to Review Comments (type: "review") - To keep changes and clean up: - git stash drop stash^{/pre-autonomous-pr-${PR_NUMBER}-${TIMESTAMP}} +For comments with `type: "review"`: -📋 Next Steps: -────────────────────────────────────────────────────────── - 1. Review autonomous changes: git diff - 2. Run tests to verify correctness: {test_command} - 3. Review skipped items manually (shown above) - 4. If satisfied, commit: - git add . && git commit -m "fix: address PR comments (autonomous: ${N} items)" - 5. Push changes: git push - 6. If issues found, rollback using instructions above +```bash +ENDPOINT="repos/${OWNER}/${REPO}/pulls/${PR_NUMBER}/comments/${COMMENT_ID}/replies" + +jq -n --arg body "$REPLY" '{body: $body}' | \ + gh api -X POST "$ENDPOINT" --input - ``` -## Rules +### 3.3 Reply to Issue Comments (type: "issue") + +For comments with `type: "issue"`: + +1. **Format reply with quote** (truncate original to 200 chars): + ```bash + QUOTED_ORIGINAL=$(echo "$ORIGINAL_BODY" | head -c 200) + if [ ${#ORIGINAL_BODY} -gt 200 ]; then + QUOTED_ORIGINAL="${QUOTED_ORIGINAL}..." + fi -### Prioritization + FORMATTED_REPLY=$(cat < ${QUOTED_ORIGINAL} -**High Priority** (address first): -- File/line-specific comments with concrete suggestions -- Comments using directive language ("Please change...", "Must fix...") -- Maintainer/owner feedback -- Unresolved discussions marked as blocking + ${REPLY} + EOF + ) + ``` -**Lower Priority**: -- General discussion comments -- Questions without suggested changes -- Already resolved threads -- Praise/approval comments +2. **Post as new issue comment**: + ```bash + ENDPOINT="repos/${OWNER}/${REPO}/issues/${PR_NUMBER}/comments" -### Skip These + jq -n --arg body "$FORMATTED_REPLY" '{body: $body}' | \ + gh api -X POST "$ENDPOINT" --input - + ``` -Don't process: -- Info-only comments without action items -- Questions that just need answers (not code changes) -- Comments marked as resolved or outdated -- Praise/acknowledgment comments -- Comments from the PR author themselves +### 3.4 Reply to Review Summaries (type: "review-summary") -### Safety +For comments with `type: "review-summary"`: -- **No automatic git operations**: Never auto-commit or push -- **Show before doing**: Display planned changes before applying -- **Preserve context**: Don't remove important code/comments -- **Ask when unclear**: Use AskUserQuestion if comment is ambiguous +1. **Format reply referencing the review**: + ```bash + # Handle empty/null body + if [ -z "$ORIGINAL_BODY" ] || [ "$ORIGINAL_BODY" = "null" ]; then + QUOTED_TEXT="No summary text provided — addressing based on review state (${REVIEW_STATE})." + else + QUOTED_ORIGINAL=$(echo "$ORIGINAL_BODY" | head -c 200) + if [ ${#ORIGINAL_BODY} -gt 200 ]; then + QUOTED_ORIGINAL="${QUOTED_ORIGINAL}..." + fi + QUOTED_TEXT="$QUOTED_ORIGINAL" + fi -## Usage + FORMATTED_REPLY=$(cat < From @${REVIEWER}'s review (${REVIEW_STATE}): + > ${QUOTED_TEXT} -**Interactive Mode (default when run in terminal):** + ${REPLY} + EOF + ) + ``` + +2. **Post as issue comment** (review summaries don't have reply threads): + ```bash + ENDPOINT="repos/${OWNER}/${REPO}/issues/${PR_NUMBER}/comments" + + jq -n --arg body "$FORMATTED_REPLY" '{body: $body}' | \ + gh api -X POST "$ENDPOINT" --input - + ``` + +### 3.5 Error Handling for Reply Posting + +Handle API errors gracefully with retries: ```bash -/address-pr-comments 18 # PR number -/address-pr-comments https://github.com/owner/repo/pull/123 # PR URL +# Retry logic with exponential backoff +MAX_RETRIES=3 +RETRY_COUNT=0 +BACKOFF=1 + +while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do + HTTP_CODE=$(jq -n --arg body "$REPLY" '{body: $body}' | \ + gh api -X POST "$ENDPOINT" --input - -i 2>&1 | grep "HTTP" | awk '{print $2}') + + if [ "$HTTP_CODE" = "201" ] || [ "$HTTP_CODE" = "200" ]; then + echo "✓ Reply posted successfully" + break + elif [ "$HTTP_CODE" = "403" ] || [ "$HTTP_CODE" = "429" ]; then + # Rate limited or forbidden - check Retry-After header + RETRY_AFTER=$(gh api -i ... | grep -i "retry-after" | awk '{print $2}') + WAIT_TIME=${RETRY_AFTER:-$BACKOFF} + echo "⏳ Rate limited, waiting ${WAIT_TIME}s before retry..." + sleep $WAIT_TIME + BACKOFF=$((BACKOFF * 2)) + RETRY_COUNT=$((RETRY_COUNT + 1)) + else + echo "❌ Failed to post reply (HTTP $HTTP_CODE), skipping..." + break + fi +done + +if [ $RETRY_COUNT -eq $MAX_RETRIES ]; then + echo "❌ Max retries reached, reply not posted" +fi ``` -**Autonomous Mode (auto-detected in CI/CD environments):** +## Phase 4: Summary Report -The command automatically switches to autonomous mode when: -- Running in CI/CD environment (CI=true, GITHUB_ACTIONS=true, JENKINS_URL, GITLAB_CI, etc.) -- stdin is not a TTY (piped or redirected input) +**Goal:** Give the user a clear picture of what happened — what was fixed, what was replied to, and what needs their attention. -```bash -# Runs autonomously if detected in CI environment -/address-pr-comments 18 +### What to Include -# In CI/CD workflow: -- name: Address PR Comments - run: claude /address-pr-comments ${{ github.event.pull_request.number }} - # Automatically runs in autonomous mode -``` +- **Statistics**: Fixes applied, replies posted, items skipped (already replied) +- **Per-comment detail**: For each comment, show the action taken and reply status +- **Next steps**: What the user should do now (review changes, run tests, push) +- **Rollback instructions** (autonomous mode): How to undo commits, with a note that GitHub replies cannot be rolled back -**Force Specific Mode:** +### How to Structure -```bash -# Force autonomous mode (even in interactive terminal) -/address-pr-comments 18 --autonomous -/address-pr-comments 18 --auto -/address-pr-comments 18 auto +Separate comments into clear groups so the user can scan quickly: +1. **Comments Fixed + Replied** — code was changed, committed, and reply posted with SHA +2. **Comments Replied Without Fix** — questions answered, praise acknowledged, clarifications requested, or items deferred +3. **Already Replied (skipped)** — idempotency caught these; no action taken -# Force interactive mode (even in CI/CD) -/address-pr-comments 18 --interactive -/address-pr-comments 18 interactive -``` +For autonomous mode, include the rollback stash name and base commit SHA. + +### Formatting Guidance -**Environment Variables for Mode Detection:** +Adapt detail level to the volume of comments: +- **Few comments (< 10)**: Show each comment inline with its action and reply +- **Many comments (10+)**: Summarize by category with counts, list only the fixed items individually -The command checks these environment variables to detect CI/CD: -- `CI=true` (generic CI indicator) -- `GITHUB_ACTIONS=true` (GitHub Actions) -- `JENKINS_URL` (Jenkins) -- `GITLAB_CI` (GitLab CI) -- `CIRCLECI=true` (CircleCI) -- `TRAVIS=true` (Travis CI) +Next steps should omit "post replies manually" (replies are already posted). Focus on: review the diff, run tests, push. + +Note clearly that commits can be reverted but replies posted to GitHub are permanent. ## Error Handling -### PR doesn't exist +### PR Not Found ``` -❌ Error: PR #{NUMBER} not found +❌ Error: PR #{NUMBER} not found in {OWNER}/{REPO} Verify the PR number/URL and try again. Tip: Use `gh pr list` to see available PRs ``` -### No actionable comments +### Not Authenticated +``` +❌ Error: Not authenticated with GitHub CLI + Run: gh auth login +``` + +### Working Tree Dirty ``` -✅ No actionable comments found on PR #{NUMBER} - All comments are either: - - Already addressed - - Questions/discussions without code change requests - - From the PR author +❌ Error: Working tree has uncommitted changes + Stash them first: git stash + Or commit them before addressing PR comments ``` -### Not authenticated with GitHub +### Git User Not Configured ``` -❌ Error: Not authenticated with GitHub CLI - Run: gh auth login +❌ Error: Git user not configured + Run: + git config user.name 'Your Name' + git config user.email 'your@email.com' ``` -### Uncommitted changes +### No Comments Found +``` +✅ No comments found on PR #{NUMBER} + All reviewers are satisfied or comments have been addressed. ``` -⚠️ WARNING: You have uncommitted changes - Stash them first: git stash - Or commit them before addressing PR comments - Continue anyway? (not recommended) +### All Comments Already Replied +``` +✅ All comments on PR #{NUMBER} have already been replied to + Nothing to process. ``` -### File not found +### File Not Found ``` -⚠️ Comment references {file_path} which doesn't exist +⚠️ Warning: Comment references {file_path} which doesn't exist This comment may be outdated or refers to a renamed file. - Skipping... + Reply posted: "This file appears to have been moved or renamed..." ``` -### Ambiguous comment +### API Rate Limit +``` +⏳ Rate limited by GitHub API + Waiting {N} seconds before retry (attempt {X}/{MAX})... ``` -❓ Comment unclear: "{comment text}" - What change would you like me to make? - [Use AskUserQuestion to clarify with user] +### Reply Posting Failed +``` +❌ Failed to post reply to comment {COMMENT_ID} (HTTP {CODE}) + Comment will be shown in summary but reply not posted. + You may need to reply manually. ``` ## Implementation Notes -### General -- Use `gh api` for detailed comment data with file paths and line numbers -- Parse JSON responses to extract actionable items -- Use Read tool to get file context before changes -- Use Edit tool for precise line modifications -- Track which comments are addressed for final report -- Consider comment timestamps to prioritize recent feedback - -### Confidence Scoring Algorithm - -**Step 1: Parse comment structure** -```javascript -{ - path: "src/utils.ts", // File path (if available) - line: 42, // Line number (if available) - body: "Please use const...", // Comment text - user: { login: "reviewer" }, // Commenter info - author_association: "OWNER", // Reviewer role - state: "unresolved" // Discussion status -} +### Comment Fetching +- Use `gh api --paginate` for all comment endpoints to handle repos with many comments +- Normalize all three comment types into unified structure for consistent processing +- Track `in_reply_to_id` to avoid processing existing replies + +### Line Drift Safety +- **Always use `diff_hunk` context** for locating code, not just API `line` number +- API line numbers can become stale if files change after review +- Extract surrounding code from `diff_hunk` and use as anchor text for matching +- If context not found, treat as missing/renamed file + +### Reply Deduplication (Idempotency) +- Before processing, check if current user already replied to each comment +- For review comments: check reply threads via `in_reply_to_id` +- For issue/review-summary comments: scan issue comment history +- Mark as `already_replied: true` and skip posting (but show in summary) + +### Commit Strategy +- **Per-fix commits** (not batched): Each actionable comment gets its own commit +- Commit message format: `fix: {description}\n\nAddresses comment from @{reviewer}\nComment ID: {id}` +- Stage **only the specific file** changed: `git add {path}` (never `git add .`) +- Capture commit SHA for reply: `git rev-parse --short HEAD` + +### Reply Safety +- **Always use `jq -n --arg body "$REPLY" '{body: $body}' | gh api --input -`** +- This prevents shell injection and handles special characters correctly +- Never use `-f body="..."` with complex/untrusted text + +### Confidence Scoring +- Used for autonomous mode filtering (≥80 = auto-address) and interactive mode prioritization +- Holistic assessment across specificity, clarity, risk, authority, and urgency +- Store with each comment for reporting; scoring is heuristic — use judgment for edge cases + +### Mode Detection +- Explicit flags (`--autonomous`, `--interactive`) override auto-detection +- CI/CD environments auto-select autonomous mode +- Non-TTY stdin auto-selects autonomous mode +- Default to interactive if no signals detected + +### Safety in Autonomous Mode +- Create git stash before any changes for easy rollback +- Only auto-fix high-confidence items (score ≥80) +- **Never auto-commit in bulk or auto-push** — leave that to user +- Provide clear rollback instructions in summary +- Note that replies posted to GitHub cannot be rolled back + +## Usage Examples + +**Interactive mode (default in terminal):** +```bash +/address-pr-comments 123 +/address-pr-comments https://github.com/owner/repo/pull/456 ``` -**Step 2: Calculate scores** -- Specificity: Check for `path` AND `line` presence, analyze `body` for code examples -- Language: Regex patterns for directive words ("please", "must", "should" vs "consider", "maybe") -- Category: NLP-based classification or keyword matching -- Authority: Map `author_association` (OWNER→10, COLLABORATOR→7, CONTRIBUTOR→3) -- Status: Check review state and `state` field +**Autonomous mode (auto-detected in CI or forced):** +```bash +/address-pr-comments 123 --autonomous +/address-pr-comments 123 auto -**Step 3: Apply thresholds** -``` -if (score >= 80) { - autonomous_mode: address_automatically -} else if (score >= 60) { - interactive_mode: present_to_user -} else { - skip_or_flag_for_manual_review -} +# In GitHub Actions workflow: +- name: Address PR Comments + run: claude /address-pr-comments ${{ github.event.pull_request.number }} + # Automatically runs in autonomous mode due to CI environment ``` -### Mode Detection Implementation - +**Force interactive mode:** ```bash -# Pseudo-code for mode detection -function detect_mode() { - # Check explicit override first - if args contains "--autonomous" or "--auto" or "auto": - return "autonomous" - if args contains "--interactive" or "interactive": - return "interactive" - - # Check CI environment variables - if $CI or $GITHUB_ACTIONS or $JENKINS_URL or $GITLAB_CI or $CIRCLECI or $TRAVIS: - return "autonomous" - - # Check if stdin is a terminal - if not is_tty(stdin): - return "autonomous" - - # Default to interactive - return "interactive" -} +/address-pr-comments 123 --interactive ``` -### Safety Mechanisms +## Design Philosophy + +### Every Comment Gets a Reply + +This command ensures **no reviewer feedback is left unacknowledged**. Every comment receives a reply posted to GitHub, whether it's: +- A fix with commit SHA +- A request for clarification +- An acknowledgment of praise +- A deferral to manual review + +This creates clear communication and prevents reviewers from wondering if their feedback was seen. + +### Line-Drift Resilience + +By using `diff_hunk` context for code location instead of relying solely on API `line` numbers, this command handles cases where files have changed since the review was posted. + +### Safe Autonomous Operation + +Autonomous mode is designed for CI/CD environments where no human is present to approve changes. It: +- Only auto-fixes high-confidence, low-risk items (score ≥80) +- Creates rollback checkpoints before changes +- Posts replies even for items it doesn't fix +- Provides detailed audit trail in summary report + +### Per-Fix Commits + +Each code fix gets its own commit with a descriptive message referencing the reviewer and comment ID. This makes it easy to: +- Review changes individually +- Cherry-pick or revert specific fixes +- Track which commit addressed which comment +- Generate meaningful git history -**Autonomous mode only:** -1. Create git stash before any modifications -2. Filter to high-confidence items only (score ≥ 80) -3. Track all changes for detailed reporting -4. Provide clear rollback instructions -5. Never auto-commit or auto-push +### Idempotency -**Both modes:** -1. Read file context before applying changes -2. Show diffs for user review -3. Skip ambiguous or unclear comments -4. Preserve code context and intent +Running the command multiple times is safe: +- Already-replied comments are detected and skipped +- No duplicate replies are posted +- Existing commits are not re-created +- Summary report clearly shows what was skipped vs. processed diff --git a/plans/exhaustive-pr-comment-resolution.md b/plans/exhaustive-pr-comment-resolution.md new file mode 100644 index 0000000..755563e --- /dev/null +++ b/plans/exhaustive-pr-comment-resolution.md @@ -0,0 +1,570 @@ +--- +title: "Exhaustive PR Comment Resolution with Mandatory Replies" +type: Enhancement +issue: 13 +research: ["research/research-address-pr-comments-improvement.md"] +status: Ready for Implementation +reviewed: true +reviewers: ["codex", "gemini"] +created: 2026-02-09 +--- + +# PRD: Exhaustive PR Comment Resolution with Mandatory Replies + +## Metadata +- **Type**: Enhancement +- **Priority**: High +- **Severity**: N/A +- **Estimated Complexity**: 5 +- **Created**: 2026-02-09 +- **Status**: Ready for Implementation + +## Overview + +### Problem Statement + +The `/github:address-pr-comments` command currently has three critical gaps: + +1. **No GitHub replies posted.** The command applies code fixes locally but never posts sub-comments on the PR. The reviewer has no way to know which comments were addressed or why something was skipped — they must manually diff the code and guess. + +2. **Silent comment skipping.** Entire categories of comments (questions, praise, resolved threads, low-confidence items) are filtered out before processing. No trace is left on GitHub explaining what happened to these comments. + +3. **No auto-commit.** Fixes are applied to the working tree but never committed, so there's no SHA to reference in replies and no atomic traceability between a review comment and its fix. + +The result: after running `/address-pr-comments`, the reviewer still has to manually walk through every comment to verify what was addressed. This defeats the purpose of the command. + +### Goals & Objectives + +1. Every comment fetched from the PR must receive a sub-comment reply explaining the action taken (fix, explanation, acknowledgment) or why no action was taken +2. Code fixes must be committed immediately so replies can reference specific commit SHAs +3. All three comment types must be fetched (review comments, review summaries, issue comments) — no blind spots +4. The command must produce a clear audit trail: for any PR comment, you can see the reply explaining what happened + +### Success Metrics + +- **Primary Metric**: 100% reply coverage — every non-author, non-bot, top-level comment on the PR receives a sub-comment reply after the command runs +- **Secondary Metrics**: + - Every code fix has a dedicated commit with a descriptive message + - Every "Fixed" reply references a specific commit SHA + - Zero comments silently filtered without trace +- **Quality Gates**: + - Manual test: run on a PR with mixed comment types (code fix, question, praise, unclear suggestion), verify all get replies + +## User Stories + +### Story 1: Know what happened to every comment +- **As a**: PR author using `/address-pr-comments` +- **I want**: every reviewer comment to receive a reply explaining the action taken +- **So that**: I can see at a glance which comments are fixed, which need discussion, and which are acknowledged — without manually comparing code diffs to comment threads +- **Acceptance Criteria**: + - [ ] Every top-level reviewer comment has a sub-comment reply after the command runs + - [ ] Fixes reference specific commit SHAs + - [ ] Explanations provide meaningful context, not generic acknowledgments + - [ ] Questions get thoughtful answers based on code context + +### Story 2: Trace fixes to comments +- **As a**: reviewer checking if my feedback was addressed +- **I want**: each code fix to be in its own commit, referenced in the reply to my comment +- **So that**: I can click the SHA in the reply and see exactly what changed for my specific comment +- **Acceptance Criteria**: + - [ ] Each fix is its own commit (not batched) + - [ ] Commit message references the reviewer and comment + - [ ] Reply contains "Fixed in {sha}" linking to the commit + +### Story 3: Understand skipped items +- **As a**: PR author running the command autonomously +- **I want**: below-threshold or unclear comments to still get a reply explaining why they weren't auto-fixed +- **So that**: I know the command saw them and can decide whether to handle them manually +- **Acceptance Criteria**: + - [ ] Low-confidence comments get a reply explaining they need manual review + - [ ] Unclear comments get a reply asking for clarification + - [ ] Praise/LGTM comments get an acknowledgment reply + +## Requirements + +### Functional Requirements + +1. **FR-1**: Fetch all three comment types from GitHub API with pagination + - Details: Review comments (`/pulls/{pr}/comments`), review summaries (`/pulls/{pr}/reviews`), and issue comments (`/issues/{pr}/comments`). All fetches MUST use `gh api --paginate` to handle PRs with >30 comments (GitHub's default page size). + - Priority: Must Have + +2. **FR-2**: Process every non-author, non-bot, top-level comment — no silent filtering + - Details: PR author's own comments and bot comments are excluded (author = PR author, not command invoker — if a maintainer runs the command, the PR author's comments are still excluded). Replies (`in_reply_to_id != null`) are excluded. Everything else MUST be processed and replied to. + - Priority: Must Have + +3. **FR-3**: Post a sub-comment reply for every processed comment + - Details: Review comments use `/pulls/{pr}/comments/{id}/replies` for threaded replies. Issue comments and review summaries use a new top-level PR comment with `> quoted original` text for clear linking. + - Priority: Must Have + +4. **FR-4**: Auto-commit each code fix individually + - Details: After applying a fix via Edit tool, immediately `git add {file} && git commit -m "fix: {description}"`. Capture SHA via `git rev-parse --short HEAD` for use in the reply. + - Priority: Must Have + +5. **FR-5**: Categorize every comment and select the appropriate reply template + - Details: Categories: `actionable/clear` (fix + "Fixed in {sha}"), `actionable/unclear` (reply asking clarification), `not-actionable/question` (reply with explanation), `not-actionable/praise` (reply with acknowledgment), `not-actionable/out-of-scope` (reply with deferral), `below-threshold` (reply explaining needs manual review in autonomous mode). + - Priority: Must Have + +6. **FR-6**: Maintain confidence scoring for autonomous mode + - Details: Keep the existing 0-100 scoring system. In autonomous mode, high-confidence items get auto-fixed. Below-threshold items still get a REPLY (not silently skipped) explaining they need manual review. + - Priority: Must Have + +7. **FR-7**: Interactive mode must process ALL comments + - Details: In interactive mode, user selects which comments to fix. Comments the user does NOT select still get a reply: "Noted — deferring to manual review" or similar. No comment is left without a reply. + - Priority: Must Have + +8. **FR-8**: Idempotent re-runs — skip comments that already have a reply from the command user + - Details: Before posting a reply, check if a reply from the current `gh` user already exists on that comment (by fetching existing replies and checking `user.login`). If a reply already exists, skip posting and note "Already replied" in the report. This prevents duplicate replies on re-runs. + - Priority: Must Have + +9. **FR-9**: Safe reply body construction — no shell injection + - Details: Reply bodies MUST NOT be passed via shell string interpolation. Use `gh api --input -` with JSON piped from `jq` to safely handle code blocks, quotes, backticks, `$variables`, and multiline content in comment bodies. + - Priority: Must Have + +10. **FR-10**: Line-drift-safe code location for multi-comment files + - Details: When multiple comments reference the same file, earlier fixes may shift line numbers. The command MUST NOT rely solely on the API's `line` field. Instead, use the `diff_hunk` from the comment payload to identify the unique code context (anchor text) and locate it dynamically in the current file state before applying each fix. + - Priority: Must Have + +### Non-Functional Requirements + +1. **NFR-1**: GitHub API rate limiting with retry/backoff + - Requirement: Handle rate limits gracefully when posting many replies. On 403/429 responses, read `Retry-After` or `X-RateLimit-Reset` headers and wait accordingly. Use exponential backoff (1s, 2s, 4s) with max 3 retries per request. + - Target: Support PRs with up to 50 comments without unrecoverable rate limit errors + - Measurement: No permanent 403/429 failures; transient ones retried successfully + +2. **NFR-2**: Reply quality + - Requirement: Replies must be contextual and useful, not generic boilerplate + - Target: Fixed replies reference SHAs, explanations reference actual code, acknowledgments are brief + - Measurement: Manual review of reply quality + +3. **NFR-3**: Preflight validation + - Requirement: Before processing, verify: (a) `gh auth status` succeeds with write permissions, (b) working tree is clean (`git status --porcelain` is empty — abort with warning if dirty), (c) git identity is configured (`git config user.name` and `user.email` are set) + - Target: Fail fast with clear error messages before any commits or replies are posted + - Measurement: Preflight catches all known setup issues + +### Technical Requirements + +- **Stack**: Markdown command file (no TypeScript needed — this is a prompt-based command) +- **Dependencies**: `gh` CLI (already required), git (already required) +- **Architecture**: Single command file `address-pr-comments.md` rewrite. No new agents, skills, or plugins. +- **API Contracts**: Uses existing GitHub REST API endpoints via `gh api` + +## Scope + +### In Scope + +- Rewrite `github-plugin/commands/address-pr-comments.md` to implement exhaustive comment resolution with mandatory replies +- Add review summary fetching (3rd comment type) +- Add per-fix auto-commit behavior +- Add reply posting via `gh api` for all comment types +- Add reply templates for all categories (fix, explanation, clarification, acknowledgment, deferral, manual-review-needed) +- Add quote-based reply format for issue comments (no threading API available) +- Retain interactive/autonomous mode detection +- Retain confidence scoring system +- Update summary/report format to reflect new reply-posting behavior + +### Out of Scope + +- Workflows plugin changes (separate concern, different architecture) +- Swarm/team-based parallel processing (that's `/workflows:resolve-comments --swarm`) +- Iterative loop for new comments during processing (that's the workflows-plugin's responsibility) +- Auto-push (user must still push manually) +- GitHub GraphQL API (stick with REST via `gh api`) + +### Future Considerations + +- Integration between `/github:address-pr-comments` and `/workflows:resolve-comments` to avoid duplication +- Auto-push option for CI/CD environments +- Support for GitHub's "resolve conversation" API if it becomes available via REST + +## Impact Analysis + +### Affected Areas + +- `github-plugin/commands/address-pr-comments.md` — complete rewrite +- GitHub PR comment threads — the command will now post replies (visible to reviewers) + +### Users Affected + +- PR authors who use `/address-pr-comments` — now see replies posted automatically +- PR reviewers — now receive sub-comment replies for each of their comments + +### System Impact + +- **Performance**: More GitHub API calls (one POST per comment for replies). Mitigated by sequential processing with brief pauses if needed. +- **Security**: No new credentials needed — uses existing `gh` auth. +- **Data Integrity**: Commits are created automatically. Each commit is atomic (one fix per commit). No destructive operations. + +### Dependencies + +- **Upstream**: GitHub API availability, `gh` CLI authentication +- **Downstream**: None — this is a leaf command +- **External**: GitHub REST API (`/pulls/{pr}/comments/{id}/replies`, `/issues/{pr}/comments`) + +### Breaking Changes + +- [x] **Behavioral change**: Command now auto-commits fixes (previously: no commits). Users who relied on "apply edits without committing" behavior will see commits created. +- [x] **Behavioral change**: Command now posts GitHub replies (previously: local-only output). This is visible to reviewers. +- [x] **"Skip These" section removed**: Questions, praise, and info-only comments are no longer skipped — they all get replies. + +## Solution Design + +### Approach + +Rewrite `address-pr-comments.md` as a single comprehensive command that processes every comment in a linear pass: + +1. **Fetch** all three comment types (review comments, review summaries, issue comments) +2. **Filter** only author's own comments, bot comments, and reply comments (`in_reply_to_id != null`) +3. **For each remaining comment**, categorize and act: + - `actionable/clear` → Read file → Edit → Commit → Reply "Fixed in {sha}" + - `actionable/unclear` → Reply asking for clarification + - `not-actionable/question` → Read relevant code → Reply with explanation + - `not-actionable/praise` → Reply "Acknowledged, thank you!" + - `not-actionable/out-of-scope` → Reply "Out of scope for this PR, will address in follow-up" + - `below-threshold` (autonomous only) → Reply "This comment needs manual review: {reason}" +4. **Reply posting** uses the correct API endpoint per comment type: + - Review comments: `gh api -X POST repos/{o}/{r}/pulls/{pr}/comments/{id}/replies -f body="..."` + - Issue comments: `gh api -X POST repos/{o}/{r}/issues/{pr}/comments -f body="> {quoted original}\n\n{reply}"` + - Review summaries: `gh api -X POST repos/{o}/{r}/issues/{pr}/comments -f body="> From @{reviewer}'s review:\n> {quoted body}\n\n{reply}"` +5. **Summary** includes all comments with their actions and reply status + +### Alternatives Considered + +1. **Delegate to workflows-plugin's comment-resolver agent** + - Pros: Reuse existing categorization and reply logic + - Cons: Creates cross-plugin dependency; workflows-plugin has its own skip behavior that would need fixing; different architectural model (skill + agent vs single command) + - Why rejected: The github-plugin should remain standalone. The user chose github-plugin-only scope. + +2. **Add reply-posting as optional flag (--reply)** + - Pros: Backwards compatible, opt-in + - Cons: Defeats the purpose — the whole point is that EVERY run posts replies. Making it optional means the default behavior remains broken. + - Why rejected: The core problem IS the default behavior. Making replies mandatory is the fix. + +3. **Post replies in batch after all fixes applied** + - Pros: Fewer API round-trips + - Cons: If the command crashes mid-way, no replies posted. Per-comment posting is more resilient. + - Why rejected: Per-comment is more reliable and gives immediate feedback. + +### Data Model Changes + +None — this is a prompt-based command, no data models. + +### API Changes + +No new APIs created. Uses existing GitHub REST API endpoints: +- `GET /repos/{o}/{r}/pulls/{pr}/comments` (already used) +- `GET /repos/{o}/{r}/issues/{pr}/comments` (already used) +- `GET /repos/{o}/{r}/pulls/{pr}/reviews` (NEW — was missing) +- `POST /repos/{o}/{r}/pulls/{pr}/comments/{id}/replies` (NEW — for review comment replies) +- `POST /repos/{o}/{r}/issues/{pr}/comments` (NEW — for issue comment and review summary replies) + +### UI/UX Changes + +Terminal output changes: +- Each processed comment now shows: category, action taken, reply posted status +- Summary includes reply count alongside fix count +- No more "Skipped Items" section without replies — all items show their reply + +## Implementation Plan + +### Phase 1: Preflight Checks and Comment Fetching +**Complexity**: 3 | **Priority**: High + + + +- [ ] Add preflight checks before any processing: + - Verify `gh auth status` succeeds (abort with "Run: gh auth login" if not) + - Verify working tree is clean via `git status --porcelain` (abort with "Stash or commit changes first" if dirty — do not proceed with uncommitted changes to avoid accidental staging) + - Verify `git config user.name` and `git config user.email` are set (abort with config instructions if missing) +- [ ] Add review summary fetching (`gh api --paginate repos/{o}/{r}/pulls/{pr}/reviews`) +- [ ] Add `--paginate` flag to ALL three `gh api` fetch calls to handle PRs with >30 comments +- [ ] Unify all three comment types into a single list with normalized structure (`id`, `type`, `path`, `line`, `diff_hunk`, `body`, `user`, `created_at`, `in_reply_to_id`). Include `diff_hunk` for review comments — it's needed for line-drift-safe code location. +- [ ] Update filtering: only exclude PR author's own comments, bot comments, and replies (`in_reply_to_id != null`). Remove the "Skip These" section that filters out questions, praise, resolved, and info-only comments. Filter out comments with empty/whitespace-only bodies (review summaries can have null body). +- [ ] Keep confidence scoring system intact (still useful for autonomous mode prioritization) +- [ ] Add idempotency check: for each comment, fetch existing replies and check if one from the current `gh` user already exists. Mark these as "already replied" and skip them in processing. + +### Phase 2: Add Per-Comment Processing with Mandatory Replies +**Complexity**: 5 | **Priority**: High + + + +- [ ] Define the complete category set: `actionable/clear`, `actionable/unclear`, `not-actionable/question`, `not-actionable/praise`, `not-actionable/out-of-scope`, `below-threshold` +- [ ] For `actionable/clear`: implement line-drift-safe fix application: + 1. Read the file via Read tool + 2. Locate the code using `diff_hunk` context (anchor text matching), NOT the API `line` number alone — this handles line drift from earlier fixes on the same file + 3. Apply fix via Edit tool + 4. Check if file actually changed: `git diff --quiet {file}`. If no change (edit was no-op), skip commit and reply "No change needed — code already matches the suggestion" + 5. Stage ONLY the specific file(s) touched: `git add {file}` (never `git add .` or `git add -A`) + 6. Commit: `git commit -m "fix: {desc}\n\nAddresses comment from @{reviewer}\nComment ID: {id}"` + 7. Capture SHA: `git rev-parse --short HEAD` + 8. Reply "Fixed in {sha}" +- [ ] Handle missing/renamed files: if the file referenced in the comment doesn't exist, reply "This file appears to have been moved or renamed since this review. Could you check if this comment is still relevant?" — do not attempt to edit. +- [ ] For `actionable/unclear`: Reply "Could you clarify what specific change you're looking for? For example:\n- Should I {option A}?\n- Or {option B}?" +- [ ] For `not-actionable/question`: Read relevant code context → Reply with explanation of why the code is written that way, offer to adjust +- [ ] For `not-actionable/praise`: Reply "Acknowledged, thank you!" +- [ ] For `not-actionable/out-of-scope`: Reply "Out of scope for this PR — will address in follow-up" +- [ ] For `below-threshold` (autonomous mode only): Reply "This comment requires manual review. Reason: {scoring-reason}. The automated system wasn't confident enough to apply a fix autonomously." +- [ ] In interactive mode: user-selected comments get fixed + replied. Unselected comments get reply "Noted — deferring to manual review" + +### Phase 3: Add Reply Posting Logic +**Complexity**: 3 | **Priority**: High + + + +- [ ] **CRITICAL: Safe body construction.** All reply bodies MUST be passed via `gh api --input -`, NOT via `-f body="..."` shell interpolation. Construct JSON payload using `jq`: `jq -n --arg b "$REPLY_BODY" '{body:$b}' | gh api -X POST "..." --input -`. This prevents shell injection from code blocks, backticks, `$variables`, and special chars in comment bodies. +- [ ] For review comments (type: "review"): `jq -n --arg b "$REPLY" '{body:$b}' | gh api -X POST "repos/{o}/{r}/pulls/{pr}/comments/{id}/replies" --input -` +- [ ] For issue comments (type: "issue"): Construct body as `> {quoted_original_body_truncated_200_chars}\n\n{reply}`, then pipe via jq as above +- [ ] For review summaries (type: "review-summary"): Construct body as `> From @{reviewer}'s review ({state}):\n> {quoted_body_truncated_200_chars}\n\n{reply}`. Handle empty/null review bodies: if body is empty, use "No summary text provided — addressing based on review state ({state})." +- [ ] Add error handling with retry: if reply POST returns 403/429, read `Retry-After` header and wait (exponential backoff: 1s, 2s, 4s, max 3 retries). On other errors, log and continue processing. +- [ ] Skip posting if idempotency check (Phase 1) found existing reply from current user + +### Phase 4: Update Summary and Report Format +**Complexity**: 2 | **Priority**: Medium + +- [ ] Update interactive mode summary to show reply status for each comment +- [ ] Update autonomous mode summary to show: comments fixed + replied, comments replied-only, total replies posted +- [ ] Remove "Skipped Items Requiring Human Review" framing — replace with "Comments Replied Without Fix" showing the reply that was posted +- [ ] Update "Next Steps" to remove manual reply instructions (replies already posted) — keep manual push instruction +- [ ] Keep rollback section for autonomous mode (rollback only reverts commits, replies remain on GitHub — note this in rollback section) + +### Phase 5: Validation +**Complexity**: 2 | **Priority**: High + +- [ ] Plugin validation: `cd github-plugin && bun run validate` +- [ ] Manual test: run on a real PR with mixed comment types +- [ ] Verify all comments received replies on GitHub +- [ ] Verify each fix has its own commit with descriptive message +- [ ] Verify autonomous mode posts replies for below-threshold items instead of silently skipping +- [ ] Verify interactive mode posts replies for unselected items + +## Relevant Files + +### Existing Files + +- `github-plugin/commands/address-pr-comments.md` — THE file being rewritten. Contains the complete current command definition (604 lines). +- `github-plugin/.claude-plugin/plugin.json` — Plugin manifest. May need description update if command behavior changes significantly. +- `github-plugin/README.md` — Plugin documentation. Needs update to reflect new reply-posting and auto-commit behavior. + +### New Files + +None — this is a rewrite of an existing file. + +### Test Files + +None — this is a markdown command file, not code. Validation is manual testing against real PRs. + +## Testing Strategy + +### Unit Tests + +N/A — markdown command files don't have unit tests. + +### Integration Tests + +N/A — testing requires a real GitHub PR with comments. + +### E2E Tests + +N/A — would require a real PR environment. + +### Manual Test Cases + +1. **Test Case: Mixed comment types** + - Steps: + 1. Create a PR with: 1 clear code suggestion, 1 vague suggestion, 1 question, 1 "LGTM" praise, 1 general discussion + 2. Run `/address-pr-comments {PR_NUMBER}` + 3. Check GitHub PR page + - Expected: All 5 comments have sub-comment replies. The code suggestion has a fix commit + "Fixed in {sha}" reply. + +2. **Test Case: Autonomous mode below-threshold** + - Steps: + 1. Run `/address-pr-comments {PR_NUMBER} --auto` on a PR with low-confidence comments + 2. Check GitHub PR page + - Expected: Low-confidence comments have replies explaining they need manual review (not silently skipped). + +3. **Test Case: Issue comment reply format** + - Steps: + 1. Create a PR with a general (non-file-specific) comment + 2. Run `/address-pr-comments {PR_NUMBER}` + 3. Check GitHub PR page + - Expected: A new comment appears with `> quoted original` text and the reply below. + +4. **Test Case: Review summary** + - Steps: + 1. Submit a review with "Request Changes" and a summary comment + 2. Run `/address-pr-comments {PR_NUMBER}` + 3. Check GitHub PR page + - Expected: A new PR comment appears addressing the review summary with quote. + +5. **Test Case: Idempotent re-run** + - Steps: + 1. Run `/address-pr-comments {PR_NUMBER}` on a PR (first run) + 2. Run `/address-pr-comments {PR_NUMBER}` again (second run) + 3. Check GitHub PR page + - Expected: No duplicate replies. Second run reports "Already replied" for all comments. + +6. **Test Case: Multiple comments on same file (line drift)** + - Steps: + 1. Create a PR where 2+ review comments reference different lines of the same file + 2. Run `/address-pr-comments {PR_NUMBER}` + - Expected: Both fixes are applied correctly even though the first fix may have shifted line numbers. + +7. **Test Case: Comment body with special characters** + - Steps: + 1. Create a PR comment containing: code blocks with backticks, `$variables`, double quotes, newlines + 2. Run `/address-pr-comments {PR_NUMBER}` + - Expected: Reply is posted successfully with properly quoted original text. No shell errors. + +8. **Test Case: Dirty working tree** + - Steps: + 1. Make uncommitted changes in the repo + 2. Run `/address-pr-comments {PR_NUMBER}` + - Expected: Command aborts with clear message to stash or commit first. + +## Risk Assessment + +### Technical Risks + +| Risk | Probability | Impact | Mitigation | +|------|-------------|--------|------------| +| GitHub API rate limiting with many replies | Medium | Medium | Exponential backoff with retry on 403/429. Read `Retry-After` header. Max 3 retries per request. | +| Reply POST fails mid-processing | Low | Medium | Continue processing remaining comments. Log failed replies in summary. Retry with backoff first. | +| Auto-commit creates messy git history | Low | Low | Each commit has descriptive message with reviewer attribution. User can squash before push if desired. | +| Review summary quotes are too long | Low | Low | Truncate quoted body to first 200 chars with "..." if longer. | +| Line number drift on same-file edits | Medium | High | Use `diff_hunk` anchor text for code location, not API `line` number alone. Re-read file before each fix. | +| Shell injection from comment content | Medium | Critical | All reply bodies piped via `jq` + `--input -`. No shell string interpolation of user content. | +| Duplicate replies on re-run | Medium | Medium | Idempotency check: skip comments that already have a reply from the current user. | +| Dirty working tree causes accidental staging | Low | High | Preflight check: abort if `git status --porcelain` is non-empty. | + +### Business Risks + +| Risk | Probability | Impact | Mitigation | +|------|-------------|--------|------------| +| Reviewers annoyed by automated replies | Low | Medium | Replies are contextual and useful, not spam. Each reply explains what was done. | +| Posting wrong reply to wrong comment | Very Low | High | Each reply is posted immediately after processing its specific comment — no batching confusion. | + +### Mitigation Strategy + +Sequential processing (one comment at a time) with immediate reply posting ensures correctness. If any step fails, the error is logged and processing continues for remaining comments. The summary report at the end shows which replies succeeded and which failed. + +## Rollback Strategy + +### Rollback Steps + +1. **Git rollback**: `git reset --soft HEAD~{N}` to undo the N fix commits (preserves changes in working tree) +2. **Full revert**: `git reset --hard {ROLLBACK_SHA}` to revert to pre-command state +3. **GitHub replies cannot be automatically deleted** — note this clearly in the command output. Replies posted to GitHub are permanent unless manually deleted. + +### Rollback Conditions + +- Incorrect fixes were applied (wrong code changes) +- Replies were posted to wrong comments +- Rate limit errors caused partial processing + +## Validation Commands + +```bash +# Validate plugin manifest +cd /Users/iamladi/Projects/claude-code-plugins/github-plugin && bun run validate + +# Verify the command file exists and has expected structure +wc -l /Users/iamladi/Projects/claude-code-plugins/github-plugin/commands/address-pr-comments.md + +# Manual validation: run on a test PR +# /address-pr-comments {test-pr-number} +# Then check GitHub PR page for replies +``` + +## Acceptance Criteria + +- [ ] All three comment types fetched (review comments, review summaries, issue comments) +- [ ] Every non-author, non-bot, top-level comment receives a sub-comment reply +- [ ] Code fixes auto-committed with descriptive messages referencing reviewer and comment ID +- [ ] "Fixed in {sha}" replies reference actual commit SHAs +- [ ] Review comment replies are threaded (appear under the original comment) +- [ ] Issue comment replies use quote format (`> original\n\nreply`) +- [ ] Review summary replies use quote format with reviewer attribution +- [ ] Autonomous mode: below-threshold comments get replies (not silently skipped) +- [ ] Interactive mode: unselected comments get "deferring to manual review" replies +- [ ] Praise/LGTM comments get acknowledgment replies +- [ ] Questions get contextual explanation replies +- [ ] Summary report shows all comments with their actions and reply status +- [ ] Plugin validation passes (`bun run validate`) +- [ ] No silent filtering — the "Skip These" section is removed + +## Dependencies + +### New Dependencies + +None — uses existing `gh` CLI and git. + +### Dependency Updates + +None. + +## Notes & Context + +### Additional Context + +- The workflows-plugin has a parallel implementation (`/workflows:resolve-comments`) that is more sophisticated (iterative loop, swarm mode, dedicated agent). This plan intentionally does NOT modify the workflows-plugin — it focuses on making the github-plugin's command self-sufficient. +- The workflows-plugin's `comment-resolver` agent has useful reply templates (lines 121-157 of `workflows-plugin/agents/comment-resolver.md`) that inform the template design in this plan, but the implementation is independent. + +### Assumptions + +- `gh` CLI is authenticated and has write access to post comments +- The PR exists and is not merged +- The user is running from within the git repository +- GitHub API reply endpoints work as documented (POST to `/pulls/{pr}/comments/{id}/replies` creates threaded reply) + +### Constraints + +- GitHub REST API does not support threaded replies on issue comments — must use quote format +- GitHub REST API does not have a direct reply endpoint for review summaries — must post as new issue comment +- Reply text passed via `gh api -f body="..."` must handle special characters in quoted text + +### Related Tasks/Issues + +- Research document: `research/research-address-pr-comments-improvement.md` +- Workflows plugin comment resolution: `workflows-plugin/skills/comment-resolution/SKILL.md` + +### References + +- GitHub REST API: Pull Request Review Comments — https://docs.github.com/en/rest/pulls/comments +- GitHub REST API: Issue Comments — https://docs.github.com/en/rest/issues/comments +- GitHub REST API: Pull Request Reviews — https://docs.github.com/en/rest/pulls/reviews + +### Open Questions + +- [x] Commit strategy? → Auto-commit per fix (user decision) +- [x] Praise handling? → Reply to everything (user decision) +- [x] Issue comment threading? → Quote + new comment (user decision) +- [x] Scope? → GitHub plugin only (user decision) + +## Blindspot Review + +**Reviewers**: GPT-5.2-Codex (xhigh), Gemini 3 Pro +**Date**: 2026-02-09 +**Plan Readiness**: Ready + +### Addressed Concerns + +- [Consensus, Critical] Shell injection / text escaping in reply bodies → Added FR-9 (safe body construction via jq + --input -), updated Phase 3 with explicit safe payload method +- [Gemini, Critical] Line number drift when multiple comments on same file → Added FR-10 (diff_hunk anchor text matching), updated Phase 2 with line-drift-safe fix application +- [Codex, High] Pagination missing on API fetches → Updated FR-1 to require `--paginate` on all fetches, updated Phase 1 +- [Codex, High] No idempotency / duplicate reply handling → Added FR-8 (check existing replies before posting), updated Phase 1 and Phase 3 +- [Codex, High] Dirty working tree risk → Added to NFR-3 preflight checks (abort if dirty), updated Phase 1 +- [Codex, Medium] Multi-file and no-change commit gaps → Updated Phase 2 to check `git diff --quiet` before committing, stage only specific files +- [Codex, Medium] Outdated file / missing file handling → Added to Phase 2 (reply with "file appears moved/renamed") +- [Codex, Medium] Rate limit retry/backoff underspecified → Updated NFR-1 with exponential backoff and Retry-After header handling +- [Codex, Medium] Auth and permission failure path missing → Added to NFR-3 preflight checks (`gh auth status`) +- [Gemini, Medium] Git identity configuration → Added to NFR-3 preflight checks (`git config user.name/email`) +- [Codex, Low] Author definition ambiguity → Clarified in FR-2 (author = PR author, not command invoker) +- [Codex, Low] Review summary empty body edge case → Updated Phase 3 to handle empty/null review bodies +- [Codex, Medium] Testing gaps for idempotency and special chars → Added manual test cases 5-8 + +### Acknowledged but Deferred + +- [Gemini, Medium] Broken intermediate commits — individual commits may not pass CI if a rename spans multiple files referenced in separate comments. Deferred because: (a) this is inherent to per-fix commits and the user explicitly chose this strategy, (b) user can squash before push, (c) the alternative (batching) breaks the traceability that is the primary goal. + +### Dismissed + +- [Gemini, Low, 0.7] Pending/draft review comments — `gh api` for review comments (`/pulls/{pr}/comments`) only returns comments from submitted reviews by default. Pending/draft review comments are not visible via the REST API to other users. No action needed. diff --git a/reviews/constitution-compliance-address-pr-comments.md b/reviews/constitution-compliance-address-pr-comments.md new file mode 100644 index 0000000..4134a12 --- /dev/null +++ b/reviews/constitution-compliance-address-pr-comments.md @@ -0,0 +1,29 @@ +# Constitution Compliance Review: address-pr-comments.md + +**Date**: 2026-02-09 +**Alignment Score**: 4.4/10 +**Classification**: Mixed (leans rule-based) + +## Dimension Scores + +| Dimension | Score | Notes | +|---|---|---| +| Constraint Style | 5/10 | Some reasoning present, many bare rules | +| Workflow Style | 3/10 | Heavily procedural, numbered scripts | +| Format Style | 3/10 | Rigid fill-in-the-blank templates | +| Trust Level | 5/10 | Trusts categorization, micromanages everything else | +| Edge Case Handling | 6/10 | Good coverage of known edges, no principles for novel ones | + +## Priority Transformations Applied + +1. **Scoring system** — Replaced point tables with judgment dimensions +2. **Reply templates** — Replaced hardcoded strings with quality criteria +3. **Fix workflow** — Reframed as principles with reasoning +4. **Summary templates** — Replaced fill-in-the-blank with format criteria + +## Kept Prescriptive (Appropriately) + +- Safe reply body construction via jq (security — error cost is severe) +- API endpoint references (reference material, not procedure) +- Preflight checks (hard safety constraints) +- `diff_hunk` anchor text requirement (correctness — line drift is a real failure mode)