Skip to content

ci: factor Claude review extraction into a locally-testable script#147

Merged
loothero merged 1 commit into
mainfrom
ci/extract-claude-review-script
May 4, 2026
Merged

ci: factor Claude review extraction into a locally-testable script#147
loothero merged 1 commit into
mainfrom
ci/extract-claude-review-script

Conversation

@loothero
Copy link
Copy Markdown
Member

@loothero loothero commented May 3, 2026

Summary

Replace the slow PR → merge → rebase-test-PR → wait iteration loop with a local fixture-based test runner. The extraction logic that has been the source of every recent regression (#145, #146) lives in a single script that's exercised end-to-end in ~1 second locally.

What changed

  • .github/scripts/extract-claude-review.sh — the Extract review output shell, lifted verbatim out of pr-ci.yml. Env-driven (EXECUTION_FILE, HEADING, REVIEW_OUT_FILE).
  • .github/scripts/test-extract-claude-review.sh — local runner that asserts exit codes and output content against every regression we've hit.
  • .github/fixtures/claude-execution-*.json — five sample execution files matching the actual JSON.stringify(SDKMessage[]) shape from base-action/src/run-claude-sdk.ts.
  • pr-ci.yml — each of the four claude-review-* jobs replaces 22 lines of inline shell with run: bash .github/scripts/extract-claude-review.sh. Net workflow diff: -68 lines.
  • review_automation_changed filter picks up .github/scripts/** so the security gate that disables AI reviews on self-modifying PRs covers the extracted script too.

Local usage

bash .github/scripts/test-extract-claude-review.sh
Running extract-claude-review.sh fixture tests

  [PASS] success: multi-line .result preserves heading + body + summary
  [PASS] success-with-HIGH: bracketed [HIGH] marker survives extraction
  [PASS] error: SDKResultError falls back to assistant text concat
  [PASS] empty array: exits 1 with heading + fallback
  [PASS] malformed JSON: exits 1 with heading + fallback
  [PASS] missing file: exits 1 with heading + fallback

Result: 6 passed, 0 failed

The six cases collectively cover every regression we've shipped so far:

Test plan

  • bash .github/scripts/test-extract-claude-review.sh → 6 / 6 passing
  • CI green (this PR modifies pr-ci.yml, so it triggers review_automation_modified and AI reviews are gated off — only the deterministic jobs run)
  • After merge: future iteration on the extraction script no longer requires a PR cycle. Run the local test instead.

🤖 Generated with Claude Code

Move the per-job Extract-review-output shell from pr-ci.yml into
.github/scripts/extract-claude-review.sh so the extraction logic can be
verified locally against fixture files before pushing — replacing the
slow PR -> merge -> rebase-test-PR -> wait cycle that has been
producing follow-up patches.

New files:
- .github/scripts/extract-claude-review.sh  — extraction logic, env-driven
  (EXECUTION_FILE, HEADING, REVIEW_OUT_FILE), exits 1 with heading-prefixed
  fallback on any failure
- .github/scripts/test-extract-claude-review.sh  — local test runner
- .github/fixtures/claude-execution-{success,success-with-high,error,empty,bad}.json
  — fixture inputs covering the regressions we hit on PRs #145/#146

Test cases (all asserted by the runner):
1. Multi-line .result success — full body preserved (catches the tail -1
   regression that would silently drop bracketed [HIGH] markers)
2. .result containing [HIGH] — survives extraction so the downstream
   blocking-check still fires
3. SDKResultError (no .result, has assistant text) — falls back to
   assistant text concatenation
4. Empty JSON array — exit 1 with heading-prefixed fallback
5. Malformed JSON — exit 1 with heading-prefixed fallback (catches jq
   exit 5 propagating through bash -eo pipefail)
6. Missing file — exit 1 with heading-prefixed fallback

pr-ci.yml changes:
- Each of the four claude-review-* jobs replaces 22 lines of inline
  shell with `run: bash .github/scripts/extract-claude-review.sh`
- Add `.github/scripts/**` to review_automation_changed so the security
  gate that disables AI reviews on workflow-modifying PRs also covers
  the extracted script

Local usage:
  bash .github/scripts/test-extract-claude-review.sh

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 3, 2026 20:49
@vercel
Copy link
Copy Markdown

vercel Bot commented May 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
summit Ready Ready Preview, Comment May 3, 2026 8:49pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

Warning

Rate limit exceeded

@loothero has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 43 minutes and 7 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: feda73b4-643f-48f4-92fd-1e3be613b6f1

📥 Commits

Reviewing files that changed from the base of the PR and between 5f0ae9f and 3a3e47a.

📒 Files selected for processing (8)
  • .github/fixtures/claude-execution-bad.json
  • .github/fixtures/claude-execution-empty.json
  • .github/fixtures/claude-execution-error.json
  • .github/fixtures/claude-execution-success-with-high.json
  • .github/fixtures/claude-execution-success.json
  • .github/scripts/extract-claude-review.sh
  • .github/scripts/test-extract-claude-review.sh
  • .github/workflows/pr-ci.yml
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci/extract-claude-review-script

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 43 minutes and 7 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@railway-app
Copy link
Copy Markdown

railway-app Bot commented May 3, 2026

🚅 Deployed to the summit-pr-147 environment in Summit

3 services not affected by this PR
  • summit-db
  • summit-indexer
  • summit-api

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a bash script and a corresponding test suite to extract review content from Claude execution logs. The extraction logic handles both successful results and partial assistant responses in case of errors, using a set of JSON fixtures for validation. Feedback focuses on improving the robustness of the JSON parsing with jq by adding type checks and enhancing the test runner's debuggability by capturing error output during execution.

Comment on lines +49 to +50
REVIEW=$(jq -r '[.[] | select(.type == "assistant") | .message.content[]? | select(.type == "text") | .text] | join("\n\n")' \
"$EXECUTION_FILE" 2>/dev/null || true)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The jq filter for assistant text concatenation could fail if the .text field is not a string (e.g., if it's null or an object), as join requires an array of strings. While the || true and 2>/dev/null prevent the script from exiting, it would result in an empty REVIEW and trigger the generic fallback message even if some text was extractable. Adding a type check makes the extraction more robust.

Suggested change
REVIEW=$(jq -r '[.[] | select(.type == "assistant") | .message.content[]? | select(.type == "text") | .text] | join("\n\n")' \
"$EXECUTION_FILE" 2>/dev/null || true)
REVIEW=$(jq -r '[.[] | select(.type == "assistant") | .message.content[]? | select(.type == "text" and (.text | type == "string")) | .text] | join("\n\n")' \
"$EXECUTION_FILE" 2>/dev/null || true)

Comment on lines +51 to +79
local exit_code=0
EXECUTION_FILE="$execution_file" \
HEADING="## Claude Review - Test Scope" \
REVIEW_OUT_FILE="$out" \
bash "$EXTRACT" >/dev/null 2>&1 || exit_code=$?

local result="PASS" reason=""
if [ "$exit_code" != "$expected_exit" ]; then
result="FAIL"
reason="exit=$exit_code expected=$expected_exit"
else
for pattern in "${patterns[@]}"; do
if ! grep -qE "$pattern" "$out"; then
result="FAIL"
reason="output missing pattern '$pattern'"
break
fi
done
fi

if [ "$result" = "PASS" ]; then
PASS=$((PASS + 1))
printf ' [PASS] %s\n' "$label"
else
FAIL=$((FAIL + 1))
printf ' [FAIL] %s — %s\n' "$label" "$reason"
printf ' output (first 5 lines):\n'
head -n 5 "$out" 2>/dev/null | sed 's/^/ | /'
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test runner currently suppresses all output from the extraction script (>/dev/null 2>&1). If a test fails due to a system error (e.g., jq missing or a syntax error in the script), the reason is hidden. Capturing stderr to a temporary file and displaying it on failure would significantly improve debuggability.

  local err_out
  err_out="$TMPDIR_ROOT/$(printf '%s' "$label" | tr ' /[]' '____').err"

  local exit_code=0
  EXECUTION_FILE="$execution_file" \
  HEADING="## Claude Review - Test Scope" \
  REVIEW_OUT_FILE="$out" \
    bash "$EXTRACT" >/dev/null 2>"$err_out" || exit_code=$?

  local result="PASS" reason=""
  if [ "$exit_code" != "$expected_exit" ]; then
    result="FAIL"
    reason="exit=$exit_code expected=$expected_exit"
  else
    for pattern in "${patterns[@]}"; do
      if ! grep -qE "$pattern" "$out"; then
        result="FAIL"
        reason="output missing pattern '$pattern'"
        break
      fi
    done
  fi

  if [ "$result" = "PASS" ]; then
    PASS=$((PASS + 1))
    printf '  [PASS] %s\n' "$label"
  else
    FAIL=$((FAIL + 1))
    printf '  [FAIL] %s — %s\n' "$label" "$reason"
    if [ -s "$err_out" ]; then
      printf '         stderr:\n'
      sed 's/^/         | /' "$err_out"
    fi
    printf '         output (first 5 lines):\n'
    head -n 5 "$out" 2>/dev/null | sed 's/^/         | /'
  fi

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors Claude review extraction in the PR CI workflow by moving the inline jq/fallback logic into a dedicated script and adding a fast local fixture-based test runner to prevent regressions in review output parsing.

Changes:

  • Extracted Claude execution_file parsing into .github/scripts/extract-claude-review.sh and updated all Claude review jobs to call it.
  • Added .github/scripts/test-extract-claude-review.sh plus JSON fixtures under .github/fixtures/ to locally validate extraction behavior across known regression cases.
  • Expanded the review_automation_changed filter to include .github/scripts/** so AI review gating applies to changes in the extracted script.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
.github/workflows/pr-ci.yml Replaces repeated inline extraction logic with the shared extraction script; expands gating filter to include scripts.
.github/scripts/extract-claude-review.sh New env-driven extraction script with fallback behavior on missing/unparseable output.
.github/scripts/test-extract-claude-review.sh New local test runner validating exit codes and extracted output against fixtures.
.github/fixtures/claude-execution-success.json Fixture for successful multi-line .result extraction.
.github/fixtures/claude-execution-success-with-high.json Fixture ensuring [HIGH] markers survive extraction.
.github/fixtures/claude-execution-error.json Fixture for error case where output must fall back to assistant text.
.github/fixtures/claude-execution-empty.json Fixture for empty message list failure behavior.
.github/fixtures/claude-execution-bad.json Fixture for malformed JSON failure behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 64 to +68
review_automation_changed:
- ".github/workflows/pr-ci.yml"
- ".github/prompts/**"
- ".github/actions/**"
- ".github/scripts/**"
Comment on lines +30 to +31
if [ ! -x "$EXTRACT" ]; then
echo "extract-claude-review.sh not found or not executable at $EXTRACT" >&2
@loothero loothero merged commit 90301c5 into main May 4, 2026
24 checks passed
@loothero loothero deleted the ci/extract-claude-review-script branch May 4, 2026 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants