ci: factor Claude review extraction into a locally-testable script#147
Conversation
Move the per-job Extract-review-output shell from pr-ci.yml into
.github/scripts/extract-claude-review.sh so the extraction logic can be
verified locally against fixture files before pushing — replacing the
slow PR -> merge -> rebase-test-PR -> wait cycle that has been
producing follow-up patches.
New files:
- .github/scripts/extract-claude-review.sh — extraction logic, env-driven
(EXECUTION_FILE, HEADING, REVIEW_OUT_FILE), exits 1 with heading-prefixed
fallback on any failure
- .github/scripts/test-extract-claude-review.sh — local test runner
- .github/fixtures/claude-execution-{success,success-with-high,error,empty,bad}.json
— fixture inputs covering the regressions we hit on PRs #145/#146
Test cases (all asserted by the runner):
1. Multi-line .result success — full body preserved (catches the tail -1
regression that would silently drop bracketed [HIGH] markers)
2. .result containing [HIGH] — survives extraction so the downstream
blocking-check still fires
3. SDKResultError (no .result, has assistant text) — falls back to
assistant text concatenation
4. Empty JSON array — exit 1 with heading-prefixed fallback
5. Malformed JSON — exit 1 with heading-prefixed fallback (catches jq
exit 5 propagating through bash -eo pipefail)
6. Missing file — exit 1 with heading-prefixed fallback
pr-ci.yml changes:
- Each of the four claude-review-* jobs replaces 22 lines of inline
shell with `run: bash .github/scripts/extract-claude-review.sh`
- Add `.github/scripts/**` to review_automation_changed so the security
gate that disables AI reviews on workflow-modifying PRs also covers
the extracted script
Local usage:
bash .github/scripts/test-extract-claude-review.sh
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 43 minutes and 7 seconds.Comment |
|
🚅 Deployed to the summit-pr-147 environment in Summit 3 services not affected by this PR
|
There was a problem hiding this comment.
Code Review
This pull request introduces a bash script and a corresponding test suite to extract review content from Claude execution logs. The extraction logic handles both successful results and partial assistant responses in case of errors, using a set of JSON fixtures for validation. Feedback focuses on improving the robustness of the JSON parsing with jq by adding type checks and enhancing the test runner's debuggability by capturing error output during execution.
| REVIEW=$(jq -r '[.[] | select(.type == "assistant") | .message.content[]? | select(.type == "text") | .text] | join("\n\n")' \ | ||
| "$EXECUTION_FILE" 2>/dev/null || true) |
There was a problem hiding this comment.
The jq filter for assistant text concatenation could fail if the .text field is not a string (e.g., if it's null or an object), as join requires an array of strings. While the || true and 2>/dev/null prevent the script from exiting, it would result in an empty REVIEW and trigger the generic fallback message even if some text was extractable. Adding a type check makes the extraction more robust.
| REVIEW=$(jq -r '[.[] | select(.type == "assistant") | .message.content[]? | select(.type == "text") | .text] | join("\n\n")' \ | |
| "$EXECUTION_FILE" 2>/dev/null || true) | |
| REVIEW=$(jq -r '[.[] | select(.type == "assistant") | .message.content[]? | select(.type == "text" and (.text | type == "string")) | .text] | join("\n\n")' \ | |
| "$EXECUTION_FILE" 2>/dev/null || true) |
| local exit_code=0 | ||
| EXECUTION_FILE="$execution_file" \ | ||
| HEADING="## Claude Review - Test Scope" \ | ||
| REVIEW_OUT_FILE="$out" \ | ||
| bash "$EXTRACT" >/dev/null 2>&1 || exit_code=$? | ||
|
|
||
| local result="PASS" reason="" | ||
| if [ "$exit_code" != "$expected_exit" ]; then | ||
| result="FAIL" | ||
| reason="exit=$exit_code expected=$expected_exit" | ||
| else | ||
| for pattern in "${patterns[@]}"; do | ||
| if ! grep -qE "$pattern" "$out"; then | ||
| result="FAIL" | ||
| reason="output missing pattern '$pattern'" | ||
| break | ||
| fi | ||
| done | ||
| fi | ||
|
|
||
| if [ "$result" = "PASS" ]; then | ||
| PASS=$((PASS + 1)) | ||
| printf ' [PASS] %s\n' "$label" | ||
| else | ||
| FAIL=$((FAIL + 1)) | ||
| printf ' [FAIL] %s — %s\n' "$label" "$reason" | ||
| printf ' output (first 5 lines):\n' | ||
| head -n 5 "$out" 2>/dev/null | sed 's/^/ | /' | ||
| fi |
There was a problem hiding this comment.
The test runner currently suppresses all output from the extraction script (>/dev/null 2>&1). If a test fails due to a system error (e.g., jq missing or a syntax error in the script), the reason is hidden. Capturing stderr to a temporary file and displaying it on failure would significantly improve debuggability.
local err_out
err_out="$TMPDIR_ROOT/$(printf '%s' "$label" | tr ' /[]' '____').err"
local exit_code=0
EXECUTION_FILE="$execution_file" \
HEADING="## Claude Review - Test Scope" \
REVIEW_OUT_FILE="$out" \
bash "$EXTRACT" >/dev/null 2>"$err_out" || exit_code=$?
local result="PASS" reason=""
if [ "$exit_code" != "$expected_exit" ]; then
result="FAIL"
reason="exit=$exit_code expected=$expected_exit"
else
for pattern in "${patterns[@]}"; do
if ! grep -qE "$pattern" "$out"; then
result="FAIL"
reason="output missing pattern '$pattern'"
break
fi
done
fi
if [ "$result" = "PASS" ]; then
PASS=$((PASS + 1))
printf ' [PASS] %s\n' "$label"
else
FAIL=$((FAIL + 1))
printf ' [FAIL] %s — %s\n' "$label" "$reason"
if [ -s "$err_out" ]; then
printf ' stderr:\n'
sed 's/^/ | /' "$err_out"
fi
printf ' output (first 5 lines):\n'
head -n 5 "$out" 2>/dev/null | sed 's/^/ | /'
fiThere was a problem hiding this comment.
Pull request overview
Refactors Claude review extraction in the PR CI workflow by moving the inline jq/fallback logic into a dedicated script and adding a fast local fixture-based test runner to prevent regressions in review output parsing.
Changes:
- Extracted Claude
execution_fileparsing into.github/scripts/extract-claude-review.shand updated all Claude review jobs to call it. - Added
.github/scripts/test-extract-claude-review.shplus JSON fixtures under.github/fixtures/to locally validate extraction behavior across known regression cases. - Expanded the
review_automation_changedfilter to include.github/scripts/**so AI review gating applies to changes in the extracted script.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/pr-ci.yml | Replaces repeated inline extraction logic with the shared extraction script; expands gating filter to include scripts. |
| .github/scripts/extract-claude-review.sh | New env-driven extraction script with fallback behavior on missing/unparseable output. |
| .github/scripts/test-extract-claude-review.sh | New local test runner validating exit codes and extracted output against fixtures. |
| .github/fixtures/claude-execution-success.json | Fixture for successful multi-line .result extraction. |
| .github/fixtures/claude-execution-success-with-high.json | Fixture ensuring [HIGH] markers survive extraction. |
| .github/fixtures/claude-execution-error.json | Fixture for error case where output must fall back to assistant text. |
| .github/fixtures/claude-execution-empty.json | Fixture for empty message list failure behavior. |
| .github/fixtures/claude-execution-bad.json | Fixture for malformed JSON failure behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| review_automation_changed: | ||
| - ".github/workflows/pr-ci.yml" | ||
| - ".github/prompts/**" | ||
| - ".github/actions/**" | ||
| - ".github/scripts/**" |
| if [ ! -x "$EXTRACT" ]; then | ||
| echo "extract-claude-review.sh not found or not executable at $EXTRACT" >&2 |
Summary
Replace the slow PR → merge → rebase-test-PR → wait iteration loop with a local fixture-based test runner. The extraction logic that has been the source of every recent regression (#145, #146) lives in a single script that's exercised end-to-end in ~1 second locally.
What changed
.github/scripts/extract-claude-review.sh— theExtract review outputshell, lifted verbatim out ofpr-ci.yml. Env-driven (EXECUTION_FILE,HEADING,REVIEW_OUT_FILE)..github/scripts/test-extract-claude-review.sh— local runner that asserts exit codes and output content against every regression we've hit..github/fixtures/claude-execution-*.json— five sample execution files matching the actualJSON.stringify(SDKMessage[])shape frombase-action/src/run-claude-sdk.ts.pr-ci.yml— each of the fourclaude-review-*jobs replaces 22 lines of inline shell withrun: bash .github/scripts/extract-claude-review.sh. Net workflow diff: -68 lines.review_automation_changedfilter picks up.github/scripts/**so the security gate that disables AI reviews on self-modifying PRs covers the extracted script too.Local usage
The six cases collectively cover every regression we've shipped so far:
fc83c6c'stail -1bug would have silently chopped a multi-line.resultto its last line, dropping bracketed[HIGH]markers from/tmp/review.txtand bypassing the severity gate. Caught by case 1 + 2.4b7c45a's pre-fix state would have crashed on malformed JSON viajqexit 5 propagating throughbash -eo pipefail. Caught by case 5.execution_filehandling) — caught by case 6.Test plan
bash .github/scripts/test-extract-claude-review.sh→ 6 / 6 passingpr-ci.yml, so it triggersreview_automation_modifiedand AI reviews are gated off — only the deterministic jobs run)🤖 Generated with Claude Code