ci: factor Claude review extraction into a locally-testable script by loothero · Pull Request #147 · Provable-Games/summit

loothero · 2026-05-03T20:49:25Z

Summary

Replace the slow PR → merge → rebase-test-PR → wait iteration loop with a local fixture-based test runner. The extraction logic that has been the source of every recent regression (#145, #146) lives in a single script that's exercised end-to-end in ~1 second locally.

What changed

.github/scripts/extract-claude-review.sh — the Extract review output shell, lifted verbatim out of pr-ci.yml. Env-driven (EXECUTION_FILE, HEADING, REVIEW_OUT_FILE).
.github/scripts/test-extract-claude-review.sh — local runner that asserts exit codes and output content against every regression we've hit.
.github/fixtures/claude-execution-*.json — five sample execution files matching the actual JSON.stringify(SDKMessage[]) shape from base-action/src/run-claude-sdk.ts.
pr-ci.yml — each of the four claude-review-* jobs replaces 22 lines of inline shell with run: bash .github/scripts/extract-claude-review.sh. Net workflow diff: -68 lines.
review_automation_changed filter picks up .github/scripts/** so the security gate that disables AI reviews on self-modifying PRs covers the extracted script too.

Local usage

bash .github/scripts/test-extract-claude-review.sh

Running extract-claude-review.sh fixture tests

  [PASS] success: multi-line .result preserves heading + body + summary
  [PASS] success-with-HIGH: bracketed [HIGH] marker survives extraction
  [PASS] error: SDKResultError falls back to assistant text concat
  [PASS] empty array: exits 1 with heading + fallback
  [PASS] malformed JSON: exits 1 with heading + fallback
  [PASS] missing file: exits 1 with heading + fallback

Result: 6 passed, 0 failed

The six cases collectively cover every regression we've shipped so far:

ci: iterate JSON array root when extracting Claude review text #146 commit fc83c6c's tail -1 bug would have silently chopped a multi-line .result to its last line, dropping bracketed [HIGH] markers from /tmp/review.txt and bypassing the severity gate. Caught by case 1 + 2.
ci: iterate JSON array root when extracting Claude review text #146 commit 4b7c45a's pre-fix state would have crashed on malformed JSON via jq exit 5 propagating through bash -eo pipefail. Caught by case 5.
PR test: trigger Claude reviews across api/client/contracts scopes #144's first failed run (missing execution_file handling) — caught by case 6.

Test plan

bash .github/scripts/test-extract-claude-review.sh → 6 / 6 passing
CI green (this PR modifies pr-ci.yml, so it triggers review_automation_modified and AI reviews are gated off — only the deterministic jobs run)
After merge: future iteration on the extraction script no longer requires a PR cycle. Run the local test instead.

🤖 Generated with Claude Code

Move the per-job Extract-review-output shell from pr-ci.yml into .github/scripts/extract-claude-review.sh so the extraction logic can be verified locally against fixture files before pushing — replacing the slow PR -> merge -> rebase-test-PR -> wait cycle that has been producing follow-up patches. New files: - .github/scripts/extract-claude-review.sh — extraction logic, env-driven (EXECUTION_FILE, HEADING, REVIEW_OUT_FILE), exits 1 with heading-prefixed fallback on any failure - .github/scripts/test-extract-claude-review.sh — local test runner - .github/fixtures/claude-execution-{success,success-with-high,error,empty,bad}.json — fixture inputs covering the regressions we hit on PRs #145/#146 Test cases (all asserted by the runner): 1. Multi-line .result success — full body preserved (catches the tail -1 regression that would silently drop bracketed [HIGH] markers) 2. .result containing [HIGH] — survives extraction so the downstream blocking-check still fires 3. SDKResultError (no .result, has assistant text) — falls back to assistant text concatenation 4. Empty JSON array — exit 1 with heading-prefixed fallback 5. Malformed JSON — exit 1 with heading-prefixed fallback (catches jq exit 5 propagating through bash -eo pipefail) 6. Missing file — exit 1 with heading-prefixed fallback pr-ci.yml changes: - Each of the four claude-review-* jobs replaces 22 lines of inline shell with `run: bash .github/scripts/extract-claude-review.sh` - Add `.github/scripts/**` to review_automation_changed so the security gate that disables AI reviews on workflow-modifying PRs also covers the extracted script Local usage: bash .github/scripts/test-extract-claude-review.sh Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-03T20:49:30Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
summit	Ready	Preview, Comment	May 3, 2026 8:49pm

coderabbitai · 2026-05-03T20:49:32Z

Warning

Rate limit exceeded

@loothero has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 43 minutes and 7 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: feda73b4-643f-48f4-92fd-1e3be613b6f1

📥 Commits

Reviewing files that changed from the base of the PR and between 5f0ae9f and 3a3e47a.

📒 Files selected for processing (8)

.github/fixtures/claude-execution-bad.json
.github/fixtures/claude-execution-empty.json
.github/fixtures/claude-execution-error.json
.github/fixtures/claude-execution-success-with-high.json
.github/fixtures/claude-execution-success.json
.github/scripts/extract-claude-review.sh
.github/scripts/test-extract-claude-review.sh
.github/workflows/pr-ci.yml

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ci/extract-claude-review-script

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 43 minutes and 7 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

railway-app · 2026-05-03T20:49:33Z

🚅 Deployed to the summit-pr-147 environment in Summit

3 services not affected by this PR

summit-db
summit-indexer
summit-api

gemini-code-assist

Code Review

This pull request introduces a bash script and a corresponding test suite to extract review content from Claude execution logs. The extraction logic handles both successful results and partial assistant responses in case of errors, using a set of JSON fixtures for validation. Feedback focuses on improving the robustness of the JSON parsing with jq by adding type checks and enhancing the test runner's debuggability by capturing error output during execution.

gemini-code-assist · 2026-05-03T20:50:54Z

+  REVIEW=$(jq -r '[.[] | select(.type == "assistant") | .message.content[]? | select(.type == "text") | .text] | join("\n\n")' \
+    "$EXECUTION_FILE" 2>/dev/null || true)


The jq filter for assistant text concatenation could fail if the .text field is not a string (e.g., if it's null or an object), as join requires an array of strings. While the || true and 2>/dev/null prevent the script from exiting, it would result in an empty REVIEW and trigger the generic fallback message even if some text was extractable. Adding a type check makes the extraction more robust.

Suggested change

REVIEW=$(jq -r '[.[] | select(.type == "assistant") | .message.content[]? | select(.type == "text") | .text] | join("\n\n")' \

"$EXECUTION_FILE" 2>/dev/null || true)

REVIEW=$(jq -r '[.[] | select(.type == "assistant") | .message.content[]? | select(.type == "text" and (.text | type == "string")) | .text] | join("\n\n")' \

"$EXECUTION_FILE" 2>/dev/null || true)

gemini-code-assist · 2026-05-03T20:50:54Z

+  local exit_code=0
+  EXECUTION_FILE="$execution_file" \
+  HEADING="## Claude Review - Test Scope" \
+  REVIEW_OUT_FILE="$out" \
+    bash "$EXTRACT" >/dev/null 2>&1 || exit_code=$?
+
+  local result="PASS" reason=""
+  if [ "$exit_code" != "$expected_exit" ]; then
+    result="FAIL"
+    reason="exit=$exit_code expected=$expected_exit"
+  else
+    for pattern in "${patterns[@]}"; do
+      if ! grep -qE "$pattern" "$out"; then
+        result="FAIL"
+        reason="output missing pattern '$pattern'"
+        break
+      fi
+    done
+  fi
+
+  if [ "$result" = "PASS" ]; then
+    PASS=$((PASS + 1))
+    printf '  [PASS] %s\n' "$label"
+  else
+    FAIL=$((FAIL + 1))
+    printf '  [FAIL] %s — %s\n' "$label" "$reason"
+    printf '         output (first 5 lines):\n'
+    head -n 5 "$out" 2>/dev/null | sed 's/^/         | /'
+  fi


The test runner currently suppresses all output from the extraction script (>/dev/null 2>&1). If a test fails due to a system error (e.g., jq missing or a syntax error in the script), the reason is hidden. Capturing stderr to a temporary file and displaying it on failure would significantly improve debuggability.

local err_out err_out="$TMPDIR_ROOT/$(printf '%s' "$label" | tr ' /[]' '____').err" local exit_code=0 EXECUTION_FILE="$execution_file" \ HEADING="## Claude Review - Test Scope" \ REVIEW_OUT_FILE="$out" \ bash "$EXTRACT" >/dev/null 2>"$err_out" || exit_code=$? local result="PASS" reason="" if [ "$exit_code" != "$expected_exit" ]; then result="FAIL" reason="exit=$exit_code expected=$expected_exit" else for pattern in "${patterns[@]}"; do if ! grep -qE "$pattern" "$out"; then result="FAIL" reason="output missing pattern '$pattern'" break fi done fi if [ "$result" = "PASS" ]; then PASS=$((PASS + 1)) printf ' [PASS] %s\n' "$label" else FAIL=$((FAIL + 1)) printf ' [FAIL] %s — %s\n' "$label" "$reason" if [ -s "$err_out" ]; then printf ' stderr:\n' sed 's/^/ | /' "$err_out" fi printf ' output (first 5 lines):\n' head -n 5 "$out" 2>/dev/null | sed 's/^/ | /' fi

Copilot

Pull request overview

Refactors Claude review extraction in the PR CI workflow by moving the inline jq/fallback logic into a dedicated script and adding a fast local fixture-based test runner to prevent regressions in review output parsing.

Changes:

Extracted Claude execution_file parsing into .github/scripts/extract-claude-review.sh and updated all Claude review jobs to call it.
Added .github/scripts/test-extract-claude-review.sh plus JSON fixtures under .github/fixtures/ to locally validate extraction behavior across known regression cases.
Expanded the review_automation_changed filter to include .github/scripts/** so AI review gating applies to changes in the extracted script.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
.github/workflows/pr-ci.yml	Replaces repeated inline extraction logic with the shared extraction script; expands gating filter to include scripts.
.github/scripts/extract-claude-review.sh	New env-driven extraction script with fallback behavior on missing/unparseable output.
.github/scripts/test-extract-claude-review.sh	New local test runner validating exit codes and extracted output against fixtures.
.github/fixtures/claude-execution-success.json	Fixture for successful multi-line `.result` extraction.
.github/fixtures/claude-execution-success-with-high.json	Fixture ensuring `[HIGH]` markers survive extraction.
.github/fixtures/claude-execution-error.json	Fixture for error case where output must fall back to assistant text.
.github/fixtures/claude-execution-empty.json	Fixture for empty message list failure behavior.
.github/fixtures/claude-execution-bad.json	Fixture for malformed JSON failure behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

            review_automation_changed:
              - ".github/workflows/pr-ci.yml"
              - ".github/prompts/**"
              - ".github/actions/**"
+              - ".github/scripts/**"


+if [ ! -x "$EXTRACT" ]; then
+  echo "extract-claude-review.sh not found or not executable at $EXTRACT" >&2


Copilot AI review requested due to automatic review settings May 3, 2026 20:49

vercel Bot deployed to Preview May 3, 2026 20:49 View deployment

Copilot started reviewing on behalf of loothero May 3, 2026 20:49 View session

gemini-code-assist Bot reviewed May 3, 2026

View reviewed changes

Copilot AI reviewed May 3, 2026

View reviewed changes

loothero merged commit 90301c5 into main May 4, 2026
24 checks passed

loothero deleted the ci/extract-claude-review-script branch May 4, 2026 00:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: factor Claude review extraction into a locally-testable script#147

ci: factor Claude review extraction into a locally-testable script#147
loothero merged 1 commit into
mainfrom
ci/extract-claude-review-script

loothero commented May 3, 2026

Uh oh!

vercel Bot commented May 3, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 3, 2026

Rate limit exceeded

Uh oh!

railway-app Bot commented May 3, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 3, 2026

Uh oh!

gemini-code-assist Bot May 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		REVIEW=$(jq -r '[.[] \| select(.type == "assistant") \| .message.content[]? \| select(.type == "text") \| .text] \| join("\n\n")' \
		"$EXECUTION_FILE" 2>/dev/null \|\| true)

		if [ ! -x "$EXTRACT" ]; then
		echo "extract-claude-review.sh not found or not executable at $EXTRACT" >&2

Conversation

loothero commented May 3, 2026

Summary

What changed

Local usage

Test plan

Uh oh!

vercel Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 3, 2026

Rate limit exceeded

Uh oh!

railway-app Bot commented May 3, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented May 3, 2026 •

edited

Loading