docs: add atomic chaining follow-up implementation plan and tests#61
docs: add atomic chaining follow-up implementation plan and tests#61
Conversation
Items from 2026-02-20-atomic-chaining-followup.md: **Item 1 — SKILL.md:** Update capability count 69→70, remove check_run domain, add Chain section covering ghx chain syntax, output shape, chainable capabilities, and a close+comment example. **Item 2 — Benchmark:** Add chaining scenario set with two new scenarios: - issue-triage-atomic-wf-001 (labels + assignee + milestone in one chain) - pr-review-submit-atomic-wf-001 (PR comment review via chain) Add chaining set to scenario-sets.json; include both in default and all. **Item 7 — Document registry tests:** Expand document-registry.test.ts from 5 tests to exhaustive coverage of all 21 registered mutations and all 9 registered lookups. **Item 8 — pr.reviews.submit resolution validation:** Add two integration tests for executeTasks exercising the Phase 1→Phase 2 resolution path: - Verifies pullRequestId is correctly injected from repository.pullRequest.id - Verifies partial status when lookup returns null pullRequest Also add executeTasks no-resolution batch tests (Phase 2 only). https://claude.ai/code/session_014xrxL9Cssn7begmw8Jekub
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughAdds documentation for atomic chaining follow-up, two benchmark chaining scenarios, updates scenario-set manifests and SKILL.md with chain usage, and expands integration and unit tests to cover batched mutation paths and pr.reviews.submit resolution across Phase 1 (ID lookup) and Phase 2 (batched mutation). Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Engine as Engine
participant GH as GitHubAPI
Client->>Engine: executeTasks(chain request)
Engine->>GH: Phase 1 - lookup IDs (query)
GH-->>Engine: resolved IDs (e.g., pullRequestId)
Engine->>GH: Phase 2 - batched mutation (queryRaw with injected IDs)
GH-->>Engine: mutation results
Engine-->>Client: aggregated results (status, per-step outcomes)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
- Move inline mock assertions to post-call in no-resolution integration test - Remove redundant capturedQueryRawVars array; use vi.fn mock.calls instead - Prefix all document-registry test names with "returns document for " - Rename checkpoint issue_state_updated → issue_view_non_empty (accuracy) - Add chaining scenarios to full-seeded scenario set - Update impl-plan status table: items 1,2,7,8 → Done (this PR) - Add chain limitation note to SKILL.md (no cross-step data passing) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tests assumed default/workflows/all were always equal. Now that chaining scenarios are included in default and all, update assertions to match the differentiated set structure (workflows ⊂ default = all = full-seeded). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/core/test/unit/document-registry.test.ts (1)
89-91: Strengthen the negative-path assertions to match the expected error message.Both
.toThrow()calls pass for any thrown value, not just the expected registry error. SincegetMutationDocumentandgetLookupDocumentthrow a known message ("No mutation document registered for operation: ..."/"No lookup document registered for operation: ..."), pinning the assertion to that string makes the test fail if the error source changes unexpectedly.♻️ Proposed refinement
- it("throws on unknown mutation", () => { - expect(() => getMutationDocument("NonExistentMutation")).toThrow() - }) + it("throws on unknown mutation", () => { + expect(() => getMutationDocument("NonExistentMutation")).toThrow( + "No mutation document registered for operation: NonExistentMutation", + ) + })- it("throws on unknown lookup", () => { - expect(() => getLookupDocument("NonExistentLookup")).toThrow() - }) + it("throws on unknown lookup", () => { + expect(() => getLookupDocument("NonExistentLookup")).toThrow( + "No lookup document registered for operation: NonExistentLookup", + ) + })Also applies to: 135-137
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/test/unit/document-registry.test.ts` around lines 89 - 91, Update the negative-path assertions to assert the exact error message thrown by getMutationDocument and getLookupDocument instead of using a generic .toThrow(): replace expect(() => getMutationDocument("NonExistentMutation")).toThrow() with an assertion that matches the specific message "No mutation document registered for operation: NonExistentMutation" (and similarly for getLookupDocument use "No lookup document registered for operation: ..."), ensuring the tests will fail if a different error or message is thrown.packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json (1)
20-26: Checkpoint doesn't verify the actual triage outcome — only that the issue exists.
condition: "non_empty"onissue.viewis satisfied by any readable issue, including one where the chain failed silently. It doesn't confirm the label'bug', assignee'aryeko', or milestone1were applied. A false-positive pass is possible if the issue already exists before the chain runs.If the assertion runner supports field-level conditions (e.g.,
contains_labelor similar), consider using a more specific check. Alternatively, split into three checkpoints — one per atomic mutation — to isolate which step's post-condition failed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json` around lines 20 - 26, The checkpoint "issue_view_non_empty" currently uses verification_task "issue.view" with condition "non_empty", which only checks existence and can false-positive; update this checkpoint to verify the actual triage outcomes by either (a) using field-level conditions on the same verification_task (e.g., conditions that assert contains_label:"bug", assignee:"aryeko", and milestone:1) or (b) replace the single checkpoint with three atomic checkpoints (e.g., "issue_has_label_bug", "issue_assigned_to_aryeko", "issue_milestone_1") each calling "issue.view" and asserting the specific field-level post-condition so you can isolate which mutation failed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/core/skills/using-ghx/SKILL.md`:
- Line 38: The sentence claiming a chain call is "one GraphQL round-trip" is
inaccurate for resolution-backed capabilities; update the SKILL.md text that
follows "Use `ghx chain`..." to clarify that while `ghx chain` batches mutations
to avoid partial state, some capabilities (e.g., `pr.reviews.submit`) require a
Phase 1 lookup before the Phase 2 mutation and therefore incur two network
round-trips per chain call; mirror the behavior asserted in the
`engine-execute-tasks-pr-review-submit.integration.test.ts` by replacing the
absolute "one GraphQL round-trip" claim with wording that notes
resolution-backed capabilities may need an extra lookup round-trip.
---
Nitpick comments:
In `@packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json`:
- Around line 20-26: The checkpoint "issue_view_non_empty" currently uses
verification_task "issue.view" with condition "non_empty", which only checks
existence and can false-positive; update this checkpoint to verify the actual
triage outcomes by either (a) using field-level conditions on the same
verification_task (e.g., conditions that assert contains_label:"bug",
assignee:"aryeko", and milestone:1) or (b) replace the single checkpoint with
three atomic checkpoints (e.g., "issue_has_label_bug",
"issue_assigned_to_aryeko", "issue_milestone_1") each calling "issue.view" and
asserting the specific field-level post-condition so you can isolate which
mutation failed.
In `@packages/core/test/unit/document-registry.test.ts`:
- Around line 89-91: Update the negative-path assertions to assert the exact
error message thrown by getMutationDocument and getLookupDocument instead of
using a generic .toThrow(): replace expect(() =>
getMutationDocument("NonExistentMutation")).toThrow() with an assertion that
matches the specific message "No mutation document registered for operation:
NonExistentMutation" (and similarly for getLookupDocument use "No lookup
document registered for operation: ..."), ensuring the tests will fail if a
different error or message is thrown.
…bilities "One GraphQL round-trip" is only true for no-resolution capabilities. Capabilities like pr.reviews.submit require a Phase 1 node-ID lookup plus Phase 2 mutation — two round-trips. Clarify to "as few as possible (typically one; node-ID lookups add a preflight query)". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/core/skills/using-ghx/SKILL.md`:
- Around line 38-61: The intro overstates guarantees: replace phrasing like
"must succeed together" and "atomically" for the ghx chain command with clear
language that it batches steps into fewer GraphQL round-trips but does not
provide transactional rollback; explicitly state that the returned { status,
results[], meta } can be "success", "partial" or "failed" and that callers must
check status and handle partial results (results[i] containing { task, ok, data
| error }). Update mentions of "atomically" and "must succeed together" to
clarify "batched into a single request but not all-or-nothing."
---
Duplicate comments:
In `@packages/core/skills/using-ghx/SKILL.md`:
- Line 38: This review is a duplicate noting that the "one round-trip" wording
already accounts for resolution-backed preflight queries; no change to the
SKILL.md text is required—mark the duplicate review comment resolved/removed and
ensure the existing sentence mentioning "Use `ghx chain`" (and the example
`pr.reviews.submit`) remains as-is.
"Must succeed together" and "atomically" imply ACID rollback that ghx chain doesn't provide — a partial result is possible. Replace with accurate batching language and add an explicit note that steps are not transactional. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Implements the atomic chaining follow-up plan from the design document. Adds comprehensive test coverage for the document registry and
executeTasksbatch mutation paths, updates SKILL.md documentation with chaining examples, and introduces two new benchmark scenarios for atomic operations.Changes
Documentation
check_runfrom domains list, and added new "Chain" section with syntax, examples, and guidance on when to useghx chainvsghx rundocs/plans/impl-plan-atomic-chaining-followup.mddocumenting remaining work items, status tracking, and execution orderTest Coverage
issue.close,issue.comments.create)pr.reviews.submit, including successful resolution and error handling when lookup returns nullBenchmark Scenarios
chainingscenario set and included both new scenarios indefaultandallsetsRelated Issue
Derived from
docs/plans/2026-02-20-atomic-chaining-followup.md. Items 3, 5, and 6 were already completed in PR #60.Change Type
Validation
pnpm run cipassesRelease Notes
Docs
https://claude.ai/code/session_014xrxL9Cssn7begmw8Jekub
Summary by CodeRabbit
New Features
Documentation
Tests