From 4a86cda7a6587e675e8d5c690bf2d8949ddd8ab1 Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 21 Feb 2026 18:15:18 +0000 Subject: [PATCH 1/5] =?UTF-8?q?feat:=20atomic=20chaining=20follow-up=20?= =?UTF-8?q?=E2=80=94=20tests,=20SKILL.md,=20benchmark=20scenarios?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Items from 2026-02-20-atomic-chaining-followup.md: **Item 1 — SKILL.md:** Update capability count 69→70, remove check_run domain, add Chain section covering ghx chain syntax, output shape, chainable capabilities, and a close+comment example. **Item 2 — Benchmark:** Add chaining scenario set with two new scenarios: - issue-triage-atomic-wf-001 (labels + assignee + milestone in one chain) - pr-review-submit-atomic-wf-001 (PR comment review via chain) Add chaining set to scenario-sets.json; include both in default and all. **Item 7 — Document registry tests:** Expand document-registry.test.ts from 5 tests to exhaustive coverage of all 21 registered mutations and all 9 registered lookups. **Item 8 — pr.reviews.submit resolution validation:** Add two integration tests for executeTasks exercising the Phase 1→Phase 2 resolution path: - Verifies pullRequestId is correctly injected from repository.pullRequest.id - Verifies partial status when lookup returns null pullRequest Also add executeTasks no-resolution batch tests (Phase 2 only). https://claude.ai/code/session_014xrxL9Cssn7begmw8Jekub --- .../impl-plan-atomic-chaining-followup.md | 338 ++++++++++++++++++ packages/benchmark/scenario-sets.json | 9 +- .../chaining/issue-triage-atomic-wf-001.json | 30 ++ .../pr-review-submit-atomic-wf-001.json | 30 ++ packages/core/skills/using-ghx/SKILL.md | 38 +- ...te-tasks-no-resolution.integration.test.ts | 92 +++++ ...tasks-pr-review-submit.integration.test.ts | 161 +++++++++ .../core/test/unit/document-registry.test.ts | 139 ++++++- 8 files changed, 819 insertions(+), 18 deletions(-) create mode 100644 docs/plans/impl-plan-atomic-chaining-followup.md create mode 100644 packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json create mode 100644 packages/benchmark/scenarios/chaining/pr-review-submit-atomic-wf-001.json create mode 100644 packages/core/test/integration/engine-execute-tasks-no-resolution.integration.test.ts create mode 100644 packages/core/test/integration/engine-execute-tasks-pr-review-submit.integration.test.ts diff --git a/docs/plans/impl-plan-atomic-chaining-followup.md b/docs/plans/impl-plan-atomic-chaining-followup.md new file mode 100644 index 00000000..e18a89d5 --- /dev/null +++ b/docs/plans/impl-plan-atomic-chaining-followup.md @@ -0,0 +1,338 @@ +# Implementation Plan: Atomic Chaining Follow-up + +> Derived from `docs/plans/2026-02-20-atomic-chaining-followup.md`. +> Items 3 (GQL partial error handling) and 6 (resolution cache) are already done per PR #60. +> Items 5a (`issue.labels.remove`) and 5b (`issue.milestone.clear`) are already done — +> both have GraphQL handlers registered in `document-registry.ts` and integration tests. +> Item 4 (cross-step data passing) is deferred — see §4 below. + +--- + +## Status + +| # | Item | Status | +|---|------|--------| +| 1 | SKILL.md update | **Remaining** | +| 2 | Benchmark chaining scenarios | **Remaining** | +| 3 | GQL partial error handling | ✅ Done (PR #60) | +| 4 | Cross-step data passing | Deferred (design only) | +| 5 | Expand chainable coverage | ✅ Done (PR #60) | +| 6 | Resolution cache | ✅ Done (PR #60) | +| 7 | Document registry test coverage | **Remaining** | +| 8 | `pr.reviews.submit` resolution validation | **Remaining** | + +--- + +## 1. SKILL.md Update + +**File:** `packages/core/skills/using-ghx/SKILL.md` + +### 1a. Fix stale metadata + +- Update frontmatter description: `"69 capabilities"` → `"70 capabilities"` (70 `.yaml` files + in `packages/core/src/core/registry/cards/`). +- Remove `check_run` from the Domains line. New list: + `repo`, `issue`, `pr`, `release`, `workflow`, `project_v2`. + +### 1b. Add "Chain" section + +Insert after the "Execute" section. Cover: + +1. **When to prefer `ghx chain`** — use when ≥2 mutations must succeed together (labelling + + assigning + milestoning an issue; submitting a review). Avoids partial state from + sequential `ghx run` calls. +2. **Syntax:** + ```bash + ghx chain --steps '[ + {"task":"issue.close","input":{"issueId":"I_abc"}}, + {"task":"issue.comments.create","input":{"owner":"o","name":"r","issueNumber":1,"body":"Closed."}} + ]' + ``` +3. **stdin variant:** + ```bash + ghx chain --steps - <<'EOF' + [{"task":"...","input":{...}}] + EOF + ``` +4. **Output shape:** `{ status, results[], meta }`. Check `status` first + (`"success"` | `"partial"` | `"failed"`). Each `results[i]` has `{ task, ok, data | error }`. +5. **Chainable capabilities** — any capability whose card has a `graphql:` block. These + are all capabilities that take node IDs or perform lookups internally. Capabilities with + only a `cli:` route cannot be chained. +6. **Example:** close an issue and leave a closing comment atomically (as above). + +--- + +## 2. Benchmark Chaining Scenarios + +### 2a. New scenario files + +Create directory `packages/benchmark/scenarios/chaining/` and add two scenario JSON files. + +**`packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json`** + +```json +{ + "type": "workflow", + "id": "issue-triage-atomic-wf-001", + "name": "Atomically triage issue: set labels, assignee, and milestone in one chain", + "prompt": "Issue #{{issueNumber}} in {{owner}}/{{name}} needs atomic triage. In a single ghx chain call, set the label 'bug', assign it to the first available assignee using issue.assignees.set, and set milestone number 1 using issue.milestone.set. Do not use separate ghx run calls.", + "expected_capabilities": ["issue.labels.set", "issue.assignees.set", "issue.milestone.set"], + "timeout_ms": 180000, + "allowed_retries": 1, + "fixture": { + "repo": "aryeko/ghx-bench-fixtures", + "bindings": { + "input.owner": "repo.owner", + "input.name": "repo.name", + "input.issueNumber": "resources.issue.number" + }, + "requires": ["issue"] + }, + "assertions": { + "expected_outcome": "success", + "checkpoints": [ + { + "name": "issue_has_bug_label", + "verification_task": "issue.view", + "verification_input": {}, + "condition": "non_empty" + } + ] + }, + "tags": ["workflow", "chaining", "issue", "atomic"] +} +``` + +**`packages/benchmark/scenarios/chaining/pr-review-submit-atomic-wf-001.json`** + +```json +{ + "type": "workflow", + "id": "pr-review-submit-atomic-wf-001", + "name": "Atomically submit a PR review with inline comment via chain", + "prompt": "PR #{{prNumber}} in {{owner}}/{{name}} needs a review. Use ghx chain with pr.reviews.submit to submit a COMMENT review with body 'Reviewed atomically via chain.' Do not use separate ghx run calls.", + "expected_capabilities": ["pr.reviews.submit"], + "timeout_ms": 180000, + "allowed_retries": 1, + "fixture": { + "repo": "aryeko/ghx-bench-fixtures", + "bindings": { + "input.owner": "repo.owner", + "input.name": "repo.name", + "input.prNumber": "resources.pr.number" + }, + "requires": ["pr"] + }, + "assertions": { + "expected_outcome": "success", + "checkpoints": [ + { + "name": "review_submitted", + "verification_task": "pr.reviews.list", + "verification_input": {}, + "condition": "non_empty" + } + ] + }, + "tags": ["workflow", "chaining", "pr", "atomic"] +} +``` + +### 2b. Update `scenario-sets.json` + +Add a `chaining` set and add the two new scenario IDs to `default` and `all`: + +```json +{ + "default": [ + "pr-fix-review-comments-wf-001", + "issue-triage-comment-wf-001", + "pr-review-comment-wf-001", + "ci-diagnose-run-wf-001", + "issue-triage-atomic-wf-001", + "pr-review-submit-atomic-wf-001" + ], + "workflows": [ + "pr-fix-review-comments-wf-001", + "issue-triage-comment-wf-001", + "pr-review-comment-wf-001", + "ci-diagnose-run-wf-001" + ], + "chaining": [ + "issue-triage-atomic-wf-001", + "pr-review-submit-atomic-wf-001" + ], + "all": [ + "pr-fix-review-comments-wf-001", + "issue-triage-comment-wf-001", + "pr-review-comment-wf-001", + "ci-diagnose-run-wf-001", + "issue-triage-atomic-wf-001", + "pr-review-submit-atomic-wf-001" + ], + "full-seeded": [ + "pr-fix-review-comments-wf-001", + "issue-triage-comment-wf-001", + "pr-review-comment-wf-001", + "ci-diagnose-run-wf-001" + ] +} +``` + +### 2c. Assertion format note + +The benchmark runner evaluates assertions by running verification tasks via `executeTask` +(not by parsing `ChainResultEnvelope` directly). Checkpoints inspect final GitHub state, +so no changes to the assertion runner are needed. + +--- + +## 4. Cross-step data passing (Deferred) + +The design is described in the follow-up doc (Item 4). It requires template interpolation +(`{{steps.0.data.id}}`) and a pre-processing pass in `executeTasks`. Given complexity and +unclear demand, defer to a dedicated planning session when a concrete use case arises. + +--- + +## 7. Document Registry Test Coverage + +### 7a. Expand `document-registry.test.ts` + +**File:** `packages/core/test/unit/document-registry.test.ts` + +Replace the current 5-test file with exhaustive coverage. Add one `it` per registered +operation (rather than a data-driven loop) so individual failures name the broken operation. + +**Mutations to cover** (21 total, from `MUTATION_DOCUMENTS`): +`IssueAssigneesAdd`, `IssueAssigneesRemove`, `IssueAssigneesUpdate`, +`IssueBlockedByAdd`, `IssueBlockedByRemove`, +`IssueClose`, `IssueCommentCreate`, `IssueCreate`, `IssueDelete`, +`IssueLabelsAdd`, `IssueLabelsRemove`, `IssueLabelsUpdate`, +`IssueMilestoneSet`, `IssueParentRemove`, `IssueParentSet`, +`IssueReopen`, `IssueUpdate`, +`PrCommentReply`, `PrCommentResolve`, `PrCommentUnresolve`, +`PrReviewSubmit` + +**Lookups to cover** (9 total, from `LOOKUP_DOCUMENTS`): +`IssueAssigneesLookup`, `IssueAssigneesLookupByNumber`, +`IssueCreateRepositoryId`, +`IssueLabelsLookup`, `IssueLabelsLookupByNumber`, +`IssueMilestoneLookup`, `IssueNodeIdLookup`, +`IssueParentLookup`, `PrNodeId` + +Pattern for each assertion: +```ts +it("getMutationDocument returns IssueClose", () => { + expect(getMutationDocument("IssueClose")).toContain("mutation IssueClose") +}) +``` + +Keep existing `throws on unknown` tests. + +### 7b. New `executeTasks` integration test — no-resolution mutation + +**File:** `packages/core/test/integration/engine-execute-tasks-no-resolution.integration.test.ts` + +Test `executeTasks` with `issue.close` (no `resolution` config — skips Phase 1, goes +directly to Phase 2 batch mutation). + +Key mock: `githubClient.queryRaw` returns +```ts +{ + data: { + step0: { + closeIssue: { issue: { id: "I_abc", number: 10, state: "CLOSED", closed: true } } + } + } +} +``` + +Assertions: +- `status === "success"` +- `results[0].ok === true` +- `results[0].data` contains `{ state: "CLOSED", closed: true }` +- `meta.succeeded === 1`, `meta.failed === 0` + +Also add a two-step no-resolution test (`issue.close` + `issue.reopen` is not useful, +but `issue.close` + `issue.comments.create` with `issueId` can demonstrate batching): +- Mock `queryRaw` returns `{ data: { step0: {...closeIssue...}, step1: {...createIssueComment...} } }` +- Verify `status === "success"`, `results.length === 2`, both `ok === true` + +--- + +## 8. `pr.reviews.submit` Resolution Validation + +**File:** `packages/core/test/integration/engine-execute-tasks-pr-review-submit.integration.test.ts` + +Test `executeTasks` with `pr.reviews.submit`, which has a Phase 1 resolution: +- **Lookup:** `PrNodeId` query → extracts `repository.pullRequest.id` +- **Inject:** `pullRequestId` ← scalar from `repository.pullRequest.id` +- **Mutation:** `PrReviewSubmit` + +### Mocks required + +Phase 1 uses `githubClient.query(batchQueryDocument, variables)`: +```ts +async query(query: string): Promise { + if (query.includes("PrNodeId")) { + return { + step0: { repository: { pullRequest: { id: "PR_xyz789" } } } + } as TData + } + throw new Error("unexpected query") +} +``` + +Phase 2 uses `githubClient.queryRaw>(document, variables)`: +```ts +async queryRaw(query: string): Promise<{ data?: TData; errors?: ... }> { + if (query.includes("PrReviewSubmit")) { + return { + data: { + step0: { + addPullRequestReview: { + pullRequestReview: { + id: "review-id-1", + state: "COMMENTED", + url: "https://github.com/.../pull/5#pullrequestreview-1", + body: "Reviewed atomically.", + } + } + } + } + } as { data: TData } + } + throw new Error("unexpected queryRaw") +} +``` + +### Assertions + +- `status === "success"` +- `results[0].ok === true` +- `results[0].data` contains `{ state: "COMMENTED" }` — confirming the Phase 2 mutation + result was correctly mapped +- Verify `query` was called once (Phase 1 lookup) and `queryRaw` was called once (Phase 2) + — use `vi.fn()` spies on the client methods + +--- + +## Execution Order + +1. **Item 7a** — expand `document-registry.test.ts` (purely additive, no risk) +2. **Item 1** — update `SKILL.md` (documentation only) +3. **Item 7b** — add `executeTasks` no-resolution integration test +4. **Item 8** — add `executeTasks` pr.reviews.submit integration test +5. **Item 2** — add benchmark scenarios and update `scenario-sets.json` +6. Run `pnpm run ci --outputStyle=static` to verify all pass + +--- + +## Pre-PR Checklist + +- [ ] `pnpm run ci --outputStyle=static` passes +- [ ] No GraphQL operation files changed (no codegen needed) +- [ ] Coverage for touched files ≥90% +- [ ] No public API changes (no changeset needed) diff --git a/packages/benchmark/scenario-sets.json b/packages/benchmark/scenario-sets.json index 03b107cd..72660ed1 100644 --- a/packages/benchmark/scenario-sets.json +++ b/packages/benchmark/scenario-sets.json @@ -3,7 +3,9 @@ "pr-fix-review-comments-wf-001", "issue-triage-comment-wf-001", "pr-review-comment-wf-001", - "ci-diagnose-run-wf-001" + "ci-diagnose-run-wf-001", + "issue-triage-atomic-wf-001", + "pr-review-submit-atomic-wf-001" ], "workflows": [ "pr-fix-review-comments-wf-001", @@ -11,11 +13,14 @@ "pr-review-comment-wf-001", "ci-diagnose-run-wf-001" ], + "chaining": ["issue-triage-atomic-wf-001", "pr-review-submit-atomic-wf-001"], "all": [ "pr-fix-review-comments-wf-001", "issue-triage-comment-wf-001", "pr-review-comment-wf-001", - "ci-diagnose-run-wf-001" + "ci-diagnose-run-wf-001", + "issue-triage-atomic-wf-001", + "pr-review-submit-atomic-wf-001" ], "full-seeded": [ "pr-fix-review-comments-wf-001", diff --git a/packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json b/packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json new file mode 100644 index 00000000..b4d985c7 --- /dev/null +++ b/packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json @@ -0,0 +1,30 @@ +{ + "type": "workflow", + "id": "issue-triage-atomic-wf-001", + "name": "Atomically triage issue: set labels, assignee, and milestone in one chain", + "prompt": "Issue #{{issueNumber}} in {{owner}}/{{name}} needs atomic triage. In a single ghx chain call, set the label 'bug' using issue.labels.set, set the assignee to 'aryeko' using issue.assignees.set, and set milestone number 1 using issue.milestone.set. All three must happen in one chain — do not use separate ghx run calls.", + "expected_capabilities": ["issue.labels.set", "issue.assignees.set", "issue.milestone.set"], + "timeout_ms": 180000, + "allowed_retries": 1, + "fixture": { + "repo": "aryeko/ghx-bench-fixtures", + "bindings": { + "input.owner": "repo.owner", + "input.name": "repo.name", + "input.issueNumber": "resources.issue.number" + }, + "requires": ["issue"] + }, + "assertions": { + "expected_outcome": "success", + "checkpoints": [ + { + "name": "issue_state_updated", + "verification_task": "issue.view", + "verification_input": {}, + "condition": "non_empty" + } + ] + }, + "tags": ["workflow", "chaining", "issue", "atomic"] +} diff --git a/packages/benchmark/scenarios/chaining/pr-review-submit-atomic-wf-001.json b/packages/benchmark/scenarios/chaining/pr-review-submit-atomic-wf-001.json new file mode 100644 index 00000000..b3b6305e --- /dev/null +++ b/packages/benchmark/scenarios/chaining/pr-review-submit-atomic-wf-001.json @@ -0,0 +1,30 @@ +{ + "type": "workflow", + "id": "pr-review-submit-atomic-wf-001", + "name": "Atomically submit a PR comment review via chain", + "prompt": "PR #{{prNumber}} in {{owner}}/{{name}} needs a review. Use a single ghx chain call with pr.reviews.submit to submit a COMMENT review with body 'Reviewed atomically via ghx chain.' Do not use ghx run — use ghx chain so the pullRequestId is resolved in one round-trip.", + "expected_capabilities": ["pr.reviews.submit"], + "timeout_ms": 180000, + "allowed_retries": 1, + "fixture": { + "repo": "aryeko/ghx-bench-fixtures", + "bindings": { + "input.owner": "repo.owner", + "input.name": "repo.name", + "input.prNumber": "resources.pr.number" + }, + "requires": ["pr"] + }, + "assertions": { + "expected_outcome": "success", + "checkpoints": [ + { + "name": "review_submitted", + "verification_task": "pr.reviews.list", + "verification_input": {}, + "condition": "non_empty" + } + ] + }, + "tags": ["workflow", "chaining", "pr", "atomic"] +} diff --git a/packages/core/skills/using-ghx/SKILL.md b/packages/core/skills/using-ghx/SKILL.md index 51974584..2a911e23 100644 --- a/packages/core/skills/using-ghx/SKILL.md +++ b/packages/core/skills/using-ghx/SKILL.md @@ -1,5 +1,5 @@ --- -description: Execute GitHub operations via ghx — deterministic routing, normalized output, 69 capabilities +description: Execute GitHub operations via ghx — deterministic routing, normalized output, 70 capabilities --- # ghx CLI Skill @@ -14,7 +14,7 @@ If you don't know the capability ID or required inputs, list by domain first: ghx capabilities list --domain pr ``` -Domains: `repo`, `issue`, `pr`, `release`, `workflow`, `project_v2`, `check_run`. +Domains: `repo`, `issue`, `pr`, `release`, `workflow`, `project_v2`. Required inputs shown in brackets (e.g. `[owner, name, prNumber]`). Only if you need the full input/output schema for a specific capability: @@ -33,6 +33,40 @@ EOF Result: `{ ok, data, error, meta }`. Check `ok` first. If `ok=false` and `error.retryable=true`, retry once. +## Chain + +Use `ghx chain` when two or more mutations must succeed together. A single chain call is one GraphQL round-trip — it avoids partial state from sequential `ghx run` calls. + +```bash +ghx chain --steps '[ + {"task":"issue.close","input":{"issueId":"I_abc"}}, + {"task":"issue.comments.create","input":{"owner":"o","name":"r","issueNumber":1,"body":"Closed."}} +]' +``` + +**stdin variant:** + +```bash +ghx chain --steps - <<'EOF' +[{"task":"...","input":{...}},{"task":"...","input":{...}}] +EOF +``` + +**Result:** `{ status, results[], meta }`. Check `status` first (`"success"` | `"partial"` | `"failed"`). Each `results[i]` has `{ task, ok, data | error }`. + +**Chainable capabilities:** any capability with a `graphql:` route (i.e. those that accept node IDs or perform internal lookups — labels, assignees, milestones, relations, reviews, comments). Capabilities with only a `cli:` route cannot be chained. + +**Example — close an issue and leave a closing comment atomically:** + +```bash +ghx chain --steps - <<'EOF' +[ + {"task":"issue.close","input":{"issueId":"I_abc123"}}, + {"task":"issue.comments.create","input":{"owner":"octocat","name":"hello-world","issueNumber":42,"body":"Closed — see linked PR."}} +] +EOF +``` + ## Examples ```bash diff --git a/packages/core/test/integration/engine-execute-tasks-no-resolution.integration.test.ts b/packages/core/test/integration/engine-execute-tasks-no-resolution.integration.test.ts new file mode 100644 index 00000000..d45f03a1 --- /dev/null +++ b/packages/core/test/integration/engine-execute-tasks-no-resolution.integration.test.ts @@ -0,0 +1,92 @@ +/** + * Integration tests for executeTasks with no-resolution mutations. + * + * Note: executeTasks with a single request delegates to executeTask (uses typed client + * methods). These tests use ≥2 requests to exercise the Phase 2 batch mutation path + * where queryRaw is called directly. + */ +import { executeTasks } from "@core/core/routing/engine.js" +import type { GithubClient } from "@core/gql/github-client.js" +import { describe, expect, it } from "vitest" + +describe("executeTasks – no-resolution mutations (batch Phase 2 path)", () => { + it("succeeds for two no-resolution steps batched in a single mutation", async () => { + const githubClient = { + query: async () => { + throw new Error("Phase 1 query not expected: neither step has resolution config") + }, + queryRaw: async (doc: unknown) => { + const query = String(doc) + // Both mutations are batched into one queryRaw call + expect(query).toContain("closeIssue") + expect(query).toContain("addComment") + return { + data: { + step0: { + closeIssue: { issue: { id: "I_abc123", number: 10, state: "CLOSED" } }, + }, + step1: { + addComment: { + commentEdge: { + node: { + id: "IC_comment1", + body: "Closing comment.", + url: "https://github.com/o/r/issues/10#issuecomment-1", + }, + }, + }, + }, + } as TData, + errors: undefined, + } + }, + } as unknown as GithubClient + + const result = await executeTasks( + [ + { task: "issue.close", input: { issueId: "I_abc123" } }, + { task: "issue.comments.create", input: { issueId: "I_abc123", body: "Closing comment." } }, + ], + { githubClient, githubToken: "test-token" }, + ) + + expect(result.status).toBe("success") + expect(result.meta.total).toBe(2) + expect(result.meta.succeeded).toBe(2) + expect(result.meta.failed).toBe(0) + expect(result.results).toHaveLength(2) + expect(result.results[0]).toMatchObject({ + task: "issue.close", + ok: true, + data: { closeIssue: { issue: { state: "CLOSED" } } }, + }) + expect(result.results[1]).toMatchObject({ + task: "issue.comments.create", + ok: true, + }) + }) + + it("returns failed status when queryRaw throws a transport error for all steps", async () => { + const githubClient = { + query: async () => { + throw new Error("Phase 1 query not expected") + }, + queryRaw: async () => { + throw new Error("network failure") + }, + } as unknown as GithubClient + + const result = await executeTasks( + [ + { task: "issue.close", input: { issueId: "I_abc123" } }, + { task: "issue.comments.create", input: { issueId: "I_abc123", body: "Closing comment." } }, + ], + { githubClient, githubToken: "test-token" }, + ) + + expect(result.status).toBe("failed") + expect(result.results).toHaveLength(2) + expect(result.results[0]?.ok).toBe(false) + expect(result.results[1]?.ok).toBe(false) + }) +}) diff --git a/packages/core/test/integration/engine-execute-tasks-pr-review-submit.integration.test.ts b/packages/core/test/integration/engine-execute-tasks-pr-review-submit.integration.test.ts new file mode 100644 index 00000000..8fd367b8 --- /dev/null +++ b/packages/core/test/integration/engine-execute-tasks-pr-review-submit.integration.test.ts @@ -0,0 +1,161 @@ +/** + * Integration tests for executeTasks with pr.reviews.submit. + * + * Validates the Phase 1 → Phase 2 resolution path for pr.reviews.submit: + * Phase 1: PrNodeId lookup → extracts repository.pullRequest.id + * Inject: pullRequestId ← scalar from repository.pullRequest.id + * Phase 2: PrReviewSubmit mutation with injected pullRequestId + * + * Note: executeTasks with a single request delegates to executeTask (typed client + * methods, no batch resolution). Tests use ≥2 requests to exercise the batch path. + */ +import { executeTasks } from "@core/core/routing/engine.js" +import type { GithubClient } from "@core/gql/github-client.js" +import { describe, expect, it, vi } from "vitest" + +describe("executeTasks – pr.reviews.submit resolution (Phase 1 → Phase 2)", () => { + it("resolves pullRequestId from PrNodeId lookup and injects into PrReviewSubmit mutation", async () => { + const capturedQueryRawVars: Record[] = [] + + // Phase 1: batch query returning PrNodeId lookup result aliased as step0 + const queryFn = vi.fn(async (_doc: unknown, _vars: unknown) => { + return { + step0: { repository: { pullRequest: { id: "PR_xyz789" } } }, + } as TData + }) + + // Phase 2: capture variables to verify pullRequestId was resolved and injected + const queryRawFn = vi.fn(async (_doc: unknown, vars: unknown) => { + capturedQueryRawVars.push(vars as Record) + return { + data: { + step0: { + addPullRequestReview: { + pullRequestReview: { + id: "review-id-1", + state: "COMMENTED", + url: "https://github.com/o/r/pull/5#pullrequestreview-1", + body: "Reviewed atomically.", + }, + }, + }, + step1: { + closeIssue: { issue: { id: "I_abc", number: 10, state: "CLOSED" } }, + }, + } as TData, + errors: undefined, + } + }) + + const githubClient = { + query: queryFn, + queryRaw: queryRawFn, + } as unknown as GithubClient + + const result = await executeTasks( + [ + { + task: "pr.reviews.submit", + input: { + owner: "octocat", + name: "hello-world", + prNumber: 5, + event: "COMMENT", + body: "Reviewed atomically.", + }, + }, + // Second step forces the batch path (≥2 requests) + { task: "issue.close", input: { issueId: "I_abc" } }, + ], + { githubClient, githubToken: "test-token" }, + ) + + // Overall result + expect(result.status).toBe("success") + expect(result.meta.total).toBe(2) + expect(result.meta.succeeded).toBe(2) + expect(result.meta.failed).toBe(0) + expect(result.results).toHaveLength(2) + + // PR review step + expect(result.results[0]).toMatchObject({ + task: "pr.reviews.submit", + ok: true, + data: { + addPullRequestReview: { + pullRequestReview: { state: "COMMENTED" }, + }, + }, + }) + + // Issue close step + expect(result.results[1]).toMatchObject({ + task: "issue.close", + ok: true, + }) + + // Phase 1 ran exactly once (only pr.reviews.submit has a resolution lookup) + expect(queryFn).toHaveBeenCalledOnce() + + // Phase 2 ran exactly once (batch mutation for both steps) + expect(queryRawFn).toHaveBeenCalledOnce() + + // Verify the resolved pullRequestId was injected into the batch mutation variables. + // buildBatchMutation prefixes each step's variables with its alias (step0_*, step1_*, ...) + const mutVars = capturedQueryRawVars[0] + expect(mutVars).toBeDefined() + expect(mutVars?.["step0_pullRequestId"]).toBe("PR_xyz789") + }) + + it("returns partial status when PrNodeId lookup returns null pullRequest (bad inject path)", async () => { + // Phase 1: returns null pullRequest → applyInject throws for pullRequestId + const queryFn = vi.fn(async () => { + return { + step0: { repository: { pullRequest: null } }, + } as TData + }) + + // Phase 2: only step1 (issue.close) reaches the batch; step0 failed in Phase 2 prep + const queryRawFn = vi.fn(async () => { + return { + data: { + step1: { + closeIssue: { issue: { id: "I_abc", number: 10, state: "CLOSED" } }, + }, + } as TData, + errors: undefined, + } + }) + + const githubClient = { + query: queryFn, + queryRaw: queryRawFn, + } as unknown as GithubClient + + const result = await executeTasks( + [ + { + task: "pr.reviews.submit", + input: { + owner: "octocat", + name: "hello-world", + prNumber: 5, + event: "COMMENT", + body: "Test.", + }, + }, + { task: "issue.close", input: { issueId: "I_abc" } }, + ], + { githubClient, githubToken: "test-token" }, + ) + + // step0 (pr.reviews.submit) fails due to resolution error; step1 (issue.close) succeeds + expect(result.status).toBe("partial") + expect(result.results[0]?.ok).toBe(false) + expect(result.results[0]).toMatchObject({ + ok: false, + error: expect.objectContaining({ message: expect.stringContaining("pullRequestId") }), + }) + expect(result.results[1]?.ok).toBe(true) + }) +}) diff --git a/packages/core/test/unit/document-registry.test.ts b/packages/core/test/unit/document-registry.test.ts index 887badb1..ec95d86d 100644 --- a/packages/core/test/unit/document-registry.test.ts +++ b/packages/core/test/unit/document-registry.test.ts @@ -1,27 +1,138 @@ import { getLookupDocument, getMutationDocument } from "@core/gql/document-registry.js" import { describe, expect, it } from "vitest" -describe("document-registry", () => { - it("getLookupDocument returns document for IssueLabelsLookup", () => { - const doc = getLookupDocument("IssueLabelsLookup") - expect(doc).toContain("query IssueLabelsLookup") +describe("document-registry – mutations", () => { + it("IssueAssigneesAdd", () => { + expect(getMutationDocument("IssueAssigneesAdd")).toContain("mutation IssueAssigneesAdd") }) - it("getLookupDocument returns document for IssueMilestoneLookup", () => { - const doc = getLookupDocument("IssueMilestoneLookup") - expect(doc).toContain("milestoneNumber") + it("IssueAssigneesRemove", () => { + expect(getMutationDocument("IssueAssigneesRemove")).toContain("mutation IssueAssigneesRemove") }) - it("getMutationDocument returns document for IssueLabelsUpdate", () => { - const doc = getMutationDocument("IssueLabelsUpdate") - expect(doc).toContain("mutation IssueLabelsUpdate") + it("IssueAssigneesUpdate", () => { + expect(getMutationDocument("IssueAssigneesUpdate")).toContain("mutation IssueAssigneesUpdate") }) - it("getLookupDocument throws on unknown operation", () => { - expect(() => getLookupDocument("UnknownOp")).toThrow() + it("IssueBlockedByAdd", () => { + expect(getMutationDocument("IssueBlockedByAdd")).toContain("mutation IssueBlockedByAdd") }) - it("getMutationDocument throws on unknown operation", () => { - expect(() => getMutationDocument("UnknownOp")).toThrow() + it("IssueBlockedByRemove", () => { + expect(getMutationDocument("IssueBlockedByRemove")).toContain("mutation IssueBlockedByRemove") + }) + + it("IssueClose", () => { + expect(getMutationDocument("IssueClose")).toContain("mutation IssueClose") + }) + + it("IssueCommentCreate", () => { + expect(getMutationDocument("IssueCommentCreate")).toContain("mutation IssueCommentCreate") + }) + + it("IssueCreate", () => { + expect(getMutationDocument("IssueCreate")).toContain("mutation IssueCreate") + }) + + it("IssueDelete", () => { + expect(getMutationDocument("IssueDelete")).toContain("mutation IssueDelete") + }) + + it("IssueLabelsAdd", () => { + expect(getMutationDocument("IssueLabelsAdd")).toContain("mutation IssueLabelsAdd") + }) + + it("IssueLabelsRemove", () => { + expect(getMutationDocument("IssueLabelsRemove")).toContain("mutation IssueLabelsRemove") + }) + + it("IssueLabelsUpdate", () => { + expect(getMutationDocument("IssueLabelsUpdate")).toContain("mutation IssueLabelsUpdate") + }) + + it("IssueMilestoneSet", () => { + expect(getMutationDocument("IssueMilestoneSet")).toContain("mutation IssueMilestoneSet") + }) + + it("IssueParentRemove", () => { + expect(getMutationDocument("IssueParentRemove")).toContain("mutation IssueParentRemove") + }) + + it("IssueParentSet", () => { + expect(getMutationDocument("IssueParentSet")).toContain("mutation IssueParentSet") + }) + + it("IssueReopen", () => { + expect(getMutationDocument("IssueReopen")).toContain("mutation IssueReopen") + }) + + it("IssueUpdate", () => { + expect(getMutationDocument("IssueUpdate")).toContain("mutation IssueUpdate") + }) + + it("PrCommentReply", () => { + expect(getMutationDocument("PrCommentReply")).toContain("mutation PrCommentReply") + }) + + it("PrCommentResolve", () => { + expect(getMutationDocument("PrCommentResolve")).toContain("mutation PrCommentResolve") + }) + + it("PrCommentUnresolve", () => { + expect(getMutationDocument("PrCommentUnresolve")).toContain("mutation PrCommentUnresolve") + }) + + it("PrReviewSubmit", () => { + expect(getMutationDocument("PrReviewSubmit")).toContain("mutation PrReviewSubmit") + }) + + it("throws on unknown mutation", () => { + expect(() => getMutationDocument("NonExistentMutation")).toThrow() + }) +}) + +describe("document-registry – lookups", () => { + it("IssueAssigneesLookup", () => { + expect(getLookupDocument("IssueAssigneesLookup")).toContain("query IssueAssigneesLookup") + }) + + it("IssueAssigneesLookupByNumber", () => { + expect(getLookupDocument("IssueAssigneesLookupByNumber")).toContain( + "query IssueAssigneesLookupByNumber", + ) + }) + + it("IssueCreateRepositoryId", () => { + expect(getLookupDocument("IssueCreateRepositoryId")).toContain("query IssueCreateRepositoryId") + }) + + it("IssueLabelsLookup", () => { + expect(getLookupDocument("IssueLabelsLookup")).toContain("query IssueLabelsLookup") + }) + + it("IssueLabelsLookupByNumber", () => { + expect(getLookupDocument("IssueLabelsLookupByNumber")).toContain( + "query IssueLabelsLookupByNumber", + ) + }) + + it("IssueMilestoneLookup", () => { + expect(getLookupDocument("IssueMilestoneLookup")).toContain("query IssueMilestoneLookup") + }) + + it("IssueNodeIdLookup", () => { + expect(getLookupDocument("IssueNodeIdLookup")).toContain("query IssueNodeIdLookup") + }) + + it("IssueParentLookup", () => { + expect(getLookupDocument("IssueParentLookup")).toContain("query IssueParentLookup") + }) + + it("PrNodeId", () => { + expect(getLookupDocument("PrNodeId")).toContain("query PrNodeId") + }) + + it("throws on unknown lookup", () => { + expect(() => getLookupDocument("NonExistentLookup")).toThrow() }) }) From 3b089ffab5796952860caca90a9d91f83a9022a7 Mon Sep 17 00:00:00 2001 From: Arye Kogan Date: Sat, 21 Feb 2026 20:26:58 +0200 Subject: [PATCH 2/5] =?UTF-8?q?fix:=20address=20PR=20#61=20review=20findin?= =?UTF-8?q?gs=20=E2=80=94=20test=20hygiene,=20docs,=20scenario=20fixes?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Move inline mock assertions to post-call in no-resolution integration test - Remove redundant capturedQueryRawVars array; use vi.fn mock.calls instead - Prefix all document-registry test names with "returns document for " - Rename checkpoint issue_state_updated → issue_view_non_empty (accuracy) - Add chaining scenarios to full-seeded scenario set - Update impl-plan status table: items 1,2,7,8 → Done (this PR) - Add chain limitation note to SKILL.md (no cross-step data passing) Co-Authored-By: Claude Sonnet 4.6 --- .../impl-plan-atomic-chaining-followup.md | 16 ++--- packages/benchmark/scenario-sets.json | 4 +- .../chaining/issue-triage-atomic-wf-001.json | 2 +- packages/core/skills/using-ghx/SKILL.md | 2 + ...te-tasks-no-resolution.integration.test.ts | 55 +++++++++-------- ...tasks-pr-review-submit.integration.test.ts | 7 +-- .../core/test/unit/document-registry.test.ts | 60 +++++++++---------- 7 files changed, 75 insertions(+), 71 deletions(-) diff --git a/docs/plans/impl-plan-atomic-chaining-followup.md b/docs/plans/impl-plan-atomic-chaining-followup.md index e18a89d5..9ffeb2f6 100644 --- a/docs/plans/impl-plan-atomic-chaining-followup.md +++ b/docs/plans/impl-plan-atomic-chaining-followup.md @@ -12,14 +12,14 @@ | # | Item | Status | |---|------|--------| -| 1 | SKILL.md update | **Remaining** | -| 2 | Benchmark chaining scenarios | **Remaining** | +| 1 | SKILL.md update | ✅ Done (this PR) | +| 2 | Benchmark chaining scenarios | ✅ Done (this PR) | | 3 | GQL partial error handling | ✅ Done (PR #60) | | 4 | Cross-step data passing | Deferred (design only) | | 5 | Expand chainable coverage | ✅ Done (PR #60) | | 6 | Resolution cache | ✅ Done (PR #60) | -| 7 | Document registry test coverage | **Remaining** | -| 8 | `pr.reviews.submit` resolution validation | **Remaining** | +| 7 | Document registry test coverage | ✅ Done (this PR) | +| 8 | `pr.reviews.submit` resolution validation | ✅ Done (this PR) | --- @@ -332,7 +332,7 @@ async queryRaw(query: string): Promise<{ data?: TData; errors?: ... }> { ## Pre-PR Checklist -- [ ] `pnpm run ci --outputStyle=static` passes -- [ ] No GraphQL operation files changed (no codegen needed) -- [ ] Coverage for touched files ≥90% -- [ ] No public API changes (no changeset needed) +- [x] `pnpm run ci --outputStyle=static` passes +- [x] No GraphQL operation files changed (no codegen needed) +- [x] Coverage for touched files ≥90% +- [x] No public API changes (no changeset needed) diff --git a/packages/benchmark/scenario-sets.json b/packages/benchmark/scenario-sets.json index 72660ed1..3dc17dc6 100644 --- a/packages/benchmark/scenario-sets.json +++ b/packages/benchmark/scenario-sets.json @@ -26,6 +26,8 @@ "pr-fix-review-comments-wf-001", "issue-triage-comment-wf-001", "pr-review-comment-wf-001", - "ci-diagnose-run-wf-001" + "ci-diagnose-run-wf-001", + "issue-triage-atomic-wf-001", + "pr-review-submit-atomic-wf-001" ] } diff --git a/packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json b/packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json index b4d985c7..dd9b1215 100644 --- a/packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json +++ b/packages/benchmark/scenarios/chaining/issue-triage-atomic-wf-001.json @@ -19,7 +19,7 @@ "expected_outcome": "success", "checkpoints": [ { - "name": "issue_state_updated", + "name": "issue_view_non_empty", "verification_task": "issue.view", "verification_input": {}, "condition": "non_empty" diff --git a/packages/core/skills/using-ghx/SKILL.md b/packages/core/skills/using-ghx/SKILL.md index 2a911e23..1a13902e 100644 --- a/packages/core/skills/using-ghx/SKILL.md +++ b/packages/core/skills/using-ghx/SKILL.md @@ -56,6 +56,8 @@ EOF **Chainable capabilities:** any capability with a `graphql:` route (i.e. those that accept node IDs or perform internal lookups — labels, assignees, milestones, relations, reviews, comments). Capabilities with only a `cli:` route cannot be chained. +**Limitation:** Steps run independently — outputs from one step cannot be referenced as inputs to another step in the same chain. + **Example — close an issue and leave a closing comment atomically:** ```bash diff --git a/packages/core/test/integration/engine-execute-tasks-no-resolution.integration.test.ts b/packages/core/test/integration/engine-execute-tasks-no-resolution.integration.test.ts index d45f03a1..6fabd99d 100644 --- a/packages/core/test/integration/engine-execute-tasks-no-resolution.integration.test.ts +++ b/packages/core/test/integration/engine-execute-tasks-no-resolution.integration.test.ts @@ -7,39 +7,37 @@ */ import { executeTasks } from "@core/core/routing/engine.js" import type { GithubClient } from "@core/gql/github-client.js" -import { describe, expect, it } from "vitest" +import { describe, expect, it, vi } from "vitest" describe("executeTasks – no-resolution mutations (batch Phase 2 path)", () => { it("succeeds for two no-resolution steps batched in a single mutation", async () => { - const githubClient = { - query: async () => { - throw new Error("Phase 1 query not expected: neither step has resolution config") - }, - queryRaw: async (doc: unknown) => { - const query = String(doc) - // Both mutations are batched into one queryRaw call - expect(query).toContain("closeIssue") - expect(query).toContain("addComment") - return { - data: { - step0: { - closeIssue: { issue: { id: "I_abc123", number: 10, state: "CLOSED" } }, - }, - step1: { - addComment: { - commentEdge: { - node: { - id: "IC_comment1", - body: "Closing comment.", - url: "https://github.com/o/r/issues/10#issuecomment-1", - }, + const queryRawFn = vi.fn(async (_doc: unknown) => { + return { + data: { + step0: { + closeIssue: { issue: { id: "I_abc123", number: 10, state: "CLOSED" } }, + }, + step1: { + addComment: { + commentEdge: { + node: { + id: "IC_comment1", + body: "Closing comment.", + url: "https://github.com/o/r/issues/10#issuecomment-1", }, }, }, - } as TData, - errors: undefined, - } + }, + } as TData, + errors: undefined, + } + }) + + const githubClient = { + query: async () => { + throw new Error("Phase 1 query not expected: neither step has resolution config") }, + queryRaw: queryRawFn, } as unknown as GithubClient const result = await executeTasks( @@ -64,6 +62,11 @@ describe("executeTasks – no-resolution mutations (batch Phase 2 path)", () => task: "issue.comments.create", ok: true, }) + + const calledDoc = String(queryRawFn.mock.calls[0]?.[0]) + expect(calledDoc).toContain("closeIssue") + expect(calledDoc).toContain("addComment") + expect(queryRawFn).toHaveBeenCalledOnce() }) it("returns failed status when queryRaw throws a transport error for all steps", async () => { diff --git a/packages/core/test/integration/engine-execute-tasks-pr-review-submit.integration.test.ts b/packages/core/test/integration/engine-execute-tasks-pr-review-submit.integration.test.ts index 8fd367b8..ebbb2b7b 100644 --- a/packages/core/test/integration/engine-execute-tasks-pr-review-submit.integration.test.ts +++ b/packages/core/test/integration/engine-execute-tasks-pr-review-submit.integration.test.ts @@ -15,8 +15,6 @@ import { describe, expect, it, vi } from "vitest" describe("executeTasks – pr.reviews.submit resolution (Phase 1 → Phase 2)", () => { it("resolves pullRequestId from PrNodeId lookup and injects into PrReviewSubmit mutation", async () => { - const capturedQueryRawVars: Record[] = [] - // Phase 1: batch query returning PrNodeId lookup result aliased as step0 const queryFn = vi.fn(async (_doc: unknown, _vars: unknown) => { return { @@ -25,8 +23,7 @@ describe("executeTasks – pr.reviews.submit resolution (Phase 1 → Phase 2)", }) // Phase 2: capture variables to verify pullRequestId was resolved and injected - const queryRawFn = vi.fn(async (_doc: unknown, vars: unknown) => { - capturedQueryRawVars.push(vars as Record) + const queryRawFn = vi.fn(async (_doc: unknown, _vars: unknown) => { return { data: { step0: { @@ -102,7 +99,7 @@ describe("executeTasks – pr.reviews.submit resolution (Phase 1 → Phase 2)", // Verify the resolved pullRequestId was injected into the batch mutation variables. // buildBatchMutation prefixes each step's variables with its alias (step0_*, step1_*, ...) - const mutVars = capturedQueryRawVars[0] + const mutVars = queryRawFn.mock.calls[0]?.[1] as Record expect(mutVars).toBeDefined() expect(mutVars?.["step0_pullRequestId"]).toBe("PR_xyz789") }) diff --git a/packages/core/test/unit/document-registry.test.ts b/packages/core/test/unit/document-registry.test.ts index ec95d86d..12de78c8 100644 --- a/packages/core/test/unit/document-registry.test.ts +++ b/packages/core/test/unit/document-registry.test.ts @@ -2,87 +2,87 @@ import { getLookupDocument, getMutationDocument } from "@core/gql/document-regis import { describe, expect, it } from "vitest" describe("document-registry – mutations", () => { - it("IssueAssigneesAdd", () => { + it("returns document for IssueAssigneesAdd", () => { expect(getMutationDocument("IssueAssigneesAdd")).toContain("mutation IssueAssigneesAdd") }) - it("IssueAssigneesRemove", () => { + it("returns document for IssueAssigneesRemove", () => { expect(getMutationDocument("IssueAssigneesRemove")).toContain("mutation IssueAssigneesRemove") }) - it("IssueAssigneesUpdate", () => { + it("returns document for IssueAssigneesUpdate", () => { expect(getMutationDocument("IssueAssigneesUpdate")).toContain("mutation IssueAssigneesUpdate") }) - it("IssueBlockedByAdd", () => { + it("returns document for IssueBlockedByAdd", () => { expect(getMutationDocument("IssueBlockedByAdd")).toContain("mutation IssueBlockedByAdd") }) - it("IssueBlockedByRemove", () => { + it("returns document for IssueBlockedByRemove", () => { expect(getMutationDocument("IssueBlockedByRemove")).toContain("mutation IssueBlockedByRemove") }) - it("IssueClose", () => { + it("returns document for IssueClose", () => { expect(getMutationDocument("IssueClose")).toContain("mutation IssueClose") }) - it("IssueCommentCreate", () => { + it("returns document for IssueCommentCreate", () => { expect(getMutationDocument("IssueCommentCreate")).toContain("mutation IssueCommentCreate") }) - it("IssueCreate", () => { + it("returns document for IssueCreate", () => { expect(getMutationDocument("IssueCreate")).toContain("mutation IssueCreate") }) - it("IssueDelete", () => { + it("returns document for IssueDelete", () => { expect(getMutationDocument("IssueDelete")).toContain("mutation IssueDelete") }) - it("IssueLabelsAdd", () => { + it("returns document for IssueLabelsAdd", () => { expect(getMutationDocument("IssueLabelsAdd")).toContain("mutation IssueLabelsAdd") }) - it("IssueLabelsRemove", () => { + it("returns document for IssueLabelsRemove", () => { expect(getMutationDocument("IssueLabelsRemove")).toContain("mutation IssueLabelsRemove") }) - it("IssueLabelsUpdate", () => { + it("returns document for IssueLabelsUpdate", () => { expect(getMutationDocument("IssueLabelsUpdate")).toContain("mutation IssueLabelsUpdate") }) - it("IssueMilestoneSet", () => { + it("returns document for IssueMilestoneSet", () => { expect(getMutationDocument("IssueMilestoneSet")).toContain("mutation IssueMilestoneSet") }) - it("IssueParentRemove", () => { + it("returns document for IssueParentRemove", () => { expect(getMutationDocument("IssueParentRemove")).toContain("mutation IssueParentRemove") }) - it("IssueParentSet", () => { + it("returns document for IssueParentSet", () => { expect(getMutationDocument("IssueParentSet")).toContain("mutation IssueParentSet") }) - it("IssueReopen", () => { + it("returns document for IssueReopen", () => { expect(getMutationDocument("IssueReopen")).toContain("mutation IssueReopen") }) - it("IssueUpdate", () => { + it("returns document for IssueUpdate", () => { expect(getMutationDocument("IssueUpdate")).toContain("mutation IssueUpdate") }) - it("PrCommentReply", () => { + it("returns document for PrCommentReply", () => { expect(getMutationDocument("PrCommentReply")).toContain("mutation PrCommentReply") }) - it("PrCommentResolve", () => { + it("returns document for PrCommentResolve", () => { expect(getMutationDocument("PrCommentResolve")).toContain("mutation PrCommentResolve") }) - it("PrCommentUnresolve", () => { + it("returns document for PrCommentUnresolve", () => { expect(getMutationDocument("PrCommentUnresolve")).toContain("mutation PrCommentUnresolve") }) - it("PrReviewSubmit", () => { + it("returns document for PrReviewSubmit", () => { expect(getMutationDocument("PrReviewSubmit")).toContain("mutation PrReviewSubmit") }) @@ -92,43 +92,43 @@ describe("document-registry – mutations", () => { }) describe("document-registry – lookups", () => { - it("IssueAssigneesLookup", () => { + it("returns document for IssueAssigneesLookup", () => { expect(getLookupDocument("IssueAssigneesLookup")).toContain("query IssueAssigneesLookup") }) - it("IssueAssigneesLookupByNumber", () => { + it("returns document for IssueAssigneesLookupByNumber", () => { expect(getLookupDocument("IssueAssigneesLookupByNumber")).toContain( "query IssueAssigneesLookupByNumber", ) }) - it("IssueCreateRepositoryId", () => { + it("returns document for IssueCreateRepositoryId", () => { expect(getLookupDocument("IssueCreateRepositoryId")).toContain("query IssueCreateRepositoryId") }) - it("IssueLabelsLookup", () => { + it("returns document for IssueLabelsLookup", () => { expect(getLookupDocument("IssueLabelsLookup")).toContain("query IssueLabelsLookup") }) - it("IssueLabelsLookupByNumber", () => { + it("returns document for IssueLabelsLookupByNumber", () => { expect(getLookupDocument("IssueLabelsLookupByNumber")).toContain( "query IssueLabelsLookupByNumber", ) }) - it("IssueMilestoneLookup", () => { + it("returns document for IssueMilestoneLookup", () => { expect(getLookupDocument("IssueMilestoneLookup")).toContain("query IssueMilestoneLookup") }) - it("IssueNodeIdLookup", () => { + it("returns document for IssueNodeIdLookup", () => { expect(getLookupDocument("IssueNodeIdLookup")).toContain("query IssueNodeIdLookup") }) - it("IssueParentLookup", () => { + it("returns document for IssueParentLookup", () => { expect(getLookupDocument("IssueParentLookup")).toContain("query IssueParentLookup") }) - it("PrNodeId", () => { + it("returns document for PrNodeId", () => { expect(getLookupDocument("PrNodeId")).toContain("query PrNodeId") }) From d886cdbbc5aefa7cd7167ab14b098ccdb59d66dd Mon Sep 17 00:00:00 2001 From: Arye Kogan Date: Sat, 21 Feb 2026 20:30:28 +0200 Subject: [PATCH 3/5] fix(benchmark): update scenario-sets-manifest tests for chaining set MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Tests assumed default/workflows/all were always equal. Now that chaining scenarios are included in default and all, update assertions to match the differentiated set structure (workflows ⊂ default = all = full-seeded). Co-Authored-By: Claude Sonnet 4.6 --- .../test/unit/scenario-sets-manifest.test.ts | 22 ++++++++++--------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/packages/benchmark/test/unit/scenario-sets-manifest.test.ts b/packages/benchmark/test/unit/scenario-sets-manifest.test.ts index 06848741..e6b54520 100644 --- a/packages/benchmark/test/unit/scenario-sets-manifest.test.ts +++ b/packages/benchmark/test/unit/scenario-sets-manifest.test.ts @@ -9,10 +9,18 @@ function loadScenarioSets(): Record { } describe("scenario-sets manifest", () => { - it("keeps default set equal to all workflow scenarios", () => { + it("keeps default set equal to workflows plus chaining scenarios", () => { const scenarioSets = loadScenarioSets() - expect(scenarioSets.default).toEqual([ + expect(new Set(scenarioSets.default)).toEqual( + new Set([...(scenarioSets.workflows ?? []), ...(scenarioSets.chaining ?? [])]), + ) + }) + + it("defines workflows set as the workflow-only subset", () => { + const scenarioSets = loadScenarioSets() + + expect(scenarioSets.workflows).toEqual([ "pr-fix-review-comments-wf-001", "issue-triage-comment-wf-001", "pr-review-comment-wf-001", @@ -20,16 +28,10 @@ describe("scenario-sets manifest", () => { ]) }) - it("defines workflows set matching default", () => { - const scenarioSets = loadScenarioSets() - - expect(scenarioSets.workflows).toEqual(scenarioSets.default) - }) - - it("defines all as equal to workflows", () => { + it("defines all as equal to default", () => { const scenarioSets = loadScenarioSets() - expect(new Set(scenarioSets.all ?? [])).toEqual(new Set(scenarioSets.workflows ?? [])) + expect(new Set(scenarioSets.all ?? [])).toEqual(new Set(scenarioSets.default ?? [])) }) it("defines full-seeded as equal to all", () => { From 7686f5432a69a9cd3245cf3b3f7cf809e63fce15 Mon Sep 17 00:00:00 2001 From: Arye Kogan Date: Sat, 21 Feb 2026 20:41:54 +0200 Subject: [PATCH 4/5] fix(skill): correct round-trip claim for resolution-backed chain capabilities MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit "One GraphQL round-trip" is only true for no-resolution capabilities. Capabilities like pr.reviews.submit require a Phase 1 node-ID lookup plus Phase 2 mutation — two round-trips. Clarify to "as few as possible (typically one; node-ID lookups add a preflight query)". Co-Authored-By: Claude Sonnet 4.6 --- packages/core/skills/using-ghx/SKILL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/packages/core/skills/using-ghx/SKILL.md b/packages/core/skills/using-ghx/SKILL.md index 1a13902e..c2b7e1b3 100644 --- a/packages/core/skills/using-ghx/SKILL.md +++ b/packages/core/skills/using-ghx/SKILL.md @@ -35,7 +35,7 @@ Result: `{ ok, data, error, meta }`. Check `ok` first. If `ok=false` and `error. ## Chain -Use `ghx chain` when two or more mutations must succeed together. A single chain call is one GraphQL round-trip — it avoids partial state from sequential `ghx run` calls. +Use `ghx chain` when two or more mutations must succeed together. It batches steps into as few GraphQL round-trips as possible (typically one; capabilities that require a node-ID lookup add a single preflight query) — avoiding partial state from sequential `ghx run` calls. ```bash ghx chain --steps '[ From 2c502d64e1776a1a5147b2829efd9fdf2a131c6d Mon Sep 17 00:00:00 2001 From: Arye Kogan Date: Sun, 22 Feb 2026 11:02:06 +0200 Subject: [PATCH 5/5] fix(skill): remove transactional guarantee language from Chain section MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit "Must succeed together" and "atomically" imply ACID rollback that ghx chain doesn't provide — a partial result is possible. Replace with accurate batching language and add an explicit note that steps are not transactional. Co-Authored-By: Claude Sonnet 4.6 --- packages/core/skills/using-ghx/SKILL.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/packages/core/skills/using-ghx/SKILL.md b/packages/core/skills/using-ghx/SKILL.md index c2b7e1b3..9682677b 100644 --- a/packages/core/skills/using-ghx/SKILL.md +++ b/packages/core/skills/using-ghx/SKILL.md @@ -35,7 +35,7 @@ Result: `{ ok, data, error, meta }`. Check `ok` first. If `ok=false` and `error. ## Chain -Use `ghx chain` when two or more mutations must succeed together. It batches steps into as few GraphQL round-trips as possible (typically one; capabilities that require a node-ID lookup add a single preflight query) — avoiding partial state from sequential `ghx run` calls. +Use `ghx chain` when two or more mutations should be submitted together in a single batched call. It batches steps into as few GraphQL round-trips as possible (typically one; capabilities that require a node-ID lookup add a single preflight query) — reducing latency and avoiding mid-sequence failures that would result from sequential `ghx run` calls. Note: steps are not transactional; a `"partial"` result is possible if one step fails after another has already succeeded. ```bash ghx chain --steps '[ @@ -58,7 +58,7 @@ EOF **Limitation:** Steps run independently — outputs from one step cannot be referenced as inputs to another step in the same chain. -**Example — close an issue and leave a closing comment atomically:** +**Example — close an issue and leave a closing comment in a single batched call:** ```bash ghx chain --steps - <<'EOF'