Skip to content

Comments

refactor(core): split CLI capability adapter into domain modules#57

Merged
aryeko merged 18 commits intomainfrom
refactor/cli-domain-modules
Feb 19, 2026
Merged

refactor(core): split CLI capability adapter into domain modules#57
aryeko merged 18 commits intomainfrom
refactor/cli-domain-modules

Conversation

@aryeko
Copy link
Owner

@aryeko aryeko commented Feb 19, 2026

Summary

  • Extracts the ~2000-line cli-capability-adapter.ts monolith into six focused domain modules (repo, issue, pr, workflow, project-v2, release) under adapters/cli/domains/
  • Adds a capability-registry.ts that maps CliCapabilityId → handler function, decoupling dispatch from implementation
  • Pulls common arg-building patterns into adapters/cli/helpers.ts
  • Slims cli-capability-adapter.ts to a 76-line dispatcher that looks up and delegates to registered handlers
  • Adds unit tests per domain module and for the registry itself
  • Fixes e2e test linting: replaces maybeIt / try/finally patterns with describe.skipIf and beforeEach/afterEach, removes vitest/no-standalone-expect + vitest/no-conditional-expect ESLint rule suppressions
  • Updates docs/architecture/repository-structure.md and docs/architecture/adapters.md to reflect the new module layout

Test plan

  • pnpm run ci --outputStyle=static passes (728 tests, 92.4% statement coverage, 85.6% branch coverage)
  • All domain module handlers covered by unit tests
  • CLI capability registry covered by unit tests
  • E2e tests pass ESLint without rule suppressions
  • Merge conflict with main (1075a6e composite capabilities integration) resolved — pr.review.submit CLI handler was already absent from our domain modules

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Refactor

    • Reorganized CLI adapter into a domain-based registry with per-domain handlers (repo, issue, PR, workflow, project_v2, release) and shared CLI helpers.
  • New Features

    • Expanded CLI capabilities across repositories, issues, PRs, workflows, projects, and releases with richer commands and consistent result normalization.
  • Tests

    • Added comprehensive unit tests for all CLI domains and helpers; streamlined e2e setup/teardown.
  • Docs

    • Updated architecture and repository-structure docs and added a changelog entry.

aryeko and others added 13 commits February 19, 2026 06:35
Move all shared utilities, constants, and normalization helpers out of the monolith into a dedicated helpers module. The helpers module exports symbols for use by domain modules, including parsing utilities, sensitive text detection, check bucket normalization, and data normalization functions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Moves all 17 PR + check_run handlers into a self-contained
domains/pr.ts module with a handlers Record export.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Moves 11 workflow handlers (list, get, run view, run cancel,
run rerun-all, dispatch, job logs) into a self-contained
domains/workflow.ts module with a handlers Record export.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Moves 6 GitHub Projects v2 handlers (org.get, user.get,
fields.list, items.list, item.add_issue, item.field.update)
into a self-contained domains/project-v2.ts module.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Moves 5 release handlers (list, get, create_draft, update,
publish_draft) into a self-contained domains/release.ts module.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Merges all domain handler maps into a single registry with
getCliHandler() and listCliCapabilities() exports, mirroring
the pattern established by the GQL capability registry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace 2120-line monolithic handler with a 76-line dispatcher that
delegates to domain modules via getCliHandler(). Fix behavioral
parity issues found during integration:

- pr: rerun_failed args order (--failed after --repo), update title
  always returns "", diff.files returns raw data (not {files} wrapper)
- workflow: dispatch.run validates inputs (throws on non-object/empty
  key/non-primitive); run fallback id: 0 (not null) for runs/jobs/artifacts
- project-v2: combined validation error messages for fields.list,
  items.list, item.add_issue, item.field.update
- release: publish_draft missing-param error has no "invalid" keyword
  so mapErrorToCode returns UNKNOWN (not VALIDATION)

All 670 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ycle hooks

- Move XDG_CONFIG_HOME save/restore to beforeEach/afterEach in both e2e tests
- Replace maybeIt variable with describe.skipIf (recognized by vitest linter)
- Move opencode server teardown and process.chdir reset to afterEach
- Make conditional expect unconditional in opencode e2e test
- Remove vitest/no-standalone-expect and vitest/no-conditional-expect overrides
  from eslint.config.mjs — no longer needed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Keep registry-based dispatcher; main's removal of pr.review.submit CLI
handler is already handled (capability was never in our domain modules).
The CliCapabilityId type union is unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@changeset-bot
Copy link

changeset-bot bot commented Feb 19, 2026

🦋 Changeset detected

Latest commit: e1ca924

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@ghx-dev/core Patch
@ghx-dev/benchmark Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

Warning

Rate limit exceeded

@aryeko has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 21 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Replaces a monolithic in-file CLI capability switch with a registry-based, domain-oriented CLI adapter. Adds per-domain handler modules, a capability registry, and shared CLI helpers; runCliCapability now resolves capabilities via the registry and delegates execution to domain handlers.

Changes

Cohort / File(s) Summary
Docs
docs/architecture/adapters.md, docs/architecture/repository-structure.md
Documentation updated to describe domain-based CLI adapters, the capability registry, and shared helper locations.
Core CLI wiring
packages/core/src/core/execution/adapters/cli-capability-adapter.ts, packages/core/src/core/execution/adapters/cli/capability-registry.ts, packages/core/src/core/execution/adapters/cli/helpers.ts
runCliCapability simplified to registry lookup + delegation; added CliCapabilityId values, getCliHandler/listCliCapabilities, and a shared helpers module (types, parsing, validation, normalization, constants).
Domain handlers
packages/core/src/core/execution/adapters/cli/domains/*
packages/core/src/core/execution/adapters/cli/domains/repo.ts, .../issue.ts, .../pr.ts, .../workflow.ts, .../project-v2.ts, .../release.ts
Introduced domain-specific handler modules exporting handler maps for repo, issue, pr, workflow, project-v2, and release. Handlers validate inputs, build gh CLI args, run the CLI, and normalize responses/errors.
Tests — unit
packages/core/test/unit/cli-capability-registry.test.ts, packages/core/test/unit/cli-domains-*.test.ts, packages/core/test/unit/cli-helpers.test.ts, packages/core/test/unit/cli-capability-adapter.test.ts
Added comprehensive unit tests for the registry, helpers, and each domain handler covering success, validation errors, CLI errors, and parsing edge cases.
Tests — e2e
packages/core/test/e2e/setup-install-verify.e2e.test.ts, packages/core/test/e2e/setup-opencode-skill.e2e.test.ts
Refactored e2e environment setup/teardown into beforeEach/afterEach hooks; improved isolation and test gating.
Config
eslint.config.mjs
Removed ESLint overrides for e2e tests (vitest plugin rule block removed).
Changelog
.changeset/fix-pr-review-followups.md
New changelog entry noting fixes related to CLI domain refactor, tests, and docs.

Sequence Diagram(s)

sequenceDiagram
    participant Request as Request Handler
    participant Registry as Capability Registry
    participant Domain as Domain Handler
    participant CLI as CLI Runner
    participant Normalizer as Result Normalizer

    Request->>Registry: getCliHandler(capabilityId)
    alt handler found
        Registry-->>Request: CliHandler
        Request->>Domain: handler(runner, params, card)
        Domain->>Domain: validate inputs
        Domain->>Domain: build CLI args
        Domain->>CLI: run gh command (stdout/stderr)
        CLI-->>Domain: stdout / stderr / exitCode
        Domain->>Normalizer: parse & normalize output or map error
        Normalizer-->>Domain: { data } / { error }
        Domain-->>Request: { ok: true, data, meta } or { ok: false, error, meta }
    else handler missing
        Registry-->>Request: undefined
        Request-->>Request: return NotFound error (meta.route_used)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐇 In a burrow of code I hopped with glee,
Handlers lined up like carrots on a tree,
Registry scouts the card and points the way,
Helpers tidy args so gh can play,
A rabbit cheers: tidy, test, and free! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'refactor(core): split CLI capability adapter into domain modules' is concise and clearly summarizes the main change: splitting a monolithic adapter file into domain-specific modules.
Description check ✅ Passed The PR description is comprehensive and addresses all major template sections. It includes a clear summary, change type (Refactor/maintenance), validation details (CI passes, test coverage included), changelog entry (changeset added), and documentation updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor/cli-domain-modules

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Feb 19, 2026

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Nitpick comments (7)
packages/core/test/e2e/setup-install-verify.e2e.test.ts (1)

83-86: Optional: eliminate repeated as string casts with a single assertion

tarballName as string appears three times after expect(tarballName).toBeDefined(). A one-liner assignment cleans this up.

♻️ Proposed refactor
   expect(tarballName).toBeDefined()
-  const tarballPath = (tarballName as string).startsWith("/")
-    ? (tarballName as string)
-    : join(packDir, tarballName as string)
+  const tgz = tarballName!
+  const tarballPath = tgz.startsWith("/") ? tgz : join(packDir, tgz)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/e2e/setup-install-verify.e2e.test.ts` around lines 83 -
86, After the expect(tarballName).toBeDefined() assertion, cast once to a string
into a local const (e.g., const tarballNameStr = tarballName as string) and then
use tarballNameStr in the tarballPath computation instead of repeating
tarballName as string; update the tarballPath expression that references packDir
and tarballName to use this single asserted variable to eliminate repeated
casts.
packages/core/test/e2e/setup-opencode-skill.e2e.test.ts (1)

119-158: getSessionApi is duplicated from packages/benchmark/src/runner/suite-runner.ts.

The local implementation (without abort) diverges from the shared one already in the codebase. If the SDK's session API shape changes, both copies need updating. Consider importing/re-exporting the shared helper, or extracting it to a common test/helpers module.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/e2e/setup-opencode-skill.e2e.test.ts` around lines 119 -
158, getSessionApi is duplicated and diverges from the canonical implementation
in packages/benchmark/src/runner/suite-runner.ts; replace this local copy by
importing the shared helper (or move the canonical function to a common
test/helpers module) so tests use one implementation that includes the full
session API surface (including the abort method) and stays in sync with SDK
changes; update any call sites in this test to use the imported helper (or the
new shared module) instead of the inline getSessionApi function.
packages/core/test/unit/cli-helpers.test.ts (1)

14-19: Consider adding a negative-integer case.

parseStrictPositiveInt rejects zero and floats but a negative integer case (e.g., -5) would close the obvious gap in boundary testing. Minor since domain-level tests exercise these helpers indirectly.

Proposed addition
   it("rejects strings", () => expect(parseStrictPositiveInt("5")).toBeNull())
+  it("rejects negative integers", () => expect(parseStrictPositiveInt(-5)).toBeNull())
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-helpers.test.ts` around lines 14 - 19, Add a test
case to the parseStrictPositiveInt suite to assert that negative integers are
rejected (e.g., expect(parseStrictPositiveInt(-5)).toBeNull()); update the
describe("parseStrictPositiveInt") block in the existing unit test so it
includes this negative-integer case alongside the existing zero/float/string
tests to close the boundary-gap.
packages/core/test/unit/cli-capability-registry.test.ts (1)

71-74: Hardcoded count 45 in test description will become stale.

The assertion correctly uses ALL_CLI_CAPABILITY_IDS.length dynamically, but the description string "lists all 45 registered capability IDs" will silently lie once capabilities are added or removed.

♻️ Use a dynamic description
-  it("lists all 45 registered capability IDs", () => {
+  it(`lists all ${ALL_CLI_CAPABILITY_IDS.length} registered capability IDs`, () => {
     expect(listCliCapabilities()).toHaveLength(ALL_CLI_CAPABILITY_IDS.length)
   })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-capability-registry.test.ts` around lines 71 -
74, The test description hardcodes "45" and will become incorrect; update the
describe/it text for the listCliCapabilities test to avoid a hardcoded
count—either make it generic (e.g., "lists all registered capability IDs") or
interpolate the dynamic value using ALL_CLI_CAPABILITY_IDS.length (e.g., using a
template literal) so the description always matches the assertion that calls
listCliCapabilities() and ALL_CLI_CAPABILITY_IDS.length.
packages/core/src/core/execution/adapters/cli/capability-registry.ts (1)

9-16: allHandlers should be ALL_HANDLERS per the UPPER_SNAKE_CASE constant naming convention.

As per coding guidelines, "Use UPPER_SNAKE_CASE for true constants"; allHandlers is a module-level const that is never reassigned and qualifies as a true constant.

♻️ Rename to follow convention
-const allHandlers: Record<string, CliHandler> = {
+const ALL_HANDLERS: Record<string, CliHandler> = {
   ...repoHandlers,
   ...issueHandlers,
   ...prHandlers,
   ...workflowHandlers,
   ...projectV2Handlers,
   ...releaseHandlers,
 }

 export function getCliHandler(capabilityId: string): CliHandler | undefined {
-  return allHandlers[capabilityId]
+  return ALL_HANDLERS[capabilityId]
 }

 export function listCliCapabilities(): string[] {
-  return Object.keys(allHandlers)
+  return Object.keys(ALL_HANDLERS)
 }

As per coding guidelines: "Use UPPER_SNAKE_CASE for true constants."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/capability-registry.ts` around
lines 9 - 16, Rename the module-level constant allHandlers to ALL_HANDLERS to
follow UPPER_SNAKE_CASE for true constants; update every reference to
allHandlers in this file to ALL_HANDLERS and ensure the exported symbol (if any)
matches the new name; verify the type annotation Record<string, CliHandler> and
the spread sources (...repoHandlers, ...issueHandlers, ...prHandlers,
...workflowHandlers, ...projectV2Handlers, ...releaseHandlers) remain intact and
compile without changes.
packages/core/src/core/execution/adapters/cli/domains/project-v2.ts (1)

16-130: handleProjectV2OrgGet and handleProjectV2UserGet are near-identical.

Both handlers differ only in the parameter key (params.org vs params.user) and the capability ID string; the gh project view ... --owner ... --format json invocation and all normalization logic are duplicated verbatim.

♻️ Suggested extraction
+async function handleProjectV2GetByOwner(
+  runner: CliCommandRunner,
+  ownerValue: string,
+  projectNumber: number,
+  capabilityId: "project_v2.org.get" | "project_v2.user.get",
+): Promise<ResultEnvelope> {
+  const args = ["project", "view", String(projectNumber), "--owner", ownerValue, "--format", "json"]
+  const result = await runner.run("gh", args, DEFAULT_TIMEOUT_MS)
+  if (result.exitCode !== 0) {
+    const code = mapErrorToCode(result.stderr)
+    return normalizeError(
+      { code, message: sanitizeCliErrorMessage(result.stderr, result.exitCode), retryable: isRetryableErrorCode(code), details: { capabilityId, exitCode: result.exitCode } },
+      "cli", { capabilityId, reason: "CARD_FALLBACK" },
+    )
+  }
+  const data = parseCliData(result.stdout)
+  return normalizeResult(normalizeProjectV2Summary(data), "cli", { capabilityId, reason: "CARD_FALLBACK" })
+}
+
 const handleProjectV2OrgGet: CliHandler = async (runner, params, _card) => {
   try {
     const owner = parseNonEmptyString(params.org)
     if (owner === null) throw new Error("Missing or invalid org for project_v2.org.get")
     const projectNumber = parseStrictPositiveInt(params.projectNumber)
     if (projectNumber === null) throw new Error("Missing or invalid projectNumber for project_v2.org.get")
-    const args = ["project", "view", String(projectNumber), "--owner", owner, "--format", "json"]
-    const result = await runner.run("gh", args, DEFAULT_TIMEOUT_MS)
-    // ... (full duplicated block)
+    return await handleProjectV2GetByOwner(runner, owner, projectNumber, "project_v2.org.get")
   } catch (error: unknown) { /* ... */ }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/domains/project-v2.ts` around
lines 16 - 130, These two handlers duplicate logic; extract the shared behavior
into a single helper (e.g., projectV2GetHandler or runProjectV2View) that
accepts the owner value (or owner param key) and capabilityId, move parsing of
the owner (use parseNonEmptyString), projectNumber (parseStrictPositiveInt),
building args (["project","view", String(projectNumber),"--owner",
owner,"--format","json"]), running runner.run, and all
normalizeError/normalizeResult calls into that helper, and then implement
handleProjectV2OrgGet and handleProjectV2UserGet as thin wrappers that call the
helper with params.org / params.user and capabilityId "project_v2.org.get" or
"project_v2.user.get" respectively so each normalized error/result uses the
correct capabilityId.
packages/core/src/core/execution/adapters/cli/domains/workflow.ts (1)

20-133: Highly repetitive error-handling boilerplate across all 11 handlers.

Every handler repeats the same ~15-line catch block (SyntaxError branch → normalizeError with errorCodes.Server; generic branch → mapErrorToCode + normalizeError), differing only in the capabilityId string. This is a significant source of duplication across ~900 lines. A small wrapper like withCliErrorHandling(capabilityId, fn) could eliminate the repetition and reduce the chance of copy-paste drift.

Not urgent — can be deferred since each handler is self-contained and correct.

Also applies to: 276-347

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/domains/workflow.ts` around
lines 20 - 133, Create a small wrapper function (e.g., withCliErrorHandling)
that accepts a capabilityId string and an async handler function and centralizes
the current try/catch logic (the SyntaxError branch that returns normalizeError
with errorCodes.Server and the generic branch that calls mapErrorToCode,
isRetryableErrorCode and normalizeError), then refactor handleWorkflowRunsList
to move only its main body (everything inside try) into a passed async function
and return withCliErrorHandling("workflow.runs.list", async () => { ...main
logic... }); ensure the wrapper uses the same normalizeError call sites and
metadata shape ({ capabilityId, reason: "CARD_FALLBACK" }) so all referenced
symbols (handleWorkflowRunsList, withCliErrorHandling, normalizeError,
mapErrorToCode, isRetryableErrorCode, errorCodes.Server) remain consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/architecture/adapters.md`:
- Around line 228-231: The docs currently instruct using a non-existent
registerCliHandler(); instead update the "Adding CLI Route" steps to: (1) add
your handler into the domain module's exported handlers: Record<string,
CliHandler> map (the same place modules export their handlers), using shared
helpers from cli/helpers.ts for arg building; (2) rely on the registry which
imports and spreads all domain handlers into allHandlers (no explicit
registration call required) and thus your handler will be discoverable by
getCliHandler and listCliCapabilities; (3) update the CliCapabilityId union so
the new capability ID is included (so runCliCapability retains proper typing).

In `@docs/architecture/repository-structure.md`:
- Line 123: Update the table row so it no longer claims
`core/execution/adapters/cli/domains/workflow.ts` handles `check_run.*`; instead
indicate that `check_run.*` (specifically `check_run.annotations.list`) is
implemented in the PR domain module (`pr.ts`) and belongs to `pr.*` handlers;
edit the caption/capabilities column for the workflow row to remove
`check_run.*` and ensure the PR row (or the line mentioning `pr.*`) notes
`check_run.annotations.list` is registered in `pr.ts` (see
`check_run.annotations.list` and `pr.*` for locating the code and `workflow.ts`
for the corrected entry).
- Line 118: Update the documentation entry for capability-registry.ts so it
matches the actual exported API: replace the documented `registerCliHandler()`
with `listCliCapabilities()` in the table cell describing exported functions
(referencing `getCliHandler()` and `listCliCapabilities()`), ensuring the row
reads: `getCliHandler()`, `listCliCapabilities()`.

In `@packages/core/src/core/execution/adapters/cli/domains/pr.ts`:
- Around line 644-656: In the fallback error branch that constructs the
normalized error, replace references to the initial run's exit code with the
fallback run's exit code: use rerunAllResult.exitCode when calling
sanitizeCliErrorMessage and when setting details.exitCode (instead of
result.exitCode); keep failureStderr sourced from rerunAllResult.stderr/stdout,
and leave mapErrorToCode, normalizeError, and isRetryableErrorCode usage
unchanged so the error message and details reflect the actual rerun-all result.

In `@packages/core/src/core/execution/adapters/cli/domains/project-v2.ts`:
- Around line 453-460: The catch block in handleProjectV2ItemFieldUpdate
contains an unreachable "if (error instanceof SyntaxError)" branch because no
JSON.parse/parseCliData is called in the try path; remove the
SyntaxError-specific branch and collapse handling into the existing generic
error path (i.e., delete the instanceof SyntaxError block and ensure the
remaining catch returns normalizeError(...) for other errors), keeping the
normalizeError call with the same "cli" context and capabilityId
"project_v2.item.field.update".

In `@packages/core/src/core/execution/adapters/cli/domains/release.ts`:
- Around line 365-375: In handleReleasePublishDraft, the error message thrown
after parseStrictPositiveInt(params.releaseId) is misleading because owner/name
are already validated by requireRepo; update the thrown Error to reference only
the missing/invalid releaseId (e.g., "Missing or invalid releaseId for
release.publish_draft") so it matches the pattern used in handleReleaseUpdate
and accurately reflects the failing validation.

In `@packages/core/src/core/execution/adapters/cli/domains/workflow.ts`:
- Around line 706-725: The normalized artifact mapping currently assigns id:
input.id without narrowing and uses a numeric fallback id: 0; change this to
consistently narrow id like the other fields and use null as the fallback.
Inside the artifactsArray.map callback (where the variable input is created),
replace id: input.id with a narrowed expression such as id: typeof input.id ===
"string" || typeof input.id === "number" ? input.id : null, and also change the
non-object fallback (currently id: 0) to id: null so both branches produce the
same nullable type.
- Around line 236-255: The loop that builds errorLines and warningLines
currently uses the same capped arrays to report counts, causing
summary.errorCount and summary.warningCount to max at 10; add separate counters
(e.g., totalErrorCount, totalWarningCount) and increment those for every
matching line while still only pushing into errorLines/warningLines when their
length < 10, then set summary.errorCount = totalErrorCount and
summary.warningCount = totalWarningCount while keeping topErrorLines =
errorLines.slice(0,10) (and similarly for warnings).

In `@packages/core/test/e2e/setup-install-verify.e2e.test.ts`:
- Around line 46-57: The test currently sets process.env.XDG_CONFIG_HOME to a
mkdtempSync(...) path in beforeEach but does not keep the temp directory path or
remove it, so persistently created dirs remain; modify beforeEach to assign the
mkdtempSync result to a local variable (e.g., tempXdgDir) in the test scope and
use that variable in afterEach to remove the directory (use
fs.rmSync(tempXdgDir, { recursive: true, force: true }) or equivalent) before
restoring process.env.XDG_CONFIG_HOME; also update imports used by the file to
include the removal function (e.g., rmSync or fs.rm) if not already imported.

In `@packages/core/test/e2e/setup-opencode-skill.e2e.test.ts`:
- Line 245: The test hardcodes port 3000 in the createOpencode config which can
cause CI flakiness if that port is taken; update the config passed to
createOpencode (the config object in the createOpencode call) to omit the port
or set port: 0 so the OS assigns an ephemeral port, and adjust any test helpers
that expect a fixed port to read the actual bound port from the returned server
object instead.

In `@packages/core/test/unit/cli-domains-workflow.test.ts`:
- Around line 213-228: The test "includes up to 10 error lines in topErrorLines"
currently asserts summary.errorCount is 10 even though mockRunner returned 15
error lines; if you changed the implementation of handleWorkflowJobLogsGet (see
function handleWorkflowJobLogsGet and the workflow log parsing logic) to count
total error lines rather than capping at 10, update this test to expect the full
count (expect(summary.errorCount).toBe(15)) while still asserting topErrorLines
length remains capped at 10; adjust the assertion for summary.errorCount to
match the new total-count behavior so the test reflects the implementation
change.

---

Nitpick comments:
In `@packages/core/src/core/execution/adapters/cli/capability-registry.ts`:
- Around line 9-16: Rename the module-level constant allHandlers to ALL_HANDLERS
to follow UPPER_SNAKE_CASE for true constants; update every reference to
allHandlers in this file to ALL_HANDLERS and ensure the exported symbol (if any)
matches the new name; verify the type annotation Record<string, CliHandler> and
the spread sources (...repoHandlers, ...issueHandlers, ...prHandlers,
...workflowHandlers, ...projectV2Handlers, ...releaseHandlers) remain intact and
compile without changes.

In `@packages/core/src/core/execution/adapters/cli/domains/project-v2.ts`:
- Around line 16-130: These two handlers duplicate logic; extract the shared
behavior into a single helper (e.g., projectV2GetHandler or runProjectV2View)
that accepts the owner value (or owner param key) and capabilityId, move parsing
of the owner (use parseNonEmptyString), projectNumber (parseStrictPositiveInt),
building args (["project","view", String(projectNumber),"--owner",
owner,"--format","json"]), running runner.run, and all
normalizeError/normalizeResult calls into that helper, and then implement
handleProjectV2OrgGet and handleProjectV2UserGet as thin wrappers that call the
helper with params.org / params.user and capabilityId "project_v2.org.get" or
"project_v2.user.get" respectively so each normalized error/result uses the
correct capabilityId.

In `@packages/core/src/core/execution/adapters/cli/domains/workflow.ts`:
- Around line 20-133: Create a small wrapper function (e.g.,
withCliErrorHandling) that accepts a capabilityId string and an async handler
function and centralizes the current try/catch logic (the SyntaxError branch
that returns normalizeError with errorCodes.Server and the generic branch that
calls mapErrorToCode, isRetryableErrorCode and normalizeError), then refactor
handleWorkflowRunsList to move only its main body (everything inside try) into a
passed async function and return withCliErrorHandling("workflow.runs.list",
async () => { ...main logic... }); ensure the wrapper uses the same
normalizeError call sites and metadata shape ({ capabilityId, reason:
"CARD_FALLBACK" }) so all referenced symbols (handleWorkflowRunsList,
withCliErrorHandling, normalizeError, mapErrorToCode, isRetryableErrorCode,
errorCodes.Server) remain consistent.

In `@packages/core/test/e2e/setup-install-verify.e2e.test.ts`:
- Around line 83-86: After the expect(tarballName).toBeDefined() assertion, cast
once to a string into a local const (e.g., const tarballNameStr = tarballName as
string) and then use tarballNameStr in the tarballPath computation instead of
repeating tarballName as string; update the tarballPath expression that
references packDir and tarballName to use this single asserted variable to
eliminate repeated casts.

In `@packages/core/test/e2e/setup-opencode-skill.e2e.test.ts`:
- Around line 119-158: getSessionApi is duplicated and diverges from the
canonical implementation in packages/benchmark/src/runner/suite-runner.ts;
replace this local copy by importing the shared helper (or move the canonical
function to a common test/helpers module) so tests use one implementation that
includes the full session API surface (including the abort method) and stays in
sync with SDK changes; update any call sites in this test to use the imported
helper (or the new shared module) instead of the inline getSessionApi function.

In `@packages/core/test/unit/cli-capability-registry.test.ts`:
- Around line 71-74: The test description hardcodes "45" and will become
incorrect; update the describe/it text for the listCliCapabilities test to avoid
a hardcoded count—either make it generic (e.g., "lists all registered capability
IDs") or interpolate the dynamic value using ALL_CLI_CAPABILITY_IDS.length
(e.g., using a template literal) so the description always matches the assertion
that calls listCliCapabilities() and ALL_CLI_CAPABILITY_IDS.length.

In `@packages/core/test/unit/cli-helpers.test.ts`:
- Around line 14-19: Add a test case to the parseStrictPositiveInt suite to
assert that negative integers are rejected (e.g.,
expect(parseStrictPositiveInt(-5)).toBeNull()); update the
describe("parseStrictPositiveInt") block in the existing unit test so it
includes this negative-integer case alongside the existing zero/float/string
tests to close the boundary-gap.

Adds targeted unit tests across all modified CLI domain modules to achieve
≥90% branch coverage (reaching 90.8–100% per file). Key patterns covered:
- `params.owner ?? ""` and `params.name ?? ""` null-coalescing branches
- Edge cases: malformed JSON, non-array/non-object payloads, zero/invalid IDs
- SyntaxError catch paths in all domain handlers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
packages/core/test/unit/cli-domains-repo.test.ts (1)

220-368: Consider consolidating the null owner/name branch-coverage tests with it.each.

The three null-branch tests (lines 343–367) and the similar additional-coverage blocks follow a repetitive pattern. Using it.each or a shared loop would reduce boilerplate without sacrificing clarity. This is optional since the current form is readable and CI passes.

♻️ Example consolidation with it.each
-describe("repo domain handlers – null owner/name ?? branch coverage", () => {
-  const nr = () => mockRunner(1, "", "err")
-
-  it("handleRepoView covers owner/name null branches", async () => {
-    const result = await handleRepoView(nr(), { owner: null, name: null }, undefined)
-    expect(result.ok).toBe(false)
-  })
-
-  it("handleRepoLabelsList covers owner/name null branches", async () => {
-    const result = await handleRepoLabelsList(
-      nr(),
-      { owner: null, name: null, first: 30 },
-      undefined,
-    )
-    expect(result.ok).toBe(false)
-  })
-
-  it("handleRepoIssueTypesList covers owner/name null branches", async () => {
-    const result = await handleRepoIssueTypesList(
-      nr(),
-      { owner: null, name: null, first: 10 },
-      undefined,
-    )
-    expect(result.ok).toBe(false)
-  })
-})
+describe("repo domain handlers – null owner/name ?? branch coverage", () => {
+  const nr = () => mockRunner(1, "", "err")
+
+  it.each([
+    ["handleRepoView", handleRepoView, { owner: null, name: null }],
+    ["handleRepoLabelsList", handleRepoLabelsList, { owner: null, name: null, first: 30 }],
+    ["handleRepoIssueTypesList", handleRepoIssueTypesList, { owner: null, name: null, first: 10 }],
+  ] as const)("%s covers owner/name null branches", async (_label, handler, params) => {
+    const result = await handler(nr(), params, undefined)
+    expect(result.ok).toBe(false)
+  })
+})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-domains-repo.test.ts` around lines 220 - 368,
Consolidate the three repetitive null-owner/name tests into a single
parameterized test using it.each: replace the separate it blocks that call
handleRepoView, handleRepoLabelsList, and handleRepoIssueTypesList with one
it.each table that iterates over tuples of [handlerFunction, args] (referencing
handleRepoView, handleRepoLabelsList, handleRepoIssueTypesList and the nr()
mockRunner), and assert result.ok is false for each; ensure you pass the same
nr() runner and the appropriate param shape for each handler (e.g., { owner:
null, name: null, first: X } where required) so behavior and coverage remain
identical.
packages/core/test/unit/cli-domains-pr.test.ts (1)

6-12: Consider extracting mockRunner and collapsing the null-branch suite with it.each.

  1. The mockRunner factory is duplicated identically across all four domain test files. Extracting it to a shared test helper (e.g., test/helpers/cli-test-utils.ts) would reduce duplication and make changes to the mock shape easier.

  2. The 17 null-branch tests (lines 1541–1680) follow an identical structure. it.each over the handler IDs and their minimal params would cut ~130 lines while retaining the same coverage.

Both are optional given CI passes and the files are self-contained.

♻️ Example for null-branch consolidation
describe("pr domain handlers – null owner/name ?? branch coverage", () => {
  const nr = () => mockRunner(1, "", "err")
  const nullCases: [string, Record<string, unknown>][] = [
    ["pr.view", { owner: null, name: null, prNumber: 1 }],
    ["pr.list", { owner: null, name: null, first: 30 }],
    ["pr.create", { owner: null, name: null, title: "T", head: "b" }],
    ["pr.update", { owner: null, name: null, prNumber: 1 }],
    ["pr.checks.list", { owner: null, name: null, prNumber: 1 }],
    ["pr.checks.failed", { owner: null, name: null, prNumber: 1 }],
    ["pr.merge.status", { owner: null, name: null, prNumber: 1 }],
    ["pr.review.submit", { owner: null, name: null, prNumber: 1, event: "APPROVE" }],
    ["pr.merge", { owner: null, name: null, prNumber: 1 }],
    ["pr.checks.rerun_failed", { owner: null, name: null, prNumber: 1 }],
    ["pr.checks.rerun_all", { owner: null, name: null, prNumber: 1 }],
    ["pr.review.request", { owner: null, name: null, prNumber: 1, reviewers: ["alice"] }],
    ["pr.assignees.update", { owner: null, name: null, prNumber: 1, add: ["alice"], remove: [] }],
    ["pr.branch.update", { owner: null, name: null, prNumber: 1 }],
    ["pr.diff.view", { owner: null, name: null, prNumber: 1 }],
    ["pr.diff.files", { owner: null, name: null, prNumber: 1 }],
    ["check_run.annotations.list", { owner: null, name: null, checkRunId: 1 }],
  ]

  it.each(nullCases)("%s covers owner/name null branches", async (id, params) => {
    const result = await h(id)(nr(), params, undefined)
    expect(result.ok).toBe(false)
  })
})

Also applies to: 1541-1680

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-domains-pr.test.ts` around lines 6 - 12, Extract
the duplicated mockRunner factory into a shared test helper (e.g., export const
mockRunner from test/helpers/cli-test-utils.ts) and import it into this test
file to remove duplication; then collapse the 17 nearly-identical null-branch
tests into a single parameterized test using it.each (build an array of
[handlerId, params] tuples) and iterate calling h(handlerId)(nr(), params,
undefined) where nr = () => mockRunner(1, "", "err"), asserting result.ok is
false; reference the existing mockRunner factory and the h() handler resolver to
locate the code to change.
packages/core/test/unit/cli-domains-release.test.ts (2)

580-612: Test title claims prerelease coverage but the assertion is missing.

The test passes prerelease: false and targetCommitish: "dev" but only asserts tag_name and body in the spy call. The prerelease=false and target_commitish=dev arguments are never verified, leaving a gap between the test name and what it actually exercises.

♻️ Proposed addition
      expect(runSpy).toHaveBeenCalledWith(
        "gh",
        expect.arrayContaining(["body=Updated notes"]),
        expect.any(Number),
      )
+     expect(runSpy).toHaveBeenCalledWith(
+       "gh",
+       expect.arrayContaining(["prerelease=false"]),
+       expect.any(Number),
+     )
+     expect(runSpy).toHaveBeenCalledWith(
+       "gh",
+       expect.arrayContaining(["target_commitish=dev"]),
+       expect.any(Number),
+     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-domains-release.test.ts` around lines 580 - 612,
The test for handleReleaseUpdate omits assertions for the prerelease and
targetCommitish arguments; update the spec to verify the runner was called with
"prerelease=false" and "target_commitish=dev" by adding
expect(runSpy).toHaveBeenCalledWith("gh",
expect.arrayContaining(["prerelease=false"]), expect.any(Number)) and a similar
expect.arrayContaining(["target_commitish=dev"]) check (use the existing
runner/runSpy and CliCommandRunner test setup to locate where to add these
assertions).

631-643: Prefer mockResolvedValueOnce chaining over a manual callCount counter.

Lines 260–291 already demonstrate the idiomatic Vitest pattern; the hand-rolled counter here is inconsistent and harder to read.

♻️ Proposed refactor
-     let callCount = 0
-     const runSpy = vi.fn().mockImplementation(async () => {
-       callCount++
-       if (callCount === 1) {
-         return { exitCode: 0, stdout: JSON.stringify({ draft: true }), stderr: "" }
-       }
-       return {
-         exitCode: 0,
-         stdout: JSON.stringify({ tag_name: "v1.0.0", draft: false }),
-         stderr: "",
-       }
-     })
+     const runSpy = vi
+       .fn()
+       .mockResolvedValueOnce({ exitCode: 0, stdout: JSON.stringify({ draft: true }), stderr: "" })
+       .mockResolvedValueOnce({
+         exitCode: 0,
+         stdout: JSON.stringify({ tag_name: "v1.0.0", draft: false }),
+         stderr: "",
+       })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-domains-release.test.ts` around lines 631 - 643,
Replace the manual callCount counter in the test by chaining Vitest's
mockResolvedValueOnce on the runSpy: change the runSpy mockImplementation that
increments callCount to use vi.fn().mockResolvedValueOnce({ exitCode: 0, stdout:
JSON.stringify({ draft: true }), stderr: "" }).mockResolvedValueOnce({ exitCode:
0, stdout: JSON.stringify({ tag_name: "v1.0.0", draft: false }), stderr: "" })
and keep runner typed as CliCommandRunner; ensure behavior stays identical
(first call returns draft response, second returns tag_name response).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/core/test/unit/cli-domains-release.test.ts`:
- Around line 670-685: The null-branch tests pass incorrect property names so
they don't exercise the handler input contract: update the test cases calling
handleReleaseGet and handleReleaseCreateDraft to use the expected "tagName"
property (e.g., { owner: null, name: null, tagName: "v1.0.0" }) instead of
"releaseId" or "tag"; keep the rest of the inputs and assertions the same so the
tests still hit the owner/name null branches while accurately reflecting the
handlers' expected parameters (handleReleaseGet and handleReleaseCreateDraft).

---

Nitpick comments:
In `@packages/core/test/unit/cli-domains-pr.test.ts`:
- Around line 6-12: Extract the duplicated mockRunner factory into a shared test
helper (e.g., export const mockRunner from test/helpers/cli-test-utils.ts) and
import it into this test file to remove duplication; then collapse the 17
nearly-identical null-branch tests into a single parameterized test using
it.each (build an array of [handlerId, params] tuples) and iterate calling
h(handlerId)(nr(), params, undefined) where nr = () => mockRunner(1, "", "err"),
asserting result.ok is false; reference the existing mockRunner factory and the
h() handler resolver to locate the code to change.

In `@packages/core/test/unit/cli-domains-release.test.ts`:
- Around line 580-612: The test for handleReleaseUpdate omits assertions for the
prerelease and targetCommitish arguments; update the spec to verify the runner
was called with "prerelease=false" and "target_commitish=dev" by adding
expect(runSpy).toHaveBeenCalledWith("gh",
expect.arrayContaining(["prerelease=false"]), expect.any(Number)) and a similar
expect.arrayContaining(["target_commitish=dev"]) check (use the existing
runner/runSpy and CliCommandRunner test setup to locate where to add these
assertions).
- Around line 631-643: Replace the manual callCount counter in the test by
chaining Vitest's mockResolvedValueOnce on the runSpy: change the runSpy
mockImplementation that increments callCount to use
vi.fn().mockResolvedValueOnce({ exitCode: 0, stdout: JSON.stringify({ draft:
true }), stderr: "" }).mockResolvedValueOnce({ exitCode: 0, stdout:
JSON.stringify({ tag_name: "v1.0.0", draft: false }), stderr: "" }) and keep
runner typed as CliCommandRunner; ensure behavior stays identical (first call
returns draft response, second returns tag_name response).

In `@packages/core/test/unit/cli-domains-repo.test.ts`:
- Around line 220-368: Consolidate the three repetitive null-owner/name tests
into a single parameterized test using it.each: replace the separate it blocks
that call handleRepoView, handleRepoLabelsList, and handleRepoIssueTypesList
with one it.each table that iterates over tuples of [handlerFunction, args]
(referencing handleRepoView, handleRepoLabelsList, handleRepoIssueTypesList and
the nr() mockRunner), and assert result.ok is false for each; ensure you pass
the same nr() runner and the appropriate param shape for each handler (e.g., {
owner: null, name: null, first: X } where required) so behavior and coverage
remain identical.

@aryeko
Copy link
Owner Author

aryeko commented Feb 19, 2026

@codex fix pr comments and add changesets

@chatgpt-codex-connector
Copy link

Summary

Committed as 9a7b2ae and created the follow-up PR via make_pr.

Testing

  • pnpm run format
  • pnpm --filter @ghx-dev/core exec vitest run test/unit/cli-domains-workflow.test.ts test/unit/cli-domains-release.test.ts test/unit/cli-capability-registry.test.ts test/unit/cli-helpers.test.ts
  • pnpm --filter @ghx-dev/core run typecheck
  • git commit -m "fix(core): address PR review feedback and add changeset" (pre-commit hooks passed format, lint, and typecheck)

View task →

aryeko and others added 3 commits February 19, 2026 09:28
- pr.ts: use rerunAllResult.exitCode (not result.exitCode) in fallback
  error path of handlePrChecksRerunFailed
- project-v2.ts: remove unreachable SyntaxError catch branch in
  handleProjectV2ItemFieldUpdate (no JSON parsing in success path)
- release.ts: fix misleading error message — owner/name are already
  validated by requireRepo; only releaseId can be invalid here
- workflow.ts: track total error/warning counts independently of the
  10-line collection cap in handleWorkflowJobLogsGet; type-narrow
  artifact id field consistently with other normalized fields
- docs: fix check_run.* module attribution (pr.ts, not workflow.ts);
  correct registerCliHandler() reference in adapters.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
packages/core/test/e2e/setup-opencode-skill.e2e.test.ts (1)

205-211: Temp directories are never cleaned up.

tempRoot and isolatedXdgConfig (and their children packDir/projectDir) are created per test run but never removed. On long-lived CI runners or dev machines these accumulate indefinitely. Consider registering cleanup in afterEach.

🧹 Proposed cleanup in afterEach
+ import { rmSync } from "node:fs"

  let originalXdg: string | undefined
  let originalCwd: string
  let opencode: Awaited<ReturnType<typeof createOpencode>> | null = null
+ let tempRootToClean: string | undefined

  beforeEach(() => {
    originalXdg = process.env.XDG_CONFIG_HOME
    originalCwd = process.cwd()
+   tempRootToClean = undefined
  })

  afterEach(() => {
    opencode?.server.close()
    opencode = null
    process.chdir(originalCwd)
    if (originalXdg === undefined) {
      delete process.env.XDG_CONFIG_HOME
    } else {
      process.env.XDG_CONFIG_HOME = originalXdg
    }
+   if (tempRootToClean) {
+     rmSync(tempRootToClean, { recursive: true, force: true })
+   }
  })

Then inside the test, assign tempRootToClean = tempRoot right after the mkdtempSync call, and do the same for isolatedXdgConfig (or fold it under tempRoot).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/e2e/setup-opencode-skill.e2e.test.ts` around lines 205 -
211, The test "uses SDK agent session in isolated config and runs ghx
capabilities list" creates temp directories (tempRoot, packDir, projectDir,
isolatedXdgConfig) that are never removed; fix by declaring mutable holders
(e.g., tempRootToClean, isolatedXdgConfigToClean) at the top of the test file,
assign them immediately after each mkdtempSync call inside the it(...) block,
and add an afterEach hook that checks those holders and removes their
directories recursively (use fs.rmSync or fs.rm with { recursive: true, force:
true } or equivalent) to ensure tempRoot and isolatedXdgConfig and their
children are deleted after each test.
packages/core/src/core/execution/adapters/cli/domains/workflow.ts (1)

135-282: handleWorkflowJobLogsRaw and handleWorkflowJobLogsGet duplicate ~60 lines of gh run view --job --log invocation.

Everything from owner/name/repo setup through rawLog/truncated/logContent derivation (lines 137–171 / 199–233) is identical between the two handlers. Only the result shape differs: one returns the raw log, the other returns the analyzed summary.

♻️ Proposed refactor — extract shared log-fetch helper
+type RawLogResult = { jobId: number; logContent: string; truncated: boolean }
+
+async function fetchJobLog(
+  runner: Parameters<CliHandler>[0],
+  params: Record<string, unknown>,
+  card: Parameters<CliHandler>[2],
+  capabilityId: string,
+): Promise<RawLogResult | ResultEnvelope> {
+  const owner = String(params.owner ?? "")
+  const name = String(params.name ?? "")
+  const repo = owner && name ? `${owner}/${name}` : ""
+  const jobId = parseStrictPositiveInt(params.jobId)
+  if (jobId === null) throw new Error(`Missing or invalid jobId for ${capabilityId}`)
+
+  const args = commandTokens(card, "run view")
+  args.push("--job", String(jobId), "--log")
+  if (repo) args.push("--repo", repo)
+
+  const result = await runner.run("gh", args, DEFAULT_TIMEOUT_MS)
+  if (result.exitCode !== 0) {
+    const code = mapErrorToCode(result.stderr)
+    return normalizeError(
+      { code, message: sanitizeCliErrorMessage(result.stderr, result.exitCode), retryable: isRetryableErrorCode(code), details: { capabilityId, exitCode: result.exitCode } },
+      "cli",
+      { capabilityId, reason: "CARD_FALLBACK" },
+    )
+  }
+
+  const rawLog = result.stdout
+  const truncated = rawLog.length > MAX_WORKFLOW_JOB_LOG_CHARS
+  return { jobId, logContent: rawLog.slice(0, MAX_WORKFLOW_JOB_LOG_CHARS), truncated }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/domains/workflow.ts` around
lines 135 - 282, Extract the duplicated logic that builds repo, validates jobId,
constructs args with commandTokens(card, "run view"), runs runner.run("gh",
args, DEFAULT_TIMEOUT_MS), handles non-zero exit (using mapErrorToCode,
sanitizeCliErrorMessage, isRetryableErrorCode and normalizeError), and returns
rawLog/truncated/logContent into a new helper (e.g., fetchWorkflowJobLog) and
call it from both handleWorkflowJobLogsRaw and handleWorkflowJobLogsGet; ensure
the helper returns { jobId, rawLog, truncated, logContent } or throws normalized
errors so the two handlers only produce their distinct normalized results (raw
vs. summary).
packages/core/test/unit/cli-domains-workflow.test.ts (1)

1072-1169: Null-owner branch coverage tests work correctly but test names are slightly imprecise.

For handlers where owner/name are optional (e.g., handleWorkflowJobLogsRaw), passing null coerces to "" and the command runs without --repo; result.ok ends up false because nr() returns exitCode: 1, not because of a null-input guard. The behavior tested is "error propagation still works when owner/name are absent", which is useful coverage. The test for handleWorkflowDispatchRun (line 1152) exercises the missing-ref validation path before requireRepo is reached.

No functional issue — the tests do protect against regressions — but the description "covers owner/name null branches" slightly overstates what is exercised.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/test/unit/cli-domains-workflow.test.ts` around lines 1072 -
1169, Rename the test descriptions to accurately reflect what they're asserting:
they verify that error propagation works when owner/name are absent (not that a
null-input guard is exercised). For each spec using nr() and calling
handleWorkflowRunsList, handleWorkflowJobLogsRaw, handleWorkflowJobLogsGet,
handleWorkflowList, handleWorkflowGet, handleWorkflowRunView,
handleWorkflowRunRerunAll, handleWorkflowRunCancel,
handleWorkflowRunArtifactsList, handleWorkflowRunRerunFailed, change the
`"covers owner/name null branches"` text to something like `"returns error when
owner/name absent (error propagation)"`; for handleWorkflowDispatchRun update
its title to indicate it exercises the missing-ref validation (e.g., `"validates
missing ref before requireRepo when owner/name absent"`). Keep the same
assertions and inputs (nr(), null owner/name) — only update the test names.
packages/core/src/core/execution/adapters/cli/domains/pr.ts (1)

295-423: handlePrChecksList and handlePrChecksFailed duplicate the entire gh pr checks fetch path.

Both handlers construct identical args, run the same command, handle errors identically, and apply the same normalization. The only difference is items: checks (list) vs items: failed (filtered). Extracting the shared fetch-and-parse logic into a private helper would halve ~130 lines and keep future changes (e.g., adding a new JSON field) in one place.

♻️ Proposed refactor — extract shared fetch logic
+type PrChecksResult = {
+  checks: ReturnType<typeof normalizeCheckItem>[]
+  failed: ReturnType<typeof normalizeCheckItem>[]
+  pending: ReturnType<typeof normalizeCheckItem>[]
+  passed: ReturnType<typeof normalizeCheckItem>[]
+}
+
+async function fetchPrChecks(
+  runner: Parameters<CliHandler>[0],
+  params: Record<string, unknown>,
+  card: Parameters<CliHandler>[2],
+): Promise<PrChecksResult | ResultEnvelope> {
+  const owner = String(params.owner ?? "")
+  const name = String(params.name ?? "")
+  const repo = owner && name ? `${owner}/${name}` : ""
+  const prNumber = parseStrictPositiveInt(params.prNumber)
+  if (prNumber === null) throw new Error("Missing or invalid prNumber for pr.checks")
+
+  const args = [...commandTokens(card, "pr checks"), String(prNumber)]
+  if (repo) args.push("--repo", repo)
+  args.push("--json", jsonFieldsFromCard(card, "name,state,bucket,workflow,link"))
+
+  const result = await runner.run("gh", args, DEFAULT_TIMEOUT_MS)
+  if (result.exitCode !== 0) {
+    const code = mapErrorToCode(result.stderr)
+    return normalizeError(
+      { code, message: sanitizeCliErrorMessage(result.stderr, result.exitCode), retryable: isRetryableErrorCode(code), details: { exitCode: result.exitCode } },
+      "cli",
+      { reason: "CARD_FALLBACK" },
+    )
+  }
+
+  const data = parseCliData(result.stdout)
+  const checks = Array.isArray(data) ? data.map((e) => normalizeCheckItem(e)) : []
+  return {
+    checks,
+    failed: checks.filter((e) => isCheckFailureBucket(e.bucket)),
+    pending: checks.filter((e) => isCheckPendingBucket(e.bucket)),
+    passed: checks.filter((e) => isCheckPassBucket(e.bucket)),
+  }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/execution/adapters/cli/domains/pr.ts` around lines 295
- 423, Both handlers (handlePrChecksList and handlePrChecksFailed) duplicate the
GH CLI fetch/parse/error-handling path; extract that shared logic into a private
helper (e.g., fetchPrChecks) that builds args (using commandTokens and
jsonFieldsFromCard), runs runner.run("gh", ...), normalizes CLI errors, parses
JSON via parseCliData and maps entries with normalizeCheckItem, and returns the
checks array plus any metadata; then have handlePrChecksList call fetchPrChecks
and return items: checks, while handlePrChecksFailed calls fetchPrChecks and
returns items: checks.filter(...) (using isCheckFailureBucket), preserving
existing error mapping (mapErrorToCode), sanitizeCliErrorMessage usage, and the
normalizeResult/normalizeError calls and capabilityId strings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/architecture/adapters.md`:
- Around line 228-232: Update the domain to export the new CLI handler by adding
an entry to the domain's exported handlers: Record<string, CliHandler> map (do
not call registerCliHandler); implement the handler using the shared helper
functions from cli/helpers.ts for arg-building and option parsing, and ensure
the handler function signature matches CliHandler. Then add the new capability
ID to the CliCapabilityId union in cli-capability-adapter.ts so the
capability-registry.ts (which imports and spreads domain handlers) will pick up
the handler automatically.

In `@docs/architecture/repository-structure.md`:
- Around line 117-125: Update the documentation to reflect the corrected domain
mapping: ensure core/execution/adapters/cli/capability-registry.ts lists
getCliHandler() and listCliCapabilities(), and ensure check_run.annotations.list
is attributed to core/execution/adapters/cli/domains/pr.ts (not workflow.ts);
also remove the duplicate reviewer tag ([duplicate_comment]) from the doc
comment so there’s no duplicated approval metadata.

In `@packages/core/src/core/execution/adapters/cli/domains/pr.ts`:
- Around line 628-656: The fallback path after calling runner.run("gh",
rerunAllArgs, ...) incorrectly used the original result.exitCode; update the
normalizeError call so the error message and details use rerunAllResult's exit
code and stderr/output: inside the shouldFallbackRerunFailedToAll block (around
runner.run and the subsequent failure handling), compute failureStderr from
rerunAllResult and pass rerunAllResult.exitCode into sanitizeCliErrorMessage and
into details.exitCode when calling normalizeError (functions referenced:
shouldFallbackRerunFailedToAll, runner.run, normalizeError,
sanitizeCliErrorMessage, mapErrorToCode, isRetryableErrorCode).

In `@packages/core/src/core/execution/adapters/cli/domains/release.ts`:
- Around line 365-375: Update handleReleasePublishDraft so the thrown error when
parseStrictPositiveInt(params.releaseId) returns null uses a specific message
referencing release.publish_draft; locate the logic in function
handleReleasePublishDraft after requireRepo(owner, name,
"release.publish_draft") and replace the generic message with "Missing or
invalid releaseId for release.publish_draft" (matching the pattern used in
handleReleaseUpdate) to avoid misleading owner/name validation implications and
keep consistency with requireRepo and parseStrictPositiveInt usage.

In `@packages/core/src/core/execution/adapters/cli/domains/workflow.ts`:
- Around line 236-263: The summary currently includes errorCount, warningCount
and topErrorLines but omits the capped warning lines; update the
normalized.summary object (near jobId/truncated/summary) to include
topWarningLines using the existing warningLines array (e.g., topWarningLines:
warningLines.slice(0, 10)) so the UI receives both topErrorLines and
topWarningLines while totalErrorCount and totalWarningCount remain accurate.
- Around line 714-733: Update the artifact normalization inside the normalized
constant's items mapping to consistently narrow id to string | number by keeping
the check in the items map: in the mapping function for artifactsArray (inside
the normalized object) ensure the returned id uses `typeof input.id === "string"
|| typeof input.id === "number" ? input.id : null`, and keep name, sizeInBytes,
and archiveDownloadUrl narrowing as currently written so the normalized object
matches the nullable fallback convention used elsewhere (refer to the mapping
callback that casts artifact to Record<string, unknown> and returns the shape
with id, name, sizeInBytes, archiveDownloadUrl).

In `@packages/core/test/e2e/setup-opencode-skill.e2e.test.ts`:
- Around line 243-244: The hardcoded-port concern is resolved by passing an
ephemeral port to createOpencode; update the test setup so the opencode instance
is created via createOpencode with the port option set to 0 (e.g., ensure the
call to createOpencode includes port: 0) so the OS assigns a free port and
avoids CI port contention, leaving the rest of the opencode initialization
unchanged.

In `@packages/core/test/unit/cli-domains-workflow.test.ts`:
- Around line 213-228: Update the unit test to assert the corrected total
error-count behavior: in the "includes up to 10 error lines in topErrorLines"
test, change the expectation for summary.errorCount to 15 (the total number of
lines matching /\berror\b/i) while keeping the topErrorLines length assertion at
10; this targets the test invoking handleWorkflowJobLogsGet and checks
summary.errorCount and summary.topErrorLines to match the new workflow.ts
behavior that tracks totalErrorCount separately from the capped collection.

---

Nitpick comments:
In `@packages/core/src/core/execution/adapters/cli/domains/pr.ts`:
- Around line 295-423: Both handlers (handlePrChecksList and
handlePrChecksFailed) duplicate the GH CLI fetch/parse/error-handling path;
extract that shared logic into a private helper (e.g., fetchPrChecks) that
builds args (using commandTokens and jsonFieldsFromCard), runs runner.run("gh",
...), normalizes CLI errors, parses JSON via parseCliData and maps entries with
normalizeCheckItem, and returns the checks array plus any metadata; then have
handlePrChecksList call fetchPrChecks and return items: checks, while
handlePrChecksFailed calls fetchPrChecks and returns items: checks.filter(...)
(using isCheckFailureBucket), preserving existing error mapping
(mapErrorToCode), sanitizeCliErrorMessage usage, and the
normalizeResult/normalizeError calls and capabilityId strings.

In `@packages/core/src/core/execution/adapters/cli/domains/workflow.ts`:
- Around line 135-282: Extract the duplicated logic that builds repo, validates
jobId, constructs args with commandTokens(card, "run view"), runs
runner.run("gh", args, DEFAULT_TIMEOUT_MS), handles non-zero exit (using
mapErrorToCode, sanitizeCliErrorMessage, isRetryableErrorCode and
normalizeError), and returns rawLog/truncated/logContent into a new helper
(e.g., fetchWorkflowJobLog) and call it from both handleWorkflowJobLogsRaw and
handleWorkflowJobLogsGet; ensure the helper returns { jobId, rawLog, truncated,
logContent } or throws normalized errors so the two handlers only produce their
distinct normalized results (raw vs. summary).

In `@packages/core/test/e2e/setup-opencode-skill.e2e.test.ts`:
- Around line 205-211: The test "uses SDK agent session in isolated config and
runs ghx capabilities list" creates temp directories (tempRoot, packDir,
projectDir, isolatedXdgConfig) that are never removed; fix by declaring mutable
holders (e.g., tempRootToClean, isolatedXdgConfigToClean) at the top of the test
file, assign them immediately after each mkdtempSync call inside the it(...)
block, and add an afterEach hook that checks those holders and removes their
directories recursively (use fs.rmSync or fs.rm with { recursive: true, force:
true } or equivalent) to ensure tempRoot and isolatedXdgConfig and their
children are deleted after each test.

In `@packages/core/test/unit/cli-domains-workflow.test.ts`:
- Around line 1072-1169: Rename the test descriptions to accurately reflect what
they're asserting: they verify that error propagation works when owner/name are
absent (not that a null-input guard is exercised). For each spec using nr() and
calling handleWorkflowRunsList, handleWorkflowJobLogsRaw,
handleWorkflowJobLogsGet, handleWorkflowList, handleWorkflowGet,
handleWorkflowRunView, handleWorkflowRunRerunAll, handleWorkflowRunCancel,
handleWorkflowRunArtifactsList, handleWorkflowRunRerunFailed, change the
`"covers owner/name null branches"` text to something like `"returns error when
owner/name absent (error propagation)"`; for handleWorkflowDispatchRun update
its title to indicate it exercises the missing-ref validation (e.g., `"validates
missing ref before requireRepo when owner/name absent"`). Keep the same
assertions and inputs (nr(), null owner/name) — only update the test names.

…hanges

- release.publish_draft: fix error code expectation (UNKNOWN → VALIDATION)
  because the new "Missing or invalid releaseId" message matches the
  validation pattern in mapErrorToCode
- workflow.run.artifacts.list: fix artifact id fallback expectation (0 → null)
  because the null-artifact guard now returns id: null instead of id: 0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@aryeko aryeko merged commit fb05c12 into main Feb 19, 2026
7 checks passed
@aryeko aryeko deleted the refactor/cli-domain-modules branch February 19, 2026 07:39
aryeko added a commit that referenced this pull request Feb 19, 2026
…gnosis

Expand prompt to require fetching job logs and add workflow.job.logs.get
to expected_capabilities. Reflects workflow domain module extraction in #57.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
aryeko added a commit that referenced this pull request Feb 20, 2026
* docs: add benchmark scenario composite update design

Documents the planned adjustments to workflow benchmark scenarios
following today's composite capabilities merge (#55):
- pr.threads.composite replaces atomic thread atomics in pr-fix-review-comments
- issue.triage.composite replaces atomic label+comment in issue-triage-comment
- ci-diagnose-run expanded to require workflow.job.logs.get

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add benchmark scenario composite update implementation plan

4-task plan covering JSON edits to 3 workflow scenarios:
- pr-fix-review-comments: atomic threads → pr.threads.composite
- issue-triage-comment: atomic label+comment → issue.triage.composite
- ci-diagnose-run: expanded prompt + workflow.job.logs.get

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(benchmark): use pr.threads.composite in pr-fix-review-comments-wf-001

Reflects #55 composite capabilities merge — pr.threads.composite
replaces separate pr.thread.reply + pr.thread.resolve atomic calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(benchmark): use issue.triage.composite in issue-triage-comment-wf-001

Reflects #55 composite capabilities merge — issue.triage.composite
replaces separate issue.labels.update + issue.comments.create atomic calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(benchmark): expand ci-diagnose-run-wf-001 to require job log diagnosis

Expand prompt to require fetching job logs and add workflow.job.logs.get
to expected_capabilities. Reflects workflow domain module extraction in #57.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(benchmark): tighten pr-fix prompt + improve SKILL.md composite guidance

- Reword pr-fix-review-comments-wf-001 prompt to focus on unresolved
  threads and combined reply+resolve action
- Restructure SKILL.md: move Discovery before Execute, add Composite
  subsection with CRITICAL enforcement referencing discovered caps

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(core): capability namespace standardization

Rename all capability IDs to consistent conventions, remove composite
infrastructure, and add new atomic capabilities.

Naming principles applied:
- view = single-item read, list = collection (retire get)
- set = replace-all, add/remove = additive mutations
- Sub-domains use dots only, never underscores (rerun.failed not rerun_failed)
- Sub-domain names reflect feature area in plural (threads, reviews, checks, relations)

Renames:
- pr.thread.* → pr.threads.*, pr.review.* → pr.reviews.*
- pr.checks.rerun_all/failed → pr.checks.rerun.all/failed
- workflow.get → workflow.view, workflow.dispatch.run → workflow.dispatch
- workflow.run.rerun_all/failed → workflow.run.rerun.all/failed
- workflow.job.logs.get → workflow.job.logs.view
- project_v2.org/user.get → project_v2.org/user.view
- project_v2.item.* → project_v2.items.*
- release.get → release.view, release.create_draft → release.create, release.publish_draft → release.publish
- issue.relations.get → issue.relations.view, issue.linked_prs.list → issue.relations.prs.list
- issue.parent.* → issue.relations.parent.*, issue.blocked_by.* → issue.relations.blocked_by.*
- issue.labels/assignees.update → issue.labels/assignees.set

New capabilities:
- issue.labels.remove, issue.assignees.add/remove, issue.milestone.clear
- pr.assignees.add/remove (replaces retired pr.assignees.update)
- project_v2.items.issue.remove

Retired: pr.checks.failed (merged into pr.checks.list), check_run.annotations.list,
pr.assignees.update, pr.threads.composite, issue.triage.composite, issue.update.composite

Schema normalizations:
- All rerun ops: { runId: string, queued: boolean }
- pr.threads.reply: adds commentId + commentUrl to output
- issue.relations.parent.set: adds updated: boolean
- issue.relations.blocked_by.add: adds added: boolean
- issue.milestone split into set (integer only) + clear (separate capability)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add changeset for capability namespace standardization

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(core): follow-up cleanups after namespace standardization

- Remove dead NON_JSON_STDOUT_CAPABILITIES export from helpers.ts (stale
  IDs, never imported anywhere)
- Fix runId type mismatch: all 4 rerun YAML cards now have integer output
  (was inconsistently string); handlers return integer directly without
  String() coercion; update test assertions accordingly
- Add integration tests for 7 new capabilities: issue.labels.remove,
  issue.assignees.add, issue.assignees.remove, issue.milestone.clear,
  pr.assignees.add, pr.assignees.remove, project_v2.items.issue.remove

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(coverage): add unit tests for new capability handlers to reach ≥90% patch coverage

- cli-domains-issue.test.ts: add describe blocks for handleIssueLabelsRemove,
  handleIssueAssigneesAdd, handleIssueAssigneesRemove, handleIssueMilestoneClear
  (success, invalid params, empty arrays, non-zero exit, runner throws paths)
- cli-domains-project-v2.test.ts: add describe block for project_v2.items.issue.remove
  (success, correct args, missing projectId/itemId, non-zero exit, runner throws)
- cli-domains-pr.test.ts: add runner-throws path for pr.assignees.add and pr.assignees.remove
- gql-builders.test.ts: cover replyBuilder.mapResponse() commentUrl truthy branch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: align GQL return types and YAML input schema with output contracts

Four P1 issues identified by Codex review:

1. pr.threads.reply: runReplyToReviewThread now returns commentId/commentUrl
   alongside id/isResolved, matching the output_schema. Adds
   ReplyToReviewThreadData type extending ReviewThreadMutationData.

2. issue.relations.parent.set: runIssueParentSet now returns updated: true,
   satisfying the required output field. IssueParentSetData type updated.

3. issue.relations.blocked_by.add: runIssueBlockedByAdd now returns added: true,
   satisfying the required output field. IssueBlockedByData type updated.

4. project_v2.items.issue.remove: YAML input schema corrected to require
   projectId (node ID string) instead of projectNumber, matching the CLI
   handler which calls gh project item-delete --project-id. Integration
   tests updated accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(review): address CodeRabbit P1 issues and schema inconsistencies

- Fix project_v2.items.issue.remove: rewrite handler to use correct
  gh project item-delete syntax (positional projectNumber + --owner/--id)
  and update YAML schema to match (owner, projectNumber, itemId)
- Fix pr.checks.list: strip internal `bucket` field from output and
  include `annotations: []` so output validates against schema
  (additionalProperties: false was failing at runtime)
- Add isResolved: false to builders.ts replyBuilder.mapResponse for
  consistency with live runReplyToReviewThread path
- Add minLength/minimum constraints to YAML output schemas for
  issue.assignees.remove, pr.checks.rerun.all, workflow.run.rerun.*
- Fix workflow.view description (Get → View)
- Remove retired Composite Capabilities section from SKILL.md
- Update benchmark scenarios: fix stale capability IDs in
  ci-diagnose-run-wf-001 and issue-triage-comment-wf-001
- Fix docs/capabilities/pull-requests.md output field names for
  pr.assignees.add/remove
- Fix changeset: runId type integer (not string) in rerun outputs
- Rename integration test files to match new capability IDs
- Update tests for new project_v2.items.issue.remove param shape
  and remove bucket from pr.checks.list assertions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(review): tighten IssueMilestoneSetInput and add remove integration test

- Remove `| null` from IssueMilestoneSetInput.milestoneNumber in types.ts
  to match the YAML schema (integer minimum:1, no null); issue.milestone.clear
  owns the clearing concern
- Remove dead null branch from runIssueMilestoneSet and tighten
  assertIssueMilestoneSetInput to reject null unconditionally
- Add ADAPTER_UNSUPPORTED integration test for project_v2.items.issue.remove
  to match the standard two-test pattern used by other CLI-only capabilities

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant