Skip to content

Refine Copilot SDK-mode tool permission scoping from engine config#36538

Merged
pelikhan merged 19 commits into
mainfrom
copilot/implement-refine-copilot-config
Jun 3, 2026
Merged

Refine Copilot SDK-mode tool permission scoping from engine config#36538
pelikhan merged 19 commits into
mainfrom
copilot/implement-refine-copilot-config

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 2, 2026

Copilot SDK mode was effectively running with broad permission behavior despite workflow-level tool constraints. This update aligns SDK session permissions with engine-generated tool configuration, removes the implicit “allow everything” posture when explicit rules exist, and improves visibility into denied permission requests.

  • Permission model alignment (SDK ↔ engine config)

    • Emit a normalized permissionConfig in the engine-generated stdin JSON payload for SDK mode.
    • Pass permission/tool settings from engine → harness → SDK driver without parsing CLI flags in JavaScript.
    • Remove JS-side parsing of serverArgs (--allow-tool, --allow-all-tools) for permission derivation.
  • Scoped permission handler in SDK driver

    • Replace unconditional onPermissionRequest: approveAll with a rule-aware handler.
    • Preserve SDK default behavior when no toolsets/permission rules are specified (matching command-line mode semantics), and only wire onPermissionRequest when explicit permission config exists.
    • Keep explicit allowAllTools behavior via approveAll when the engine config requests full tool access.
    • Enforce allowlist decisions across:
      • shell commands (shell, shell(cmd), shell(prefix:*))
      • MCP server/tool permissions (server, server(tool))
      • web_fetch URL permissions
      • write/custom-tool requests
    • Reject non-allowed requests with explicit feedback.
  • Permission-denied observability

    • Add SDK-side denied-request summaries for permission checks.
    • Emit permission-denied diagnostics through GitHub Actions-style core logging (core.info + core.warning) via shim-backed global.core in the harness, with harness-log fallback when core logging is unavailable.
  • Behavioral consistency improvements

    • Ensure deterministic tool-rule normalization and ordering for derived permission config.
    • Keep SDK-mode semantics consistent with existing Copilot CLI tool flag behavior.
    • Add/update tests for stdin SDK options payload handling and SDK execution-step emission of permissionConfig.
    • Add regression coverage to verify SDK-mode sessions do not inject an approve-all permission handler when no permission config is provided.
    • Add test coverage for permission conversion/parsing edge cases (malformed entries, dedupe/sort, allow-all handling) and SDK permission-denied logging behavior.
// Before
const sessionConfig = {
  model,
  onPermissionRequest: approveAll,
  provider,
};

// After
const onPermissionRequest = buildCopilotSDKPermissionHandler(permissionConfig, approveAll);

const sessionConfig = {
  model,
  provider,
};

if (onPermissionRequest) {
  sessionConfig.onPermissionRequest = onPermissionRequest;
}

Copilot AI and others added 4 commits June 2, 2026 21:43
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title Refine Copilot SDK-mode engine permission scoping Refine Copilot SDK-mode tool permission scoping from engine config Jun 2, 2026
Copilot AI requested a review from pelikhan June 2, 2026 21:51
@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Jun 2, 2026

@copilot store the permissions configuration, toolsets, and related settings in the JSON config passed via stdin so you don’t need a parser on the JavaScript side to interpret these options.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 2, 2026

@copilot store the permissions configuration, toolsets, and related settings in the JSON config passed via stdin so you don’t need a parser on the JavaScript side to interpret these options.

Implemented in ec5cde4. The engine now emits permissionConfig in the stdin JSON payload, and the harness consumes that config directly. JS-side parsing of serverArgs for --allow-tool/--allow-all-tools has been removed.

@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Jun 2, 2026

@copilot remove dead code introduced in this change. Review the behavior when no toolsets are specified, as it should match the existing command-line version behavior.

Copilot AI and others added 3 commits June 2, 2026 22:24
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 2, 2026

@copilot remove dead code introduced in this change. Review the behavior when no toolsets are specified, as it should match the existing command-line version behavior.

Addressed in 34254d4 and follow-up 9b5ccbc. SDK mode no longer injects an approve-all permission handler when no toolsets are configured, so it now defers to SDK defaults to match CLI no-toolset behavior, and I removed the redundant fallback path by only wiring onPermissionRequest when explicit permission rules are present.

@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Jun 2, 2026

@copilot add core info logging in the logic to allow tools to record permission-denied events, and log them as core warnings.
Add more tests to cover permission conversion and parsing logic.

Copilot AI and others added 3 commits June 2, 2026 23:01
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 2, 2026

@copilot add core info logging in the logic to allow tools to record permission-denied events, and log them as core warnings.
Add more tests to cover permission conversion and parsing logic.

Implemented in 2be2a58.

  • Added SDK permission-denied event logging that records request summaries and emits GitHub Actions core warnings/info when permission checks reject a request.
  • Wired a best-effort @actions/core logger from harness → SDK driver so denied tool requests are visible as core warnings.
  • Added test coverage for:
    • core warning/info emission on denied SDK permission requests
    • stdin SDK payload parsing for allow-all permission config
    • Go-side permission conversion/parsing (buildCopilotSDKPermissionConfig) including malformed entries, dedupe/sort, and allow-all behavior.

@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Jun 3, 2026

@copilot usd shim.cos to provide core logging

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

🏗️ Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (135 new lines in pkg/) but does not have a linked Architecture Decision Record (ADR).

📄 Draft ADR committed: docs/adr/36538-derive-copilot-sdk-permission-model-in-engine.md — review and complete it before merging.

🔒 This PR cannot merge until an ADR is linked in the PR body.

📋 What to do next
  1. Review the draft ADR committed to your branch — it was generated from the PR diff (the move of SDK permission derivation into the Go engine, and the shift from unconditional approveAll to a rule-aware, default-deny handler).
  2. Complete the missing sections — add context the AI couldn't infer, refine the decision rationale, and confirm the alternatives reflect what you actually considered.
  3. Commit the finalized ADR to docs/adr/ on your branch.
  4. Reference the ADR in this PR body by adding a line such as:

    ADR: ADR-36538: Derive Copilot SDK Permission Model in the Engine

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

❓ Why ADRs Matter

"AI made me procrastinate on key design decisions. Because refactoring was cheap, I could always say 'I'll deal with this later.' Deferring decisions corroded my ability to think clearly."

ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you.

📋 Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

  • Context — What is the problem? What forces are at play?
  • Decision — What did you decide? Why?
  • Alternatives Considered — What else could have been done?
  • Consequences — What are the trade-offs (positive and negative)?

All ADRs are stored in docs/adr/ as Markdown files numbered by PR number (e.g., 36538-derive-copilot-sdk-permission-model-in-engine.md for PR #36538).

🔒 Blocking: link an ADR in the PR body to clear this gate.

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · opus48 730.5K ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUEST_CHANGES — 3 medium issues and 1 test gap across the new permission-logging layer.

Blocking themes

Double logging in logPermissionDenied

Every permission denial emits both core.info and core.warning to GitHub Actions, producing two annotations with contradictory severity for the same event. The info call is redundant and should be removed.

All-or-nothing buildCoreLogger validation

The REQUIRED_CORE_LOGGER_METHODS check discards a partial core adapter (e.g. one that only implements warning) entirely, even though both call sites guard with optional chaining and would work fine with a partial object. The validation is stricter than necessary.

summarizePermissionRequest default case

Returns raw request.kind, which is undefined for any SDK kind not yet in the switch. The resulting log message reads permission denied ... undefined and is not actionable.

Missing standalone --allow-all-tools test

The critical allowAllTools=true && no allowedTools path is only covered in combination with --allow-tool write. The standalone case is not tested.

🔎 Code quality review by PR Code Quality Reviewer · sonnet46 184K

coreLogger.info(`Copilot SDK permission denied: ${requestSummary}`);
}
if (coreLogger?.warning) {
coreLogger.warning(`Copilot SDK permission denied by workflow tool permissions: ${requestSummary}`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant dual-level logging for every permission denial: logPermissionDenied emits both core.info and core.warning for the same event, producing two GitHub Actions annotations with contradictory severity for each denied request.

💡 Suggested fix

Keep only the warning call — it is the correct severity for a denial event:

function logPermissionDenied(coreLogger, logger, request) {
  const requestSummary = summarizePermissionRequest(request);
  logger(`permission denied by workflow tool permissions: ${requestSummary}`);
  if (coreLogger?.warning) {
    coreLogger.warning(`Copilot SDK permission denied by workflow tool permissions: ${requestSummary}`);
  }
}

The info entry adds no value once the warning is present. Operators will see the same event twice at different severity levels, creating noise in the GHA annotations panel and potentially confusing automated alerting.

case "custom-tool":
return `custom-tool(${request.toolName || "unknown"})`;
default:
return request.kind;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

summarizePermissionRequest default returns request.kind directly, which can be undefined: If the SDK introduces a new kind value not in the switch, the log message reads permission denied ... undefined, masking the actual operation.

💡 Suggested fix
default:
  return String(request.kind ?? "unknown");

This ensures the log message is always a printable string even for future SDK additions, and makes the "unknown" case explicit rather than relying on implicit JS coercion.

Comment thread actions/setup/js/copilot_harness.cjs Outdated
}
const missingMethods = REQUIRED_CORE_LOGGER_METHODS.filter(method => typeof core[method] !== "function");
if (missingMethods.length > 0) {
logCoreLoggerFallback(harnessLogger, `missing method(s): ${missingMethods.join(", ")}`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All-or-nothing validation silently discards partial core logger implementations: If global.core implements only warning (e.g. a minimal test stub or a future SDK revision), buildCoreLogger returns undefined and drops all structured logging instead of degrading gracefully.

💡 Suggested fix
return {
  info: typeof core.info === "function" ? message => core.info(message) : undefined,
  warning: typeof core.warning === "function" ? message => core.warning(message) : undefined,
};

This removes REQUIRED_CORE_LOGGER_METHODS and the all-or-nothing guard. logPermissionDenied already guards each method with optional chaining (coreLogger?.info, coreLogger?.warning), so a partial adapter works correctly with the existing call sites.

}
})

t.Run("preserves allow-all when present", func(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test for standalone --allow-all-tools with no --allow-tool flags: The critical branch allowAllTools=true && len(allowedToolsSet)==0 is only tested in combination with --allow-tool write. A regression that moves the early-nil-return check above the allowAllTools check would cause standalone --allow-all-tools to silently return nil (no-handler, SDK defaults) instead of approveAll, blocking all tools in production with no test failure.

💡 Suggested addition
t.Run("allow-all-tools without explicit tool entries", func(t *testing.T) {
    config := buildCopilotSDKPermissionConfig([]string{"--allow-all-tools"})
    if config == nil {
        t.Fatal("Expected non-nil permission config for --allow-all-tools")
    }
    if !config.AllowAllTools {
        t.Fatal("Expected AllowAllTools=true")
    }
    if len(config.AllowedTools) != 0 {
        t.Fatalf("Expected empty AllowedTools, got %v", config.AllowedTools)
    }
})

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

🧪 Test Quality Sentinel Report

⚠️ Test Quality Score: 66/100 — Acceptable, with suggestions

Analyzed 16 tests (4 Go sub-cases, 12 JavaScript): 15 design tests, 1 borderline, 0 guideline violations. Test inflation detected in copilot_harness.test.cjs (147 additions vs 50 in production file, ratio ≈ 2.9:1).

📊 Metrics & Test Classification (16 tests analyzed)
Metric Value
New/modified tests analyzed 16
✅ Design tests (behavioral contracts) 15 (93.8%)
⚠️ Implementation tests (low value) 1 (6.3%)
Tests with error/edge cases 11 (68.8%)
Duplicate test clusters 0
Test inflation detected ✅ YES — copilot_harness.test.cjs (2.9:1)
🚨 Coding-guideline violations 0

Test Classification Details

Test File Classification Notes
TestCopilotEngineExecutionStepsWithCopilotSDKPermissionConfig pkg/workflow/copilot_engine_test.go:321 ✅ Design Verifies generated step JSON contains permissionConfig; no error path
TestBuildCopilotSDKPermissionConfig / returns nil when no flags copilot_engine_test.go:349 ✅ Design Edge case: nil result when no --allow-* flags
TestBuildCopilotSDKPermissionConfig / converts and normalizes copilot_engine_test.go:356 ✅ Design Edge cases: malformed entries ignored, duplicates deduped
TestBuildCopilotSDKPermissionConfig / preserves allow-all copilot_engine_test.go:380 ✅ Design Boundary: allow-all flag dominates individual allow-tool
uses scoped permission handler from SDK permission config copilot_harness.test.cjs:368 ✅ Design Tests approve vs. reject per tool kind — behavioral contract
logs permission-denied SDK requests as core warnings copilot_harness.test.cjs:410 ✅ Design Verifies observable side-effect (warning emission)
uses SDK default permission behavior when no permissionConfig copilot_harness.test.cjs:455 ✅ Design Edge case: absent config → no onPermissionRequest injected
returns null when stdin is a TTY copilot_harness.test.cjs:1778 ✅ Design Edge case: TTY guard
parses valid JSON payload with promptFile copilot_harness.test.cjs:1783 ✅ Design Happy-path parse
parses full payload with permissionConfig copilot_harness.test.cjs:1788 ✅ Design Full field coverage including permissionConfig
parses payload with serverArgs but no addWorkspaceDir copilot_harness.test.cjs:1810 ✅ Design Edge case: optional field absent
parses payload with allow-all permission config copilot_harness.test.cjs:1823 ✅ Design Variant: allowAllTools=true
returns null on empty stdin copilot_harness.test.cjs:1839 ✅ Design Edge case: empty input
returns null on invalid JSON copilot_harness.test.cjs:1844 ✅ Design Error path: malformed input
handles extra whitespace around JSON copilot_harness.test.cjs:1849 ✅ Design Edge case: whitespace trimming
saves enriched reflect data when api-proxy returns null models copilot_harness.test.cjs:1860 ⚠️ Borderline Primarily a happy-path integration; no failure path for fetch errors

Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 4 test cases (unit, //go:build !integration present ✅)
  • 🟨 JavaScript (*.test.cjs): 12 tests (vitest)
⚠️ Flagged Tests — Requires Review (2 issues)

⚠️ Test inflation — copilot_harness.test.cjs

Issue: 147 lines added to test file vs 50 lines added to copilot_harness.cjs production file (ratio ≈ 2.9:1, threshold 2:1).
Assessment: The ratio is elevated but the test growth reflects well-scoped coverage of several new parseStdinPayload cases and three permission-handling scenarios. The inflation is driven by comprehensive edge-case coverage rather than low-value filler. No action strictly required, but consider whether any of the nine parseStdinPayload tests could be collapsed into a single table-driven block to reduce verbosity.
Suggested improvement: Combine the parseStdinPayload it() tests into a describe with a shared test.each / it.each table (vitest supports it.each). This preserves all edge cases while reducing line count.

⚠️ saves enriched reflect data when api-proxy returns null models (copilot_harness.test.cjs:1860)

Classification: Borderline — happy-path only
Issue: Tests the successful enrichment path but does not cover the fetch-failure case documented in fetchAWFReflect (the "leaves models null when models_url fetch fails" test nearby covers a different function). If the enrichment save logic breaks on fetch error, this test would not catch it.
Suggested improvement: Add one error-path variant: stub fetch to reject or return ok: false, then assert that outputPath is still written with models: null (or that the function does not throw).

Verdict

Check passed. 6.3% of new tests are implementation tests (threshold: 30%). Score: 66/100 — acceptable quality with minor improvements suggested (see flagged tests above).

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References:

🧪 Test quality analysis by Test Quality Sentinel · sonnet46 2.2M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test Quality Sentinel: 66/100. Test quality is acceptable — 6.3% of new tests are implementation tests (threshold: 30%). Minor suggestion: consolidate the parseStdinPayload it() tests into an it.each table and add an error-path case for the reflect-enrichment test.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /tdd, /zoom-out, and /grill-with-docs — overall solid security improvement with a few actionable gaps.

📋 Key Themes & Highlights

Key Themes

  • Test coverage gaps (3 items): allowAllTools=true ignoring allowedTools, read always allowed, and shell prefix matching with an empty commands array all lack regression tests.
  • Redundant logging: logPermissionDenied emits both core.info and core.warning for every denial — info is noise on top of the warning.
  • Secondary derivation path: buildCopilotSDKPermissionConfig re-parses CLI flags that were just synthesised by computeCopilotToolArguments. If the flag names change the SDK layer silently gets an empty config.
  • Missing typedef documentation: CopilotSDKPermissionConfig.allowedTools entry format conventions are undocumented, making the type hard to use correctly without reading the implementation.

Positive Highlights

  • ✅ Replacing unconditional approveAll with a rule-aware handler is the right security primitive — well-structured and easy to reason about.
  • ✅ Deterministic normalization (trim, dedup, sort) in Go before JSON serialization prevents drift between runs.
  • undefined handler when no rules are configured correctly preserves SDK-default behavior rather than injecting a restrictive policy where none was requested.
  • ✅ Clear separation between the Go config-building layer and the JS enforcement layer makes the pipeline auditable.
  • ✅ Comprehensive happy-path test coverage for the new JS permission handler.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 3M

coreLogger.info(`Copilot SDK permission denied: ${requestSummary}`);
}
if (coreLogger?.warning) {
coreLogger.warning(`Copilot SDK permission denied by workflow tool permissions: ${requestSummary}`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] logPermissionDenied emits both core.info and core.warning for every denial — the info call is redundant noise and dilutes the signal in the Actions log.

💡 Suggestion

Drop the core.info call (lines 80–82) entirely. core.warning already surfaces the denial prominently in the GitHub Actions UI, and the internal logger line above it captures the same message for the harness log. Two separate core.* calls with slightly different wording ("Copilot SDK permission denied" vs "...by workflow tool permissions") will confuse readers trying to triage denied-tool warnings.

function logPermissionDenied(coreLogger, logger, request) {
  const requestSummary = summarizePermissionRequest(request);
  logger(`permission denied by workflow tool permissions: ${requestSummary}`);
  coreLogger?.warning?.(`Copilot SDK permission denied by workflow tool permissions: ${requestSummary}`);
}

return shellRules.some(rule => {
if (rule.endsWith(":*")) {
const prefix = rule.slice(0, -2).trim();
return prefix.length > 0 && commandIdentifiers.includes(prefix);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] Prefix rules like shell(git:*) rely on request.commands being a populated array. When commands is absent or empty (e.g., for script invocations), commandIdentifiers will be [] and the prefix rule silently misses — the request is denied even though the intent is to allow it.

💡 Suggestion and missing test

Add a fallback: if commandIdentifiers is empty and fullCommandText is present, extract the first word as a fallback identifier:

const commandIdentifiers = Array.isArray(request.commands)
  ? request.commands.map(cmd => cmd?.identifier).filter(Boolean)
  : [];

// Fallback: derive identifier from first token of fullCommandText when commands is empty.
const effectiveIdentifiers = commandIdentifiers.length > 0
  ? commandIdentifiers
  : [fullCommand.split(/\s+/)[0]].filter(Boolean);

Then use effectiveIdentifiers in the shellRules.some(...) check. Add a test:

it("matches shell prefix rule when commands array is absent", () => {
  const handler = buildCopilotSDKPermissionHandler(
    { allowedTools: ["shell(git:*)"] }, approveAll
  );
  // No 'commands' field — SDK sends only fullCommandText
  expect(
    handler({ kind: "shell", commands: [], fullCommandText: "git status" })
  ).toEqual({ kind: "approve-once" });
});

chmod 700 %s`, customEngineCommandScriptPath, heredocDelimiter, scriptContent, heredocDelimiter, customEngineCommandScriptPath)
}

func buildCopilotSDKPermissionConfig(toolArgs []string) *copilotSDKPermissionConfig {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] buildCopilotSDKPermissionConfig re-parses the serialized CLI flags (--allow-tool, --allow-all-tools) that were just built by computeCopilotToolArguments. This creates a secondary derivation path that will silently break if the CLI flag names ever change.

💡 Architectural concern

The permission config and the CLI flags are both derived from workflowData.Tools (via computeCopilotToolArguments). Instead of re-parsing the CLI args, consider passing the structured tool config directly:

PermissionConfig: buildCopilotSDKPermissionConfigFromTools(workflowData),

where buildCopilotSDKPermissionConfigFromTools reads workflowData.ParsedTools or the same inputs that computeCopilotToolArguments uses. This removes the --allow-tool string as an intermediate representation and keeps one source of truth.

If keeping the current approach, at minimum add a comment documenting the coupling: toolArgs must contain the flags produced by computeCopilotToolArguments for this function to produce a correct result.

const allowedToolEntries = new Set(normalizedAllowedTools);

// Keep explicit allow-all behavior when requested by the engine config.
if (allowAll) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] When allowAllTools: true, approveAll is returned immediately and allowedTools is silently ignored. This is the correct policy, but there's no test that verifies buildCopilotSDKPermissionHandler({ allowAllTools: true, allowedTools: ["shell(rm:*)"] }, approveAll) returns approveAll exactly (not a scoped handler that would block unlisted tools).

💡 Suggested test
it("returns approveAll unchanged when allowAllTools is true, ignoring allowedTools", () => {
  const fakeApproveAll = () => ({ kind: "approve-once" });
  const handler = buildCopilotSDKPermissionHandler(
    { allowAllTools: true, allowedTools: ["shell(rm:*)"] },
    fakeApproveAll
  );
  expect(handler).toBe(fakeApproveAll); // must be the same reference
});

Without this, a future refactor that accidentally builds a scoped handler for the allowAllTools case would block legitimately-allowed tools with no obvious failure signal.

return allowedToolEntries.has("write");
case "read":
// Read permissions are low-risk and are broadly expected by the agent flow.
return true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] case "read": return true is an implicit security policy (reads are always permitted regardless of the allowlist). This invariant has zero test coverage — if it's accidentally removed or the fallthrough to default: return false is refactored to cover read, the agent silently loses read access.

💡 Suggested test
it("always allows read requests regardless of allowedTools", () => {
  const handler = buildCopilotSDKPermissionHandler(
    { allowedTools: [] }, // empty — nothing in allowlist
    approveAll
  );
  // handler is undefined when allowedTools is empty, so test with a non-empty config
  const handler2 = buildCopilotSDKPermissionHandler(
    { allowedTools: ["shell(git:*)"] },
    approveAll
  );
  expect(handler2({ kind: "read", fileName: "README.md" }))
    .toEqual({ kind: "approve-once" });
});

* @typedef {{
* allowAllTools?: boolean,
* allowedTools?: string[],
* }} CopilotSDKPermissionConfig
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/grill-with-docs] The CopilotSDKPermissionConfig typedef documents allowAllTools and allowedTools but says nothing about the entry format convention that isAllowed expects. Without reading the implementation, a caller can't know whether to write "github", "github(get_file_contents)", "shell(git:*)", or "web_fetch" for a given permission.

💡 Suggested JSDoc addition
/**
 * `@typedef` {{
 *   allowAllTools?: boolean,
 *   allowedTools?: string[],
 * }} CopilotSDKPermissionConfig
 *
 * allowedTools entry format (mirrors --allow-tool CLI flag conventions):
 *   "shell"              — allow all shell commands
 *   "shell(cmd:*)"       — allow any command whose identifier matches `cmd`
 *   "shell(git status)"  — allow the exact full command text
 *   "write"              — allow all file writes
 *   "web_fetch"          — allow all URL fetch requests
 *   "server"             — allow all tools from MCP server `server`
 *   "server(tool)"       — allow only `tool` from MCP server `server`
 *   "toolName"           — allow a custom tool by name
 */

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

@copilot review all comments and address unresolved review feedback. Please refresh the branch and rerun checks after applying fixes.

Generated by 👨‍🍳 PR Sous Chef · gpt54mini 6.3M ·

@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Jun 3, 2026

@copilot review all comments and reviews
Apply copilot-review skill

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 3, 2026

@copilot review all comments and reviews
Apply copilot-review skill

Addressed in 7b71e6c. I reviewed the remaining comments/reviews and fixed the outstanding parser edge case in pkg/workflow/copilot_engine_execution.go so malformed --allow-tool values are consumed instead of being reinterpreted as flags, and added a regression test in pkg/workflow/copilot_engine_test.go. Targeted go test ./pkg/workflow -run 'TestBuildCopilotSDKPermissionConfig|TestGenerateCopilotExecutionStep|TestGenerateCopilotExecutionStep_StdinJSONIncludesSDKOptions' passes, and final validation completed with no actionable follow-up.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

``
@copilot review all comments and address the unresolved review feedback on permissions and logging.

Generated by 👨‍🍳 PR Sous Chef · gpt54mini 2.1M ·

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

``
@copilot please summarize the remaining blockers and refresh checks if anything changed.

Generated by 👨‍🍳 PR Sous Chef · gpt54mini 2.1M ·

Comment thread actions/setup/js/copilot_harness.cjs Outdated
* @param {(message: string) => void} harnessLogger
* @returns {{info?: (message: string) => void, warning?: (message: string) => void} | undefined}
*/
function buildCoreLogger(harnessLogger) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot require shim.cjs instead

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 381aa5c. actions/setup/js/copilot_harness.cjs now requires ./shim.cjs at module scope and buildCoreLogger() reads global.core from that initialization path.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI requested a review from pelikhan June 3, 2026 04:42
@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Jun 3, 2026

@copilot remove build core logging, just call core.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 3, 2026

@copilot remove build core logging, just call core.

Addressed in b6bbcae. actions/setup/js/copilot_harness.cjs now passes the shim-provided global.core directly to the SDK driver and removes the redundant buildCoreLogger helper/fallback logic.

@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Jun 3, 2026

@copilot merge main and recompile

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 3, 2026

@copilot merge main and recompile

Addressed in f6ecb89. I merged origin/main into the branch and ran the repository recompile flow (make build && make recompile && make fmt).

@pelikhan pelikhan merged commit 9f4c8cd into main Jun 3, 2026
@pelikhan pelikhan deleted the copilot/implement-refine-copilot-config branch June 3, 2026 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants