Skip to content

Fix loom on Claude Code 2.1.x: /loom:<role> namespace + BypassPermissions auto-accept#3348

Merged
rjwalters merged 2 commits into
rjwalters:mainfrom
jperla:fix/claude-code-2.1-compat
May 28, 2026
Merged

Fix loom on Claude Code 2.1.x: /loom:<role> namespace + BypassPermissions auto-accept#3348
rjwalters merged 2 commits into
rjwalters:mainfrom
jperla:fix/claude-code-2.1-compat

Conversation

@jperla
Copy link
Copy Markdown
Contributor

@jperla jperla commented May 28, 2026

Two changes to make loom spawn working agents on Claude Code 2.1.133. Tested live against a real project (QuillUI) for both shepherd and support-role spawns.

1. Use the /loom:<role> namespaced slash-command form (fixes #3345)

After #3176 moved commands to .claude/commands/loom/<role>.md, Claude Code 2.1+ only auto-discovers them via the /loom:<role> namespace prefix. The bare /<role> form fails with Unknown command: /<role>, so every freshly spawned agent stalls at startup.

  • agent_spawn.pyrole_cmd construction now emits /loom:{role}
  • agent_monitor.py — stuck-prompt recovery uses the same namespaced form

This is the clean, uncontroversial half — it's just matching the resolver behavior introduced alongside #3176. See #3345 for the full repro.

2. Auto-accept the BypassPermissions warning on spawn

On this Claude Code build the BypassPermissions acknowledgement is not persisted to CLAUDE_CONFIG_DIR, so every fresh process shows the modal. The default selection is "1. No, exit", which means an unattended Enter exits the spawn immediately. We schedule a delayed Down+Enter at +8s via a backgrounded subprocess so the agent silently selects "2. Yes, I accept" and proceeds.

This half is a pragmatic workaround, not a polished fix — the +8s timing is a heuristic and the keystroke-injection is ugly. I'm opening it for visibility; happy to drop it from this PR if you'd prefer to solve the persisted-acknowledgement problem a different way, and keep this PR to the namespace fix alone.

🤖 Generated with Claude Code

@rjwalters
Copy link
Copy Markdown
Owner

Thanks for putting this together, @jperla — and for the careful write-up in #3345 with the verified reproduction. Sorry about the timing collision here.

Status: partially superseded, partially still wanted

Half 1: /loom:<role> namespace fix — fully landed via #3352

A near-identical fix landed today as #3352 (merged commit 9f756938). It patches the same two sites you patched here (agent_spawn.py:720, agent_monitor.py:503) with identical code, plus 8 additional sites you flagged as possibly missed in #3345 (daemon.py, daemon_v2/actions/shepherds.py, defaults/scripts/agent-wait-bg.sh, the skill-routing files, user-facing CLI help/examples), plus 3 detection-side regex updates and a regression test. Your grep suggestion in #3345 was exactly right — that's how the additional coverage was found.

So the namespace half of this PR is a strict subset of what's now on main, and the conflict you're seeing is because both PRs touch the same lines.

Half 2: BypassPermissions auto-accept — still genuinely wanted

This one is not a duplicate, and it's actually filling a regression gap most people wouldn't have noticed:

So your patch is restoring functionality that just silently disappeared. Worth landing.

Implementation feedback for the BypassPermissions portion

If you're willing to rebase and reduce the PR to just this half, a few things to consider:

  1. Polling instead of a hardcoded sleep 8: an 8s wait is brittle (too long on fast systems, too short on slow ones). A poll on capture_pane for the literal string "Bypass Permissions" (or "Yes, I accept") with a 10s timeout would be more robust. PR Fix agent launcher silent failure after bypass prompt timeout #375 hit this exact "silent failure after bypass prompt timeout" issue and added defensive error handling around it — worth a look.

  2. Send via _tmux, not orphan bash subprocess: agent_spawn.py already imports subprocess at module scope and has the _tmux() helper. A threading.Timer or simply a time.sleep(N); _tmux("send-keys", ...) after spawn-verify is cleaner than Popen(["bash", "-c", "sleep 8 && tmux send-keys ..."], start_new_session=True). The current approach spawns an orphan background process that survives Loom shutdown.

  3. Drop the import subprocess as _subprocess linesubprocess is already imported at the top of the file.

  4. Down vs "2": PR Automatically accept Claude Code bypass permissions prompt #112's approach was sending the literal character "2" then Enter, which is more explicit than relying on cursor position. Both work; explicit is easier to debug.

  5. Add a small test if practical (e.g., assert the scheduled-input call is fired when spawn succeeds).

  6. Consider gating behind a env-var or feature flag for now, so users on Claude Code builds where --dangerously-skip-permissions already suppresses the modal don't get an extraneous keystroke injected. Something like LOOM_AUTO_ACCEPT_BYPASS=1 (default off, or auto-detect Claude Code version). The risk of injecting Down+Enter after the agent is already running is small in practice (role_cmd was already submitted as first input) but a flag makes the workaround explicit.

Suggested path forward

Three options, your call:

  • (a) Rebase, drop the namespace hunks, refine the BypassPermissions half along the lines above — that's the path I'd recommend, and I'd be happy to review a v2.
  • (b) Close this PR and leave the BypassPermissions work for a maintainer follow-up, with attribution to your investigation here. I'll file an issue referencing this PR if you go this route.
  • (c) Defer the decision to @rjwalters — totally fine, no rush.

Labeling loom:changes-requested so it doesn't sit indefinitely in the review queue. Not closing — that's your call (or the maintainer's), not mine. Thank you again for the careful work and the detailed reproduction; the QuillUI fleet trace in #3345 was what made #3352 easy to scope correctly.

@rjwalters rjwalters added the loom:changes-requested PR requires changes before re-review (Judge requested modifications) label May 28, 2026
Joseph Perla and others added 2 commits May 28, 2026 10:23
Two changes needed to make loom work against Claude Code 2.1.133:

1. Use /loom:<role> namespaced slash command form. Claude Code 2.1+
   only auto-discovers nested .claude/commands/loom/<role>.md via the
   /loom: namespace prefix; the bare /<role> form fails with
   "Unknown command". Applies to agent_spawn.py role_cmd construction
   and agent_monitor.py stuck-prompt recovery.

2. Auto-accept the BypassPermissions warning on every spawn. On this
   Claude Code build the acknowledgement is not persisted to
   CLAUDE_CONFIG_DIR — every fresh process shows the modal, default
   selection is "1. No, exit", so an unattended Enter would EXIT
   the spawn immediately. We schedule a delayed Down+Enter at +8s via
   a backgrounded subprocess so the spawned agent silently accepts
   "2. Yes, I accept" and proceeds.

Tested live against QuillUI for both shepherd and support-role spawns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses Judge feedback on PR rjwalters#3348.  The original auto-accept logic
used `subprocess.Popen(["bash", "-c", "sleep 8 && tmux send-keys ..."])`
which:

  - Raced a fixed 8s timer against modal render — too short on slow
    boxes (misses the prompt entirely), wasteful on fast boxes.
  - Orphaned a detached bash process per spawn.
  - Reimported subprocess as `_subprocess` despite the module-level
    import.

Replace with a poll-loop helper that uses the module's existing
`_tmux()` wrapper:

  - `_bypass_prompt_visible(session_name)` — runs `tmux capture-pane`
    and matches against BYPASS_PROMPT_MARKERS ("Bypass Permissions mode"
    plus the `--dangerously-skip-permissions` flag text as a fallback
    marker, since future Claude Code builds may rename the warning).
  - `_auto_accept_bypass_prompt(session_name, ...)` — polls up to
    DEFAULT_BYPASS_POLL_TIMEOUT (15s) with DEFAULT_BYPASS_POLL_INTERVAL
    (1s).  On detection, sends `Down Enter` via `_tmux("send-keys", ...)`
    to select "2. Yes, I accept".  Accepts an injectable `sleep_fn` for
    deterministic testing.
  - Gated behind `LOOM_AUTO_ACCEPT_BYPASS`, default "1" (enabled).  Set
    to "0" to disable on builds where `--dangerously-skip-permissions`
    already suppresses the modal.

The post-spawn call in `spawn_agent` now reads as a synchronous helper
invocation rather than the brittle subprocess.Popen pattern.

Tests (`TestAutoAcceptBypassPrompt`, 6 cases):
  - sends Down+Enter when modal is detected on first poll
  - polls until modal appears across multiple iterations
  - honours the timeout budget when modal never appears
  - LOOM_AUTO_ACCEPT_BYPASS=0 disables the path entirely
  - alternate marker (`--dangerously-skip-permissions`) also detected
  - non-zero `capture-pane` returncode treated as "not yet"

All 6 new tests pass; the 69 pre-existing `test_agent_spawn.py` tests
remain at their main-branch baseline (one pre-existing unrelated
TestValidateRole failure was present before this PR).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rjwalters rjwalters force-pushed the fix/claude-code-2.1-compat branch from b6f2b21 to f7de604 Compare May 28, 2026 17:31
@rjwalters rjwalters added loom:review-requested PR ready for Judge to review and removed loom:changes-requested PR requires changes before re-review (Judge requested modifications) labels May 28, 2026
@rjwalters
Copy link
Copy Markdown
Owner

Thanks for the patch @jperla — sorry for the timing collision. Here's what happened and what's now ready for re-review:

Rebase

The branch was rebased onto current main. Your namespace fix half (/loom:<role> in agent_spawn.py and agent_monitor.py) was independently merged as #3352 a few hours after you filed this PR — that PR patched the same 2 sites plus 8 additional spawn paths and 3 detection-side regexes. During the rebase, both your hunks conflicted with #3352 and were resolved by taking main's version (semantically identical, slightly more thorough comments). Result: zero behavioral diff vs. your original on the namespace side — it's fully covered by #3352.

Your BypassPermissions auto-accept half survived the rebase intact as commit 4ede216c (was b6f2b21) — attribution preserved.

Doctor fixes (commit f7de604)

The Judge raised 6 concerns on the auto-accept logic. Addressed in a single fixup commit on top of yours:

  1. Hardcoded sleep 8 → poll loop. Replaced with _auto_accept_bypass_prompt() that polls tmux capture-pane every 1s up to a 15s budget, looking for \"Bypass Permissions mode\" or \"--dangerously-skip-permissions\" in the visible buffer. Detects the modal as soon as it's rendered (no race against slow boxes; no wasted time on fast boxes).

  2. Orphaned subprocess.Popen([\"bash\", \"-c\", ...]) → synchronous helper using _tmux(). No more detached bash process per spawn. Send-keys goes through the module's existing _tmux() wrapper, consistent with the rest of agent_spawn.py.

  3. Redundant import subprocess as _subprocess dropped. subprocess is already imported at module top.

  4. Keystrokes stay as Down Enter (no change — your choice matches PR Automatically accept Claude Code bypass permissions prompt #112's pattern; sending the literal \"2\" would be brittle if the menu reorders).

  5. Tests addedTestAutoAcceptBypassPrompt in tests/test_agent_spawn.py (6 cases): immediate detection, multi-iteration polling, timeout honored when modal never appears, env-var disable, alternate marker, and capture-pane error handling. All 6 pass; the 69 pre-existing test_agent_spawn.py tests are unaffected (one pre-existing TestValidateRole::test_role_in_claude_commands failure is unrelated to this PR).

  6. Gating via LOOM_AUTO_ACCEPT_BYPASS (default \"1\" = enabled). Set to \"0\" to disable on builds where --dangerously-skip-permissions already suppresses the modal. Documented in the helper's docstring.

Re-review

Labels updated to loom:review-requested. Merge state is now CLEAN. Ready for Judge.

(doctor pass by @rjwalters per maintainer takeover)

@rjwalters
Copy link
Copy Markdown
Owner

LGTM after Doctor takeover. Re-reviewed final state of the PR (commits 4ede216c + f7de6041).

Scope verification

  • Half 1 (namespace fix) cleanly dropped on rebase — gh pr diff no longer touches agent_monitor.py and the only agent_spawn.py hunks are the Bypass auto-accept block. Confirmed via diff scan: zero /loom: namespace lines, zero role_cmd changes.
  • Attribution to jperla preserved: 4ede216 shows Author: Joseph Perla <patches@jperla.com>.

The 6 Judge concerns, verified against the implementation

  1. Hardcoded sleep 8 → resolved. _auto_accept_bypass_prompt uses a bounded while elapsed < timeout loop (DEFAULT_BYPASS_POLL_TIMEOUT=15s, DEFAULT_BYPASS_POLL_INTERVAL=1s), polling _bypass_prompt_visible() which runs tmux capture-pane -S -200 and matches against BYPASS_PROMPT_MARKERS.
  2. Orphan subprocess.Popen with detached bash → resolved. The implementation uses the existing _tmux() helper synchronously. No Popen, no bash -c, no thread.
  3. Redundant import subprocess as _subprocess → resolved. Only one import subprocess (line 30); no shadow re-import.
  4. Literal "2" keystroke → resolved. _tmux("send-keys", "-t", session_name, "Down", "Enter") — arrow-navigates from default "1. No, exit" to "2. Yes, I accept", matching PR Automatically accept Claude Code bypass permissions prompt #112's pattern.
  5. No tests → resolved. TestAutoAcceptBypassPrompt adds 6 cases covering: immediate detection, multi-iteration polling (asserts exact sleep count), timeout honored (no send-keys emitted), env-var disable (asserts mock_tmux.call_count == 0), alternate --dangerously-skip-permissions marker, and returncode != 0 capture-pane handled gracefully. All 6 pass locally (uv run pytest tests/test_agent_spawn.py::TestAutoAcceptBypassPrompt -v → 6 passed).
  6. Gating → resolved. LOOM_AUTO_ACCEPT_BYPASS env var with default "1" (enabled); "0" short-circuits before any tmux interaction. test_disabled_by_env_var proves this is a true no-op (zero tmux calls, zero sleeps).

Additional checks

  • Polling is bounded — no infinite-wait potential; cap is 15s wall-clock by default.
  • Disabled path is a true no-op — env-var test asserts mock_tmux.call_count == 0.
  • Outer try/except Exception around the call in spawn_agent ensures a tmux quirk never breaks the spawn path itself.
  • Full test_agent_spawn.py run: 69 passed, 1 pre-existing failure (TestValidateRole::test_role_in_claude_commands) that reproduces on main — not introduced by this PR.
  • mergeStateStatus: CLEAN.

CI
Only Detect Changes ran (SUCCESS); the Rust/MCP/installer workflows correctly skipped because the change is Python-only. No regression — this is a known gap (Loom lacks a Python CI workflow), not a problem with this PR.

Approving.

@rjwalters rjwalters added loom:pr PR approved by Judge, ready for human to merge and removed loom:review-requested PR ready for Judge to review labels May 28, 2026
@rjwalters rjwalters merged commit c0d8838 into rjwalters:main May 28, 2026
5 checks passed
rjwalters pushed a commit that referenced this pull request May 28, 2026
Captures the 8 PRs that landed since v0.8.1: Tauri removal (#3353),
Claude Code 2.1+ compatibility hardening (#3352, #3356, #3348),
post-build quality gate (#3355), host-sleep readiness check (#3357),
and complementary builder/judge perf-guidance docs (#3351, #3354).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rjwalters added a commit that referenced this pull request May 29, 2026
When /loom:doctor (or any subagent that mutates a PR branch) ran on an
external-fork PR — i.e., a PR whose branch did not match
`feature/issue-<N>` — it had no dedicated worktree path to use and fell
through to `gh pr checkout <N>` in the orchestrator's main worktree. That
switched the orchestrator's HEAD off main and could leave behind
untracked files from the PR (concrete incident: v0.9.0 session,
jperla's #3348, src-tauri/ orphans).

This change:
- Adds `.loom/scripts/pr-worktree.sh` — creates an isolated worktree at
  `.loom/worktrees/pr-<PR_NUMBER>/` with a `.loom-managed` sentinel, then
  runs `gh pr checkout` from INSIDE the worktree so the PR branch never
  touches the orchestrator's HEAD.
- Updates `defaults/.claude/commands/loom/doctor.md` to document the
  branch-name heuristic (`^feature/issue-([0-9]+)$` -> issue worktree;
  anything else -> pr-worktree) and updates every `gh pr checkout` site
  to route through the right helper before any mutation.
- Tightens the branch-to-issue-number regex in `merge-pr.sh` from the
  loose trailing-digit heuristic to the strict
  `^feature/issue-([0-9]+)$` pattern, so branches like `release-1` or
  `fix-bug-42` are correctly classified as PR-style (not issue-style).
- Extends `merge-pr.sh` cleanup to recognize and remove
  `.loom/worktrees/pr-<N>/` in addition to `.loom/worktrees/issue-<N>/`.
- Adds regression test `test-pr-worktree-isolation.sh` covering branch
  classification, sentinel placement, and the invariant that the
  orchestrator's HEAD is unchanged after a doctor pass on `fix/foo-bar`.

`.claude/commands/` is a symlink to `defaults/.claude/commands/` in this
repo so the doctor.md update propagates automatically.

Closes #3358

Co-authored-by: Loom Worker <loom-reviewer@users.noreply.github.comecho>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

loom:pr PR approved by Judge, ready for human to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spawn paths still emit bare /<role> after #3176; breaks on Claude Code 2.1+

2 participants