Skip to content

Harden host + agent-runner from health audit findings#2732

Open
caburi00 wants to merge 1 commit into
nanocoai:mainfrom
caburi00:audit-fixes-2026-06
Open

Harden host + agent-runner from health audit findings#2732
caburi00 wants to merge 1 commit into
nanocoai:mainfrom
caburi00:audit-fixes-2026-06

Conversation

@caburi00

@caburi00 caburi00 commented Jun 11, 2026

Copy link
Copy Markdown

Fixes from a multi-agent health audit (adversarially verified). Scoped to upstream core; WhatsApp adapter fixes (skill-managed in core) and approval-click authorization (already implemented upstream) are intentionally excluded. Rebased onto latest main — 1 commit, 19 files, typecheck + tests green.

Container lifecycle

  • realpath-resolve bind-mount sources so the groups/data ext4 symlinks are followed and drvfs never enters the mount path (fixes Docker Desktop stale-staging crash-loops, exit 127)
  • crash-on-spawn circuit breaker (decideCrashExit) — a broken image backs off and pauses instead of respawning every 60s forever
  • enforce MAX_CONCURRENT_CONTAINERS in wakeContainer
  • killContainer falls back to daemon-level docker kill before the CLI client

Agent-runner

  • follow-up poller claims only the messages it will push (no orphaned 'processing' rows)
  • apply the accumulate (trigger=1) gate to follow-ups
  • thread message origin (fromMe) through edit/reaction content

Delivery + DB

  • order outbound by (timestamp, seq) so same-second multi-part replies stay ordered (host + container)
  • add idx_messages_in_due for the hot countDueMessages poll
  • guard migration013 ALTERs (idempotent)
  • transactional FK-safe cascade deletes for agent/messaging groups
  • correct misleading delivery-retry comment

Router

  • cache compiled engage_pattern + cap input length (ReDoS guard)
  • invalid pattern fails closed with a one-shot warn (was fail-open)
  • run the command gate only when engaging (accumulate context stays silent)

Scheduling

  • recurring series survives a failed occurrence instead of dying silently
  • anchor next run on scheduled fire time to prevent drift

Ops

  • only colorize logs on a TTY so the service log file is greppable
  • non-destructive startup reconciliation of orphan session folders
  • correct stale schema.ts header to point at migrations

Adds unit tests for the crash breaker.

🤖 Generated with Claude Code

Fixes from a multi-agent health audit (adversarially verified). Scoped to
upstream core; WhatsApp adapter fixes and the approval-click authorization
(already implemented upstream) are intentionally excluded.

Container lifecycle (container-runner.ts, container-runtime.ts):
- realpath-resolve bind-mount sources so the groups/data ext4 symlinks are
  followed and drvfs never enters the mount path (fixes Docker Desktop stale
  staging crash-loops, exit 127)
- crash-on-spawn circuit breaker (decideCrashExit) so a broken image backs off
  and pauses instead of respawning every 60s forever
- enforce MAX_CONCURRENT_CONTAINERS in wakeContainer
- killContainer falls back to daemon-level `docker kill` before the CLI client

Agent-runner (poll-loop.ts, db/messages-out.ts, mcp-tools/core.ts):
- follow-up poller claims only messages it will push (no orphaned 'processing')
- apply the accumulate (trigger=1) gate to follow-ups
- thread message origin (fromMe) through edit/reaction content

Delivery + DB:
- order outbound by (timestamp, seq) so same-second multi-part replies stay
  ordered (host + container)
- add idx_messages_in_due for the hot countDueMessages poll
- guard migration013 ALTERs (idempotent)
- delete FK dependents in a transaction for agent/messaging group deletes
- correct misleading delivery-retry comment

Router:
- cache compiled engage_pattern + cap input length (ReDoS guard)
- invalid pattern fails closed with a one-shot warn (was fail-open)
- run the command gate only when engaging (accumulate context stays silent)

Scheduling:
- recurring series survives a failed occurrence instead of dying silently
- anchor next run on scheduled fire time to prevent drift

Ops:
- only colorize logs on a TTY so the service log file is greppable
- non-destructive startup reconciliation of orphan session folders
- correct stale schema.ts header to point at migrations

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant