feat(sdk/plugin-claude): session-aware messaging (format + registry + daemon)#209
Conversation
…se A) Wrap outbound DMs in a SessionEnvelopeV1 superset (schema tinyplace.harness.session.v1) with a namespaced `tp` extension block, so plugin DMs interoperate with the harness-wrapper format while routing/correlation/auto metadata rides inside the encrypted body. No backend or sendMessage changes. - Add mcp/format.mjs: pure encode/decode. encodeEnvelope builds the superset; decodeBody tries the envelope path first, then falls back to the legacy AUTO_SENTINEL/re: sentinel header (preserved byte-for-byte, incl. SOH delimiters) and finally plain text. - send/auto_reply gain optional to_session + role; the receiver surfaces fromSession/role across inbox, check_reply, await_reply, send_and_wait, the channel push, and the durable queue. - Auto-responder replies back to the sender's session (to_session) when known. - Add envelope-test.mjs (round-trip incl. tp block, legacy fallback, plain text, wrapper interop) and a `test` script; copy the design doc into docs/. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hase B) Let one agent run multiple addressable Claude sessions. - Add mcp/registry.mjs: presence files at sessions/<agent>/<label>.json with a 10s heartbeat and liveness = fresh heartbeat AND pid alive. Label allocation prefers an explicit request, then the same harnessSessionId's prior label (stable across restarts), then the lowest-free claude:<n>. Stale files GC'd. - server.mjs: adopt() allocates a label, writes+heartbeats presence, and frees it on switch/teardown; use gains a `label` option; whoami shows label + live sessions; new `sessions` tool lists live sessions. Send-only responders stay out of the registry. - Add /tinyplace:sessions; update use/whoami command docs and README. - Tests: registry-test.mjs (liveness fresh/stale/pid-dead, allocation, reuse-by-harnessSessionId, GC) and sessions-test.mjs (two servers → claude:1 / claude:2, label reuse across restart) wired into `npm test`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e C) Give each agent a single process that owns the relay drain + Signal ratchet, so N sessions never race the mailbox or corrupt the shared ratchet. - hooks/agent-daemon.mjs: single owner via a CAS lock (mcp/daemon-lock.mjs) with takeover on death and idle-exit when the agent has no live sessions. It is the sole decryptor — drains the mailbox once, routes each message to the target session's inbox/ queue (to_session live → that session; dead → _unrouted/ held until it appears; none → primary/broadcast/drop per TINYPLACE_UNROUTED_POLICY), sends outbound jobs from _outbox/ (sole ratchet writer), and drives the auto-responder. - mcp/routing.mjs (routeTarget/enqueueRouted/drainInbox/redeliverUnrouted) and mcp/outbox.mjs (writeOutboxJob/claimOutboxJobs) — atomic-rename file queues. - server.mjs thin-client mode: on use, register presence + ensure a daemon; if one is live, stop touching the relay — send/auto_reply write outbox jobs, inbox/check_reply/await_reply/send_and_wait read the session's inbox/ queue (correlating on the in-body envelope message.id). Falls back to the original self-drain if the daemon is disabled (TINYPLACE_SESSION_DAEMON=off) or can't start — no regression for single-session use. whoami reports mode + daemon. - format.mjs: buildEnvelope returns the message id; decode surfaces messageId. - Tests: routing-test.mjs, lock-test.mjs (incl. a 3-way process race), and daemon-test.mjs (daemon boot + idle-exit + thin-client send/inbox/check_reply) wired into npm test. Existing adopt tests pin TINYPLACE_SESSION_DAEMON=off. Follow-up: contact-request surfacing (drainContacts / /tinyplace:contacts) is not in this base (PR tinyhumansai#207), so moving contact-polling into the daemon is deferred. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@sanil-23 is attempting to deploy a commit to the Vezures Team on Vercel. A member of the Team first needs to authorize it. |
|
Important Review skippedNo new commits to review since the last review. ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR adds session-aware messaging to the Claude plugin: session envelopes, per-agent daemon ownership and routing, filesystem-backed session presence and inbox/outbox state, plus updated MCP tools, tests, and docs. ChangesSession-Aware Messaging
Estimated code review effort: 5 (Critical) | ~120 minutes Sequence Diagram(s)sequenceDiagram
participant Client
participant MCPServer
participant Registry
participant AgentDaemon
participant Outbox
Client->>MCPServer: use(name, label)
MCPServer->>Registry: claimLabel + writePresence
Client->>MCPServer: send(to, body, to_session, role)
alt daemon mode
MCPServer->>Outbox: writeOutboxJob(envelope)
AgentDaemon->>Outbox: claimOutboxJobs()
else self mode
MCPServer->>MCPServer: sendMessage(envelope)
end
AgentDaemon-->>MCPServer: routed inbox / reply delivery
MCPServer-->>Client: inbox / check_reply / whoami
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 251e184fcb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const { auto, inReplyTo, text, fromSession, role, toSession } = decodeBody(rawMsg.text); | ||
| // A re-handshake ping only exists to re-run X3DH (done on decrypt); consume | ||
| // it silently so it never surfaces as a message. | ||
| if (text === RESET_SENTINEL) continue; | ||
| const msg = { ...rawMsg, text, inReplyTo }; | ||
| const waiterIndex = session.waiters.findIndex((w) => w.match(msg)); | ||
| if (waiterIndex !== -1) { | ||
| // Consumed by a synchronous waiter — don't also push. | ||
| const [waiter] = session.waiters.splice(waiterIndex, 1); | ||
| clearTimeout(waiter.timer); | ||
| waiter.resolve({ ...msg, _delivered: "waiter" }); | ||
| } else { | ||
| // Unsolicited inbound: buffer (poll fallback + check_reply source), push | ||
| // as a channel event so Claude reacts in real time, and — unless it is | ||
| // itself an auto-reply — enqueue it for the Stop-hook auto-responder. | ||
| session.buffer.push(msg); | ||
| void pushToChannel(msg); | ||
| if (!auto) { | ||
| enqueueInbound(session.address, msg); | ||
| enqueuedAny = true; | ||
| } | ||
| } | ||
| const msg = { ...rawMsg, text, inReplyTo, fromSession, role, toSession }; |
There was a problem hiding this comment.
Preserve envelope IDs when self-draining
When this fallback/self-drain path receives a SessionEnvelope, decodeBody exposes the envelope messageId, but the queued/buffered message keeps rawMsg.id instead. If a daemon-mode peer sends and waits, its returned correlation id is the envelope message.id; a self-mode recipient (daemon disabled or failed) will auto-reply with in_reply_to set to this relay id, so check_reply/send_and_wait on the daemon-mode sender never matches the reply even though it arrived.
Useful? React with 👍 / 👎.
| const target = payload.toSession; | ||
| if (target && live.has(target)) { |
There was a problem hiding this comment.
Redeliver held untargeted mail to a live primary
If the daemon receives an untargeted DM while no sessions are live (for example during the daemon's idle grace after the last Claude session exits), enqueueRouted stores it in _unrouted with toSession: null. When a session later comes online, this check only moves files with a truthy explicit target, so those default-primary messages stay in _unrouted forever and never reach the next primary session.
Useful? React with 👍 / 👎.
| if (e?.status === 403 && !contactRequested.has(job.to)) { | ||
| contactRequested.add(job.to); | ||
| try { await client.contacts.request(job.to); } catch { /* best-effort */ } |
There was a problem hiding this comment.
Resolve daemon contact requests to crypto IDs
In daemon mode this is the only 403 not_a_contact recovery path, but it POSTs the raw job.to value. For normal send inputs like @handle or a base64 messaging key, sendMessage first resolves that to the relay key while the contacts endpoint is keyed by the peer's cryptoId (the foreground path uses toCryptoId), so the contact request fails silently and the job is requeued/retried indefinitely instead of ever creating the pending contact.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 11
🧹 Nitpick comments (3)
sdk/plugin-claude/daemon-test.mjs (1)
68-82: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueDaemon process not force-killed if the idle-exit assertion fails.
If
exitednever becomestrue(Line 81 assertion fails), the spawneddaemonprocess (Line 68-71) is left running with no cleanup, since the script only relies on the process self-exiting. On CI, a failing assertion here would leave a lingering process consuming the fakeTINYPLACE_CLAUDE_HOME/lock resources for the rest of the test run.🔧 Suggested fix: force-kill on failure
for (let i = 0; i < 30 && !exited; i++) await sleep(100); expect("daemon idle-exits when no live sessions", exited); +if (!exited) daemon.kill("SIGKILL"); expect("daemon releases its lock on exit", !lock.daemonLive(AGENT));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/plugin-claude/daemon-test.mjs` around lines 68 - 82, The daemon test currently assumes the spawned process will always self-exit, so a failed idle-exit assertion can leave a lingering child process behind. Update the daemon lifecycle in daemon-test.mjs around the spawn/exit handling to ensure the process is force-killed during cleanup if exited never becomes true, and make the cleanup robust even when the final expectations fail. Use the existing daemon and exited variables plus the lock.daemonLive(AGENT) checks to locate the spot.sdk/plugin-claude/mcp/server.mjs (2)
933-943: 🎯 Functional Correctness | 🔵 TrivialSelf-mode reply matching for
send_and_waitdoesn't checkfromSession.
matchFnat Line 942 matches any replyfrom === recipientKey, regardless of which session of a multi-session peer sent it. If the peer runs multiple sessions under the same wallet and more than one replies, an uncorrelated match (inReplyTo == null) from the wrong session could be picked up. This is inherent to self mode having no daemon-side routing, but worth a short note since session-awareness is the point of this PR.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/plugin-claude/mcp/server.mjs` around lines 933 - 943, The self-mode reply matcher in send_and_wait is only checking from and inReplyTo, so it can accept an uncorrelated reply from the wrong session when the same peer has multiple sessions. Update the matchFn branch in server.mjs to include fromSession-based correlation alongside recipientKey, so only replies from the intended session can match when inReplyTo is null. Keep the daemon branch unchanged.
933-948: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win
send_and_waitshould acceptto_session/roleand return the same send metadata assend.dispatchSend({ to, text: body })drops session/role targeting, so this path can’t address a specific peer session; thesentpayload also omitsvia/fromSession/toSession.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/plugin-claude/mcp/server.mjs` around lines 933 - 948, The send_and_wait flow in server.mjs is dropping targeting and response metadata compared to send. Update the send_and_wait handler around dispatchSend, waitFor, and the final ok payload to accept and forward to_session and role into dispatchSend, and make the returned sent object include the same metadata as send, including via, fromSession, and toSession. Ensure the reply matching logic still uses the correct recipient/session context when to_session or role is provided.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@sdk/plugin-claude/docs/session-aware-messaging.md`:
- Around line 60-83: The session envelope example is putting the Claude label
into scope.wrapper_session_id, but that field should remain the wrapper session
identifier. Update the session-aware messaging example to keep
wrapper_session_id aligned with the core SessionEnvelope contract in harness.ts
and harness-wrapper.ts, and move any Claude-specific label/session routing into
the namespaced tp extension block instead. Make sure the fields shown in the
example still reflect the decodeBody structured path and the standard envelope
semantics.
In `@sdk/plugin-claude/hooks/agent-daemon.mjs`:
- Around line 94-101: The auto-response queue payload is dropping the target
session information, so `enqueueForAutoResponse` should carry `toSession`
through when writing the queued message and `drainInbound` should continue
decoding/reading it consistently. Update the JSON object built in
`enqueueForAutoResponse` to include `toSession` alongside the existing fields,
and make sure any enqueue/dequeue handling in `agent-daemon.mjs` preserves that
property so targeted inbound messages remain correlated to the addressed
session.
In `@sdk/plugin-claude/hooks/respond-batch.mjs`:
- Around line 38-51: The prompt in respond-batch.mjs is directly interpolating
attacker-controlled msg.fromSession into the auto_reply tool call, so sanitize
or validate it before use. Update the message template around the
toSessionArg/fromSession handling to only include a value that matches the
expected session-id shape, or omit it entirely when invalid, and keep the
auto_reply invocation free of raw unescaped free text.
In `@sdk/plugin-claude/mcp/daemon-lock.mjs`:
- Around line 85-92: The lock-stealing path in the daemon lock code can delete a
lock that another daemon just created, and the release path can also remove a
lock without proving ownership. Update the acquire/steal flow around readLock(),
isDaemonLive(), and the rmSync(path) branch to use an atomic claim step (for
example, rename-to-claim) and only proceed after validating the claimed lock
contents. In releaseLock(), verify the lock still belongs to the current PID
before removing it, and avoid deleting partial or ambiguous locks that were not
proven to be owned by this process.
In `@sdk/plugin-claude/mcp/format.mjs`:
- Around line 101-116: decodeEnvelope currently passes attacker-controlled
scope.wrapper_session_id and tp.to_session through as fromSession/toSession
without validation. Update decodeEnvelope in format.mjs to validate and
normalize these fields at decode time, using a strict safe-label pattern (or
nulling/rejecting anything outside it) so only short token-like values are
returned. Keep the fix localized to decodeEnvelope and ensure downstream
consumers of fromSession/toSession can rely on safe-by-construction values.
In `@sdk/plugin-claude/mcp/outbox.mjs`:
- Around line 31-58: The outbox claim flow in claimOutboxJobs currently skips
hidden .sending-* files, so crashed or exited daemons can leave jobs permanently
stuck. Add stale-claim recovery in claimOutboxJobs before the readdirSync
filter/listing step: scan for .sending- files in the agent outbox, compare their
age against a safe timeout, and rename stale claims back to their original .json
filenames so they can be reclaimed on the next pass. Keep the atomic renameSync
claim behavior and only requeue files that are clearly older than an in-flight
send attempt.
In `@sdk/plugin-claude/mcp/registry.mjs`:
- Around line 114-150: `allocateLabel` and `writePresence` are vulnerable to a
concurrent-start race because label selection happens before the presence file
is created, so two sessions can claim the same `claude:<n>` or reused label. Fix
this by making the label claim and presence creation atomic in `registry.mjs`,
using a lock/claim-and-write approach similar to `daemon-lock.mjs`, and have
`server.mjs` retry when a collision is detected instead of assuming the chosen
label is still free.
In `@sdk/plugin-claude/mcp/routing.mjs`:
- Around line 84-110: redeliverUnrouted currently only moves held messages when
payload.toSession is set, so untargeted entries from enqueueRouted() never leave
_unrouted after a session becomes live. Update redeliverUnrouted(agentAddress)
in routing.mjs to also recognize no-target payloads and re-evaluate them against
the current routing policy (using the same liveSessions/inboxDir path logic) so
they can be delivered once a matching live session exists; keep the existing
target-specific flow for payload.toSession and use the redelivery path to handle
both cases.
In `@sdk/plugin-claude/mcp/server.mjs`:
- Around line 604-624: The presence lifecycle in registerPresence can desync
because presenceWritten is only set on the initial write, while later heartbeat
writes may succeed after a transient failure. Update registerPresence and the
heartbeat callback so any successful writePresence call also marks
session.presenceWritten true (and preserve it on subsequent successes), and
ensure teardownListener/removePresence can clean up once the file has actually
been written. Keep the logic centered around registerPresence,
session.presenceWritten, and the heartbeat setInterval write path.
- Around line 429-459: The daemon respawn path in drainDaemonInbox currently
calls ensureDaemonSpawn whenever daemonLive is false, which can cause repeated
spawn attempts on every POLL_INTERVAL_MS tick. Add a cooldown/backoff guard
before respawning, using the same defensive pattern as maybeRecoverSession with
RECOVER_COOLDOWN_MS or an equivalent timestamp check, so repeated failures do
not trigger a spawn storm. Keep the change localized to drainDaemonInbox and the
related daemon-spawn helper/state used to track the last respawn attempt.
In `@sdk/plugin-claude/README.md`:
- Around line 68-74: The README tool descriptions are still phrased around a
session starting its own listener, but the daemon now owns relay draining and
routing. Update the `use` and `whoami` entries in the tool table to describe
`use` as registering/adopting the session with the daemon and `whoami` as
reporting daemon state, while keeping the references to the existing tool names
so the docs match the new daemon-first flow.
---
Nitpick comments:
In `@sdk/plugin-claude/daemon-test.mjs`:
- Around line 68-82: The daemon test currently assumes the spawned process will
always self-exit, so a failed idle-exit assertion can leave a lingering child
process behind. Update the daemon lifecycle in daemon-test.mjs around the
spawn/exit handling to ensure the process is force-killed during cleanup if
exited never becomes true, and make the cleanup robust even when the final
expectations fail. Use the existing daemon and exited variables plus the
lock.daemonLive(AGENT) checks to locate the spot.
In `@sdk/plugin-claude/mcp/server.mjs`:
- Around line 933-943: The self-mode reply matcher in send_and_wait is only
checking from and inReplyTo, so it can accept an uncorrelated reply from the
wrong session when the same peer has multiple sessions. Update the matchFn
branch in server.mjs to include fromSession-based correlation alongside
recipientKey, so only replies from the intended session can match when inReplyTo
is null. Keep the daemon branch unchanged.
- Around line 933-948: The send_and_wait flow in server.mjs is dropping
targeting and response metadata compared to send. Update the send_and_wait
handler around dispatchSend, waitFor, and the final ok payload to accept and
forward to_session and role into dispatchSend, and make the returned sent object
include the same metadata as send, including via, fromSession, and toSession.
Ensure the reply matching logic still uses the correct recipient/session context
when to_session or role is provided.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 17ed3ce8-fcee-42dc-9419-125527cc9179
📒 Files selected for processing (22)
sdk/plugin-claude/README.mdsdk/plugin-claude/assignment-test.mjssdk/plugin-claude/commands/sessions.mdsdk/plugin-claude/commands/use.mdsdk/plugin-claude/commands/whoami.mdsdk/plugin-claude/daemon-test.mjssdk/plugin-claude/docs/session-aware-messaging.mdsdk/plugin-claude/env-adopt-test.mjssdk/plugin-claude/envelope-test.mjssdk/plugin-claude/hooks/agent-daemon.mjssdk/plugin-claude/hooks/respond-batch.mjssdk/plugin-claude/lock-test.mjssdk/plugin-claude/mcp/daemon-lock.mjssdk/plugin-claude/mcp/format.mjssdk/plugin-claude/mcp/outbox.mjssdk/plugin-claude/mcp/registry.mjssdk/plugin-claude/mcp/routing.mjssdk/plugin-claude/mcp/server.mjssdk/plugin-claude/package.jsonsdk/plugin-claude/registry-test.mjssdk/plugin-claude/routing-test.mjssdk/plugin-claude/sessions-test.mjs
… races, correctness)
Security:
- Validate session labels to a safe token shape (^[\w:-]{1,32}$) at decode
(format.mjs) and before interpolating fromSession into the auto-responder LLM
prompt (respond-batch.mjs) — prevents argument-injection via the attacker-
controlled envelope label.
- Carry the routing label in tp.from_session and keep scope.wrapper_session_id
as a unique wrapper-session id (aligned with the shared SessionEnvelope
contract); decode falls back to wrapper_session_id for older bodies.
Concurrency / data integrity:
- daemon-lock: steal a stale lock via rename-to-claim + validate-claimed-
contents so a racing stealer can't clobber a freshly-created lock; releaseLock
only removes a lock provably owned by this pid.
- registry: new atomic claimLabel (exclusive-create CAS + mid-write re-read) so
concurrent session starts can't grab the same label; adopt() uses it.
- outbox: requeue stale `.sending-*` claims orphaned by a crashed daemon.
Correctness / stability:
- Unify reply correlation on the in-body envelope message.id in BOTH modes, so a
daemon-mode sender and a self-mode receiver still correlate (self-drain now
preserves the envelope id).
- routing.redeliverUnrouted re-evaluates held mail against current policy, so
untargeted messages held while no session was live reach the primary once one
appears.
- Daemon respawn from a session poll is cooldown-guarded (no spawn storm);
presence heartbeat marks presenceWritten so teardown always cleans up.
- Daemon resolves @handle/base64-key to cryptoId before a contact request
(shared mcp/address.mjs); carries toSession in the auto-response queue.
- send_and_wait accepts to_session/role, returns via/fromSession/toSession, and
only matches a reply from the targeted peer session.
- README/docs describe the daemon-first flow; daemon-test force-kills on a failed
idle-exit assertion.
Tests: +label validation, untargeted-unrouted redelivery, concurrent stale-lock
steal race, concurrent claimLabel (no collision), stale-outbox recovery. Full
offline suite green (130 checks); live staging smoke unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thanks @coderabbitai and Codex — addressed all findings in d34cf0c. Summary: Security
Concurrency / data integrity
Correctness / stability
Offline suite is green (130 checks, incl. the new race/redelivery/validation tests); live staging smoke unchanged (two sessions → |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
set -euo pipefail
git log --oneline -5
echo "---format.mjs label validation---"
rg -n "from_session|to_session|wrapper_session_id|\\^\\[\\\\w" sdk/plugin-claude/mcp/format.mjs || true
echo "---daemon-lock.mjs steal logic---"
sed -n '1,140p' sdk/plugin-claude/mcp/daemon-lock.mjs
echo "---registry.mjs claimLabel---"
rg -n "claimLabel|allocateLabel|writePresence" sdk/plugin-claude/mcp/registry.mjs
echo "---outbox.mjs stale claims---"
rg -n "sending-|claimOutboxJobs" sdk/plugin-claude/mcp/outbox.mjs
echo "---respond-batch.mjs validation---"
rg -n "fromSession|\\^\\[\\\\w" sdk/plugin-claude/hooks/respond-batch.mjsLength of output: 8596
This all looks solid — nice work tightening the concurrency edges and the injection surface. Since the offline suite is green and staging smoke confirms the label allocation ( [resolve] ᕦ(ò_óˇ)ᕤ |
…o session-aware-messaging # Conflicts: # sdk/plugin-claude/mcp/server.mjs
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
sdk/plugin-claude/mcp/registry.mjs (1)
192-193: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winDo not fall back to non-atomic presence writes.
If the CAS retry budget is exhausted,
writePresencecan still overwrite a live session file and orphan another session. Fail fast and let startup retry instead.Proposed fix
- // Extremely unlikely: fall back to a plain (non-atomic) write. - return writePresence(agentAddress, { label: allocateLabel(agentAddress, { requested: req, harnessSessionId }), harnessSessionId, cwd, startedAt }); + throw new Error(`Unable to atomically claim a session label for ${agentAddress}; retry session start`);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/plugin-claude/mcp/registry.mjs` around lines 192 - 193, The fallback in registry.mjs after the CAS retry budget is exhausted should not call writePresence, because it can overwrite an active session; update the presence-write path around writePresence/allocateLabel to fail fast instead of doing a non-atomic write, and let startup retry handle recovery. Keep the atomic CAS logic in place and return or throw an error when retries are exhausted so no live session file is clobbered.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@sdk/plugin-claude/registry-test.mjs`:
- Around line 107-112: The helper that runs the spawned process is resolving too
early by listening to the child process exit event, which can fire before all
stdout data has been fully read. Update the Promise logic in the spawn-based
helper to resolve on the child’s close event instead of exit, and keep the
existing stdout accumulation so the final out.trim() includes all buffered
output before resolve runs.
---
Duplicate comments:
In `@sdk/plugin-claude/mcp/registry.mjs`:
- Around line 192-193: The fallback in registry.mjs after the CAS retry budget
is exhausted should not call writePresence, because it can overwrite an active
session; update the presence-write path around writePresence/allocateLabel to
fail fast instead of doing a non-atomic write, and let startup retry handle
recovery. Keep the atomic CAS logic in place and return or throw an error when
retries are exhausted so no live session file is clobbered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cad39283-cd24-4e0c-b3d0-071e151985e8
📒 Files selected for processing (16)
sdk/plugin-claude/README.mdsdk/plugin-claude/daemon-test.mjssdk/plugin-claude/docs/session-aware-messaging.mdsdk/plugin-claude/envelope-test.mjssdk/plugin-claude/hooks/agent-daemon.mjssdk/plugin-claude/hooks/respond-batch.mjssdk/plugin-claude/lock-test.mjssdk/plugin-claude/mcp/address.mjssdk/plugin-claude/mcp/daemon-lock.mjssdk/plugin-claude/mcp/format.mjssdk/plugin-claude/mcp/outbox.mjssdk/plugin-claude/mcp/registry.mjssdk/plugin-claude/mcp/routing.mjssdk/plugin-claude/mcp/server.mjssdk/plugin-claude/registry-test.mjssdk/plugin-claude/routing-test.mjs
✅ Files skipped from review due to trivial changes (1)
- sdk/plugin-claude/docs/session-aware-messaging.md
🚧 Files skipped from review as they are similar to previous changes (11)
- sdk/plugin-claude/routing-test.mjs
- sdk/plugin-claude/hooks/respond-batch.mjs
- sdk/plugin-claude/mcp/outbox.mjs
- sdk/plugin-claude/mcp/routing.mjs
- sdk/plugin-claude/daemon-test.mjs
- sdk/plugin-claude/mcp/format.mjs
- sdk/plugin-claude/envelope-test.mjs
- sdk/plugin-claude/README.md
- sdk/plugin-claude/mcp/daemon-lock.mjs
- sdk/plugin-claude/hooks/agent-daemon.mjs
- sdk/plugin-claude/mcp/server.mjs
| return new Promise((resolve) => { | ||
| const c = spawn(process.execPath, ["--input-type=module", "-e", src], { stdio: ["ignore", "pipe", "ignore"] }); | ||
| let out = ""; | ||
| c.stdout.on("data", (d) => (out += d.toString())); | ||
| c.on("exit", () => resolve(out.trim())); | ||
| }); |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== registry-test.mjs excerpt ==\n'
sed -n '1,170p' sdk/plugin-claude/registry-test.mjs | cat -n
printf '\n== search for similar child-process patterns ==\n'
rg -n 'on\("(exit|close)"|spawn\(process\.execPath' sdk/plugin-claude -g '!**/node_modules/**'
printf '\n== minimal Node event-order probe ==\n'
node - <<'JS'
const { spawn } = require('child_process');
const c = spawn(process.execPath, ['-e', `
process.stdout.write('x');
`], { stdio: ['ignore', 'pipe', 'ignore'] });
let out = '';
c.stdout.on('data', d => {
out += d.toString();
console.log('data', JSON.stringify(out));
});
c.on('exit', (code, sig) => {
console.log('exit', { code, sig, out });
});
c.on('close', (code, sig) => {
console.log('close', { code, sig, out });
});
JSRepository: tinyhumansai/tiny.place
Length of output: 10339
🌐 Web query:
Node.js child_process exit close event ordering stdout drained docs
💡 Result:
In Node.js, the close event is emitted only after the child process has ended and all of its stdio streams have been closed [1][2]. This is a key distinction from the exit event, which is emitted as soon as the child process ends, even if stdio streams (like stdout or stderr) are still open [2][3]. Because the close event waits for the stdio streams to be closed, it is the appropriate event to listen for if you need to ensure that all data has been received and processed from a child process's stdout or stderr [4]. While the exit event signifies that the process itself has finished, there is no guarantee that all data currently in the pipe has been flushed or read at that moment [3]. The close event effectively ensures the drainage of these streams [4]. It is important to note: 1. Ordering: The exit event always fires before the close event (if both are emitted) [2][3]. 2. Use Case: If your application relies on reading the entirety of the child process's output, you should rely on the close event rather than the exit event to confirm that all output streams have been fully drained and closed [4]. Top results: [1][4][2][3]
Citations:
- 1: https://nodejs.org/api/child_process.html
- 2: https://nodejs.org/docs/latest-v26.x/api/child_process.html
- 3: https://github.com/nodejs/node/blob/main/doc/api/child_process.md
- 4: https://stackoverflow.com/questions/37522010/difference-between-childprocess-close-exit-events
Resolve the child output on close. Use close instead of exit so stdout is fully drained before resolve(out.trim()) runs.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@sdk/plugin-claude/registry-test.mjs` around lines 107 - 112, The helper that
runs the spawned process is resolving too early by listening to the child
process exit event, which can fire before all stdout data has been fully read.
Update the Promise logic in the spawn-based helper to resolve on the child’s
close event instead of exit, and keep the existing stdout accumulation so the
final out.trim() includes all buffered output before resolve runs.
Address CodeRabbit follow-up review: - registry.claimLabel throws when the CAS retry budget is exhausted instead of falling back to a non-atomic writePresence that could clobber a live session's file — the caller retries session start. - Spawn-based test helpers (registry-test, lock-test) resolve on the child's 'close' event (stdout fully flushed) rather than 'exit', which can fire before all output is read. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Addressed the follow-up review in 5c5dd1a:
Offline suite green (130 checks). Also note: the merge from |
|
@coderabbitai review |
✅ Action performedReview finished.
|
Summary
Implements session-aware messaging for the tiny.place Claude plugin (
sdk/plugin-claude) per the design doc (added atdocs/session-aware-messaging.md). Lets N Claude sessions run under one agent, each individually addressable by a session label (claude:1,claude:2, …), withSessionEnvelope-compatible message bodies and a per-agent daemon that routes messages to the right session.No backend changes;
sendMessage/MessageEnvelopeuntouched — all metadata rides inside the encrypted body.Built in three independently-tested phases (design §16):
Phase A — message format (
mcp/format.mjs)SessionEnvelopeV1superset (schematinyplace.harness.session.v1) plus a namespacedtpblock, so they interoperate with the harness-wrapper format.decodeBodytries the envelope path first, then falls back to the legacyAUTO_SENTINEL/re:sentinel (preserved byte-for-byte, including the SOH delimiters) and finally plain text.send/auto_replygain optionalto_session+role; the receiver surfacesfromSession/role.Phase B — session registry (
mcp/registry.mjs)sessions/<agent>/<label>.jsonwith a 10s heartbeat; liveness = fresh heartbeat and pid alive; stale files GC'd.harness_session_idacross restarts → lowest-freeclaude:<n>.sessionstool +/tinyplace:sessions;whoamishows label + live sessions.Phase C — per-agent daemon (
hooks/agent-daemon.mjs)mcp/daemon-lock.mjs), with takeover-on-death and idle-exit.inbox/queue (to_sessionlive → that session; dead →_unrouted/; none → primary/broadcast/drop). Sole ratchet writer: sends_outbox/jobs. Drives the auto-responder.message.id.TINYPLACE_SESSION_DAEMON=off) or can't start, a session reverts to the original per-session self-drain — no regression for single-session use.§15 open-question resolutions (per design defaults)
harness_session_id=CLAUDE_CODE_SESSION_ID; unlabeled inbound → primary session; daemon↔session transport = file queues; brief drain gap on handoff accepted.Testing
Offline suite (
npm test, 111 checks) — all green:envelope-test(round-trip incl. tp block, legacy fallback, plain text, wrapper interop)registry-test(liveness fresh/stale/pid-dead, allocation, reuse-by-harnessSessionId, GC)sessions-test(two servers → claude:1/claude:2, label reuse across restart)routing-test(to_session live/dead/unset → correct queue; broadcast/drop; unrouted redelivery)lock-test(CAS incl. a real 3-way process race → exactly one winner)daemon-test(daemon boot + idle-exit + thin-client send/inbox/check_reply)smoke/assignment/env-adoptregressions.Live staging smoke: two sessions of one agent register as
claude:1/claude:2, both indaemonmode withkeysPublished:true— the daemon boots, acquires the lock, and publishes keys against the real backend.Follow-up
Contact-request surfacing (
drainContacts//tinyplace:contacts) is not in this base (PR #207), so moving contact-polling into the daemon is deferred to a follow-up. A full peer-to-peer liveto_sessiondelivery test (two agents + contact acceptance) is left for a manual/E2E pass; the daemon 403 path already auto-requests contacts.🤖 Generated with Claude Code
Summary by CodeRabbit
sessionstool for listing active sessions.send/auto_reply/send_and_waitwithto_sessiontargeting androle-aware reply routing.use,whoami, and session command docs; added a session-aware messaging guide.