Skip to content

claude code bug report #55

Description

@jonalm

I don't know if this is helpful, but below is claude code's own report of some buggy behaviour I encountered.

Kaimon MCP — bug report

Date: 2026-06-23
Context: Driving an iterative Julia loop via the kaimon MCP from an automated coding agent — "edit a source file → re-evaluate in the session → inspect the result", with many small edits to a single package source file and a re-evaluation after each batch.

Identifiers below are placeholders: MyPkg = the package under development (a path-dev-ed Julia package with its own Project.toml), at <project-root>. Session keys and PIDs are reproduced as observed (they are random/local and not sensitive).

Environment

  • OS: macOS (Darwin 23.5), Apple Silicon (arm64-darwin).
  • Julia: 1.11, juliaup-managed.
  • Kaimon package: ~/.julia/packages/Kaimon/6N7iB/ (version hash 6N7iB). Stacktraces point into ~/.julia/packages/Kaimon/6N7iB/src/gate.jl.
  • Project under test: a path-dev-ed Julia package (MyPkg) with a few registered deps; nothing exotic.
  • Concurrent unrelated session (stable throughout): key 0f5b58b7, agent-spawned for a different project, PID 45258, up ~7.5h. It never misbehaved — only the MyPkg session(s) did. Useful as a control.

TL;DR — five distinct symptoms

  1. Wrong working directory in a fresh session's REPL: pwd() was an unrelated local project directory, not the project_path passed to start_session.
  2. Revise never hot-reloaded edits to package source in the persistent session — even after an explicit Revise.revise(); newly-added top-level definitions stayed undefined. Only a full process restart picked changes up.
  3. manage_repl("restart") timed out and was a no-op: afterwards the process kept its old PID and accumulating uptime, and ran stale code.
  4. Session registry race: the set of live sessions and the project's session key flipped non-deterministically between consecutive MCP calls (two keys appeared/disappeared; a shut-down session reappeared).
  5. start_session phantom "already running": it returned a session key that ping and ex simultaneously reported as nonexistent — and therefore declined to create a new one, leaving the project with no usable REPL.

Net effect: the persistent-REPL workflow became unusable. Fell back to spawning a one-shot julia --project=<project-root> -e '…' from a plain shell, which worked 100% reliably thereafter.

What worked (to isolate the fault)

  • start_session — the first call returned a usable key.
  • run_testsalways worked and always reflected current on-disk code. (It spawns a fresh subprocess, sidestepping both the stale-Revise and registry issues.)
  • ex with absolute paths — worked whenever the session was actually live.
  • manage_repl("restart") — worked the first time; hung the second time.
  • One-shot julia --project=<project-root> -e '…' from a plain shell — worked every time. This confirms the on-disk edits were valid and complete; the fault is in the persistent session / Revise / registry layer, not the code.

Issue 1 — Fresh session's REPL working directory is not the project directory

Repro:

start_session(project_path="<project-root>")
  → "Session started. Session key: c97c8a8e"

ex(ses=c97c8a8e, e='include(<a project-relative path>)')

Result (verbatim, path redacted):

ERROR: SystemError: opening file
"<unrelated-project-dir>/<the relative path>": No such file or directory
Stacktrace:
 [1] include(mapexpr::Function, mod::Module, _path::String) @ Base ./Base.jl:307
 ...
 [4] _eval_with_capture(expr::Expr) @ Kaimon.Gate ~/.julia/packages/Kaimon/6N7iB/src/gate.jl:1211
 [5] gate_eval(code::String; _mod::Module, display_code::String) @ Kaimon.Gate .../gate.jl:920
 [6] (::Kaimon.Gate.var"#_do_async_eval#handle_message##0")() @ .../gate.jl:1571
 [7] (::Kaimon.Gate.var"#handle_message##5#handle_message##6"{...})() @ .../gate.jl:1610

Observation: pwd() in the new session resolved a project-relative path against an unrelated local project directory, not the project_path passed to start_session. The REPL apparently inherited a cwd from the MCP server process (or a global/previous default) instead of cd-ing into the activated project.

Expected: a session activated for project_path = X should have pwd() == X (or the cwd behaviour should be documented), so project-relative paths work.

Workaround: used an absolute project root constant and joinpath everywhere.

Severity: low (easy workaround) but surprising; a footgun for any relative-path code.


Issue 2 — Revise does not hot-reload edits to package source in the persistent session

Repro: Edited one package source file repeatedly — changed function bodies, and added three brand-new top-level definitions (one function + two consts; call them newfn, NEW_A, NEW_B). Re-ran the workflow via ex.

Result: the produced output reflected the old code. Confirmed at the artifact level — the generated output still contained the pre-edit form of a changed function, after that code had been replaced on disk.

Direct confirmation (verbatim):

ex(ses=…, e='using Revise; Revise.revise(); using MyPkg;
   (isdefined(MyPkg,:newfn), isdefined(MyPkg,:NEW_A), isdefined(MyPkg,:NEW_B))')
  → (false, false, false)

Even after an explicit Revise.revise(), none of the newly-added top-level symbols existed in the module. The session banner reported "Revise active."

Cross-check: run_tests (fresh subprocess) and the one-shot julia --project did see all the new code — so the edits were on disk and valid; only the long-lived session's module was stale, and Revise's incremental update never fired.

Hypotheses for maintainer:

  • The package may be loaded such that Revise isn't tracking it (loaded from a precompiled image / not dev-ed into the active environment / file watcher not started). Revise.revise() being a no-op suggests Revise has no tracked files for this package.
  • Worth checking at session start: is the package Pkg.develop-ed in the active env? Is Revise's FS watcher attached to its src/? Do Revise.pkgdatas / Revise.watched_files contain it? Any queued revision errors (Revise.queue_errors)?

Severity: high for an interactive dev loop — it silently produces stale results (no error), the worst failure mode.


Issue 3 — manage_repl("restart") timed out and did not actually restart

First restart (worked):

manage_repl(command="restart", session=c97c8a8e)
  → "Session c97c8a8e restarted. Fresh Julia state. Revise active."

After this, fresh code was loaded — a working restart was the only thing that picked up edits.

Second restart (hung):

manage_repl(command="restart", session=c97c8a8e)
  → "The operation timed out."

Then ping showed (verbatim, names redacted):

● 438477fc <MyPkg>        (connected, up 1h 37m, PID 63715, free)
● 0f5b58b7 <other-proj>   (connected, up 7h 33m, PID 45258, agent-spawned)

Observation: the MyPkg process showed uptime 1h 37m and an unchanged PID (63715) — i.e. it was not actually restarted (a real restart would show ~0m uptime / a new PID). Yet the call returned a timeout, not success or failure. Note also the key visible to ping (438477fc) differed from the key I had restarted (c97c8a8e) — see Issue 4.

Severity: high — restart was the only reliable way to pick up edits (given Issue 2), and it became unavailable.


Issue 4 — Session registry race / non-deterministic key reassignment

The set of "available sessions" and the project's session key changed between consecutive calls, with no session activity from me in between. Verbatim sequence (names redacted, keys/PIDs as observed):

1) ping
   → ● 438477fc <MyPkg> (up 1h 37m, PID 63715)   ● 0f5b58b7 <other-proj>

2) manage_repl(shutdown, session=438477fc)
   → "Session 438477fc shut down."

3) start_session(project_path="<project-root>")
   → "Session already running for this project. Session key: c97c8a8e"     ← c97c8a8e?!

4) manage_repl(shutdown, session=c97c8a8e)
   → ERROR: "No session matched 'c97c8a8e'.
            Available: 0f5b58b7 (<other-proj>), 438477fc (<MyPkg>)"          ← 438477fc is BACK

5) ping
   → "Sessions: 1 connected / 1 total"
     ● 0f5b58b7 <other-proj> only                              ← both MyPkg sessions gone now

6) start_session(project_path="<project-root>")
   → "Session already running for this project. Session key: c97c8a8e"     ← claims c97c8a8e again

7) ex(ses=c97c8a8e, …)
   → ERROR: "No session matched 'c97c8a8e'. Available: 0f5b58b7 (<other-proj>)"

8) ping
   → ● 0f5b58b7 <other-proj> only

Two keys (c97c8a8e, 438477fc) appear and disappear inconsistently; 438477fc reappeared after being shut down (step 2 → step 4); ping, start_session, and the ex/shutdown "Available:" listings disagreed with each other within seconds.

Hypothesis: the key↔process registry has a race / stale-entry bug — possibly restart registers a new key/process without atomically retiring the old one, and timeouts (Issue 3) leave half-committed registry state. Different endpoints (ping vs start_session vs the ex/shutdown "Available:" list) appear to read different or differently-cached views of the registry.

Severity: high — makes the session unaddressable; you can't reliably target a key.


Issue 5 — start_session phantom "already running" (won't create a usable session)

In steps 3 and 6 above, start_session returned:

"Session already running for this project. Session key: c97c8a8e"

…but at those same moments ping reported no MyPkg session (steps 5/8) and ex(ses=c97c8a8e) failed with "No session matched 'c97c8a8e'" (step 7). So:

  • start_session's "already running" check consulted a stale registry entry and returned a key that is not actually live/connected.
  • Because it believed a session existed, it did not spin up a new one — leaving the project with no usable REPL at all. The only escape was abandoning the MCP for the rest of the session.

Expected: start_session should verify the candidate is actually alive (ping the gate / check the PID) before reporting "already running"; if dead/unreachable, reap the stale entry and start a fresh process.

Severity: high — this is the terminal failure that forced the fallback.


Consolidated hypotheses & suggested diagnostics

In rough priority:

  1. Registry liveness reconciliation (Issues 3–5). Reconcile the session registry against live PIDs on every ping / start_session / ex; reap dead/phantom entries; make restart atomic (retire the old key/PID together with registering the new); on a restart timeout, return the actual resulting state and the new key rather than a bare timeout. Have start_session health-check the "already running" candidate before returning it.
  2. Revise tracking (Issue 2). Verify Revise is actually attached to the dev package's src/ at session start. Useful introspection to capture: Revise.pkgdatas, Revise.watched_files, length(Revise.revision_queue), Revise.queue_errors. Check whether the package is dev-ed vs loaded from a precompiled image, and whether the FS watcher thread is alive. A no-op Revise.revise() strongly suggests nothing is being tracked.
  3. Working directory (Issue 1). cd the gate process into project_path (or expose/document the cwd) at session creation.

Artifacts that would help (could not be retrieved reliably once the session died):

  • The kaimon server log around the timeout (there is a server_log MCP tool).
  • An internal session-registry dump at the moments of disagreement.
  • kaimon/gate version: package hash 6N7iB; relevant frames src/gate.jl:1211 (_eval_with_capture), :920 (gate_eval), :1571, :1610.

Reproduction recipe (suspected):

  1. start_session for a path-dev-ed Julia package; check pwd() in the session (Issue 1).
  2. Edit a src/ file (add a new top-level symbol); ex Revise.revise(); isdefined(Mod, :newsym) → expect false (Issue 2).
  3. manage_repl("restart") repeatedly / under load until one times out; ping and compare PID / uptime / key (Issues 3–4).
  4. After a timed-out restart, start_session again and observe the phantom "already running" plus a failing ex against that key (Issue 5).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions