Skip to content

WebSocket reconnection fails during approval blocks #119

@wu-changxing

Description

@wu-changxing

Bug

When a client refreshes while the agent is blocked waiting for approval (e.g., bash tool), reconnection fails with "No active connection" error.

Reproduction

  1. Start a hosted agent with tool approval enabled
  2. Send a prompt that triggers a bash tool
  3. Agent sends approval_needed, blocks on io.receive()
  4. Refresh the browser page
  5. Frontend shows "No active connection" error
  6. Agent logs: ✗ bash - connection closed

Root Cause

Three bugs compound:

1. run_agent() has no error handling

If the agent crashes (e.g., on io_closed sentinel), agent_finished.set() never fires. _pipe_ws_io hangs forever waiting.

File: connectonion/network/asgi/websocket.pyrun_agent() function

2. Reattach uses closed IO

On reconnect, the server reattaches to the old io object which has _closed = True (set by io.close() on disconnect). io.send() silently drops all events — agent can't communicate with the new client.

File: connectonion/network/io/websocket.pyWebSocketIO._closed flag

3. Two _pipe_ws_io loops compete

The old loop (stuck waiting for agent_finished) and the new loop (from reattach) both reference the same agent_finished event, causing race conditions on completion.

File: connectonion/network/asgi/websocket.py_pipe_ws_io()

Fix Plan

  1. run_agent(): wrap in try/finally — always set agent_finished, capture error in error_holder
  2. Reattach: reopen IO — reset io._closed = False so agent can send events through new WebSocket
  3. Old _pipe_ws_io: detect superseded — when new connection reattaches, old pipe should exit cleanly

Documentation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions