Streaming responses consistently interrupted mid-transmission - connection closes without message_stop event

Issue Description:

  I'm experiencing a consistent issue where streaming responses from the Anthropic API are being prematurely terminated. The stream ends abruptly without sending a message_stop event, leaving the response incomplete. This happens specifically when using tool_use with large JSON payloads.

  Environment:
  - SDK: @anthropic-ai/sdk v0.68.0
  - Runtime: Node.js (NestJS application)
  - Model: claude-3-5-haiku-latest (also tested with other models)
  - API: anthropicClient.beta.messages.create with stream: true

  Observed Behavior:

  The stream consistently ends mid-transmission after receiving multiple content_block_delta events:

  [ClaudeService] Stream END event. Total bytes: 277346
  [ClaudeService] Last chunk (last 500 chars): event: content_block_delta
  data: {"type":"content_block_delta","index":2,"delta":{"type":"input_json_delta","partial_json":"fa"}   }

  Failed parsing Anthropic stream Error: Stream ended without message_stop event

  The connection closes without:
  - A content_block_stop event for the current block
  - A message_delta event with stop reason
  - A message_stop event

  The incomplete JSON in the last chunk ("partial_json":"fa") indicates the stream is being cut off mid-transmission rather than completing gracefully.

  What I've extensively tried:

  I've spent significant time trying different approaches to isolate the issue:

  1. Multiple custom fetch implementations:
    - Native Node.js fetch with custom HTTP agents
    - Axios with streaming response handling
    - Custom fetch wrapper using axios as transport
    - Different combinations of HTTP/HTTPS agent configurations
  2. Various timeout and connection configurations:
    - Disabled all timeouts (timeout: 0)
    - Configured HTTP agents with persistent keep-alive
    - Set maxContentLength: Infinity and maxBodyLength: Infinity
    - Implemented aggressive keep-alive settings (1-second intervals)
    - Added socket-level timeout prevention
  3. Different maxTokens values:
    - Tested with 2048, 4096, 8192, and higher values
    - The interruption occurs regardless of the token limit setting
  4. Stream handling variations:
    - Direct stream passthrough
    - ReadableStream conversion with proper backpressure handling
    - Added extensive error handling and logging
    - Monitored socket events (timeout, close, end, abort)
    - Pause/resume patterns on the underlying Node.js stream

  What I cannot determine:

  Despite all these attempts, I cannot determine if this is:
  - A backpressure/flow control issue between the client and server
  - A timing issue (some timeout I haven't been able to configure)?
  - A size limit (the stream consistently ends around 270-280KB)?
  - A network intermediary issue (Cloudflare proxy limitation)?
  - Something else entirely?

  The fact that slowing down consumption fixes the issue suggests the SDK or underlying connection isn't properly handling backpressure signals.

  Use Case:

  My tool_use needs to generate large JSON structures (arrays with many objects) for data organization tasks. This is a legitimate use case where the response needs to be comprehensive.

  Expected Behavior:

  The stream should continue until:
  1. All tool_use input JSON is fully transmitted
  2. A content_block_stop event is sent
  3. A message_delta event with stop reason is sent
  4. A message_stop event is sent

  The stream consumer speed should not affect whether the complete response is delivered.

  Questions:

  1. Is there a known backpressure or flow control issue with the SDK's streaming implementation?
  2. Are there any undocumented limits (size, time, or otherwise) on streaming responses?
  3. Is there a recommended maximum size for tool_use JSON payloads?
  4. Could this be related to Cloudflare (I see cf-ray headers) buffering or timeout settings?
  5. Are there SDK configuration options for controlling stream consumption rate or buffering?

  Request:

  Could you please help identify:
  - Why rapid stream consumption causes premature termination
  - Whether this is a known issue with the SDK's async iterator implementation
  - How to properly handle backpressure for large tool_use responses
  - Any configuration or approach I should try


  Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Streaming responses consistently interrupted mid-transmission - connection closes without message_stop event #842

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Streaming responses consistently interrupted mid-transmission - connection closes without message_stop event #842

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions