-
Notifications
You must be signed in to change notification settings - Fork 185
Description
Issue Description:
I'm experiencing a consistent issue where streaming responses from the Anthropic API are being prematurely terminated. The stream ends abruptly without sending a message_stop event, leaving the response incomplete. This happens specifically when using tool_use with large JSON payloads.
Environment:
- SDK: @anthropic-ai/sdk v0.68.0
- Runtime: Node.js (NestJS application)
- Model: claude-3-5-haiku-latest (also tested with other models)
- API: anthropicClient.beta.messages.create with stream: true
Observed Behavior:
The stream consistently ends mid-transmission after receiving multiple content_block_delta events:
[ClaudeService] Stream END event. Total bytes: 277346
[ClaudeService] Last chunk (last 500 chars): event: content_block_delta
data: {"type":"content_block_delta","index":2,"delta":{"type":"input_json_delta","partial_json":"fa"} }
Failed parsing Anthropic stream Error: Stream ended without message_stop event
The connection closes without:
- A content_block_stop event for the current block
- A message_delta event with stop reason
- A message_stop event
The incomplete JSON in the last chunk ("partial_json":"fa") indicates the stream is being cut off mid-transmission rather than completing gracefully.
What I've extensively tried:
I've spent significant time trying different approaches to isolate the issue:
- Multiple custom fetch implementations:
- Native Node.js fetch with custom HTTP agents
- Axios with streaming response handling
- Custom fetch wrapper using axios as transport
- Different combinations of HTTP/HTTPS agent configurations - Various timeout and connection configurations:
- Disabled all timeouts (timeout: 0)
- Configured HTTP agents with persistent keep-alive
- Set maxContentLength: Infinity and maxBodyLength: Infinity
- Implemented aggressive keep-alive settings (1-second intervals)
- Added socket-level timeout prevention - Different maxTokens values:
- Tested with 2048, 4096, 8192, and higher values
- The interruption occurs regardless of the token limit setting - Stream handling variations:
- Direct stream passthrough
- ReadableStream conversion with proper backpressure handling
- Added extensive error handling and logging
- Monitored socket events (timeout, close, end, abort)
- Pause/resume patterns on the underlying Node.js stream
What I cannot determine:
Despite all these attempts, I cannot determine if this is:
- A backpressure/flow control issue between the client and server
- A timing issue (some timeout I haven't been able to configure)?
- A size limit (the stream consistently ends around 270-280KB)?
- A network intermediary issue (Cloudflare proxy limitation)?
- Something else entirely?
The fact that slowing down consumption fixes the issue suggests the SDK or underlying connection isn't properly handling backpressure signals.
Use Case:
My tool_use needs to generate large JSON structures (arrays with many objects) for data organization tasks. This is a legitimate use case where the response needs to be comprehensive.
Expected Behavior:
The stream should continue until:
- All tool_use input JSON is fully transmitted
- A content_block_stop event is sent
- A message_delta event with stop reason is sent
- A message_stop event is sent
The stream consumer speed should not affect whether the complete response is delivered.
Questions:
- Is there a known backpressure or flow control issue with the SDK's streaming implementation?
- Are there any undocumented limits (size, time, or otherwise) on streaming responses?
- Is there a recommended maximum size for tool_use JSON payloads?
- Could this be related to Cloudflare (I see cf-ray headers) buffering or timeout settings?
- Are there SDK configuration options for controlling stream consumption rate or buffering?
Request:
Could you please help identify:
- Why rapid stream consumption causes premature termination
- Whether this is a known issue with the SDK's async iterator implementation
- How to properly handle backpressure for large tool_use responses
- Any configuration or approach I should try
Thank you!