fix(providers): detect truncated Anthropic and OpenAI Responses streams#33
Merged
Merged
Conversation
The Anthropic Messages streaming protocol guarantees message_stop as the final SSE event of every successful stream. Today the adapter treats any clean EOF (Stream.Err() == nil or io.EOF) as a successful Finish, even when the upstream body was cut off mid-response. This silently truncates the assistant's reply and commits the partial text as if it were the model's complete answer. Track whether message_stop was observed during the SSE loop. On clean EOF without it, yield StreamPartTypeError wrapping io.EOF so the failure surfaces as a retryable transport error rather than a phantom success. Existing transport errors continue to flow through the unchanged else branch; the event: error path keeps yielding via Stream.Err(). Tests cover happy path, EOF before message_stop, empty stream, and malformed stream (existing error path preserved). Also picks up a one-line gofmt fix in TestComputerUseToolJSON; the test file was not gofmt-clean at HEAD without it.
The OpenAI Responses API emits terminal lifecycle events when a streamed response reaches its final state. The adapter currently yields Finish on any clean EOF, even if the stream ended before response.completed or response.incomplete. That has the same silent-truncation shape as the Anthropic message_stop bug in this PR. Track response.completed and response.incomplete before yielding Finish from both Stream and StreamObject. If the transport closes cleanly first, yield a StreamPartTypeError/ObjectStreamPartTypeError wrapping io.EOF so callers can retry instead of committing partial output. Also surface response.failed as an error event instead of falling through to Finish. Tests cover completed and incomplete terminal events, EOF before terminal event, empty streams, response.failed, malformed streams, and JSON-mode StreamObject truncation. Also fixes a pre-existing OpenAI test compile issue where one toResponsesPrompt call still expected two return values.
|
@codex review |
1 similar comment
Member
Author
|
@codex review |
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
ethanndickson
added a commit
to coder/coder
that referenced
this pull request
May 8, 2026
coder/fantasy now fails closed when Anthropic or OpenAI Responses streams close before their provider terminal events instead of yielding a successful finish. This bumps the fantasy replacement to coder/fantasy#33 and teaches chat error classification to treat those failures as retryable timeout errors with explicit stream-closed messages. <img width="875" height="311" alt="image" src="https://github.com/user-attachments/assets/69c6f7b5-c885-46d2-a88b-b7a2b111bd55" />
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
This started from a Coder agent observability report: a workspace agent's chat appeared to hang and then surfaced as
context canceledin coderd logs, with no upstream error visible to the user. Root cause: a mitmproxy sitting between Coder and Anthropic was occasionally closing the upstream SSE response cleanly, mid-stream, before the response was complete. The fantasy Anthropic adapter treated the clean EOF as a successful Finish, committing the partial assistant text as the final answer. Because no error surfaces,chatretry.Retrynever engages, the chatd retry budget is wasted, and the failure is invisible to operators.This PR now applies the same terminal-event invariant to the OpenAI Responses API path too. Responses streaming has semantic lifecycle events; OpenAI documents
response.completedas "emitted when the model response is complete,"response.incompleteas "emitted when a response finishes as incomplete," andresponse.failedas the failed terminal event. The current Responses adapter had the same shape as the Anthropic bug: afterstream.Next()stops,stream.Err() == nilmeant Finish, even if no terminal lifecycle event arrived.Closes CODAGT-325
Fix
Anthropic Messages streaming
The Anthropic Messages streaming protocol explicitly guarantees
message_stopas the final SSE event of every successful stream (in two places on the same docs page: the "Event types" flow list and the "Full HTTP Stream response" section, each ending with "A finalmessage_stopevent"). Tool use, extended thinking, web search, and server tools all preserve this invariant; only the documentedevent: errorfailure mode replaces it.Today, this adapter has an empty
case "message_stop":arm and treatsStream.Err() == nil || errors.Is(err, io.EOF)as a successful Finish unconditionally. The Anthropic Go SDK does not enforce the terminal event either: inpackages/ssestream/ssestream.go,Stream[T].Nextreturns false on a clean EOF and leavesStream.Err() == nil. Application code must add the gate.This PR tracks whether
message_stoparrived during the SSE loop. On clean EOF without it, yield aStreamPartTypeErrorwrappingio.EOF. The existing transport-error path (else branch) is unchanged, and theevent: errorpath keeps surfacing throughStream.Err(). Wrapping with%wpreserves the underlyingio.EOFso downstream classifiers (e.g. Coder'schaterror.Classifyalready matcheseofin itstimeoutPatterns) treat it as retryable without any extra plumbing.OpenAI Responses streaming
For Responses API streaming, this PR tracks terminal lifecycle events before yielding Finish from both
Streamand JSON-modeStreamObject:response.completedmarks a normal terminal response and can yield Finish.response.incompletemarks a terminal response with an incomplete finish reason (e.g. max output tokens) and can yield Finish with the mapped finish reason.response.failednow yields an error immediately instead of falling through to a synthetic Finish.StreamPartTypeError/ObjectStreamPartTypeErrorwrappingio.EOFwith an "openai responses stream closed before terminal event" message.This deliberately does not change the legacy OpenAI Chat Completions adapter. That API has a different streaming shape (
[DONE], chunk finish reasons, and SDK-level behavior), and the codex prior art is specifically for Responses-style terminal lifecycle events.Coverage
Anthropic coverage is table-driven across complete stream, EOF before
message_stop, empty stream, and malformed stream (existing error path preserved).OpenAI Responses coverage is table-driven across
response.completed,response.incomplete, EOF before terminal event, empty stream,response.failed, malformed stream, and JSON-modeStreamObjecttruncation.Prior art
This is the same defense Anthropic itself shipped in their newest official SDK:
MessageAccumulatorinanthropic-sdk-java(PR anthropics/anthropic-sdk-java#178, merged 2025-03-21) raisesIllegalStateException("'message_stop' event not yet received.")onMessageAccumulator.message()and has dedicated unit tests for bothmessageNotStartedandmessageNotStopped.OpenAI's
openai/codeximplements the analogous gate for the OpenAI Responses API: incodex-rs/codex-api/src/sse/responses.rsit emitsApiError::Stream("stream closed before response.completed")on early EOF, with a regression test (stream_no_completed.rs::retries_on_early_close) that feeds anincomplete_sse.jsonfixture and asserts the retry path fires. This is the direct precedent for the Responses API portion of this PR.A community contributor independently arrived at the same conclusion for LiteLLM in BerriAI/litellm#20361 (open, "changes requested" as of Feb 2026), filed against issue BerriAI/litellm#20347 ("Anthropic streaming silently completes with empty content"). The proposed
AnthropicStreamValidatorsynthesizes anincomplete_stream_errorevent on missingmessage_stop.The bug class has multiple open user-facing reports:
anthropic-sdk-typescript#842— "Streaming responses consistently interrupted mid-transmission - connection closes without message_stop event"anthropic-sdk-python#1470— "Streaming /v1/messages drops mid-stream with RemoteProtocolError on long code_execution + skills runs"Roo-Code#12079— "write_to_file called without required content parameter" (textbook truncated-tool-call signature)Adjacent (not overlapping) prior work in fantasy
io.ErrUnexpectedEOFasProviderError. That PR explicitly chose not to touch theerrors.Is(err, io.EOF)branch ("the stream handler treatsio.EOFas a clean terminator"), which is exactly the branch this PR adds the Anthropic gate to. Complementary, not overlapping. Worth backporting fix(anthropic/openai/google): wrapio.ErrUnexpectedEOFasProviderErrorcharmbracelet/fantasy#198 separately.No other upstream fantasy PR or issue addresses the Anthropic
message_stopgate (verified: 17 issue searches + 12 PR searches + reading every open PR and every Anthropic-titled PR in the repo + 6 months ofproviders/anthropic/anthropic.gocommits + code search acrosscharmbracelet/*formessage_stop/sawMessageStop/ "stream closed before").Rollout
Once merged on
coder_2_33, the consuming change incoder/coderis a pseudo-version bump ingo.mod. No code changes needed in coder/coder for the Anthropic path: the wrappedio.EOFalready classifies as retryable via the existingchaterror.ClassifytimeoutPatterns, sochatretry.Retry's 25-attempt budget engages automatically. The OpenAI Responses path now exposes the same class of retryable transport-shaped error for consumers that classify EOF.Drive-by
The diff includes two small pre-existing cleanup items in tests:
TestComputerUseToolJSON(a malformedrequire.Contains(...)})line). The Anthropic test file was not gofmt-clean atcoder_2_33HEAD without it.toResponsesPrompthad been updated to return three values, but the test still expected two.Drafting because we'd like to land this on the existing
coder_2_33line and then bump the pin in coder/coder once the SHA is settled.