fix: classify no-usage-stats log by stream completion state#519
fix: classify no-usage-stats log by stream completion state#519Evrard-Nil merged 2 commits intomainfrom
Conversation
Client disconnects and provider errors are expected to lack usage stats since external providers (OpenAI, Anthropic) only send usage in the final SSE chunk. These now log at WARN level instead of ERROR. Only truly completed streams that lack usage stats remain ERROR, which indicates an actual provider bug worth investigating. Reduces ~12k false-positive ERROR logs/day in prod.
ReviewThe fix is correct and well-scoped. The One minor observation: The (Some(usage), None) => {
if !self.stream_completed {
tracing::warn!(..., "Stream interrupted after receiving usage stats but before chat_id");
} else {
tracing::error!(..., "Stream completed but no chat_id available");
}
return;
}This is minor and may be less common in practice, but worth considering for full consistency. Otherwise the change is clean — good use of structured logging with ✅ Approved (minor suggestion above is non-blocking) |
There was a problem hiding this comment.
Code Review
This pull request updates the .gitignore file to exclude reproduction scripts and refines the logging logic within the completions service. The changes distinguish between expected stream interruptions, such as client disconnects or provider errors, and cases where a stream completes normally but lacks required usage statistics or chat IDs. Interruptions are now logged as warnings instead of errors to reduce noise. I have no feedback to provide as there were no review comments.
There was a problem hiding this comment.
Pull request overview
Adjusts streaming drop logging to reduce false-positive ERROR volume by classifying “no usage stats” cases based on whether the stream completed normally, aligning operational alerting with expected client disconnect/provider interruption behavior.
Changes:
- Downgrade “no usage stats” logs to WARN when
stream_completed = false; keep ERROR only when the stream completed but usage is still missing. - Add additional structured field (
stream_error) to WARN logs to distinguish provider error vs. client disconnect. - Ignore local reproduction shell scripts via
.gitignore.
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
crates/services/src/completions/mod.rs |
Classifies missing-usage logs by stream_completed and logs WARN vs. ERROR accordingly. |
.gitignore |
Adds an ignore pattern for repro shell scripts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
stream_completed = true) but still lack usage — indicating a real provider bugRoot cause
External providers (OpenAI, Anthropic) only send usage stats in the final SSE chunk. When a client disconnects mid-stream (closes browser, network timeout), the
InterceptStreamis dropped before receiving the final chunk. The existing code logged ERROR for all cases without checkingstream_completed.Reproduction steps
See
repro_no_usage_stats.sh(gitignored) for the full reproduction script.Test plan
cargo checkcompiles cleanlycargo test --lib --bins)🤖 Generated with Claude Code