Conversation
The 0.15.0 streaming pipeline passed [:async, :once, ...] as separate
atoms to :hackney.request/5. Erlang's proplists resolves bare :async as
{:async, true}, putting hackney into push mode; the bare :once is
ignored. This silently forfeited the M-12 backpressure architecture —
:hackney.stream_next/1 is a no-op in push mode, so the receive loop
appeared to work in many cases but the pacing came from the producer.
Cold/slow SSE backends additionally surfaced as :timeout stream errors.
Mirroring the 0.15.1 non-streaming pluggable backend pattern:
- New Nous.HTTP.StreamBackend behaviour with the same per-call → env →
app config → default resolution chain (NOUS_HTTP_STREAM_BACKEND env,
:http_stream_backend app config, :stream_backend per-call opt).
- Nous.HTTP.StreamBackend.Req (default) — Req.post/1 with :into callback.
- Nous.HTTP.StreamBackend.Hackney (opt-in) — strict pull-based mode via
the [{:async, :once}] tuple form per deps/hackney/NEWS.md:269-272.
Verified end-to-end against running LMStudio: both backends produce
deltas through Nous.AgentRunner.run(_, _, stream: true). Hackney's
mailbox stays bounded at 2 messages after 2s idle wait — pull mode is
finally working as documented. Req TTFB is marginally faster (130ms vs
133ms mean) in localhost benchmarks; total stream time within 5%.
Provider stream normalizers consume normalized events and need no
changes. SSE parser helpers (parse_stream_buffer/2, flush_stream_buffer/2,
max_buffer_size/0, ensure_streaming_headers/1) promoted to public
@doc false for backend reuse.
There was a problem hiding this comment.
Pull request overview
Adds a pluggable streaming HTTP backend layer to Nous.Providers.HTTP.stream/4, restoring correct pull-based hackney streaming behavior and making streaming backend selection configurable (mirroring the existing non-streaming backend model).
Changes:
- Introduces
Nous.HTTP.StreamBackendbehaviour withReqas default andHackneyas opt-in. - Refactors
Nous.Providers.HTTP.stream/4to dispatch to the configured stream backend and promotes shared SSE helper functions for backend reuse. - Adds/updates docs, changelog, and tests covering backend resolution and both streaming implementations.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/nous/http/stream_backend_resolution_test.exs | Tests stream-backend precedence (per-call/env/config/default) and env-var parsing/fallback. |
| test/nous/http/stream_backend/req_test.exs | Exercises Req streaming backend (SSE parsing, errors, early halt). |
| test/nous/http/stream_backend/hackney_test.exs | Exercises Hackney streaming backend and includes a regression test intended to guard pull-mode behavior. |
| mix.exs | Bumps version to 0.15.4. |
| lib/nous/providers/http.ex | Switches stream/4 to stream backend dispatch; adds stream-backend resolver; exposes SSE helpers for reuse. |
| lib/nous/http/stream_backend/req.ex | Implements streaming via Req :into callback + Task forwarding to the consumer. |
| lib/nous/http/stream_backend/hackney.ex | Implements streaming via hackney pull mode using [{:async, :once}]. |
| lib/nous/http/stream_backend.ex | Defines the Nous.HTTP.StreamBackend behaviour and selection guidance. |
| README.md | Documents separate non-streaming vs streaming backend configuration and trade-offs. |
| CHANGELOG.md | Documents 0.15.4 additions/fixes and migration guidance. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+109
to
+113
| defp next_chunk(state) do | ||
| receive do | ||
| {ref, :done} when ref == state.ref -> | ||
| {events, _} = HTTP.flush_stream_buffer(state.buffer, state.stream_parser) | ||
|
|
Comment on lines
+334
to
+336
| case parse_sse_buffer(buffer <> "\n\n") do | ||
| {:error, :buffer_overflow} -> {[{:stream_error, %{reason: :buffer_overflow}}], ""} | ||
| result -> result |
Comment on lines
+77
to
+83
| # Regression net for the [{:async, :once}] tuple fix. If a future hackney | ||
| # bump silently changes the option shape, this test should fail loudly: | ||
| # in push mode the receive loop would still get messages but the | ||
| # backpressure property is lost. Here we verify the *messages* shape by | ||
| # asserting the stream completes against a Bypass server that delivers | ||
| # all data in one chunk — works in both push and pull. | ||
| test "request actually goes through hackney pull mode", %{bypass: bypass, url: url} do |
Three Copilot review nits, all valid:
- Req backend: when the spawned Task crashes (Req exception, callback
raise) the parent never received a completion message and waited the
full receive timeout before emitting a misleading :timeout error.
Match `{:DOWN, task_ref, :process, _, reason}` and surface a
`:task_died` stream_error immediately. Track `task.ref` (the monitor
ref Task.async sets up) on state.
- flush_stream_buffer/2: appending the synthetic "\n\n" delimiter for
end-of-stream parsing could push a buffer at exactly @max_buffer_size
two bytes over and trip a false-positive :buffer_overflow even though
received data was within limits. Check input size first and bypass
the public size-checked parser via the same-module private
do_parse_sse_buffer/1.
- Hackney pull-mode regression test was too weak — it asserted
user-agent and event delivery, both of which pass in push mode too
(push and pull deliver the same `{:hackney_response, conn, _}` shape;
the difference is who drives chunk delivery). Replace with a unit
test on the actual regression surface: extract `request_opts/3` and
assert `{:async, :once}` is present as a tuple and bare `:async` /
`:once` atoms are not. This is the only sound guardrail for the
proplist-shape bug — a future hackney bump that changes the option
form will fail this test loudly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
:timeoutstream errors.Nous.HTTP.StreamBackendmirroring the non-streaming pluggable backend from 0.15.1, with Req as the default (symmetry across streaming + non-streaming) and Hackney as opt-in (with the bug fixed) for callers who needstrict pull-based pacing.
Root cause
lib/nous/providers/http.ex(0.15.0–0.15.3) passed[:async, :once, ...]as separate atoms to:hackney.request/5. Erlang'sproplistsresolves bare:asyncas{:async, true}(push mode); the bare:onceis ignored.:hackney.stream_next/1is a no-op in push mode, so the receive loop appeared to work in many cases — chunks arrived in the same{:hackney_response, conn, _}shape — but the pacing came from the producer.The hackney 4 docs (
deps/hackney/NEWS.md:269-272) document the pull form as[{async, once}]— a tuple.Changes
Nous.HTTP.StreamBackendbehaviour — same per-call → env → app config → default resolution chain asNous.HTTP.Backend.Nous.HTTP.StreamBackend.Req(default) —Req.post/1with:intocallback.Nous.HTTP.StreamBackend.Hackney(opt-in) — strict pull-based via[{:async, :once}]tuple.NOUS_HTTP_STREAM_BACKEND(req|hackney|MyApp.MyBackend).config :nous, :http_stream_backend, ....Nous.StreamNormalizer.*) untouched — they consume normalized events.@doc falsepublic for backend reuse.Empirical evidence (localhost LMStudio,
qwen3.5-0.8b-mlx@4bit)stream_next/1[:async, :once, ...](current/broken)[{:async, :once}, ...](fix)stream_next/1)End-to-end through
Nous.AgentRunner.run(_, _, stream: true):"Hello! How can I help?"TTFB benchmark (8 iters + warmup):
All within ~5% — Req is marginally faster on TTFB, not slower.
Test plan
mix compile --warnings-as-errors— cleanmix test— 1640 tests, 0 failurestest/nous/http/stream_backend*(resolution chain + bypass-driven Req + bypass-driven Hackney)Stream.take/2early-exit cleanup on both backendsNOUS_HTTP_STREAM_BACKENDenv var dispatch verified for bothreqandhackneyMigration
No code changes required for callers — default behaviour is restored to "streaming works against any healthy SSE backend." Apps that depend on strict pull-based backpressure should set: