Conversation
…0.15.5) Two small changes in one commit: 1. Both Req-based HTTP backends (non-streaming and streaming) now actually use the supervised Nous.Finch pool (started by Nous.Application with size: 10, count: 1). Previously they ignored the :finch_name opt that Nous.Provider built and let Req spin up its own default Finch instance — the named pool sat idle. Both backends now read :finch_name from per-call opts, falling back to Application.get_env(:nous, :finch, Nous.Finch). Side note: Req disallows passing both :finch and :connect_options, so connect timeouts are now pool-level (configure on the Finch pool itself if a non-default is needed; receive timeouts still take per-call). 2. Default timeouts bumped from 60s to 180s (3 min) across the HTTP layer, model defaults, and provider backstops. The 60s default was tripping on reasoning models and longer completions. Local provider defaults stay at 120s (lmstudio/ollama/vllm/sglang), llamacpp moves to 5 min for cold-start weights, streaming timeouts move to 5 min. Per-call :timeout and :receive_timeout continue to override. Both changes verified end-to-end against running LMStudio: 13/13 pass on a smoke matrix covering both backends in non-streaming and streaming modes (env var dispatch, per-call opt, default), agent-level streaming (Nous.AgentRunner.run/3 with stream: true), Hackney backpressure (mailbox stays at 2 msgs after 2s no-drain), and Stream.take/2 early-exit cleanup. Full mix test: 1640/1640 green.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…0.15.5)
Two small changes in one commit:
Both Req-based HTTP backends (non-streaming and streaming) now actually use the supervised Nous.Finch pool (started by Nous.Application with size: 10, count: 1). Previously they ignored the :finch_name opt that Nous.Provider built and let Req spin up its own default Finch instance — the named pool sat idle. Both backends now read :finch_name from per-call opts, falling back to Application.get_env(:nous, :finch, Nous.Finch). Side note: Req disallows passing both :finch and :connect_options, so connect timeouts are now pool-level (configure on the Finch pool itself if a non-default is needed; receive timeouts still take per-call).
Default timeouts bumped from 60s to 180s (3 min) across the HTTP layer, model defaults, and provider backstops. The 60s default was tripping on reasoning models and longer completions. Local provider defaults stay at 120s (lmstudio/ollama/vllm/sglang), llamacpp moves to 5 min for cold-start weights, streaming timeouts move to 5 min. Per-call :timeout and :receive_timeout continue to override.
Both changes verified end-to-end against running LMStudio: 13/13 pass on a smoke matrix covering both backends in non-streaming and streaming modes (env var dispatch, per-call opt, default), agent-level streaming (Nous.AgentRunner.run/3 with stream: true), Hackney backpressure (mailbox stays at 2 msgs after 2s no-drain), and Stream.take/2 early-exit cleanup. Full mix test: 1640/1640 green.