Add retry logic for streaming connection failures

## Problem

ReqLLM streaming operations using `Finch.stream()` fail when connections are closed by the server during idle periods. This is particularly problematic for LLM workflows that have long thinking times between streaming chunks.

## Current Behavior

When a streaming connection is closed (e.g., due to server-side idle timeout), the error is immediately propagated to the caller without retry:
- `{:error, %Mint.TransportError{reason: :closed}}`
- `{:error, %Mint.TransportError{reason: :timeout}}`
- `{:error, %Mint.TransportError{reason: :econnrefused}}`

This occurs in `ReqLLM.Streaming.FinchClient.start_streaming_task/6` where `Finch.stream/5` is called directly without retry logic.

## Expected Behavior

Streaming operations should automatically retry on transient connection errors, similar to:
1. How Req handles retries with `retry: :transient` option
2. How `ReqLLM.Step.Retry` already handles non-streaming requests
3. The pattern used in langchain: https://github.com/brainlid/langchain/pull/329

## Proposed Solution

Implement automatic retry logic at the Finch streaming layer by:

1. **Create retry module** (`ReqLLM.Streaming.Retry`) that wraps `Finch.stream/5` calls
2. **Detect retryable errors**: `:closed`, `:timeout`, `:econnrefused` 
3. **Configurable retries**: Default 3 max attempts (consistent with `ReqLLM.Step.Retry`)
4. **Immediate retry**: 0ms delay (can add backoff later)
5. **Configuration options**:
   - `streaming_max_retries` (default: 3)
   - `streaming_retry_delay` (default: 0)

## Implementation Details

### Files to Modify

**New file:** `lib/req_llm/streaming/retry.ex`
- Implement `stream_with_retry/4` function
- Detect retryable vs non-retryable errors
- Handle retry loop with configurable max attempts

**Modify:** `lib/req_llm/streaming/finch_client.ex`
- Wrap `Finch.stream/5` call with retry logic (lines 152-226)
- Pass retry configuration from options or application config

**Modify:** `lib/req_llm/provider/options.ex`
- Add `streaming_max_retries` and `streaming_retry_delay` to schema

### Example Usage

```elixir
# Application config
config :req_llm,
  streaming_max_retries: 3,
  streaming_retry_delay: 0

# Per-request config
{:ok, response} = ReqLLM.stream_text(
  model,
  messages,
  streaming_max_retries: 5,
  streaming_retry_delay: 100
)
```

## References

- Langchain PR with similar fix: https://github.com/brainlid/langchain/pull/329
- Existing non-streaming retry: `ReqLLM.Step.Retry` module
- Req retry documentation: https://hexdocs.pm/req/Req.Steps.html#retry/1

## Environment

- ReqLLM version: 1.2.0
- Issue occurs specifically during streaming operations (not non-streaming)
- Affects all providers using Finch for streaming (Anthropic, OpenAI, etc.)

## Additional Context

This issue becomes more critical with extended thinking models (Claude Sonnet 4.5 with thinking) where the model may spend 30+ seconds in the thinking phase without sending chunks, making idle connection timeouts more likely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add retry logic for streaming connection failures #314

Problem

Current Behavior

Expected Behavior

Proposed Solution

Implementation Details

Files to Modify

Example Usage

References

Environment

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add retry logic for streaming connection failures #314

Description

Problem

Current Behavior

Expected Behavior

Proposed Solution

Implementation Details

Files to Modify

Example Usage

References

Environment

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions