Skip to content

feat(embedder): configurable request timeout and max retries#257

Open
kryptt wants to merge 1 commit into
yoanbernabeu:mainfrom
kryptt:pr/embedder-timeout
Open

feat(embedder): configurable request timeout and max retries#257
kryptt wants to merge 1 commit into
yoanbernabeu:mainfrom
kryptt:pr/embedder-timeout

Conversation

@kryptt

@kryptt kryptt commented May 28, 2026

Copy link
Copy Markdown

Problem

The HTTP client timeout (60s) and retry max-attempts (5) used by the embedders are compiled-in constants. They're well-tuned for OpenAI's hosted API but break down against slow self-hosted endpoints — a shared `ollama-cuda` instance handling other inference jobs can easily take > 60s to return a single batch's embeddings, especially under model warm-up or memory pressure.

Symptom I hit while building a large workspace through a self-hosted Ollama:

```
Warning: failed to initialize runtime for emacs-source: initial indexing failed:
failed to embed batches: failed to send request to OpenAI:
Post "https://ollama.hr-home.xyz/v1/embeddings\":
context deadline exceeded (Client.Timeout exceeded while awaiting headers)
```

The watcher abandons the whole project on the first batch that exceeds 60s, even though the underlying queue is still making forward progress on the server side. Larger projects (in my case ~47k chunks across emacs-source) become un-indexable end-to-end.

Solution

Two new optional fields on `EmbedderConfig`:

```yaml
embedder:
provider: openai
request_timeout_seconds: 600 # default: 0 = preserve historical 60s
max_retries: 8 # default: 0 = preserve historical 5
```

  • `request_timeout_seconds` becomes the `http.Client.Timeout` for the chosen embedder. Each of the five embedders (openai / ollama / lmstudio / openrouter / synthetic) gets a `With…Timeout` option; the factory threads the value through. Synthetic's existing 90s default is preserved when the field is unset.
  • `max_retries` is honored by providers that already implement retry today — only `openai` does, so the field's docstring calls this out. Other providers silently ignore it (no behavior change). For openai, the value caps `RetryPolicy.MaxAttempts`.

Both fields are zero-valued by default and *explicitly preserve existing behavior* in that case. A bare `embedder:` block produces the same embedder as before this PR.

Test plan

Two new tests in `embedder/factory_test.go`:

Test Asserts
`TestNewFromConfig_RequestTimeoutAndMaxRetries` `request_timeout_seconds: 300` results in 5-minute `client.Timeout`; `max_retries: 9` results in `retryPolicy.MaxAttempts == 9`
`TestNewFromConfig_TimeoutDefaultsPreserved` with both fields unset, defaults stay at 60s / 5 attempts

All existing tests continue to pass (`go test ./...`).

Two new optional EmbedderConfig fields:

  request_timeout_seconds: HTTP client timeout per embedding request.
    Defaults preserved (60s, except synthetic which keeps its 90s).
    Useful when running against slow self-hosted endpoints where a
    full batch can take longer than a minute, leading to spurious
    "context deadline exceeded" errors mid-scan.

  max_retries: caps RetryPolicy.MaxAttempts for transient (429/5xx)
    failures. Default preserved at 5. Only honored by providers that
    implement retry today (openai); other providers silently ignore.

Adds With…Timeout option to all five embedders and WithOpenAIMaxRetries
to the openai embedder. Factory threads both through. Tests cover the
override path and the default-preservation path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant