Skip to content

Custom embeddings endpoint with no /embeddings API floods Sentry (unclassified 4xx) + no save-time validation #3625

Description

@oxoxDev

Source

Sentry (self-hosted): TAURI-RUST-5JR — `Embedding API error (404 Not Found):`
Events: 2685 · Users: 9 · First seen: 2026-05-28 · Last seen: 2026-06-12
Release: openhuman@0.57.18 · OS: Windows

Sibling: TAURI-RUST-4SA — `Embedding API error (400 Bad Request): "maximum input length is 8192 tokens"` (2761 events / 3 users, text-embedding-3-large). Same call site + flood mechanism; its root cause is already fixed on main by #3598 (`cap_embed_text` → 7500-token cap before `embed_batch`). 4SA needs only resolve-in-next-release + the shared classifier arm below as defense-in-depth.

Symptom

A user pointed the Custom (OpenAI-compatible) embeddings provider at a base URL whose host has no embeddings API (here DeepSeek, `https://api.deepseek.com/v1\` — chat-only). Stored as `embedding_provider = "custom:https://api.deepseek.com/v1"\`, `model = "DeepSeek"`. Every memory re-embed POSTs `/v1/embeddings` → permanent 404, re-emitted on every sync → 2685 Sentry events.

Root cause

  1. No save-time validation — `embeddings::rpc::update_settings` persists any provider/endpoint string. `test_connection` (a real embed probe) exists but the save path never calls it, so a no-embeddings endpoint is accepted silently.
  2. Unclassified flood — `embeddings/openai.rs` routes the non-2xx through `report_error_or_expected("embeddings","openai_embed")`. The `config_rejection.rs` classifier has 404 arms for chat/inference (`OpenHuman API error`, `custom_openai API error`, `model_not_found`) but none match `Embedding API error (4xx)`, so it fires a Sentry error on every retry instead of demoting + surfacing.
  3. No backpressure — a deterministic, never-self-healing config 404 keeps getting retried/re-emitted every sync with no circuit-break and no user-facing actionable state.

Stack (top in-app frame)

`src/openhuman/embeddings/openai.rs:210` — non-2xx → `report_error_or_expected(..., "embeddings", "openai_embed", ...)`

Reproduces on

  • Branch: upstream/main @ `a93fc6730` (confirmed — path + classifier + missing-validation all unchanged from the 0.57.18 release build)
  • Steps: Settings → Memory → embeddings provider = Custom (OpenAI-compatible), endpoint = `https://api.deepseek.com/v1\` (any chat-only OpenAI-compatible base), save; trigger memory sync → 404 flood.

Proposed fix (prevent + handle — mirrors #3360)

  • Prevent (root cause): gate `update_settings` save on a connectivity probe (reuse `test_connection`) for custom endpoints; reject/warn so a no-embeddings endpoint never persists.
  • Handle (defense-in-depth): add an `Embedding API error (4xx)` config-rejection arm to the classifier (demote error→info) + rewrite the opaque body into an actionable "this endpoint has no embeddings API — pick an embeddings-capable provider in Settings → Memory". Optional circuit-break to stop re-embed hammering a known-bad config.

This handling layer also closes the TAURI-RUST-4SA residual.

Bug shape

Preventable user-state (capability/endpoint mismatch) — same family as #3360 (Ollama embedding-model-as-chat) and TAURI-RUST-35.

Sentry-Issue: TAURI-RUST-5JR (self-hosted id 6334)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions