Skip to content

fix: Vertex/Gemini whitespace crash + surface retry-after hints (0.15.8)#55

Merged
nyo16 merged 3 commits intomasterfrom
fix/vertex-whitespace-and-retry-info
May 6, 2026
Merged

fix: Vertex/Gemini whitespace crash + surface retry-after hints (0.15.8)#55
nyo16 merged 3 commits intomasterfrom
fix/vertex-whitespace-and-retry-info

Conversation

@nyo16
Copy link
Copy Markdown
Owner

@nyo16 nyo16 commented May 6, 2026

Summary

Fixes a production crash in Vertex AI / Gemini response parsing and adds first-class support for server-suggested retry-after hints across all providers.

The crash

Vertex/Gemini occasionally return text parts whose content is only whitespace (e.g. "\n\n\n") — typically between tool calls or as filler when the model is partially blocked. Ecto's default :empty_values for cast/3 treats
whitespace-only strings as empty, so Nous.Message.ContentPart's changeset dropped the content field entirely and then raised:

** (Ecto.InvalidChangesetError) could not perform  because changeset is invalid.

Errors

    %{content: [{"content is required", []}]}

Params

    %{"content" => "\n\n\n", "type" => :text}

…taking down the whole Nous.LLM.run_with_tools/6 call from inside background jobs.

What this PR does

Fixes

  • Nous.Message.ContentPart overrides :empty_values to [""] so legitimate whitespace content is preserved.
  • Nous.Messages.Gemini.parse_content/1 defensively skips whitespace-only text parts (matching the existing guard in Nous.StreamNormalizer.Gemini).
  • Nous.Messages.Gemini.parse_content/1 no longer silently drops nullary functionCall parts (those without args); pattern now requires only name and falls back to %{}.

Adds — retry-after surfacing

  • New Nous.Errors.RetryInfo parses google.rpc.RetryInfo from Vertex/Gemini error bodies and the standard Retry-After header into milliseconds. Returns nil when the server gives no hint — itself meaningful for Google APIs,
    since long-term/daily quota exhaustion deliberately omits RetryInfo to discourage retry loops.

  • Nous.Errors.ProviderError gains :retry_after_ms alongside the existing :status_code. Both fields are populated automatically by Nous.Provider.request/3 and request_stream/3 when the HTTP layer returns an error tuple.
    Callers can now branch on rate limits without parsing provider-specific bodies:

    case Nous.LLM.run_with_tools(...) do
      {:error, %Nous.Errors.ProviderError{retry_after_ms: ms}} when is_integer(ms) ->
        {:snooze, ms}                     # server-suggested delay
      {:error, %Nous.Errors.ProviderError{status_code: 429}} ->
        {:snooze, exp_backoff(attempt)}   # rate-limited, no hint
      ...
    end

Adds — Gemini diagnostics

  • Nous.Messages.Gemini.from_response/1 captures finishReason and promptFeedback into message.metadata and emits a Logger.warning when content is empty for non-STOP reasons (SAFETY, RECITATION, MAX_TOKENS, etc.) or a
    prompt block. Previously these signals were discarded, so blocked generations manifested as silent empty messages.

Changes — HTTP error shape

  • Nous.HTTP.Backend.Req, Nous.HTTP.Backend.Hackney, and Nous.HTTP.StreamBackend.Req now return {:error, %{status, body, headers}} instead of %{status, body}, surfacing response headers so Retry-After can be parsed.
    headers is a list of {name, value} tuples. Existing pattern matches on %{status: _, body: _} continue to work — map matching is non-exhaustive.

Changes — tool-call IDs

  • Unified Gemini tool-call ID generation. The previous parse_content/1 used "gemini_#{:rand.uniform(10_000)}" (~50% birthday-paradox collision at ~118 calls) while parse_parts/1 used "call_#{:rand.uniform(1_000_000)}". Both
    now share a generate_tool_call_id/0 helper using 64 bits of :crypto.strong_rand_bytes/1, base64url-encoded.

Test plan

  • mix test — 1674 tests, 0 failures (24 new tests).
  • New Nous.Errors.RetryInfoTest — 16 tests covering body parsing (google.rpc.RetryInfo, fractional durations, missing/malformed entries), header parsing (Retry-After integer, case-insensitive, non-integer/HTTP-date
    rejection, negative values), precedence (body wins over header), and edge cases (non-map input, empty maps).
  • Extended Nous.HTTP.BackendTest against both Req and Hackney via Bypass: verifies Retry-After header round-trips through RetryInfo, and that a Vertex-shaped 429 body with RetryInfo is parsed end-to-end.
  • Extended Nous.ProviderTest: request/3 correctly populates :status_code from HTTP error tuples, :retry_after_ms from both body and header sources, and leaves both nil for transport errors.
  • Regression tests in Nous.MessagesGeminiTest: whitespace-only text parts are skipped (the original Vertex "\n\n\n" payload), finishReason and promptFeedback land in metadata, functionCall without args is parsed.
  • Regression test in Nous.Message.ContentPartTest: ContentPart.text("\n\n\n") no longer raises; nil and "" content are still rejected.
  • mix compile --warnings-as-errors --force clean.

Migration notes

  • HTTP error tuple shape change is additive. Code matching %{status: _, body: _} is unchanged. Code matching %{status: _, body: _} = err and reaching for Map.keys(err) will see an extra :headers key.
  • ProviderError.retry_after_ms is new; existing pattern matches on the struct continue to compile because all fields default to nil.

nyo16 added 2 commits May 6, 2026 11:38
- ContentPart.new/1: Ecto's default :empty_values trimmed whitespace,
  which crashed ContentPart.text/1 on legitimate Gemini text parts
  containing only newlines. Override empty_values to [""] so "\n\n\n"
  is preserved.
- Messages.Gemini.parse_content/1: skip whitespace-only text parts
  defensively, capture finishReason and promptFeedback in metadata,
  log a warning when content is empty for non-STOP reasons (SAFETY,
  RECITATION, MAX_TOKENS) so blocked generations stop being silent.
  Also handle functionCall without args, and unify tool-call ID
  generation to a single 64-bit-entropy helper.
- Add Nous.Errors.RetryInfo: parses google.rpc.RetryInfo from error
  body and Retry-After header into milliseconds.
- ProviderError gains :retry_after_ms; Provider.request/3 now populates
  :status_code and :retry_after_ms from HTTP error tuples.
- HTTP backends (Req, Hackney, stream Req) surface response headers in
  error tuples so RetryInfo can extract Retry-After.
@nyo16 nyo16 changed the title fix: Vertex/Gemini whitespace crash + surface retry-after hints fix: Vertex/Gemini whitespace crash + surface retry-after hints (0.15.8) May 6, 2026
Req returns headers as %{binary() => [binary()]} unconditionally and
hackney returns them as a list — the is_list / catch-all fallbacks I
added were flagged as pattern_match_cov.
@nyo16 nyo16 merged commit f97aab2 into master May 6, 2026
6 checks passed
@nyo16 nyo16 deleted the fix/vertex-whitespace-and-retry-info branch May 6, 2026 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant