Skip to content

fix(provider): tolerate duplicate reasoning_content key from NVIDIA compat endpoint (TAURI-RUST-85R)#4227

Closed
rainbowpuffpuff wants to merge 2 commits into
tinyhumansai:mainfrom
rainbowpuffpuff:fix/4204-nvidia-duplicate-reasoning-content
Closed

fix(provider): tolerate duplicate reasoning_content key from NVIDIA compat endpoint (TAURI-RUST-85R)#4227
rainbowpuffpuff wants to merge 2 commits into
tinyhumansai:mainfrom
rainbowpuffpuff:fix/4204-nvidia-duplicate-reasoning-content

Conversation

@rainbowpuffpuff

@rainbowpuffpuff rainbowpuffpuff commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Summary

NVIDIA's integrate.api.nvidia.com OpenAI-compatible endpoint returns the
reasoning_content key twice in a single message object for some thinking
models (e.g. stepfun-ai/step-3.7-flash). ResponseMessage and StreamDelta
deserialize through an inner Shadow struct using serde's derived deserializer,
which strict-rejects a repeated key with duplicate field reasoning_content and
drops the entire completion (Sentry TAURI-RUST-85R — 2,037 events / 5 users, still
firing on 0.57.18). #[serde(default)] does not relax duplicate-key rejection, and
#3547 only fixed the distinct-name collision (reasoning + reasoning_content),
not two identical keys.

Fix

Replace the Shadow structs in both ResponseMessage::deserialize and
StreamDelta::deserialize with hand-written visit_map visitors that consume map
entries manually. Repeated keys no longer error (last value wins — standard JSON
object semantics), and the canonical reasoning_content still wins over the
reasoning alias (#3547 preserved). Applies to both the buffered and SSE paths.

Tests

  • duplicate_reasoning_content_in_response_message_does_not_error
  • duplicate_reasoning_content_in_stream_delta_does_not_error
  • duplicate_reasoning_content_still_beats_reasoning_alias

Existing #3547 tests continue to pass.

Fixes #4204

Summary by CodeRabbit

  • Bug Fixes
    • Improved compatibility with OpenAI-compatible responses that may include repeated JSON fields.
    • Message and streaming parsing now tolerates duplicated reasoning_content, using the last occurrence.
    • Preserved fallback behavior so reasoning is used when reasoning_content is absent.
  • Tests
    • Added regression coverage for duplicated reasoning_content in both non-streaming and streaming responses, including precedence when reasoning and duplicated reasoning_content are both present.

…ompat endpoint (TAURI-RUST-85R)

NVIDIA's integrate.api.nvidia.com OpenAI-compatible endpoint returns the
`reasoning_content` key twice in a single `message` object for some thinking
models (e.g. stepfun-ai/step-3.7-flash). The derived `Shadow` deserializer used
by ResponseMessage and StreamDelta strict-rejected the repeated key with
`duplicate field reasoning_content`, dropping the entire completion.

Replace both Shadow structs with hand-folded `visit_map` visitors that consume
map entries manually, so a repeated key no longer errors (last value wins,
standard JSON object semantics). The tinyhumansai#3547 behaviour is preserved: the canonical
`reasoning_content` still wins over the `reasoning` alias. Applies to both the
buffered and SSE streaming paths.

Adds regression tests covering a doubled `reasoning_content` on the buffered and
streaming paths, plus the doubled-canonical-beats-alias case.

Fixes tinyhumansai#4204
@rainbowpuffpuff rainbowpuffpuff requested a review from a team June 27, 2026 04:54
@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 840663f4-ab1e-4639-9808-8de21b2571c0

📥 Commits

Reviewing files that changed from the base of the PR and between b44779f and 11e50a1.

📒 Files selected for processing (2)
  • src/openhuman/inference/provider/compatible_tests.rs
  • src/openhuman/inference/provider/compatible_types.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/openhuman/inference/provider/compatible_tests.rs
  • src/openhuman/inference/provider/compatible_types.rs

📝 Walkthrough

Walkthrough

This PR updates OpenAI-compatible deserialization for ResponseMessage and StreamDelta to accept duplicated reasoning_content keys. It preserves canonical reasoning_content over reasoning and adds regression tests for message and streaming delta parsing.

Changes

Duplicate reasoning_content handling

Layer / File(s) Summary
ResponseMessage manual deserializer
src/openhuman/inference/provider/compatible_types.rs
ResponseMessage now deserializes map entries with a manual visitor that accepts repeated keys, ignores unknown fields, and keeps reasoning_content ahead of reasoning.
StreamDelta manual deserializer
src/openhuman/inference/provider/compatible_types.rs
StreamDelta uses the same manual visitor pattern for repeated keys, unknown fields, and reasoning_content versus reasoning folding.
Duplicate-key regressions
src/openhuman/inference/provider/compatible_tests.rs
Tests cover duplicated reasoning_content in message and delta payloads and the precedence case where canonical reasoning_content overrides the alias.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • tinyhumansai/openhuman#3552: Updates the same ResponseMessage and StreamDelta deserialization path for reasoning_content and reasoning folding.

Poem

🐇 I sniffed out a key tucked twice in the clay,
Now message and delta both hop through just fine today.
The last leaf of reasoning_content wins the race,
While aliases stay snug in their proper place.
Hoppy JSON parsing! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: tolerating duplicate reasoning_content keys in NVIDIA compat parsing.
Linked Issues check ✅ Passed The PR matches issue #4204 by updating both ResponseMessage and StreamDelta to tolerate duplicate reasoning_content and adding regression tests.
Out of Scope Changes check ✅ Passed The changes stay within the parsing fix and related regression tests; no unrelated scope is evident.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/openhuman/inference/provider/compatible_types.rs (1)

547-606: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚖️ Poor tradeoff

Module exceeds the ~500-line size guideline.

compatible_types.rs now runs past 600 lines, and the two near-identical hand-folded visitors (ResponseMessageVisitor / StreamDeltaVisitor) add to it while duplicating the last-wins + reasoning_content-over-reasoning precedence logic that must stay in sync. Consider splitting this file along the canonical module shape and/or factoring the shared map-folding precedence logic into a small helper or macro_rules! to keep both deserializers from drifting.

As per coding guidelines: "Rust modules must be ≤ ~500 lines in size".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/inference/provider/compatible_types.rs` around lines 547 - 606,
The issue is that compatible_types.rs has grown beyond the module size guideline
and duplicates the same hand-folded deserialization precedence logic in
ResponseMessageVisitor and StreamDeltaVisitor. Split the module into smaller
canonical pieces and factor the shared “last value wins” plus
reasoning_content-over-reasoning merge behavior into a common helper or small
macro, then use that shared logic from both deserializers so they stay
consistent and the file size drops below the limit.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/inference/provider/compatible_types.rs`:
- Around line 547-606: The issue is that compatible_types.rs has grown beyond
the module size guideline and duplicates the same hand-folded deserialization
precedence logic in ResponseMessageVisitor and StreamDeltaVisitor. Split the
module into smaller canonical pieces and factor the shared “last value wins”
plus reasoning_content-over-reasoning merge behavior into a common helper or
small macro, then use that shared logic from both deserializers so they stay
consistent and the file size drops below the limit.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7978e785-6979-4343-83ea-d38fede6dd56

📥 Commits

Reviewing files that changed from the base of the PR and between 5a41a4f and b44779f.

📒 Files selected for processing (2)
  • src/openhuman/inference/provider/compatible_tests.rs
  • src/openhuman/inference/provider/compatible_types.rs

coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 27, 2026
…hanges

Ran `cargo fmt --all` over the two files touched by this PR: collapse the short
match arms in the ResponseMessage visitor to single lines and rewrap the test's
serde_json::from_str(...).expect(...). Pure formatting, no logic change.

Refs tinyhumansai#4204
@rainbowpuffpuff

Copy link
Copy Markdown
Contributor Author

Closing as a duplicate of #4207, which fixes #4204 (TAURI-RUST-85R) with the same map-fold Visitor approach and predates this PR — and additionally handles the last-non-null-wins edge case. Thanks @M3gA-Mind!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nvidia parse: tolerate duplicate reasoning_content key (TAURI-RUST-85R)

1 participant