OpenAI backend: drop unsupported params (top_p/top_k) and add backend docs #138
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR improves the OpenAI backend in UltraRAG by:
o3-mini) by dropping parameters that those models do not accept.Motivation
When running
examples/rag_full.yamlwith the OpenAI backend and newer models, I hit repeated 400 errors from the OpenAI Chat Completions API such as:Unknown parameter: 'chat_template_kwargs'.Unsupported parameter: 'top_p' is not supported with this model.These errors occurred during the
generation.generatestep when callingclient.chat.completions.create. They are not specific to my environment; they arise from the OpenAI API rejecting parameters that are either internal to local backends or not supported by certain models (notably reasoning models).The goal of this PR is to make the OpenAI backend "just work" in these cases, while keeping behavior simple and predictable.
Changes
File:
servers/generation/src/generation.pyFor
backend == "openai":chat_template_kwargsto OpenAI.top_pandtop_kfrom sampling parameters for OpenAI backends to avoidunsupported_parametererrors from models likeo3-mini.max_tokens→max_completion_tokensfor backwards compatibility.The behavior for vLLM and HF backends is unchanged.
File:
docs/openai-backend.mdAdds a new documentation page that covers:
LLM_API_KEY,RETRIEVER_API_KEY).servers/generation/parameter.yamlforbackend: openai.o3-mini) do not supporttop_p, and that UltraRAG dropschat_template_kwargs,top_pandtop_kfor OpenAI backends.Impact and compatibility
top_p/top_kare configured in YAML.top_p/top_k, the current behavior is conservative: those params are not sent for OpenAI backends. If there is interest in exposing them conditionally per model, I'm happy to adjust based on maintainer feedback.Testing
ultrarag buildandultrarag runwithexamples/rag_full.yamlusing the OpenAI backend.chat_template_kwargs,top_p) no longer occur and the pipeline runs to completion.If you'd like additional tests or want the docs integrated into a specific docs navigation system, I'm happy to follow your conventions.