fix(gateway): propagate generation params to inference request by Bhanudahiyaa · Pull Request #1342 · mofa-org/mofa

Bhanudahiyaa · 2026-03-17T21:03:34Z

Summary

This PR ensures OpenAI-compatible generation controls (max_tokens, temperature) are carried through gateway translation into internal InferenceRequest instead of being silently
discarded.

Motivation

The API contract already exposes these parameters, but runtime translation dropped them. That mismatch causes surprising behavior and makes the gateway less trustworthy for
OpenAI-compatible clients.

Fixes: #1341

What changed

Added optional generation fields to internal inference contract:
- InferenceRequest.max_tokens: Option
- InferenceRequest.temperature: Option
Added builder helpers:
- with_max_tokens(...)
- with_temperature(...)
Added conversion helper on OpenAI request type:
- ChatCompletionRequest::to_inference_request(required_memory_mb)
- Copies model/prompt/priority and forwards max_tokens/temperature.
Updated gateway request paths to use this conversion:
- HTTP OpenAI handler
- WebSocket streaming handler
Updated inference bridge path to forward these same generation params.
Added tests to prevent regressions (including backward-compatible deserialize behavior for older request payloads without new fields).

Files changed

crates/mofa-foundation/src/inference/types.rs
crates/mofa-gateway/src/openai_compat/types.rs
crates/mofa-gateway/src/openai_compat/handler.rs
crates/mofa-gateway/src/streaming/ws.rs
crates/mofa-gateway/src/inference_bridge.rs

Design notes / tradeoffs

This PR focuses on propagation correctness, not full provider-side enforcement.
Keeping fields optional preserves backward compatibility.
Centralized conversion (to_inference_request) reduces future drift across multiple gateway entry points.

Tests added/updated

mofa-foundation inference type tests:
- request builder includes new fields
- serde roundtrip includes new fields
- deserializing payloads without new fields defaults to None
mofa-gateway OpenAI type test:
- to_inference_request propagates max_tokens and temperature
Existing openai handler tests still pass with updated request translation path.

Validation run

cargo test -p mofa-foundation --lib inference::types::tests
cargo test -p mofa-gateway --features openai-compat --lib openai_compat::types::tests
cargo test -p mofa-gateway --features openai-compat --lib openai_compat::handler::tests

Checklist

Focused fix for one problem
Propagation added across relevant gateway paths
Regression tests added
No unrelated functional changes
Full workspace fmt/clippy/test gates (can be run in CI / maintainers’ environment as needed)

———

Bhanudahiyaa · 2026-03-17T21:49:00Z

@lijingrs @BH3GEI @yangrudan

This change closes a contract gap between the OpenAI compatible API layer and internal inference orchestration.

Previously, request-level generation controls (max_tokens, temperature) were accepted by the schema but dropped during translation, which produced silent behavior drift. This PR introduces explicit propagation via a centralized conversion helper and extends InferenceRequest with optional fields to maintain backward compatibility.

Question for maintainers

Should we follow up with explicit validation ranges at the gateway boundary (e.g., temperature bounds), or leave normalization/validation to downstream provider adapters?

fix(gateway): propagate generation params to inference request

2ae8f89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gateway): propagate generation params to inference request#1342

fix(gateway): propagate generation params to inference request#1342
Bhanudahiyaa wants to merge 1 commit intomofa-org:mainfrom
Bhanudahiyaa:fix/openai-generation-param-propagation

Bhanudahiyaa commented Mar 17, 2026 •

edited

Loading

Uh oh!

Bhanudahiyaa commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Bhanudahiyaa commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Fixes: #1341

What changed

Files changed

Design notes / tradeoffs

Tests added/updated

Validation run

Checklist

Uh oh!

Bhanudahiyaa commented Mar 17, 2026

Question for maintainers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bhanudahiyaa commented Mar 17, 2026 •

edited

Loading