feat(openai): add per-request reasoning_effort override for GPT-5.1 #617

dragon1086 · 2025-11-20T15:21:00Z

Summary

Add per-request reasoning_effort parameter override to support mixing reasoning and non-reasoning modes in multi-agent applications with GPT-5.1.

Problem

With GPT-5.1's release, OpenAI added reasoning_effort='none' for non-reasoning mode. However, mcp-agent only supported one global reasoning_effort setting per application.

Real-world issue:

My multi-agent project uses GPT-5 with reasoning for analysis agents
Simple utility agents used GPT-4.1 for fast responses
When upgrading utility agents to GPT-5.1, the global reasoning_effort setting created conflicts
Couldn't use GPT-5.1 with reasoning_effort='none' for some agents and reasoning_effort='high' for others

Solution

Allow reasoning_effort to be set per-request via RequestParams, with fallback to config default.

Usage:

# Complex reasoning
await llm.generate_str(
    message="Analyze...",
    request_params=RequestParams(
        model="gpt-5.1",
        reasoning_effort="high"
    )
)

# Fast, no reasoning
await llm.generate_str(
    message="Format...",
    request_params=RequestParams(
        model="gpt-5.1",
        reasoning_effort="none"
    )
)

Changes

Add reasoning_effort field to RequestParams with Literal["none", "low", "medium", "high"]
Support 'none' value in OpenAISettings config
Implement fallback: request param → config → default ('medium')
Add 5 unit tests (145+ lines)
Update example demonstrating GPT-5.1 with reasoning_effort='none'
Update JSON schema

Testing

Backward Compatibility

✅ Fully backward compatible - existing applications work without changes.

Summary by CodeRabbit

New Features
- Added "none" as a new reasoning effort option (in addition to "low", "medium", "high").
- Enabled per-request override of reasoning effort settings, allowing individual requests to specify their own reasoning effort independent of the default configuration.
Tests
- Added comprehensive test coverage for reasoning effort behavior across different model types and configuration scenarios.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Add ability to dynamically control reasoning_effort parameter on a per-request basis for OpenAI reasoning models (o1/o3/o4/gpt-5 series). Changes: - Add reasoning_effort field to RequestParams with Literal type hints - Support 'none', 'low', 'medium', 'high' values in config and runtime - Implement fallback logic: request param -> config -> default ('medium') - Add comprehensive unit tests for reasoning_effort behavior - Update example to demonstrate gpt-5.1 with reasoning_effort override - Update JSON schema to include 'none' value The reasoning_effort parameter is OpenAI-specific and only applies to reasoning models. Non-reasoning models ignore this parameter. Tests: Added 5 new test cases covering all reasoning_effort scenarios

coderabbitai · 2025-11-20T15:21:12Z

Walkthrough

The changes introduce per-request specification of OpenAI reasoning effort, adding a new optional reasoning_effort parameter to RequestParams that overrides the default configuration setting. The reasoning effort enum is expanded to include "none" across configuration, request parameters, and test coverage.

Changes

Cohort / File(s)	Summary
Configuration Schema Updates `schema/mcp-agent.config.schema.json`, `src/mcp_agent/config.py`	Expanded `OpenAISettings.reasoning_effort` enum to include "none" alongside existing ["low", "medium", "high"] values in both JSON schema and Python config class.
Request Parameters & LLM Core `src/mcp_agent/workflows/llm/augmented_llm.py`, `src/mcp_agent/workflows/llm/augmented_llm_openai.py`	Added optional `reasoning_effort` field to `RequestParams` with type `Literal["none", "low", "medium", "high"]`; modified OpenAI generate methods to use request-level `reasoning_effort` with fallback to configured default.
Example Usage `examples/basic/functions/main.py`	Demonstrates per-request reasoning effort override by passing `RequestParams(model="gpt-5.1", reasoning_effort="none")` to generate_str.
Test Coverage `tests/workflows/llm/test_augmented_llm_openai.py`	Added comprehensive parameterized unit tests validating reasoning_effort payload inclusion, model-specific behavior (reasoning vs. non-reasoning models), fallback defaults, and various enum value propagation.

Sequence Diagram

sequenceDiagram
    participant User
    participant augmented_llm as AugmentedLLM
    participant augmented_llm_openai as OpenAIAugmentedLLM
    participant OpenAI as OpenAI API

    User->>augmented_llm: generate_str(request_params=RequestParams(reasoning_effort="none"))
    augmented_llm->>augmented_llm_openai: Forward request_params
    augmented_llm_openai->>augmented_llm_openai: Check if params.reasoning_effort provided
    alt Request-level reasoning_effort exists
        augmented_llm_openai->>augmented_llm_openai: Use params.reasoning_effort
    else Fallback to default
        augmented_llm_openai->>augmented_llm_openai: Use self._reasoning_effort ("medium")
    end
    augmented_llm_openai->>OpenAI: POST with reasoning_effort in payload
    OpenAI-->>augmented_llm_openai: Response
    augmented_llm_openai-->>augmented_llm: Result
    augmented_llm-->>User: Generated output

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Logic modification in augmented_llm_openai.py: Verify the fallback logic correctly prioritizes request-level reasoning_effort over configuration defaults in both generate and generate_structured methods.
Parameter propagation chain: Ensure reasoning_effort flows correctly from RequestParams through the OpenAI payload construction without unintended side effects.
Test coverage completeness: Review parameterized tests for edge cases—verify that the distinction between reasoning models (o1/o3/gpt-5\*) and non-reasoning models (e.g., gpt-4) is properly enforced and that max_completion_tokens is correctly used for reasoning models.

Possibly related PRs

feat: tool allow_list in request params #437: Adds tool_filter field to RequestParams class in the same module, following the same per-request parameter override pattern.
fix: Add user request param and missing default_headers #272: Modifies RequestParams and OpenAI augmented-LLM logic in the same files, adding optional request fields and wiring them into OpenAI payloads.
Get structured outputs using LLM native APIs #418: Adds strict field to RequestParams class in augmented_llm.py, using a similar optional parameter pattern.

Suggested reviewers

saqadri
jtcorbett

Poem

🐰 A reasoning effort hops through the chain,
From request params, no more config pain!
With "none," "low," "medium," "high" to choose,
The model decides—you can't lose!
Per-request control, oh what a gain! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: adding a per-request reasoning_effort parameter for GPT-5.1, which is the core feature implemented across multiple files.
Docstring Coverage	✅ Passed	Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c9b364 and e2a23e1.

📒 Files selected for processing (6)

examples/basic/functions/main.py (2 hunks)
schema/mcp-agent.config.schema.json (1 hunks)
src/mcp_agent/config.py (1 hunks)
src/mcp_agent/workflows/llm/augmented_llm.py (2 hunks)
src/mcp_agent/workflows/llm/augmented_llm_openai.py (2 hunks)
tests/workflows/llm/test_augmented_llm_openai.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-07-22T18:59:49.368Z

Learnt from: CR
Repo: lastmile-ai/mcp-agent PR: 0
File: examples/usecases/reliable_conversation/CLAUDE.md:0-0
Timestamp: 2025-07-22T18:59:49.368Z
Learning: Applies to examples/usecases/reliable_conversation/examples/reliable_conversation/test_basic.py : Automated tests must cover multi-turn state persistence, requirement tracking, quality control pipeline (LLM and fallback), context consolidation, and research metrics collection.

Applied to files:

tests/workflows/llm/test_augmented_llm_openai.py

🧬 Code graph analysis (2)

tests/workflows/llm/test_augmented_llm_openai.py (2)

src/mcp_agent/workflows/llm/augmented_llm.py (3)

generate (210-215)

generate (341-346)

RequestParams (127-204)

src/mcp_agent/workflows/llm/augmented_llm_openai.py (1)

generate (186-423)

examples/basic/functions/main.py (1)

src/mcp_agent/workflows/llm/augmented_llm.py (1)

RequestParams (127-204)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks / test

🔇 Additional comments (13)

src/mcp_agent/workflows/llm/augmented_llm.py (2)

15-15: LGTM!

The Literal import is correctly added to support the new reasoning_effort type annotation.

199-204: LGTM!

The reasoning_effort parameter is well-designed:

Properly typed with all valid values including the new "none" option

Clear documentation indicating OpenAI-only applicability

Default value of None enables the fallback behavior (request → config → default)

examples/basic/functions/main.py (2)

9-9: LGTM!

The import of RequestParams is correctly added to support the new per-request parameter override.

48-48: LGTM!

This is an excellent example demonstrating the new feature:

Shows explicit model selection (gpt-5.1)

Demonstrates the new reasoning_effort="none" option

Illustrates per-request override of reasoning behavior

src/mcp_agent/workflows/llm/augmented_llm_openai.py (2)

280-281: LGTM!

The fallback logic correctly implements the documented behavior: uses params.reasoning_effort when provided, otherwise falls back to the config default (self._reasoning_effort). This is only applied for reasoning models as expected.

562-564: LGTM!

The fallback logic is consistently applied in generate_structured as well, matching the implementation in generate. This ensures uniform behavior across both generation methods.

schema/mcp-agent.config.schema.json (1)

1334-1334: LGTM!

The JSON schema correctly reflects the addition of "none" as a valid value for reasoning_effort in OpenAISettings, maintaining consistency with the code changes in config.py.

src/mcp_agent/config.py (1)

421-421: LGTM!

The reasoning_effort type has been correctly expanded to include "none" while maintaining backward compatibility with the default value of "medium". This aligns with both the schema update and the new RequestParams field.

tests/workflows/llm/test_augmented_llm_openai.py (5)

694-719: LGTM!

Excellent test coverage for the basic reasoning_effort payload inclusion:

Correctly mocks select_model to return a reasoning model

Verifies the custom reasoning_effort value is passed to the API

Validates that max_completion_tokens is used (not max_tokens) for reasoning models

721-741: LGTM!

Good test for the fallback behavior, confirming that when reasoning_effort is not specified in the request, it correctly defaults to the config value ("medium").

743-769: LGTM!

Thorough parameterized testing of all valid reasoning_effort values ("none", "low", "medium", "high"), ensuring each value is correctly propagated to the API payload.

772-800: LGTM!

Critical test ensuring that reasoning_effort is not incorrectly applied to non-reasoning models:

Verifies reasoning_effort is absent from payload for non-reasoning models (gpt-4.1)

Confirms max_tokens is used instead of max_completion_tokens

Validates proper model type detection

803-838: LGTM!

Comprehensive validation of reasoning model detection across various model prefixes (o1, o3, o4, gpt-5 series). This ensures the feature works correctly for all supported reasoning models.

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

rholinshead

Great, thanks @dragon1086 for the contribution and attention to detail to the contributing guidelines :)

andrew-lastmile requested a review from saqadri November 20, 2025 16:21

rholinshead approved these changes Dec 1, 2025

View reviewed changes

rholinshead merged commit c67eea5 into lastmile-ai:main Dec 1, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(openai): add per-request reasoning_effort override for GPT-5.1 #617

feat(openai): add per-request reasoning_effort override for GPT-5.1 #617

Uh oh!

dragon1086 commented Nov 20, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 20, 2025 •

edited

Loading

Uh oh!

rholinshead left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(openai): add per-request reasoning_effort override for GPT-5.1 #617

feat(openai): add per-request reasoning_effort override for GPT-5.1 #617

Uh oh!

Conversation

dragon1086 commented Nov 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Changes

Testing

Backward Compatibility

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

rholinshead left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dragon1086 commented Nov 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 20, 2025 •

edited

Loading