Skip to content

Conversation

@dragon1086
Copy link
Contributor

@dragon1086 dragon1086 commented Nov 20, 2025

Summary

Add per-request reasoning_effort parameter override to support mixing reasoning and non-reasoning modes in multi-agent applications with GPT-5.1.

Problem

With GPT-5.1's release, OpenAI added reasoning_effort='none' for non-reasoning mode. However, mcp-agent only supported one global reasoning_effort setting per application.

Real-world issue:

  • My multi-agent project uses GPT-5 with reasoning for analysis agents
  • Simple utility agents used GPT-4.1 for fast responses
  • When upgrading utility agents to GPT-5.1, the global reasoning_effort setting created conflicts
  • Couldn't use GPT-5.1 with reasoning_effort='none' for some agents and reasoning_effort='high' for others

Solution

Allow reasoning_effort to be set per-request via RequestParams, with fallback to config default.

Usage:

# Complex reasoning
await llm.generate_str(
    message="Analyze...",
    request_params=RequestParams(
        model="gpt-5.1",
        reasoning_effort="high"
    )
)

# Fast, no reasoning
await llm.generate_str(
    message="Format...",
    request_params=RequestParams(
        model="gpt-5.1",
        reasoning_effort="none"
    )
)

Changes

  • Add reasoning_effort field to RequestParams with Literal["none", "low", "medium", "high"]
  • Support 'none' value in OpenAISettings config
  • Implement fallback: request param → config → default ('medium')
  • Add 5 unit tests (145+ lines)
  • Update example demonstrating GPT-5.1 with reasoning_effort='none'
  • Update JSON schema

Testing

  • make format - passed
  • make lint - passed
  • make schema - updated
  • Added 5 comprehensive test cases
  • All existing tests pass

Backward Compatibility

✅ Fully backward compatible - existing applications work without changes.

Summary by CodeRabbit

  • New Features

    • Added "none" as a new reasoning effort option (in addition to "low", "medium", "high").
    • Enabled per-request override of reasoning effort settings, allowing individual requests to specify their own reasoning effort independent of the default configuration.
  • Tests

    • Added comprehensive test coverage for reasoning effort behavior across different model types and configuration scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

Add ability to dynamically control reasoning_effort parameter on a
per-request basis for OpenAI reasoning models (o1/o3/o4/gpt-5 series).

Changes:
- Add reasoning_effort field to RequestParams with Literal type hints
- Support 'none', 'low', 'medium', 'high' values in config and runtime
- Implement fallback logic: request param -> config -> default ('medium')
- Add comprehensive unit tests for reasoning_effort behavior
- Update example to demonstrate gpt-5.1 with reasoning_effort override
- Update JSON schema to include 'none' value

The reasoning_effort parameter is OpenAI-specific and only applies to
reasoning models. Non-reasoning models ignore this parameter.

Tests: Added 5 new test cases covering all reasoning_effort scenarios
@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

Walkthrough

The changes introduce per-request specification of OpenAI reasoning effort, adding a new optional reasoning_effort parameter to RequestParams that overrides the default configuration setting. The reasoning effort enum is expanded to include "none" across configuration, request parameters, and test coverage.

Changes

Cohort / File(s) Summary
Configuration Schema Updates
schema/mcp-agent.config.schema.json, src/mcp_agent/config.py
Expanded OpenAISettings.reasoning_effort enum to include "none" alongside existing ["low", "medium", "high"] values in both JSON schema and Python config class.
Request Parameters & LLM Core
src/mcp_agent/workflows/llm/augmented_llm.py, src/mcp_agent/workflows/llm/augmented_llm_openai.py
Added optional reasoning_effort field to RequestParams with type Literal["none", "low", "medium", "high"]; modified OpenAI generate methods to use request-level reasoning_effort with fallback to configured default.
Example Usage
examples/basic/functions/main.py
Demonstrates per-request reasoning effort override by passing RequestParams(model="gpt-5.1", reasoning_effort="none") to generate_str.
Test Coverage
tests/workflows/llm/test_augmented_llm_openai.py
Added comprehensive parameterized unit tests validating reasoning_effort payload inclusion, model-specific behavior (reasoning vs. non-reasoning models), fallback defaults, and various enum value propagation.

Sequence Diagram

sequenceDiagram
    participant User
    participant augmented_llm as AugmentedLLM
    participant augmented_llm_openai as OpenAIAugmentedLLM
    participant OpenAI as OpenAI API

    User->>augmented_llm: generate_str(request_params=RequestParams(reasoning_effort="none"))
    augmented_llm->>augmented_llm_openai: Forward request_params
    augmented_llm_openai->>augmented_llm_openai: Check if params.reasoning_effort provided
    alt Request-level reasoning_effort exists
        augmented_llm_openai->>augmented_llm_openai: Use params.reasoning_effort
    else Fallback to default
        augmented_llm_openai->>augmented_llm_openai: Use self._reasoning_effort ("medium")
    end
    augmented_llm_openai->>OpenAI: POST with reasoning_effort in payload
    OpenAI-->>augmented_llm_openai: Response
    augmented_llm_openai-->>augmented_llm: Result
    augmented_llm-->>User: Generated output
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Logic modification in augmented_llm_openai.py: Verify the fallback logic correctly prioritizes request-level reasoning_effort over configuration defaults in both generate and generate_structured methods.
  • Parameter propagation chain: Ensure reasoning_effort flows correctly from RequestParams through the OpenAI payload construction without unintended side effects.
  • Test coverage completeness: Review parameterized tests for edge cases—verify that the distinction between reasoning models (o1/o3/gpt-5\*) and non-reasoning models (e.g., gpt-4) is properly enforced and that max_completion_tokens is correctly used for reasoning models.

Possibly related PRs

Suggested reviewers

  • saqadri
  • jtcorbett

Poem

🐰 A reasoning effort hops through the chain,
From request params, no more config pain!
With "none," "low," "medium," "high" to choose,
The model decides—you can't lose!
Per-request control, oh what a gain! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a per-request reasoning_effort parameter for GPT-5.1, which is the core feature implemented across multiple files.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c9b364 and e2a23e1.

📒 Files selected for processing (6)
  • examples/basic/functions/main.py (2 hunks)
  • schema/mcp-agent.config.schema.json (1 hunks)
  • src/mcp_agent/config.py (1 hunks)
  • src/mcp_agent/workflows/llm/augmented_llm.py (2 hunks)
  • src/mcp_agent/workflows/llm/augmented_llm_openai.py (2 hunks)
  • tests/workflows/llm/test_augmented_llm_openai.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-22T18:59:49.368Z
Learnt from: CR
Repo: lastmile-ai/mcp-agent PR: 0
File: examples/usecases/reliable_conversation/CLAUDE.md:0-0
Timestamp: 2025-07-22T18:59:49.368Z
Learning: Applies to examples/usecases/reliable_conversation/examples/reliable_conversation/test_basic.py : Automated tests must cover multi-turn state persistence, requirement tracking, quality control pipeline (LLM and fallback), context consolidation, and research metrics collection.

Applied to files:

  • tests/workflows/llm/test_augmented_llm_openai.py
🧬 Code graph analysis (2)
tests/workflows/llm/test_augmented_llm_openai.py (2)
src/mcp_agent/workflows/llm/augmented_llm.py (3)
  • generate (210-215)
  • generate (341-346)
  • RequestParams (127-204)
src/mcp_agent/workflows/llm/augmented_llm_openai.py (1)
  • generate (186-423)
examples/basic/functions/main.py (1)
src/mcp_agent/workflows/llm/augmented_llm.py (1)
  • RequestParams (127-204)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks / test
🔇 Additional comments (13)
src/mcp_agent/workflows/llm/augmented_llm.py (2)

15-15: LGTM!

The Literal import is correctly added to support the new reasoning_effort type annotation.


199-204: LGTM!

The reasoning_effort parameter is well-designed:

  • Properly typed with all valid values including the new "none" option
  • Clear documentation indicating OpenAI-only applicability
  • Default value of None enables the fallback behavior (request → config → default)
examples/basic/functions/main.py (2)

9-9: LGTM!

The import of RequestParams is correctly added to support the new per-request parameter override.


48-48: LGTM!

This is an excellent example demonstrating the new feature:

  • Shows explicit model selection (gpt-5.1)
  • Demonstrates the new reasoning_effort="none" option
  • Illustrates per-request override of reasoning behavior
src/mcp_agent/workflows/llm/augmented_llm_openai.py (2)

280-281: LGTM!

The fallback logic correctly implements the documented behavior: uses params.reasoning_effort when provided, otherwise falls back to the config default (self._reasoning_effort). This is only applied for reasoning models as expected.


562-564: LGTM!

The fallback logic is consistently applied in generate_structured as well, matching the implementation in generate. This ensures uniform behavior across both generation methods.

schema/mcp-agent.config.schema.json (1)

1334-1334: LGTM!

The JSON schema correctly reflects the addition of "none" as a valid value for reasoning_effort in OpenAISettings, maintaining consistency with the code changes in config.py.

src/mcp_agent/config.py (1)

421-421: LGTM!

The reasoning_effort type has been correctly expanded to include "none" while maintaining backward compatibility with the default value of "medium". This aligns with both the schema update and the new RequestParams field.

tests/workflows/llm/test_augmented_llm_openai.py (5)

694-719: LGTM!

Excellent test coverage for the basic reasoning_effort payload inclusion:

  • Correctly mocks select_model to return a reasoning model
  • Verifies the custom reasoning_effort value is passed to the API
  • Validates that max_completion_tokens is used (not max_tokens) for reasoning models

721-741: LGTM!

Good test for the fallback behavior, confirming that when reasoning_effort is not specified in the request, it correctly defaults to the config value ("medium").


743-769: LGTM!

Thorough parameterized testing of all valid reasoning_effort values ("none", "low", "medium", "high"), ensuring each value is correctly propagated to the API payload.


772-800: LGTM!

Critical test ensuring that reasoning_effort is not incorrectly applied to non-reasoning models:

  • Verifies reasoning_effort is absent from payload for non-reasoning models (gpt-4.1)
  • Confirms max_tokens is used instead of max_completion_tokens
  • Validates proper model type detection

803-838: LGTM!

Comprehensive validation of reasoning model detection across various model prefixes (o1, o3, o4, gpt-5 series). This ensures the feature works correctly for all supported reasoning models.

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Member

@rholinshead rholinshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks @dragon1086 for the contribution and attention to detail to the contributing guidelines :)

@rholinshead rholinshead merged commit c67eea5 into lastmile-ai:main Dec 1, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants