Skip to content

Add reasoning_style override for inline-tag thinking blocks on OpenAI chat-completions (MiniMax M3, Qwen, GLM) #3222

@buko

Description

@buko

Executive Summary

  • The parses of reasoning content from Minimax M3 is broken in CodeWhale when the Minimax Token Plan is used as an 'OpenAI-comaptible' Provider.
  • There is an existing issue v0.8.60: Add first-party MiniMax provider route that adds Minimax Token Plan as a supported provider. This is a step in the right direction but this issue does not address the more fundamental problem: developers using CodeWhale with OpenAI-compatible providers need to be able to override how the TUI parses reasoning content. This is especially true for developers using CodeWhale with private models that would never be added to the well-known Providers enum.
  • This patch introduces a "OpenAI Reasoning Style Override". It is a configuration option that developers can use to change how Codewhale parses reasoning content with any OpenAI-compatible provider.

Problem

When MiniMax M3 is reached through CodeWhale's generic OpenAI-compatible provider, reasoning is streamed as inline <think>…</think> blocks inside delta.content instead of a separate delta.reasoning_content field. CodeWhale's parser has no inline-tag handling, so the literal tags appear in the visible chat. The same problem affects Qwen thinking models on raw vLLM/Ollama and GLM models on aggregators that don't have a GLM-specific parser. The format depends on the serving gateway, not the model identity.

Proposed solution

A new [overrides] config bucket lets the user declare which reasoning format a given protocol emits. The change to ~/.codewhale/config.toml looks like this:

[overrides.openai.protocol]
reasoning_style = "https://codewhale.dev/configuration/reasoning_style#inline_tags"

Built-in fragments (all under https://codewhale.dev/configuration/reasoning_style):

  • #separate_field (default) — reasoning arrives in delta.reasoning_content or delta.reasoning. Current behavior.
  • #inline_tags — reasoning arrives as <think>…</think> blocks inside delta.content. Used by M3, Qwen thinking family, GLM, and similar.
  • #none — no reasoning support. Treats the content stream as plain text.

The fragment is the strategy selector; the base URL is the namespace. New strategies ship as new fragments, not new keys or new code paths. Plugins can register custom strategies by pointing the URL elsewhere.

Resolution order (highest priority wins): CLI flag → environment variable → [overrides] block → per-provider config → per-model config → hardcoded heuristic.

A typo in the URI or an unknown key is a startup error with a clear message. No silent fallbacks.

Use case

The format depends on the gateway, not the model. A user running Qwen via raw Ollama needs #inline_tags; the same model via Alibaba Cloud DashScope needs #separate_field. With this change, the user picks the right strategy per gateway with a single config line — no code change, no model-name matching, no per-model registry update.

Alternatives considered

  • Per-model hardcoded matching (current approach). Doesn't scale to gateway-dependent formats; requires a code change per new model.
  • Strip-only post-processing. An existing strip_thinking_tags already strips the tags, but only for the saved-session history list, not the live stream — and it silently eats the reasoning rather than routing it to a thinking cell.
  • New ApiProvider variant per model family. v0.8.60: Add first-party MiniMax provider route #1310 (closed) added the first-party MiniMax route for v0.8.60, but that's orthogonal — it covers the provider route, not the inline-tag parser gap.

Impact

Additional context

Reproduction

export OPENAI_BASE_URL="https://api.minimax.io/v1"
export OPENAI_API_KEY="<token-plan-key>"
export DEEPSEEK_PROVIDER=openai
codewhale --provider openai --model MiniMax-M3
# send: "hello"

The visible response begins with a literal <think>…</think> block. Expected: a reasoning cell plus the actual reply.

Related CodeWhale issues

External references

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdocumentationImprovements or additions to documentationenhancementNew feature or request

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions