You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The parses of reasoning content from Minimax M3 is broken in CodeWhale when the Minimax Token Plan is used as an 'OpenAI-comaptible' Provider.
There is an existing issue v0.8.60: Add first-party MiniMax provider route that adds Minimax Token Plan as a supported provider. This is a step in the right direction but this issue does not address the more fundamental problem: developers using CodeWhale with OpenAI-compatible providers need to be able to override how the TUI parses reasoning content. This is especially true for developers using CodeWhale with private models that would never be added to the well-known Providers enum.
This patch introduces a "OpenAI Reasoning Style Override". It is a configuration option that developers can use to change how Codewhale parses reasoning content with any OpenAI-compatible provider.
Problem
When MiniMax M3 is reached through CodeWhale's generic OpenAI-compatible provider, reasoning is streamed as inline <think>…</think> blocks inside delta.content instead of a separate delta.reasoning_content field. CodeWhale's parser has no inline-tag handling, so the literal tags appear in the visible chat. The same problem affects Qwen thinking models on raw vLLM/Ollama and GLM models on aggregators that don't have a GLM-specific parser. The format depends on the serving gateway, not the model identity.
Proposed solution
A new [overrides] config bucket lets the user declare which reasoning format a given protocol emits. The change to ~/.codewhale/config.toml looks like this:
Built-in fragments (all under https://codewhale.dev/configuration/reasoning_style):
#separate_field (default) — reasoning arrives in delta.reasoning_content or delta.reasoning. Current behavior.
#inline_tags — reasoning arrives as <think>…</think> blocks inside delta.content. Used by M3, Qwen thinking family, GLM, and similar.
#none — no reasoning support. Treats the content stream as plain text.
The fragment is the strategy selector; the base URL is the namespace. New strategies ship as new fragments, not new keys or new code paths. Plugins can register custom strategies by pointing the URL elsewhere.
A typo in the URI or an unknown key is a startup error with a clear message. No silent fallbacks.
Use case
The format depends on the gateway, not the model. A user running Qwen via raw Ollama needs #inline_tags; the same model via Alibaba Cloud DashScope needs #separate_field. With this change, the user picks the right strategy per gateway with a single config line — no code change, no model-name matching, no per-model registry update.
Alternatives considered
Per-model hardcoded matching (current approach). Doesn't scale to gateway-dependent formats; requires a code change per new model.
Strip-only post-processing. An existing strip_thinking_tags already strips the tags, but only for the saved-session history list, not the live stream — and it silently eats the reasoning rather than routing it to a thinking cell.
New ApiProvider variant per model family.v0.8.60: Add first-party MiniMax provider route #1310 (closed) added the first-party MiniMax route for v0.8.60, but that's orthogonal — it covers the provider route, not the inline-tag parser gap.
Impact
Removes the blocker preventing M3 (and Qwen/GLM) from being usable through CodeWhale.
Single config edit covers the whole class of inline-tag models. Future models that adopt inline tags need a config line, not a PR.
Executive Summary
Problem
When MiniMax M3 is reached through CodeWhale's generic OpenAI-compatible provider, reasoning is streamed as inline
<think>…</think>blocks insidedelta.contentinstead of a separatedelta.reasoning_contentfield. CodeWhale's parser has no inline-tag handling, so the literal tags appear in the visible chat. The same problem affects Qwen thinking models on raw vLLM/Ollama and GLM models on aggregators that don't have a GLM-specific parser. The format depends on the serving gateway, not the model identity.Proposed solution
A new
[overrides]config bucket lets the user declare which reasoning format a given protocol emits. The change to~/.codewhale/config.tomllooks like this:Built-in fragments (all under
https://codewhale.dev/configuration/reasoning_style):#separate_field(default) — reasoning arrives indelta.reasoning_contentordelta.reasoning. Current behavior.#inline_tags— reasoning arrives as<think>…</think>blocks insidedelta.content. Used by M3, Qwen thinking family, GLM, and similar.#none— no reasoning support. Treats the content stream as plain text.The fragment is the strategy selector; the base URL is the namespace. New strategies ship as new fragments, not new keys or new code paths. Plugins can register custom strategies by pointing the URL elsewhere.
Resolution order (highest priority wins): CLI flag → environment variable →
[overrides]block → per-provider config → per-model config → hardcoded heuristic.A typo in the URI or an unknown key is a startup error with a clear message. No silent fallbacks.
Use case
The format depends on the gateway, not the model. A user running Qwen via raw Ollama needs
#inline_tags; the same model via Alibaba Cloud DashScope needs#separate_field. With this change, the user picks the right strategy per gateway with a single config line — no code change, no model-name matching, no per-model registry update.Alternatives considered
strip_thinking_tagsalready strips the tags, but only for the saved-session history list, not the live stream — and it silently eats the reasoning rather than routing it to a thinking cell.ApiProvidervariant per model family. v0.8.60: Add first-party MiniMax provider route #1310 (closed) added the first-party MiniMax route for v0.8.60, but that's orthogonal — it covers the provider route, not the inline-tag parser gap.Impact
Additional context
Reproduction
The visible response begins with a literal
<think>…</think>block. Expected: a reasoning cell plus the actual reply.Related CodeWhale issues
External references
<think>tag behavior across model sizesdelta.reasoning_contentas a separate fieldreasoning_contentdepending on request config