-
Notifications
You must be signed in to change notification settings - Fork 30
Description
Description
When using Claude models through the Kilo Gateway provider, thinking/reasoning blocks often remain visible in the UI after the response has completed. The thinking content appears inline between text sections in the final rendered output, even though it should be hidden once the model finishes generating.
This does not happen (or happens far less frequently) when using Claude directly via @ai-sdk/anthropic.
Steps to reproduce
- Configure Kilo CLI with Kilo Gateway as the provider
- Use a Claude model with reasoning enabled (e.g.
anthropic/claude-opus-4.6atmaxvariant) - Send a prompt that triggers tool use and multiple rounds of thinking (e.g. ask a complex coding question that requires analysis, tool calls, then further reasoning)
- Wait for the response to complete
- Observe that
_Thinking:_blocks remain visible in the expanded steps view
Expected behavior
After the response completes, all reasoning/thinking parts should be hidden. The UI filters parts with type === "reasoning" via hideReasoning={!working()} in session-turn.tsx:636, and this works correctly when parts are properly typed.
Actual behavior
Thinking content remains visible after completion, appearing between text sections. The _Thinking:_ prefix with italic formatting is visible inline.
Root cause analysis
The rendering logic in the UI is correct:
session-turn.tsx:636passeshideReasoning={!working()}toAssistantMessageItemsession-turn.tsx:103-105filters out parts wherepart.type === "reasoning"message-part.tsx:712-724renders reasoning parts viaReasoningPartDisplay
The issue is upstream in the streaming pipeline. The Kilo Gateway provider (packages/kilo-gateway/src/provider.ts) wraps @openrouter/ai-sdk-provider (v1.5.2), which parses OpenRouter's SSE stream into AI SDK events. The hypothesis is:
OpenRouter (or the AI SDK OpenRouter provider) sometimes delivers Claude's thinking content as text-delta events rather than reasoning-start/reasoning-delta/reasoning-end events. When this happens, the thinking content ends up stored as a text part (with _Thinking:_ prefix) rather than a reasoning part, so the hideReasoning filter has no effect.
This likely happens specifically during mid-loop thinking — thinking blocks that occur between tool call rounds, not at the start of a response.
Investigation steps
-
Log the raw stream events — Add temporary logging in
packages/opencode/src/session/processor.ts:55to dumpvalue.typefor each event during a Kilo Gateway response. Check whether thinking content arrives astext-deltavsreasoning-start/reasoning-delta/reasoning-end. -
Compare with direct Anthropic — Run the same prompt using
@ai-sdk/anthropicdirectly to confirm thinking blocks are properly typed asreasoningevents when not going through OpenRouter. -
Check
@openrouter/ai-sdk-providerparsing — If OpenRouter's SSE stream does include properthinkingcontent blocks but the SDK maps them to text events, the fix belongs in the OpenRouter SDK or in our stream processor as a workaround.
Relevant code
| File | Lines | Purpose |
|---|---|---|
packages/opencode/src/session/processor.ts |
55-101 | Stream event processing — handles reasoning-start/delta/end |
packages/ui/src/components/session-turn.tsx |
100-105, 636 | hideReasoning filter and prop |
packages/ui/src/components/message-part.tsx |
712-724 | ReasoningPartDisplay component |
packages/kilo-gateway/src/provider.ts |
24-76 | Kilo Gateway wrapping OpenRouter SDK |
packages/opencode/src/provider/transform.ts |
261-331 | Strips thinking blocks when sending messages back to OpenRouter |
Possible fixes
-
Upstream fix in
@openrouter/ai-sdk-provider— Ensure all thinking content blocks from OpenRouter's SSE stream are mapped toreasoning-*events. -
Workaround in stream processor — Detect text deltas that contain thinking-like patterns (e.g. starting with
_Thinking:_or matching the thinking block format) and reclassify them as reasoning parts inprocessor.ts. -
UI-level fallback — When
hideReasoningis true, also filter text parts that match thinking block patterns. This is the least clean option but would fix the symptom immediately.