Skip to content

Thinking blocks remain visible after response completes when using Kilo Gateway provider #236

@marius-kilocode

Description

@marius-kilocode

Description

When using Claude models through the Kilo Gateway provider, thinking/reasoning blocks often remain visible in the UI after the response has completed. The thinking content appears inline between text sections in the final rendered output, even though it should be hidden once the model finishes generating.

This does not happen (or happens far less frequently) when using Claude directly via @ai-sdk/anthropic.

Steps to reproduce

  1. Configure Kilo CLI with Kilo Gateway as the provider
  2. Use a Claude model with reasoning enabled (e.g. anthropic/claude-opus-4.6 at max variant)
  3. Send a prompt that triggers tool use and multiple rounds of thinking (e.g. ask a complex coding question that requires analysis, tool calls, then further reasoning)
  4. Wait for the response to complete
  5. Observe that _Thinking:_ blocks remain visible in the expanded steps view

Expected behavior

After the response completes, all reasoning/thinking parts should be hidden. The UI filters parts with type === "reasoning" via hideReasoning={!working()} in session-turn.tsx:636, and this works correctly when parts are properly typed.

Actual behavior

Thinking content remains visible after completion, appearing between text sections. The _Thinking:_ prefix with italic formatting is visible inline.

Root cause analysis

The rendering logic in the UI is correct:

  • session-turn.tsx:636 passes hideReasoning={!working()} to AssistantMessageItem
  • session-turn.tsx:103-105 filters out parts where part.type === "reasoning"
  • message-part.tsx:712-724 renders reasoning parts via ReasoningPartDisplay

The issue is upstream in the streaming pipeline. The Kilo Gateway provider (packages/kilo-gateway/src/provider.ts) wraps @openrouter/ai-sdk-provider (v1.5.2), which parses OpenRouter's SSE stream into AI SDK events. The hypothesis is:

OpenRouter (or the AI SDK OpenRouter provider) sometimes delivers Claude's thinking content as text-delta events rather than reasoning-start/reasoning-delta/reasoning-end events. When this happens, the thinking content ends up stored as a text part (with _Thinking:_ prefix) rather than a reasoning part, so the hideReasoning filter has no effect.

This likely happens specifically during mid-loop thinking — thinking blocks that occur between tool call rounds, not at the start of a response.

Investigation steps

  1. Log the raw stream events — Add temporary logging in packages/opencode/src/session/processor.ts:55 to dump value.type for each event during a Kilo Gateway response. Check whether thinking content arrives as text-delta vs reasoning-start/reasoning-delta/reasoning-end.

  2. Compare with direct Anthropic — Run the same prompt using @ai-sdk/anthropic directly to confirm thinking blocks are properly typed as reasoning events when not going through OpenRouter.

  3. Check @openrouter/ai-sdk-provider parsing — If OpenRouter's SSE stream does include proper thinking content blocks but the SDK maps them to text events, the fix belongs in the OpenRouter SDK or in our stream processor as a workaround.

Relevant code

File Lines Purpose
packages/opencode/src/session/processor.ts 55-101 Stream event processing — handles reasoning-start/delta/end
packages/ui/src/components/session-turn.tsx 100-105, 636 hideReasoning filter and prop
packages/ui/src/components/message-part.tsx 712-724 ReasoningPartDisplay component
packages/kilo-gateway/src/provider.ts 24-76 Kilo Gateway wrapping OpenRouter SDK
packages/opencode/src/provider/transform.ts 261-331 Strips thinking blocks when sending messages back to OpenRouter

Possible fixes

  1. Upstream fix in @openrouter/ai-sdk-provider — Ensure all thinking content blocks from OpenRouter's SSE stream are mapped to reasoning-* events.

  2. Workaround in stream processor — Detect text deltas that contain thinking-like patterns (e.g. starting with _Thinking:_ or matching the thinking block format) and reclassify them as reasoning parts in processor.ts.

  3. UI-level fallback — When hideReasoning is true, also filter text parts that match thinking block patterns. This is the least clean option but would fix the symptom immediately.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions