Skip to content

ADK LiteLlm adapter drops LiteLLM reasoning content #3694

@mikkokirvesoja

Description

@mikkokirvesoja

Summary

When using ADK with LiteLLM-backed models that emit separate reasoning / "thinking" content (for example via reasoning_content or reasoning fields on messages and deltas), the current LiteLlm adapter in ADK silently discards that reasoning.

As a result, ADK callers cannot access or surface model reasoning even though it is available in the underlying LiteLLM responses.

This behavior has been observed with:

  • ADK (google-adk) 1.19.0
  • LiteLLM 1.79.3 (Local models from LM Studio and another model via OpenAI compatible endpoint)

Current Behavior

The ADK side behaves roughly as follows:

  • LiteLlm:
    • Aggregates assistant content and tool call information from LiteLLM responses.
    • Does not read or propagate any reasoning_content / reasoning fields from LiteLLM.
  • LlmResponse:
    • Does not define a field for reasoning, so there is nowhere to store it.
  • Event:
    • Inherits from LlmResponse and is what flows emit to callers.
    • Since LlmResponse has no reasoning field and LiteLlm does not populate one, Event never exposes reasoning either.

In practice, this means:

  • Downstream consumers using ADK only see final assistant content and tool calls.
  • Any reasoning content provided by LiteLLM-backed models is silently dropped and unavailable.

Expected Behavior

For LiteLLM-backed models that emit reasoning fields (e.g. reasoning_content / reasoning):

  • ADK's LiteLlm adapter should:
    • Map those reasoning fields into an explicit reasoning field on LlmResponse.
    • Do this consistently for both non-streaming and streaming responses.
  • LlmResponse / Event should:
    • Expose this reasoning field so that callers can choose to display or process it.

For models/providers that do not emit any reasoning fields:

  • Behavior should remain unchanged.
  • The reasoning field should simply be absent / None and not appear in serialized output when using exclude_none=True.

Impact

  • ADK currently cannot surface model reasoning even when it is available through LiteLLM.
  • This limits the ability of downstream applications to provide transparency/debugging views or UX patterns that distinguish between "thinking" and final answers.
  • The problem is purely on the ADK integration side: LiteLLM already exposes the necessary fields.

Proposed Directions

  1. Extend LlmResponse with an optional reasoning field

    • Add reasoning: Optional[str] = None to google.adk.models.llm_response.LlmResponse.
    • This becomes the canonical place to store reasoning/thinking content within ADK.
    • Because Event inherits from LlmResponse, Event will also gain this field automatically.
  2. Map LiteLLM reasoning into LlmResponse.reasoning in the LiteLlm adapter

    In google.adk.models.lite_llm.LiteLlm:

    • Non-streaming path (_model_response_to_generate_content_response):

      • After building LlmResponse from the main assistant message, inspect choices[0].message from the LiteLLM ModelResponse.
      • If the message (whether dict-style or Pydantic-style) contains reasoning_content or reasoning, copy that string into llm_response.reasoning.
    • Streaming path (generate_content_async(..., stream=True)):

      • Maintain a reasoning_text accumulator alongside the existing text buffer.
      • For each streaming chunk, inspect choices[0].delta from the LiteLLM ModelResponseStream.
      • If the delta (dict or object) contains reasoning_content or reasoning, append it to reasoning_text.
      • When yielding partial/final LlmResponse instances, pass reasoning=reasoning_text or None.
  3. Rely on existing flow behavior to propagate reasoning to events

    • BaseLlmFlow._finalize_model_response_event already merges LlmResponse into Event via something like:

      model_response_event = Event.model_validate({
          **model_response_event.model_dump(exclude_none=True),
          **llm_response.model_dump(exclude_none=True),
      })
    • Once LlmResponse.reasoning is populated, this merge will automatically carry it into Event.reasoning.

    • No changes should be required in flows or runners.

Local patch & verification

I have tested this approach locally by patching ADK 1.19.0 as follows:

  • google.adk.models.llm_response.LlmResponse

    • Added an optional reasoning: Optional[str] field to hold model reasoning / thinking content.
  • google.adk.models.lite_llm.LiteLlm

    • Non-streaming path: in _model_response_to_generate_content_response, read reasoning_content / reasoning from choices[0].message (supporting both dict- and object-style messages) and assign it to llm_response.reasoning when present.
    • Streaming path: in generate_content_async(..., stream=True), accumulate reasoning_content / reasoning from choices[0].delta across chunks into a reasoning_text buffer and pass it as reasoning when yielding partial and final LlmResponse objects.

With this local patch applied, events emitted by ADK now expose a reasoning field whenever the underlying LiteLLM response includes reasoning, and downstream consumers are able to display the model's reasoning traces as expected.

This was verified both in an application consuming ADK and directly via adk web, where the patched ADK shows a reasoning field in the event payload, while the unpatched 1.19.0 build does not.

Metadata

Metadata

Labels

models[Component] Issues related to model supportneeds-review[Status] The PR is awaiting review from the maintainer

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions