Skip to content

token usage metrics in InstrumentModel does not work with Ollama #1968

Open
@georgiedekker

Description

@georgiedekker

Initial Checks

Description

Claude:
I've examined your implementation, and I can see why your usage metrics are returning as null when using the Ollama model. The issue is related to how token usage is being extracted from the Ollama model response.

Looking at your PydanticAIAgent.run method, you're trying to extract usage metrics this way:

python
if hasattr(self, 'model') and self.model and hasattr(self.model, 'usage') and self.model.usage:
usage = self.model.usage
prompt_tokens = getattr(usage, 'prompt_tokens', None)
completion_tokens = getattr(usage, 'completion_tokens', None)
total_tokens = getattr(usage, 'total_tokens', None)
The problem is that Ollama models don't expose token usage in the same way as OpenAI models. The Ollama API doesn't return token usage by default, or might structure it differently than the OpenAI API that your code seems to be expecting.

Based on the Pydantic PR you linked (#1898,) this is a new feature, and it looks like it's not fully implemented for all model types.

How to fix it:
Check how you're initializing the Ollama model. Looking at the code, you're initializing Ollama through the OpenAI provider:
python
logger.info(f"Initializing Ollama model '{config.model}' via OpenAIModel and OpenAIProvider.")
Implement a wrapper or adapter for Ollama responses that explicitly captures and formats token usage. You may need to modify your Ollama model initialization to capture usage stats.
Check for Ollama-specific usage stats: The Ollama client might return usage stats in a different form. For example, it could be accessed through a different property name or structure.

https://github.com/ollama/ollama/blob/45f56355d557b7130c7c07bbd6e1b634a758d946/llm/server.go#L724
type CompletionResponse struct {
Content string json:"content"
DoneReason DoneReason json:"done_reason"
Done bool json:"done"
PromptEvalCount int json:"prompt_eval_count"
PromptEvalDuration time.Duration json:"prompt_eval_duration"
EvalCount int json:"eval_count"
EvalDuration time.Duration json:"eval_duration"
}

Example Code

Python, Pydantic AI & LLM client version

python:3.12-slim
pydantic-ai:0.2.17
ollama: 0.9.0
devstral:latest

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions