Description
Initial Checks
- I confirm that I'm using the latest version of Pydantic AI
- I confirm that I searched for my issue in https://github.com/pydantic/pydantic-ai/issues before opening this issue
Description
Claude:
I've examined your implementation, and I can see why your usage metrics are returning as null when using the Ollama model. The issue is related to how token usage is being extracted from the Ollama model response.
Looking at your PydanticAIAgent.run method, you're trying to extract usage metrics this way:
python
if hasattr(self, 'model') and self.model and hasattr(self.model, 'usage') and self.model.usage:
usage = self.model.usage
prompt_tokens = getattr(usage, 'prompt_tokens', None)
completion_tokens = getattr(usage, 'completion_tokens', None)
total_tokens = getattr(usage, 'total_tokens', None)
The problem is that Ollama models don't expose token usage in the same way as OpenAI models. The Ollama API doesn't return token usage by default, or might structure it differently than the OpenAI API that your code seems to be expecting.
Based on the Pydantic PR you linked (#1898,) this is a new feature, and it looks like it's not fully implemented for all model types.
How to fix it:
Check how you're initializing the Ollama model. Looking at the code, you're initializing Ollama through the OpenAI provider:
python
logger.info(f"Initializing Ollama model '{config.model}' via OpenAIModel and OpenAIProvider.")
Implement a wrapper or adapter for Ollama responses that explicitly captures and formats token usage. You may need to modify your Ollama model initialization to capture usage stats.
Check for Ollama-specific usage stats: The Ollama client might return usage stats in a different form. For example, it could be accessed through a different property name or structure.
https://github.com/ollama/ollama/blob/45f56355d557b7130c7c07bbd6e1b634a758d946/llm/server.go#L724
type CompletionResponse struct {
Content string json:"content"
DoneReason DoneReason json:"done_reason"
Done bool json:"done"
PromptEvalCount int json:"prompt_eval_count"
PromptEvalDuration time.Duration json:"prompt_eval_duration"
EvalCount int json:"eval_count"
EvalDuration time.Duration json:"eval_duration"
}
Example Code
Python, Pydantic AI & LLM client version
python:3.12-slim
pydantic-ai:0.2.17
ollama: 0.9.0
devstral:latest