-
Notifications
You must be signed in to change notification settings - Fork 86
Description
Description
After further reflection on the library's approach to model pricing and total usage cost tracking, I've realized we need to be much clearer about exactly what the library promises in this area.
The solution space can be categorized into four levels, ranging from the least to the most commitment:
-
No guarantees, no assistance: Tracking usage and costs is entirely the user's responsibility.
-
Some assistance, no guarantees: The library extracts useful information (e.g., input/output/cached token counts) and pricing data (e.g., by integrating and merging sources like models.dev, OpenRouter, and local patches), then performs basic cost calculations for the user—but all on a best-effort basis only.
-
Full assistance, limited guarantees: The library aims to be as comprehensive as possible with cost and usage information. This includes not only token counts reported by the model but also provider-specific charges (e.g., search costs or other API usage billed as "$x per 1,000 searches"), even when those charges are not explicitly included in response metadata. However, there is still no guarantee that the calculated cost will exactly match the provider's billing.
-
Full assistance, full guarantees: Any discrepancy between the library's cost calculations and a provider's actual billing (at least for major providers) is treated as a high-priority bug. Regression tests are in place to detect incorrect pricing metadata for all key models.
Currently, we're operating at level 2. Several open issues (#81, #198, #265) relate, in one way or another, to moving toward level 3. Historically, there have also been various bugs around cost calculation (e.g., #31, agentjido/llm_db#7, agentjido/llm_db#53, which illustrates how challenging even level 3 can be. We are, of course, very far from level 4.
I believe publicly declaring the specific level we're targeting would benefit both users and maintainers in the long term by setting clear expectations and guiding future development.
Thoughts?
Cost-related issues
- Add support for reasoning token cost calculation #198
- [Feature]: Account for the cost related to tool usage (such as web search) #265
- [Feature]: Google cost calculation should use correct price when token count is > 200k #369
- [Feature]: Make sure that new pricing structure for Google/OpenAI/xAI/Anthropic works with 3rd party providers #370
Known limitations (need to fix eventually)
- Cost of video generation (e.g., OpenAI's Sora)
- Cost of realtime audio/text
Known limitations (won't fix category)
- We do not support legacy
web_search_previewpricing for OpenAI