[Meta]: Library Policy on Pricing Data and Cost Calculation Accuracy


### Description

After further reflection on the library's approach to model pricing and total usage cost tracking, I've realized we need to be much clearer about exactly what the library promises in this area.

The solution space can be categorized into four levels, ranging from the least to the most commitment:

1. **No guarantees, no assistance:** Tracking usage and costs is entirely the user's responsibility.

2. **Some assistance, no guarantees:** The library extracts useful information (e.g., input/output/cached token counts) and pricing data (e.g., by integrating and merging sources like models.dev, OpenRouter, and local patches), then performs basic cost calculations for the user—but all on a best-effort basis only.

3. **Full assistance, limited guarantees:** The library aims to be as comprehensive as possible with cost and usage information. This includes not only token counts reported by the model but also provider-specific charges (e.g., search costs or other API usage billed as "$x per 1,000 searches"), even when those charges are not explicitly included in response metadata. However, there is still no guarantee that the calculated cost will exactly match the provider's billing.

4. **Full assistance, full guarantees:** Any discrepancy between the library's cost calculations and a provider's actual billing (at least for major providers) is treated as a high-priority bug. Regression tests are in place to detect incorrect pricing metadata for all key models.

Currently, we're operating at level 2. Several open issues (#81, #198, #265) relate, in one way or another, to moving toward level 3. Historically, there have also been various bugs around cost calculation (e.g., #31, https://github.com/agentjido/llm_db/issues/7, https://github.com/agentjido/llm_db/issues/53, which illustrates how challenging even level 3 can be. We are, of course, very far from level 4.

I believe publicly declaring the specific level we're targeting would benefit both users and maintainers in the long term by setting clear expectations and guiding future development.
Thoughts?

### Cost-related issues

- [x] #198 
- [x] #265 
- [ ] #369 
- [ ] #370 

### Known limitations (need to fix eventually)

- [ ] Cost of video generation (e.g., OpenAI's Sora)
- [ ] Cost of realtime audio/text

### Known limitations (won't fix category)

- We do not support legacy `web_search_preview` [pricing](https://openai.com/api/pricing/) for OpenAI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Meta]: Library Policy on Pricing Data and Cost Calculation Accuracy #289

Description

Cost-related issues

Known limitations (need to fix eventually)

Known limitations (won't fix category)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Meta]: Library Policy on Pricing Data and Cost Calculation Accuracy #289

Description

Description

Cost-related issues

Known limitations (need to fix eventually)

Known limitations (won't fix category)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions