introduce cached tokens to usage #1133

agoddijn-fern · 2025-03-15T14:53:03Z

It's useful to understand whether (and how) we are leveraging prompt caching in different models.
So far I've looked at OpenAI, Anthropic, and Gemini

https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#how-prompt-caching-works
https://platform.openai.com/docs/guides/prompt-caching
https://ai.google.dev/gemini-api/docs/caching?lang=python

alexmojaki · 2025-03-15T15:08:42Z

Thanks, this looks good, especially since you seem to have fixed how we measured input tokens in Anthropic. If you run pytest --inline-snapshot=fix that should update the tests so that they pass.

agoddijn-fern · 2025-03-15T15:39:41Z

Note for this, a potential next step would be to implement some of the different caching strategies mentioned in

https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#optimizing-for-different-use-cases

I'm not sure what the most ergonomic way of doing this would be

DouweM · 2025-04-30T00:06:14Z

This is being implemented in #1549!

introduce cached tokens to usage

64da76b

attempt to fix test

3684887

DouweM closed this Apr 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduce cached tokens to usage #1133

introduce cached tokens to usage #1133

agoddijn-fern commented Mar 15, 2025

alexmojaki commented Mar 15, 2025

agoddijn-fern commented Mar 15, 2025

DouweM commented Apr 30, 2025

introduce cached tokens to usage #1133

introduce cached tokens to usage #1133

Conversation

agoddijn-fern commented Mar 15, 2025

alexmojaki commented Mar 15, 2025

agoddijn-fern commented Mar 15, 2025

DouweM commented Apr 30, 2025