-
Notifications
You must be signed in to change notification settings - Fork 551
Add Anthropic prompt-cache hint and cache-hit metrics #1403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
fce4d3d
7350782
78a690b
125f4d0
10e0030
6973cdf
66ef2ce
b713c88
3637077
b0f02de
7829691
badc2c5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| from verifiers.types import ClientConfig | ||
| from verifiers.utils.prompt_cache_utils import apply_prompt_cache_to_kwargs | ||
|
|
||
|
|
||
| def test_anthropic_cache_control_hint_is_default_only(): | ||
| extra_kwargs = apply_prompt_cache_to_kwargs( | ||
| config=ClientConfig( | ||
| client_type="anthropic_messages", | ||
| api_base_url="https://api.anthropic.com/v1", | ||
| ), | ||
| sampling_args={"max_tokens": 16}, | ||
| extra_kwargs={}, | ||
| ) | ||
|
|
||
| assert extra_kwargs == {"cache_control": {"type": "ephemeral"}} | ||
|
|
||
| extra_kwargs = apply_prompt_cache_to_kwargs( | ||
| config=ClientConfig( | ||
| client_type="anthropic_messages", | ||
| api_base_url="https://api.anthropic.com/v1", | ||
| ), | ||
| sampling_args={"cache_control": {"type": "custom"}}, | ||
| extra_kwargs={}, | ||
| ) | ||
|
|
||
| assert extra_kwargs == {} |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -423,13 +423,29 @@ def parse_usage(response: OpenAIChatResponse) -> Usage | None: | |
| completion_tokens, int | ||
| ): | ||
| return None | ||
| prompt_details = get_usage_field(usage, "prompt_tokens_details") | ||
| if prompt_details is None: | ||
| prompt_details = get_usage_field(usage, "input_tokens_details") | ||
| cached_tokens = None | ||
| if prompt_details is not None: | ||
| reported_cached_tokens = get_usage_field( | ||
| prompt_details, "cached_tokens" | ||
| ) | ||
| if isinstance(reported_cached_tokens, int) and not isinstance( | ||
| reported_cached_tokens, bool | ||
| ): | ||
| cached_tokens = reported_cached_tokens | ||
| prompt_tokens = max(0, prompt_tokens - cached_tokens) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OpenAI cached tokens excluded from cost calculationLow Severity For OpenAI-compatible clients, Additional Locations (1)Reviewed by Cursor Bugbot for commit badc2c5. Configure here. |
||
| if not isinstance(total_tokens, int): | ||
| total_tokens = prompt_tokens + completion_tokens | ||
| elif cached_tokens is not None: | ||
| total_tokens = max(0, total_tokens - cached_tokens) | ||
| return Usage( | ||
| prompt_tokens=prompt_tokens, | ||
| reasoning_tokens=0, | ||
| completion_tokens=completion_tokens, | ||
| total_tokens=total_tokens, | ||
| cached_input_tokens=cached_tokens, | ||
| ) | ||
|
|
||
| def parse_is_truncated(response: OpenAIChatResponse) -> bool: | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| from collections.abc import Mapping | ||
| from typing import Any | ||
| from urllib.parse import urlsplit | ||
|
|
||
| from verifiers.types import ClientConfig | ||
|
|
||
| ANTHROPIC_ORIGINS = frozenset({"https://api.anthropic.com"}) | ||
|
|
||
|
|
||
| def endpoint_origin(api_base_url: str) -> str | None: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this could be a substring match on api.anthropic.com tbf |
||
| parsed = urlsplit(api_base_url) | ||
| if not parsed.scheme or not parsed.hostname: | ||
| return None | ||
| scheme = parsed.scheme.lower() | ||
| host = parsed.hostname.lower() | ||
| port = parsed.port | ||
| netloc = host | ||
| if ":" in host: | ||
| netloc = f"[{host}]" | ||
| if port is not None and not ( | ||
| (scheme == "https" and port == 443) or (scheme == "http" and port == 80) | ||
| ): | ||
| netloc = f"{netloc}:{port}" | ||
| return f"{scheme}://{netloc}" | ||
|
macroscopeapp[bot] marked this conversation as resolved.
|
||
|
|
||
|
|
||
| def uses_official_anthropic_messages(config: ClientConfig | None) -> bool: | ||
| return ( | ||
| config is not None | ||
| and config.client_type == "anthropic_messages" | ||
| and endpoint_origin(config.api_base_url) in ANTHROPIC_ORIGINS | ||
| ) | ||
|
|
||
|
|
||
| def _cache_control_payload() -> dict[str, str]: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. remove this func |
||
| return {"type": "ephemeral"} | ||
|
|
||
|
|
||
| def apply_prompt_cache_to_kwargs( | ||
| *, | ||
| config: ClientConfig | None, | ||
| sampling_args: Mapping[str, Any], | ||
| extra_kwargs: Mapping[str, Any], | ||
| ) -> dict[str, Any]: | ||
| updated_extra_kwargs = dict(extra_kwargs) | ||
| if ( | ||
| uses_official_anthropic_messages(config) | ||
| and "cache_control" not in sampling_args | ||
| ): | ||
| updated_extra_kwargs.setdefault("cache_control", _cache_control_payload()) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this might break when the user already have set a custom anthropic cache control setting in the sampling args
cursor[bot] marked this conversation as resolved.
|
||
| return updated_extra_kwargs | ||


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing
importorskipguard breaks test without anthropicMedium Severity
The new
test_anthropic_from_native_response_extracts_cache_usagetest importsAnthropicMessagesClientwithout first callingpytest.importorskip("anthropic"). Every other Anthropic test in this file (lines 57, 103, 126, 156, 213, 235) uses this guard. Sinceanthropic_messages_client.pyunconditionally imports fromanthropicat the top level, this test will crash with anImportErrorin environments where theanthropicpackage is not installed, instead of being gracefully skipped.Triggered by project rule: BugBot Instructions
Reviewed by Cursor Bugbot for commit badc2c5. Configure here.