-
Notifications
You must be signed in to change notification settings - Fork 22
Return Guardrail token usage #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements token usage tracking for LLM-based guardrails, addressing issue #41. The implementation provides per-guardrail token statistics and aggregated totals across all guardrail calls, working seamlessly with all client surfaces (OpenAI clients, Agents SDK, streaming and non-streaming).
Key Changes:
- Introduced
TokenUsagedataclass and helper functions (extract_token_usage,token_usage_to_dict,aggregate_token_usage_from_infos) to capture and aggregate token consumption data - Updated all LLM-based guardrails (Jailbreak, Custom Prompt Check, Prompt Injection Detection, Hallucination Detection) to return token usage alongside their analysis results
- Added unified
total_guardrail_token_usage()helper function that works across all guardrails surfaces for easy token tracking
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/guardrails/types.py | Adds TokenUsage dataclass, extraction utilities, and total_guardrail_token_usage() unified interface |
| src/guardrails/_base_client.py | Adds total_token_usage property to GuardrailResults for token aggregation |
| src/guardrails/checks/text/llm_base.py | Updates run_llm() to return token usage tuple; modifies create_llm_check_fn() to include token usage in result info |
| src/guardrails/checks/text/jailbreak.py | Updates jailbreak guardrail to capture and include token usage in results |
| src/guardrails/checks/text/prompt_injection_detection.py | Updates prompt injection detection to capture and include token usage in results |
| src/guardrails/checks/text/hallucination_detection.py | Updates hallucination detection to capture and include token usage in results |
| src/guardrails/agents.py | Updates agent guardrail wrappers to propagate token usage in output_info even for successful checks |
| src/guardrails/init.py | Exports total_guardrail_token_usage helper function for public API |
| tests/unit/test_types.py | Comprehensive tests for TokenUsage, extraction, aggregation, and unified helper |
| tests/unit/test_base_client.py | Tests for token aggregation in GuardrailResults |
| tests/unit/checks/test_llm_base.py | Updates tests to verify token usage is returned from LLM calls |
| tests/unit/checks/test_jailbreak.py | Updates tests to mock token usage in return values |
| tests/unit/checks/test_prompt_injection_detection.py | Updates tests to mock token usage in return values |
| tests/unit/test_agents.py | Adds test verifying successful agent guardrails return info with token usage |
| docs/quickstart.md | Documents token usage tracking with examples for all client surfaces |
| docs/agents_sdk_integration.md | Documents token usage tracking for Agents SDK with per-stage examples |
| examples/basic/hello_world.py | Demonstrates token usage tracking in basic example |
| examples/basic/multi_bundle.py | Demonstrates token usage tracking in streaming example |
| examples/basic/local_model.py | Demonstrates token usage tracking with local models |
| src/guardrails/utils/anonymizer.py | Code formatting cleanup (unrelated to token usage) |
| src/guardrails/checks/text/pii.py | Code formatting cleanup (unrelated to token usage) |
| src/guardrails/checks/text/urls.py | Code formatting cleanup (unrelated to token usage) |
| src/guardrails/client.py | Code formatting cleanup (unrelated to token usage) |
| src/guardrails/evals/core/async_engine.py | Code formatting cleanup (unrelated to token usage) |
| tests/unit/evals/test_guardrail_evals.py | Code formatting cleanup (unrelated to token usage) |
| tests/unit/evals/test_async_engine.py | Code formatting cleanup (unrelated to token usage) |
| tests/unit/checks/test_anonymizer_baseline.py | Code formatting cleanup (unrelated to token usage) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…o dev/steven/token_count
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| return { | ||
| "prompt_tokens": total_prompt if has_any_data else None, | ||
| "completion_tokens": total_completion if has_any_data else None, | ||
| "total_tokens": total if has_any_data else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid reporting missing token fields as zero
aggregate_token_usage_from_infos uses a single has_any_data flag while totals are initialised to zero, so if any guardrail contributes only one of the token fields (e.g., a provider exposes total_tokens but not prompt_tokens/completion_tokens, or a run only records prompt tokens), the function returns 0 for the missing fields instead of None. That misstates usage and under-reports costs whenever some components are unavailable, even though no data exists for those fields. Consider tracking availability per field or leaving fields as None unless a value was actually aggregated.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with this implementation. OpenAI clients and popular 3rd parties were tested and return the same three token fields. A hypothetical edge case client may return different fields but in that case we will be returning that there is no token data to be reported.
gabor-openai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM TY
Implements an update for issue 41 - thank you @thuanng-a11y for reporting!
info_dictand can be accessed withtotal_guardrail_token_usagehelpertotal_guardrail_token_usage(response). This works for all clients (GuardrailAgent,GuardrailAsyncOpenAI, ...) and works with streaming and non-streaming