Skip to content

Conversation

@steven10a
Copy link
Collaborator

@steven10a steven10a commented Dec 1, 2025

Implements an update for issue 41 - thank you @thuanng-a11y for reporting!

  • For Guardrails that use LLM's, return token usage data
  • Per guardrail token usage is returned via its info_dict and can be accessed with
for gr in response.guardrail_results.all_results:
    usage = gr.info.get("token_usage")
    if usage:
        print(f"{gr.info['guardrail_name']}: {usage['total_tokens']} tokens")
  • Additionally, the total tokens used in a response can be returned with the total_guardrail_token_usage helper
    total_guardrail_token_usage(response). This works for all clients (GuardrailAgent, GuardrailAsyncOpenAI, ...) and works with streaming and non-streaming
  • Updated the documentation to reflect this new functionality
  • Updated example scripts to show implementation
  • Added tests

Copilot AI review requested due to automatic review settings December 1, 2025 23:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements token usage tracking for LLM-based guardrails, addressing issue #41. The implementation provides per-guardrail token statistics and aggregated totals across all guardrail calls, working seamlessly with all client surfaces (OpenAI clients, Agents SDK, streaming and non-streaming).

Key Changes:

  • Introduced TokenUsage dataclass and helper functions (extract_token_usage, token_usage_to_dict, aggregate_token_usage_from_infos) to capture and aggregate token consumption data
  • Updated all LLM-based guardrails (Jailbreak, Custom Prompt Check, Prompt Injection Detection, Hallucination Detection) to return token usage alongside their analysis results
  • Added unified total_guardrail_token_usage() helper function that works across all guardrails surfaces for easy token tracking

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/guardrails/types.py Adds TokenUsage dataclass, extraction utilities, and total_guardrail_token_usage() unified interface
src/guardrails/_base_client.py Adds total_token_usage property to GuardrailResults for token aggregation
src/guardrails/checks/text/llm_base.py Updates run_llm() to return token usage tuple; modifies create_llm_check_fn() to include token usage in result info
src/guardrails/checks/text/jailbreak.py Updates jailbreak guardrail to capture and include token usage in results
src/guardrails/checks/text/prompt_injection_detection.py Updates prompt injection detection to capture and include token usage in results
src/guardrails/checks/text/hallucination_detection.py Updates hallucination detection to capture and include token usage in results
src/guardrails/agents.py Updates agent guardrail wrappers to propagate token usage in output_info even for successful checks
src/guardrails/init.py Exports total_guardrail_token_usage helper function for public API
tests/unit/test_types.py Comprehensive tests for TokenUsage, extraction, aggregation, and unified helper
tests/unit/test_base_client.py Tests for token aggregation in GuardrailResults
tests/unit/checks/test_llm_base.py Updates tests to verify token usage is returned from LLM calls
tests/unit/checks/test_jailbreak.py Updates tests to mock token usage in return values
tests/unit/checks/test_prompt_injection_detection.py Updates tests to mock token usage in return values
tests/unit/test_agents.py Adds test verifying successful agent guardrails return info with token usage
docs/quickstart.md Documents token usage tracking with examples for all client surfaces
docs/agents_sdk_integration.md Documents token usage tracking for Agents SDK with per-stage examples
examples/basic/hello_world.py Demonstrates token usage tracking in basic example
examples/basic/multi_bundle.py Demonstrates token usage tracking in streaming example
examples/basic/local_model.py Demonstrates token usage tracking with local models
src/guardrails/utils/anonymizer.py Code formatting cleanup (unrelated to token usage)
src/guardrails/checks/text/pii.py Code formatting cleanup (unrelated to token usage)
src/guardrails/checks/text/urls.py Code formatting cleanup (unrelated to token usage)
src/guardrails/client.py Code formatting cleanup (unrelated to token usage)
src/guardrails/evals/core/async_engine.py Code formatting cleanup (unrelated to token usage)
tests/unit/evals/test_guardrail_evals.py Code formatting cleanup (unrelated to token usage)
tests/unit/evals/test_async_engine.py Code formatting cleanup (unrelated to token usage)
tests/unit/checks/test_anonymizer_baseline.py Code formatting cleanup (unrelated to token usage)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@steven10a
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +236 to +239
return {
"prompt_tokens": total_prompt if has_any_data else None,
"completion_tokens": total_completion if has_any_data else None,
"total_tokens": total if has_any_data else None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid reporting missing token fields as zero

aggregate_token_usage_from_infos uses a single has_any_data flag while totals are initialised to zero, so if any guardrail contributes only one of the token fields (e.g., a provider exposes total_tokens but not prompt_tokens/completion_tokens, or a run only records prompt tokens), the function returns 0 for the missing fields instead of None. That misstates usage and under-reports costs whenever some components are unavailable, even though no data exists for those fields. Consider tracking availability per field or leaving fields as None unless a value was actually aggregated.

Useful? React with 👍 / 👎.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am okay with this implementation. OpenAI clients and popular 3rd parties were tested and return the same three token fields. A hypothetical edge case client may return different fields but in that case we will be returning that there is no token data to be reported.

Copy link
Collaborator

@gabor-openai gabor-openai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM TY

@gabor-openai gabor-openai merged commit 2615892 into main Dec 2, 2025
3 checks passed
@gabor-openai gabor-openai deleted the dev/steven/token_count branch December 2, 2025 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants