Skip to content

Conversation

@KiyotakaMatsushita
Copy link

Fix handle_redact_content to properly handle redactUserContentMessage

Problem

When using AWS Bedrock Guardrails with separate custom messages for input and output blocking, the handle_redact_content function in src/strands/event_loop/streaming.py only processes redactAssistantContentMessage and completely ignores redactUserContentMessage. This causes the output guardrail block message to be displayed even when the input guardrail is triggered.

Related Issue: #1324

Root Cause

The current implementation only checks for redactAssistantContentMessage:

def handle_redact_content(event: RedactContentEvent, state: dict[str, Any]) -> None:
    if event.get("redactAssistantContentMessage") is not None:
        state["message"]["content"] = [{"text": event["redactAssistantContentMessage"]}]
    # ❌ redactUserContentMessage is completely ignored

Solution

This PR fixes the issue by checking both message types in the correct order:

def handle_redact_content(event: RedactContentEvent, state: dict[str, Any]) -> None:
    # Check for input redaction first
    if event.get("redactUserContentMessage") is not None:
        state["message"]["content"] = [{"text": event["redactUserContentMessage"]}]
    # Check for output redaction
    elif event.get("redactAssistantContentMessage") is not None:
        state["message"]["content"] = [{"text": event["redactAssistantContentMessage"]}]

Changes

  1. Updated handle_redact_content function (src/strands/event_loop/streaming.py):

    • Added check for redactUserContentMessage (input guardrail blocking)
    • Prioritizes input redaction over output redaction
    • Falls back to redactAssistantContentMessage if input message is not present
  2. Updated test expectations (tests/strands/event_loop/test_streaming.py):

    • Fixed the expected redaction message in the test case
    • Test now expects redactUserContentMessage value when both fields are present
    • Aligns test behavior with the corrected implementation

Impact

Before

  • ❌ Input guardrail triggers → Shows output block message
  • ✅ Output guardrail triggers → Shows output block message

After

  • ✅ Input guardrail triggers → Shows input block message
  • ✅ Output guardrail triggers → Shows output block message

Testing

The existing test case for redacted messages has been updated to reflect the correct behavior. The test will pass with the new implementation where redactUserContentMessage is properly prioritized.

Additional Context

Consistency with agent.py

Interestingly, src/strands/agent/agent.py already correctly handles redactUserContentMessage:

if (
    isinstance(event, ModelStreamChunkEvent)
    and event.chunk
    and event.chunk.get("redactContent")
    and event.chunk["redactContent"].get("redactUserContentMessage")
):
    self.messages[-1]["content"] = self._redact_user_content(...)

This PR brings streaming.py into alignment with the correct implementation already present in agent.py.

AWS Bedrock Event Structure

When a guardrail is triggered, AWS Bedrock sends both fields in the redactContent event:

{
  "redactContent": {
    "redactUserContentMessage": "Input blocked message",
    "redactAssistantContentMessage": "Output blocked message"
  }
}

The implementation needs to determine which one to use based on which guardrail was actually triggered.

Checklist

  • Code follows the project's coding standards
  • Updated relevant tests
  • Added comments explaining the fix
  • Verified the fix addresses the reported issue
  • CI/CD tests pass (will be verified after PR submission)

Previously, handle_redact_content only checked for redactAssistantContentMessage
and completely ignored redactUserContentMessage, causing incorrect block messages
to be displayed when input guardrails were triggered.

This commit fixes the issue by:
- Checking redactUserContentMessage first (input guardrail)
- Falling back to redactAssistantContentMessage if input message is not present
- Updating test expectations to match the corrected behavior

Fixes the issue where input guardrail blocks incorrectly showed output
guardrail messages, improving user experience and message accuracy.

Related to: strands-agents#1324
Added test cases to cover all scenarios:
1. Both redactUserContentMessage and redactAssistantContentMessage present
   - Verifies that redactUserContentMessage takes priority
2. Only redactUserContentMessage present (input guardrail)
   - Verifies input-only blocking works correctly
3. Only redactAssistantContentMessage present (output guardrail)
   - Verifies output-only blocking works correctly

This ensures the fix properly handles all possible guardrail configurations.
AWS Bedrock sends multiple redactContent events in sequence:
1. First event with redactUserContentMessage
2. Second event with redactAssistantContentMessage

Previous implementation processed both events, causing the second one
to override the first, leading to incorrect messages being displayed.

This fix adds a 'redacted' flag to state to ensure only the first
redactContent event is processed, maintaining the correct priority:
- redactUserContentMessage (input guardrail) takes precedence
- redactAssistantContentMessage (output guardrail) is used if no input message

Related to: strands-agents#1324
AWS Bedrock always sends both redactUserContentMessage and
redactAssistantContentMessage regardless of which guardrail was
triggered. The trace metadata contains the actual trigger information.

Changes:
- Store both redact messages in state instead of choosing immediately
- Add finalize_redact_message() to analyze trace and select correct message
- Add _check_if_blocked() helper to check if any policy was blocked
- Call finalize_redact_message() when metadata event is received

This ensures:
- Input guardrail → shows redactUserContentMessage
- Output guardrail → shows redactAssistantContentMessage
The outputAssessments field is a dict (not a list) with guardrail
IDs as keys and assessment lists as values.

Before: outputAssessments = []
After: outputAssessments = { "guardrail_id": [...] }
AWS Bedrock sends redactContent events with both input and output guardrail
messages regardless of which guardrail was actually triggered. Previously,
these chunks were yielded directly to the stream via ModelStreamChunkEvent,
causing the input guardrail message to appear in the output even when
the output guardrail was triggered.

This fix prevents redactContent chunks from being yielded to the stream.
The messages are still processed by handle_redact_content and stored in
state, then finalize_redact_message uses the trace information to select
the correct message based on which guardrail actually blocked.
Previously _generate_redaction_events() would generate both input and output
redaction messages based only on config flags, regardless of which guardrail
actually triggered the block.

This fix:
1. Adds guardrail_data parameter to _generate_redaction_events()
2. Uses trace data (inputAssessment/outputAssessments) to determine which
   guardrail was actually blocked
3. Generates only the appropriate redaction message:
   - Output guardrail block -> redactAssistantContentMessage
   - Input guardrail block -> redactUserContentMessage
4. Falls back to legacy behavior if guardrail_data is not provided

This ensures that when an output guardrail blocks AI-generated content,
the output guardrail message is displayed instead of the input guardrail message.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant