Skip to content

Conversation

@af001
Copy link

@af001 af001 commented Dec 7, 2025

Description

This PR fixes the cachePoint formatting issue in BedrockModel that was causing ParamValidationError when using prompt caching with system prompts.

Problem: When using cachePoint in system prompts, Bedrock's API returned a validation error because the system content blocks were not being formatted correctly. Bedrock requires cachePoint to be a separate content block (tagged union), not merged with other fields.

Solution: Added _format_bedrock_system_blocks() method that formats system content blocks using the same logic as message content blocks, ensuring cachePoint blocks remain as separate content blocks.

Verified: Cache metrics now show cacheWriteInputTokens on first request and cacheReadInputTokens on subsequent requests, confirming prompt caching works correctly.

Related Issues

Fixes #1219
Fixes #1015

Documentation PR

N/A - No documentation changes needed

Type of Change

Bug fix

Testing

How have you tested the change?

  • All 92 bedrock unit tests pass (hatch test -- tests/strands/models/test_bedrock.py)

  • All 1608 unit tests pass (hatch test)

  • All 19 bedrock integration tests pass (hatch run test-integ -- tests_integ/models/test_model_bedrock.py)

  • Added 5 new tests specifically for cachePoint formatting

  • Verified cache metrics show actual cache hits with real Bedrock API calls

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Output

================================================================================
CACHE PERFORMANCE SUMMARY
================================================================================

Request 1 (First - cache CREATED):
  - Cache Write Input Tokens: 1761 tokens  <- system prompt written to cache
  - Cache Read Input Tokens:  0 tokens
  - Input Tokens:             14 tokens  <- user query only

Request 2 (Second - cache HIT):
  - Cache Write Input Tokens: 0 tokens
  - Cache Read Input Tokens:  1761 tokens  <- reused from cache!
  - Input Tokens:             197 tokens  <- user query + conversation

Request 3 (Third - cache HIT):
  - Cache Write Input Tokens: 0 tokens
  - Cache Read Input Tokens:  1761 tokens  <- reused from cache!
  - Input Tokens:             442 tokens  <- user query + conversation

================================================================================
SUCCESS: Prompt caching is working!
   Total Cache Write: 1761 tokens (charged at write rate)
   Total Cache Read:  3522 tokens (charged at ~90% discount)
   Total Input:       653 tokens (charged at standard rate)

   Without caching, requests 2 & 3 would have cost 3522 more input tokens!
================================================================================

Notes

NOTE: Bedrock Prompt Caching Minimum Token Requirements (5 minute cache)
--------------------------------------------------------
| Model                  | Min Tokens per Cache Checkpoint |
|------------------------|--------------------------------|
| Claude Sonnet 4        | 1,024 tokens                   |
| Claude 3.7 Sonnet      | 1,024 tokens                   |
| Claude 3.5 Haiku       | 2,048 tokens                   |
| Claude Opus 4.5        | 4,096 tokens                   |
| Amazon Nova (all)      | 1,000 tokens                   |
--------------------------------------------------------
Reference: https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html

@af001
Copy link
Author

af001 commented Dec 11, 2025

Here is the error you get:

    raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in system[1]: "cachePoint", must be one of: text, guardContent

This is a result of attempting to use both methods for Bedrock System Prompt Caching. For example:

bedrock_modle = BedrockModel(
    boto_session=boto_session,
    boto_client_config=bedrock_config,
    model_id=model,
    temperature=self.parameters.temperature,
    streaming=streaming,
    cache_prompt=cache_prompt,  # True
)

As a result, no cacheRead/Write tokens are being used and input_tokens remains high. System prompts are static and are the majority of the tokens we are being charged for:

{ # First run
  ...
  "input_tokens": 4489,
  "output_tokens": 39,
  "total_tokens": 4528,
  "latency_ms": 2764,
  "cycle_count": 1,
  "total_duration_s": 3.0020978450775146,
  "cache_read_tokens": 0,
  "cache_write_tokens": 0,
  "time_to_first_byte_ms": null,
  "tool_calls": 1
},
{  # Second Run
  ...
  "input_tokens": 4494,
  "output_tokens": 40,
  "total_tokens": 4534,
  "latency_ms": 3257,
  "cycle_count": 1,
  "total_duration_s": 3.5222342014312744,
  "cache_read_tokens": 0,
  "cache_write_tokens": 0,
  "time_to_first_byte_ms": null,
  "tool_calls": 1
}

The file was last modified here from #1112 . This introduced the feature but may have had a formatting bug, making it a regression. Verified this impacts v1.15-v1.19. Did not test v1.14 due to code refactoring required.

@af001 af001 force-pushed the fix/bedrock-cache-point-messages branch from 6eda9cd to 5026f6c Compare December 16, 2025 01:22
@github-actions github-actions bot added size/m and removed size/m labels Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

2 participants