Skip to content

Improve Claude prompt caching usage #279

@0xallam

Description

@0xallam

Right now Strix’s Claude prompt caching is only applied to the system prompt, but most of the repeated token spend happens in other parts of the input context.

We should expand caching to cover more reusable prompt segments so repeated calls within a run can reuse cached tokens and cut cost significantly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    To-Do

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions