feat(core): add pinFirstUserTurn compaction option for prefix cache stability#33604
feat(core): add pinFirstUserTurn compaction option for prefix cache stability#33604azhen073 wants to merge 1 commit into
Conversation
…tability When enabled, the first user message is preserved verbatim during compaction. This keeps the prompt prefix byte-identical across turns, maximizing prompt cache hit rates for providers with prefix caching (e.g. DeepSeek). Configuration: - V2 (opencode.json): compaction.pinFirstUserTurn = true - V1 (opencode.json): compaction.pin_first_user_turn = true Default is false, preserving existing behavior.
|
This PR doesn't fully meet our contributing guidelines and PR template. What needs to be fixed:
Please edit this PR description to address the above within 2 hours, or it will be automatically closed. If you believe this was flagged incorrectly, please let a maintainer know. |
|
The following comment was made by an LLM, it may be inaccurate: Potential Duplicate/Related PRs Found:
These are worth reviewing to ensure your new |
|
This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window. Feel free to open a new pull request that follows our guidelines. |
Issue for this PR
N/A — incremental improvement, no issue filed.
Type of change
What does this PR do?
Adds a
pinFirstUserTurncompaction option. When enabled, the first user message is kept verbatim during compaction instead of being summarized. This keeps the prompt prefix stable across turns, which helps providers with prefix caching (DeepSeek, OpenAI, etc.) reuse cached computation.The change is tiny (~30 lines, 4 files) and defaults to false, so existing users are unaffected.
Related: #31867 improves DeepSeek cache from the system prompt side (date injection), this one improves it from the compaction side (first user turn protection). They complement each other.
How did you verify your code works?
Reviewed the logic manually:
pinFirstUserTurnis set, ensuring it lands inrecentinstead ofhead. The index calculation properly accounts for compaction messages being filtered out and serialized messages that evaluate to empty being skipped.tail_turns.falsemeans no change to current behavior.I haven't been able to run tests — building requires downloading platform-specific Bun dependencies which timed out in my environment. Happy to run them if a maintainer can suggest a simpler test path.
Screenshots / recordings
N/A — not a UI change.
Checklist