fix(sessions/claudecode): keep final streaming token count for duplicate entries#358
fix(sessions/claudecode): keep final streaming token count for duplicate entries#358crhan wants to merge 1 commit intojunhoyeo:mainfrom
Conversation
…ate entries Claude Code's streaming API writes the same messageId:requestId multiple times while streaming — the first entry has a partial output_tokens count and the last entry has the final, complete count. The previous HashSet-based dedup kept the first-seen entry (partial), causing systematic token undercounts. Switch processed_hashes from HashSet<String> to HashMap<String, usize> mapping each dedup key to its index in the messages vec. When a duplicate is found, update the existing entry in-place if the new output_tokens is larger. Add test_deduplication_keeps_max_output_for_streaming_duplicates to verify that three streaming writes for the same messageId:requestId collapse to one entry retaining the max output_tokens (300, not the partial 31). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@crhan is attempting to deploy a commit to the Inevitable Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
1 issue found across 1 file
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="crates/tokscale-core/src/sessions/claudecode.rs">
<violation number="1" location="crates/tokscale-core/src/sessions/claudecode.rs:112">
P2: Duplicate merge only updates tokens when output increases, so equal-output duplicates cannot refresh input/cache token fields.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| if !processed_hashes.insert(hash.clone()) { | ||
| if let Some(&existing_idx) = processed_hashes.get(&hash) { | ||
| let new_output = usage.output_tokens.unwrap_or(0).max(0); | ||
| if new_output > messages[existing_idx].tokens.output { |
There was a problem hiding this comment.
P2: Duplicate merge only updates tokens when output increases, so equal-output duplicates cannot refresh input/cache token fields.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/tokscale-core/src/sessions/claudecode.rs, line 112:
<comment>Duplicate merge only updates tokens when output increases, so equal-output duplicates cannot refresh input/cache token fields.</comment>
<file context>
@@ -91,23 +96,36 @@ pub fn parse_claude_file(path: &Path) -> Vec<UnifiedMessage> {
- if !processed_hashes.insert(hash.clone()) {
+ if let Some(&existing_idx) = processed_hashes.get(&hash) {
+ let new_output = usage.output_tokens.unwrap_or(0).max(0);
+ if new_output > messages[existing_idx].tokens.output {
+ let t = &mut messages[existing_idx].tokens;
+ t.input = usage.input_tokens.unwrap_or(0).max(0);
</file context>
False positive. CC's streaming writes for the same |
Problem
Claude Code's streaming API writes the same
messageId:requestIdpair to the JSONL log multiple times while a response is being generated:output_tokens(e.g. 31)output_tokenscount (e.g. 300)The previous
HashSet-based dedup kept the first-seen entry, causing systematic token undercounts for every streaming response.Solution
Replace
processed_hashes: HashSet<String>withHashMap<String, usize>that maps each dedup key to its index in themessagesvec. When a duplicate key is encountered, compareoutput_tokensand update the existing entry in-place if the new value is larger.Changes
sessions/claudecode.rs:HashSet→HashMap<String, usize>forprocessed_hashes; dedup logic updated to keep maxoutput_tokenstest_deduplication_keeps_max_output_for_streaming_duplicates: three streaming writes for the samemessageId:requestIdcollapse to one entry with the final token countTest
🤖 Generated with Claude Code