Skip to content

fix(sessions/claudecode): keep final streaming token count for duplicate entries#358

Open
crhan wants to merge 1 commit intojunhoyeo:mainfrom
crhan:fix/streaming-dedup-keep-final-tokens
Open

fix(sessions/claudecode): keep final streaming token count for duplicate entries#358
crhan wants to merge 1 commit intojunhoyeo:mainfrom
crhan:fix/streaming-dedup-keep-final-tokens

Conversation

@crhan
Copy link
Contributor

@crhan crhan commented Mar 26, 2026

Problem

Claude Code's streaming API writes the same messageId:requestId pair to the JSONL log multiple times while a response is being generated:

  • Early writes contain partial output_tokens (e.g. 31)
  • The final write contains the complete output_tokens count (e.g. 300)

The previous HashSet-based dedup kept the first-seen entry, causing systematic token undercounts for every streaming response.

Solution

Replace processed_hashes: HashSet<String> with HashMap<String, usize> that maps each dedup key to its index in the messages vec. When a duplicate key is encountered, compare output_tokens and update the existing entry in-place if the new value is larger.

Changes

  • sessions/claudecode.rs: HashSetHashMap<String, usize> for processed_hashes; dedup logic updated to keep max output_tokens
  • Added test_deduplication_keeps_max_output_for_streaming_duplicates: three streaming writes for the same messageId:requestId collapse to one entry with the final token count

Test

running 3 tests
test sessions::claudecode::tests::test_deduplication_allows_same_message_different_request ... ok
test sessions::claudecode::tests::test_deduplication_keeps_max_output_for_streaming_duplicates ... ok
test sessions::claudecode::tests::test_deduplication_skips_duplicate_entries ... ok

🤖 Generated with Claude Code

…ate entries

Claude Code's streaming API writes the same messageId:requestId multiple
times while streaming — the first entry has a partial output_tokens count
and the last entry has the final, complete count. The previous HashSet-based
dedup kept the first-seen entry (partial), causing systematic token undercounts.

Switch processed_hashes from HashSet<String> to HashMap<String, usize> mapping
each dedup key to its index in the messages vec. When a duplicate is found,
update the existing entry in-place if the new output_tokens is larger.

Add test_deduplication_keeps_max_output_for_streaming_duplicates to verify
that three streaming writes for the same messageId:requestId collapse to one
entry retaining the max output_tokens (300, not the partial 31).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel
Copy link
Contributor

vercel bot commented Mar 26, 2026

@crhan is attempting to deploy a commit to the Inevitable Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="crates/tokscale-core/src/sessions/claudecode.rs">

<violation number="1" location="crates/tokscale-core/src/sessions/claudecode.rs:112">
P2: Duplicate merge only updates tokens when output increases, so equal-output duplicates cannot refresh input/cache token fields.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

if !processed_hashes.insert(hash.clone()) {
if let Some(&existing_idx) = processed_hashes.get(&hash) {
let new_output = usage.output_tokens.unwrap_or(0).max(0);
if new_output > messages[existing_idx].tokens.output {
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Duplicate merge only updates tokens when output increases, so equal-output duplicates cannot refresh input/cache token fields.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/tokscale-core/src/sessions/claudecode.rs, line 112:

<comment>Duplicate merge only updates tokens when output increases, so equal-output duplicates cannot refresh input/cache token fields.</comment>

<file context>
@@ -91,23 +96,36 @@ pub fn parse_claude_file(path: &Path) -> Vec<UnifiedMessage> {
-                        if !processed_hashes.insert(hash.clone()) {
+                        if let Some(&existing_idx) = processed_hashes.get(&hash) {
+                            let new_output = usage.output_tokens.unwrap_or(0).max(0);
+                            if new_output > messages[existing_idx].tokens.output {
+                                let t = &mut messages[existing_idx].tokens;
+                                t.input = usage.input_tokens.unwrap_or(0).max(0);
</file context>
Fix with Cubic

@crhan
Copy link
Contributor Author

crhan commented Mar 26, 2026

P2: Duplicate merge only updates tokens when output increases, so equal-output duplicates cannot refresh input/cache token fields.

False positive. CC's streaming writes for the same messageId:requestId always have strictly monotonically increasing output_tokens — the API never emits a later streaming chunk with the same or lower output count. Two writes with identical output_tokens are de-facto identical entries, so not refreshing other fields is correct behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant