Skip to content

fix(core): sort scan paths for deterministic dedup and stable refreshes#357

Open
crhan wants to merge 1 commit intojunhoyeo:mainfrom
crhan:fix/deterministic-scan-order
Open

fix(core): sort scan paths for deterministic dedup and stable refreshes#357
crhan wants to merge 1 commit intojunhoyeo:mainfrom
crhan:fix/deterministic-scan-order

Conversation

@crhan
Copy link
Copy Markdown
Contributor

@crhan crhan commented Mar 26, 2026

Problem

scan_directory uses par_bridge() to parallelize WalkDir traversal, which is explicitly non-deterministic — items may arrive in any order on each call.

The global seen_keys dedup in parse_all_messages_with_pricing keeps the first occurrence of any messageId:requestId key. When the same message appears in multiple session files (e.g. original session + acompact snapshot), which copy is kept depends on file order. Since that order changes between runs, the deduplicated message set changes too, causing visible data fluctuations every time the user presses r to refresh.

Fix

Sort the collected paths before returning from scan_directory. The parallel scan still runs at full speed; we sort only the result.

before: par_bridge() → non-deterministic order → different dedup winner each refresh
after:  par_bridge() → collect → sort → stable order → same dedup winner always

Testing

Manually verified on a corpus with ~2,000+ session files: pressing r repeatedly in the TUI daily/hourly view now produces stable numbers across refreshes.

🤖 Generated with Claude Code

scan_directory uses par_bridge() which is explicitly non-deterministic —
items may arrive in any order. The global seen_keys dedup in
parse_all_messages_with_pricing keeps the first occurrence of any
messageId:requestId key, so when the same message appears in multiple
session files (e.g. original + acompact session), a different copy is
"kept" each run depending on file order.

Sort the collected paths before returning so that:
  - dedup always picks the same winner across refreshes
  - data shown in the TUI daily/hourly views is stable when pressing 'r'

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Mar 26, 2026

@crhan is attempting to deploy a commit to the Inevitable Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant