Skip to content

cross-language benchmark: elastic hash 2-3x faster on both x86 and Apple M4#4

Open
joshuaisaact wants to merge 195 commits intomainfrom
autoresearch/cross-language-bench
Open

cross-language benchmark: elastic hash 2-3x faster on both x86 and Apple M4#4
joshuaisaact wants to merge 195 commits intomainfrom
autoresearch/cross-language-bench

Conversation

@joshuaisaact
Copy link
Copy Markdown
Owner

@joshuaisaact joshuaisaact commented Mar 24, 2026

Why

We verified elastic hash beats abseil by 36-97% on string key lookups on x86. But two questions remained: does the advantage hold against Rust's hashbrown and Go's swiss.Map when the comparison is fair? And is the advantage just a cache-density artifact of x86's small L2, or something deeper?

What

  • Fair cross-language benchmarks: Rust std::HashMap with ahash, Go cockroachdb/swiss.Map with pre-allocated strings. Same keys, shuffled access, median of 10.
  • Apple M4 benchmarks to test the cache-density hypothesis. M4 has ~16MB shared L2 vs ~512KB on x86. If both fingerprint arrays fit in L2, the advantage should disappear.
  • Size sweep from 16K to 4M elements on M4.

Results

The cache-density hypothesis was wrong

On M4, the advantage grew. At 50% load, shuffled hit lookup:

Platform L2 cache Elastic Abseil Gap
x86 (Linux) ~512KB 11,119us 19,312us 1.74x
Apple M4 ~16MB 9,488us 22,261us 2.35x

The win comes from cache lines touched per probe, not L2 vs L3 spill. Separated dense fingerprint arrays mean fewer cache line fetches under random access regardless of cache hierarchy.

Cross-language at 1M, 50% load, shuffled (M4)

Implementation Lookup (us) vs Elastic
Elastic Hash (Zig) 9,488 --
Rust hashbrown + ahash 17,182 1.81x slower
Abseil (C++) 22,261 2.35x slower
Go swiss.Map 28,720 3.03x slower
Go builtin map 33,749 3.56x slower

Size sweep at 50% load, shuffled (M4)

Size vs Abseil vs Rust+ahash vs Go swiss
16K 1.45x 1.21x 1.85x
64K 1.30x 1.26x 1.84x
256K 2.20x 3.03x 3.54x
1M 2.35x 1.81x 3.03x
4M 1.91x 1.78x 2.33x

Elastic hash wins at every table size from 16K to 4M. The x86 finding that small tables favored abseil does not hold on M4.

Where abseil still wins

  • Miss lookups: 2-3x faster (early termination on empty groups)
  • 99% load factor (tier 0 saturates)

References

Summary by CodeRabbit

  • New Features

    • String-key support and a comprehensive cross-language benchmarking suite (C++, Go, Rust, Zig) with multiple new benchmark programs and runners.
  • Documentation

    • Major README rewrite and many new docs: results, verification plans, insights, cross-language reports, experiment programs, and benchmark result summaries.
  • Refactor

    • Redesigned multi‑tier lookup/entry layout and tightened probe limits to change probing behavior and data layout.
  • Tests

    • New verification and realistic benchmark executables exercising insert/lookup/delete/miss scenarios.
  • Chores

    • Added benchmark scripts, build/run targets, and expanded ignore patterns.

- src/string_hybrid.zig: StringElasticHash with wyhash, same tiered architecture
- src/autobench-strings.zig: Zig benchmark with 16-byte hex keys
- bench-abseil-strings.cpp: abseil benchmark with string_view keys
- bench-strings.sh: comparison runner
- build.zig: added string test and benchmark steps
… access

Shuffled hit lookup gaps at 1M (string_view keys, 16-byte hex):
  10%: 1.97x  25%: 1.86x  50%: 1.74x  75%: 1.61x  90%: 1.50x  99%: 1.36x

The advantage is LARGER than u64 keys because:
- Fingerprint filtering saves expensive 16-byte memcmp comparisons
- Both sides pay more for wyhash, but our table structure saves more per probe
…swiss.Map

Elastic hash wins on string key hit lookups against all competitors at every load factor.
Fair comparison (vs abseil): 36-97% faster.
Rust/Go comparisons have fairness caveats (SipHash, allocation overhead).
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Added many cross-language benchmarks, runner scripts, and verification/research documents; introduced multiple new Zig benchmark executables and a string-key hybrid map; made substantial behavioral and public-layout changes to the Zig hybrid hash implementation (src/hybrid.zig); updated .gitignore and build wiring.

Changes

Cohort / File(s) Summary
Repo config & agents
\.gitignore, AGENTS.md, build.zig
Updated ignore patterns; added agent operating rules (AGENTS.md); added new Zig build targets/executables and test wiring in build.zig.
Core Zig hash implementation
src/hybrid.zig
Major functional/public-layout changes: combined keys/values into entries, reduced MAX_PROBES (100→7), changed hashing/fingerprint derivation, added tier0 mask/shift and get_overflow_fn, and rewrote control flow for get/overflow.
String-key Zig module
src/string_hybrid.zig
New StringElasticHash implementation for string keys with SIMD fingerprinting, tiered buckets, and tests.
Zig benchmark executables
src/autobench.zig, src/autobench-verify.zig, src/autobench-strings.zig, src/autobench-strings-verify.zig, src/autobench-realistic.zig, src/verify-strings.zig
Added multiple public Zig entrypoints that run deterministic workloads and emit tab-delimited RESULT lines (insert/lookup/delete/miss) across sizes, loads, and access patterns.
C++ benchmarks & verifiers
bench-abseil.cpp, bench-abseil-strings.cpp, bench-realistic.cpp, bench-verify.cpp, bench-strings-verify.cpp, verify-capacity.cpp, verify-hash-cost.cpp, verify-strings.cpp
New C++ benchmark and verification programs targeting absl::flat_hash_map (u64 and string keys), realistic workloads, capacity/hash-cost diagnostics, and ordered vs shuffled access.
Go & Rust benchmarks
bench-go/go.mod, bench-go/main.go, bench-rust/Cargo.toml, bench-rust/src/main.rs, bench-rust/...target...
Added Go benchmark using cockroachdb/swiss; added Rust bench-hashbrown with ahash plus many generated Cargo build artifacts in bench-rust/target.
Runner scripts & automation
bench.sh, bench-v2.sh, bench-strings.sh
New Bash runners compile/execute C++ and Zig benchmarks, capture logs, parse RESULT lines, and compute per-operation gap ratios.
Cross-language docs & experiments
README.md, program.md, program-v2.md, program-v3.md, program-v3-verify.md, program-v4-verify-strings.md, program-v5-cross-lang.md, results.md, insights*.md, cross-lang-results.md, verify-results.md, verify-strings-results.md, BENCHMARK-M4.md
Large README rewrite and many new program/spec/verification/insight/result documents and experiment logs describing methodology, findings, and verification plans.
String benchmark helpers
bench-abseil-strings.cpp, bench-strings.sh, bench-strings-verify.cpp, src/autobench-strings.zig, src/autobench-strings-verify.zig
New tooling and benchmarks to compare string-key performance between Abseil and Zig implementations.
Misc verification tools
verify-capacity.cpp, verify-hash-cost.cpp
Added capacity and per-hash-cost diagnostic utilities for investigating fairness and allocation behavior.
Generated build artifacts
bench-rust/target/...
Many auto-generated Cargo fingerprint/metadata/build output files added (fingerprints, .d/.rlib/.rmeta, timestamps) from Rust release builds.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Runner as "bench-v2.sh"
  participant BuilderC as "g++ / pkg-config"
  participant BuilderZ as "zig build"
  participant Abseil as "bench-abseil"
  participant Elastic as "autobench"
  participant Parser as "log-parser"

  Runner->>BuilderC: compile `bench-abseil.cpp`
  Runner->>BuilderZ: build `autobench` (ReleaseFast)
  BuilderC-->>Abseil: produce Abseil binary
  BuilderZ-->>Elastic: produce Elastic binary
  Runner->>Abseil: run -> write `abseil-v2.log`
  Runner->>Elastic: run -> write `elastic-v2.log`
  Runner->>Parser: parse `RESULT` lines from logs
  Parser-->>Runner: match n/load and compute per-op gaps
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I’m a rabbit with a stopwatch and a plan,
Hopping tiers and fingerprints across the land,
I compile binaries, shuffle keys with delight,
I log RESULTS, compare gaps late at night,
I twitch my nose and measure—cheerful rabbit flight 🐇

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly identifies the main achievement: elastic hash performing 2-3x faster than competitors in a cross-language benchmark context on both major platforms.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch autoresearch/cross-language-bench

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/hybrid.zig (1)

611-628: ⚠️ Potential issue | 🔴 Critical

remove() only searches tier 0, but elements can exist in tier 1+.

The remove() function uses tier0_bucket_mask and tier0_bucket_shift to search only tier 0 buckets. However, insert() can place elements into tier 1+ (via insertIntoTier, insertAnyTier). Elements inserted into higher tiers will never be found by remove(), returning false and leaving the element in the table.

Compare with get() which delegates to get_overflow_fn for tier 1+ lookups. remove() needs equivalent overflow handling.

🐛 Proposed fix sketch
 pub fn remove(self: *Self, key: u64) bool {
     const h = hash(key);
     const fp = fingerprint(h);
     const mask = self.tier0_bucket_mask;
     const bucket_base = h >> self.tier0_bucket_shift;

     var probe: usize = 0;
     while (probe < MAX_PROBES) : (probe += 1) {
         const bucket_idx = (bucket_base +% `@as`(u64, probe)) & mask;

         if (self.findKeyInBucket(bucket_idx, key, fp)) |slot| {
             self.fingerprints[bucket_idx][slot] = TOMBSTONE;
             self.count -= 1;
             return true;
         }
     }
-    return false;
+    // Search tier 1+ (similar to defaultGetOverflow)
+    return self.removeOverflow(h, key, fp);
 }
+
+fn removeOverflow(self: *Self, h: u64, key: u64, fp: u8) bool {
+    if (self.num_tiers <= 1) return false;
+    const num_buckets = self.tier_bucket_counts[1];
+    const tier_start = self.tier_starts[1];
+    for (0..@min(MAX_PROBES, num_buckets)) |probe| {
+        const abs_idx = tier_start + bucketIndex(h, probe, num_buckets);
+        if (self.findKeyInBucket(abs_idx, key, fp)) |slot| {
+            self.fingerprints[abs_idx][slot] = TOMBSTONE;
+            self.tier_slot_counts[1] -= 1;
+            self.count -= 1;
+            return true;
+        }
+    }
+    return false;
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hybrid.zig` around lines 611 - 628, remove() only probes tier0 (uses
tier0_bucket_mask/tier0_bucket_shift) so it never finds entries inserted into
higher tiers by insertIntoTier/insertAnyTier; modify remove (or add a helper) to
mirror get() behavior by delegating to the overflow lookup for tiers >0: for
each tier attempt the same probe loop using that tier's bucket_mask and
bucket_shift (or call get_overflow_fn-style logic) and when
findKeyInBucket(bucket_idx, key, fp) returns a slot, set
fingerprints[bucket_idx][slot] = TOMBSTONE and decrement count and return true;
if no tier finds the key after probing MAX_PROBES per tier return false. Ensure
you reference and use findKeyInBucket, MAX_PROBES, TOMBSTONE, fingerprints,
count, and the per-tier bucket_mask/bucket_shift logic rather than always using
tier0_*.
♻️ Duplicate comments (1)
src/autobench-verify.zig (1)

26-31: ⚠️ Potential issue | 🟡 Minor

Same potential underflow issue as in autobench-strings.zig.

The ns_diff cast to u64 could panic if negative. See the earlier comment on autobench-strings.zig for a fix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/autobench-verify.zig` around lines 26 - 31, The usElapsed function can
underflow when ns_diff is negative; change the math to use signed arithmetic and
only cast to u64 after combining seconds and nanoseconds: treat the seconds
difference as a signed value (e.g., s_i64 = `@as`(i64, end.sec - start.sec)), keep
ns_diff as i64, compute total_ns in a sufficiently-wide signed type (e.g., i128)
as total_ns_signed = s_i64 * 1_000_000_000 + ns_diff, then cast total_ns_signed
to u64 once (or assert/non-negatively check) before dividing by 1_000; update
the usElapsed function in src/autobench-verify.zig accordingly to avoid casting
a negative ns_diff to u64.
🧹 Nitpick comments (20)
bench-rust/src/main.rs (1)

70-77: Remove redundant as &[u8] casts.

The casts are unnecessary since &keys[order[i]] (type &[u8; 16]) automatically coerces to &[u8] when passed to map.get().

♻️ Proposed simplification
         // Shuffled hit lookup
         let start = Instant::now();
         for i in 0..fill {
-            black_box(map.get(&keys[order[i]] as &[u8]));
+            black_box(map.get(&keys[order[i]][..]));
         }
         let lookup_us = start.elapsed().as_micros() as u64;

         // Shuffled miss lookup
         let start = Instant::now();
         for i in 0..fill {
-            black_box(map.get(&miss_keys[order[i]] as &[u8]));
+            black_box(map.get(&miss_keys[order[i]][..]));
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-rust/src/main.rs` around lines 70 - 77, The calls to map.get currently
include redundant casts like `&keys[order[i]] as &[u8]`; remove the unnecessary
`as &[u8]` so you pass `&keys[order[i]]` and `&miss_keys[order[i]]` directly to
map.get. Update the lookup loops that use `map.get(&keys[order[i]] as &[u8])`
and `map.get(&miss_keys[order[i]] as &[u8])` (in the blocks computing
`lookup_us` and the shuffled miss lookup) to call `map.get(&keys[order[i]])` and
`map.get(&miss_keys[order[i]])` respectively.
bench-realistic.cpp (2)

132-134: Iterator dereference assumes key always exists.

it->second is dereferenced without checking it != map.end(). This is safe currently because all lookup_keys are drawn from keys[] which are all inserted, but the code is fragile if the key generation logic changes.

Consider adding an assertion or defensive check for safety:

♻️ Proposed defensive check
         for (size_t i = 0; i < OPS; i++) {
             auto it = map.find(lookup_keys[i]);
-            do_not_optimize(it->second);
+            do_not_optimize(it != map.end() ? it->second : 0);
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-realistic.cpp` around lines 132 - 134, The loop dereferences it->second
after auto it = map.find(lookup_keys[i]) without verifying the iterator is
valid; update the code around map.find/it to either assert(it != map.end()) or
add a defensive check (if (it == map.end()) handle error/continue) before
calling do_not_optimize(it->second) so lookups from lookup_keys are safe if key
generation changes.

34-35: Clarify the workload comment.

The comment says "90% lookup, 5% insert, 5% delete" but doesn't mention the hit/miss split. The actual breakdown is 80% hit lookup + 10% miss lookup + 5% insert + 5% delete. Consider updating the comment to reflect this:

-// Workload 1: Mixed read/write (90% lookup, 5% insert, 5% delete)
+// Workload 1: Mixed read/write (80% hit lookup, 10% miss lookup, 5% insert, 5% delete)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-realistic.cpp` around lines 34 - 35, Update the "Workload 1" comment to
explicitly state the hit/miss split instead of just "90% lookup, 5% insert, 5%
delete" — replace it with a line such as "Workload 1: Mixed read/write (80% hit
lookup, 10% miss lookup, 5% insert, 5% delete) — simulates a cache or index
under steady-state load" so the actual 80/10/5/5 breakdown is clear; locate the
comment near the Workload 1 block in bench-realistic.cpp to make the change.
verify-hash-cost.cpp (1)

64-69: Use portable format specifiers for uint64_t.

The %lu format specifier assumes uint64_t is unsigned long, which isn't portable across platforms (e.g., Windows uses unsigned long long). Use PRIu64 from <cinttypes> for portability.

♻️ Proposed fix
 `#include` <cstdint>
 `#include` <cstdio>
+#include <cinttypes>
 `#include` <chrono>
-    printf("  Elastic (multiply+xor-shift): %lu us (%.1f ns/hash)\n",
+    printf("  Elastic (multiply+xor-shift): %" PRIu64 " us (%.1f ns/hash)\n",
            best_ours, best_ours * 1000.0 / N);
-    printf("  Abseil (absl::Hash<u64>):     %lu us (%.1f ns/hash)\n",
+    printf("  Abseil (absl::Hash<u64>):     %" PRIu64 " us (%.1f ns/hash)\n",
            best_abseil, best_abseil * 1000.0 / N);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@verify-hash-cost.cpp` around lines 64 - 69, The printf calls use %lu for
variables best_ours and best_abseil which is not portable for uint64_t; include
<cinttypes> (or <inttypes.h>) and switch the format specifiers to use PRIu64
(e.g. "%" PRIu64) for best_ours and best_abseil, and ensure any arithmetic
involving N is cast appropriately (or printed with a matching specifier) so the
expressions best_ours * 1000.0 / N and best_abseil * 1000.0 / N still format
correctly; update the three printf lines referencing best_ours, best_abseil and
the ratio accordingly.
src/autobench.zig (1)

84-126: Consider gating diagnostic output behind a compile-time flag.

The brute-force scan and detailed tracing are valuable for debugging but add noise to benchmark output. Consider using a comptime flag to enable/disable this block.

♻️ Optional: Add compile-time toggle
+const ENABLE_DIAGNOSTICS = false;
+
 // One-time tier/probe distribution print (BEFORE delete)
-if (r == 0 and load_pct == 99 and n == 1_048_576) {
+if (ENABLE_DIAGNOSTICS and r == 0 and load_pct == 99 and n == 1_048_576) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/autobench.zig` around lines 84 - 126, Introduce a comptime toggle (e.g.,
comptime const SHOW_DISTRIBUTION = false) and wrap the entire diagnostic block
(the if (r == 0 and load_pct == 99 and n == 1_048_576) { ... } including the
brute-force scan that references map, map.get, map.getWithProbes, map.entries,
map.fingerprints, KEY_SEED, and splitmix64) in if (comptime SHOW_DISTRIBUTION) {
... } so the prints and scan are compiled out by default and can be enabled at
compile time for debugging.
src/hybrid.zig (1)

403-406: Hash function differs from ComptimeHybridElasticHash version.

The runtime hash() uses a simpler single-multiply-XOR scheme, while ComptimeHybridElasticHash.hash() (lines 82-89) uses a more thorough wyhash-style double multiplication. This may be intentional for performance, but could affect distribution quality differently between the two implementations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hybrid.zig` around lines 403 - 406, The runtime hash function inline fn
hash(key: u64) currently uses a single-multiply-XOR scheme that differs from the
ComptimeHybridElasticHash.hash() implementation; make them consistent by
replacing the runtime hash body with the same wyhash-style
double-multiply/xor/shift sequence used in ComptimeHybridElasticHash.hash()
(same constants and operations) so both runtime and comptime hashes produce the
same distribution, or if the simpler runtime variant is intentional, add a
TODO/comment in inline fn hash referencing ComptimeHybridElasticHash.hash and
include a brief note to keep behavior divergence deliberate and benchmarked;
locate the function by name (inline fn hash) and the comptime implementation
(ComptimeHybridElasticHash.hash) to copy or document the logic.
program.md (1)

74-77: Add language specifier to fenced code block.

The code block showing the TSV format lacks a language specifier, which can cause rendering issues in some markdown viewers.

Proposed fix
 9. **Log.** Append a line to `results.tsv`:
-   ```
+   ```tsv
    <commit_hash>\t<lookup_ratio_1M>\t<insert_ratio_1M>\t<delete_ratio_1M>\t<keep|revert|crash>\t<description>
    ```
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@program.md` around lines 74 - 77, Update the fenced code block in the "Log.
Append a line to `results.tsv`" section so it includes a language specifier
(e.g., ```tsv) immediately after the opening backticks; locate the block showing
"<commit_hash>\t<lookup_ratio_1M>\t<insert_ratio_1M>\t<delete_ratio_1M>\t<keep|revert|crash>\t<description>"
and change the opening fence to include the specifier to ensure proper markdown
rendering.
bench.sh (1)

14-14: grep -P is not portable to macOS/BSD.

The -P (Perl regex) flag is a GNU grep extension. On macOS (which uses BSD grep by default), this will fail with "grep: invalid option -- P". If cross-platform support is needed, consider using sed or awk instead.

Portable alternative using awk
-grep "RESULT" bench.log | grep "n=1048576" | grep -oP 'lookup_ratio=\K[0-9.]+'
+grep "RESULT" bench.log | grep "n=1048576" | awk -F'lookup_ratio=' '{print $2}' | awk '{print $1}'

Or using sed:

grep "RESULT" bench.log | grep "n=1048576" | sed -n 's/.*lookup_ratio=\([0-9.]*\).*/\1/p'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench.sh` at line 14, The current pipeline using grep -oP
'lookup_ratio=\K[0-9.]+' is not portable because grep -P is a GNU-only (Perl)
extension; replace the trailing grep -oP step with a portable sed or awk
expression to extract the numeric value from lines matched by grep "RESULT" and
grep "n=1048576" (i.e., update the pipeline that currently reads grep "RESULT"
bench.log | grep "n=1048576" | grep -oP 'lookup_ratio=\K[0-9.]+'). Use sed -n
with a capture group or awk to print the captured lookup_ratio value so the
script works on macOS/BSD as well.
bench-go/go.mod (1)

1-5: Remove stale duplicate bench-rust/bench-go/go.mod.

The directory contains an outdated duplicate go.mod file with Go version 1.25 (compared to 1.25.4 in the main bench-go directory) and no dependencies. No Go source files are present in this directory, confirming it's a leftover artifact that should be deleted.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-go/go.mod` around lines 1 - 5, Delete the stale duplicate go.mod file
that contains "module bench-go" and "go 1.25" (this leftover has no Go source
files and is inconsistent with the main bench-go go.mod). Remove the file from
the repo (e.g., git rm), commit the deletion, and ensure there are no references
or imports relying on that duplicate; if CI or tooling references it, update
those references to the main bench-go/go.mod.
bench-strings-verify.cpp (1)

61-66: Unchecked iterator dereference assumes all lookups are hits.

Lines 61 and 66 dereference it->second without checking if it != map.end(). This is correct for hit benchmarks where all keys exist, but could segfault if any key is missing due to a bug.

This is acceptable for controlled benchmark code, but adding a debug assertion would catch potential issues:

auto it = map.find(keys[i]);
assert(it != map.end());
do_not_optimize(it->second);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-strings-verify.cpp` around lines 61 - 66, The benchmark currently
dereferences iterators returned by map.find(...) (see use of map.find, keys,
shuffle_order and do_not_optimize) without checking for map.end(), which can
segfault if a key is missing; add a debug-only assertion before the dereference
in both loops (after auto it = map.find(...)) to assert(it != map.end()) so hits
are still assumed in release but missing-key bugs are caught during development.
insights.md (1)

10-27: Add language specifier to pseudocode block.

📝 Proposed fix
-```
+```text
 get(key):
     h = key * 0x517cc1b727220a95; h ^= h >> 32
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@insights.md` around lines 10 - 27, The fenced pseudocode block that shows
get(key) and findKeyInBucket lacks a language specifier, so update the opening
fence to include one (e.g., ```text or ```pseudo) to ensure consistent
rendering/highlighting; leave the content of get(key) and findKeyInBucket
unchanged and keep the closing ``` intact.
program-v3.md (2)

92-99: Add language specifier to fenced code blocks.

The code blocks showing memory layouts lack language specifiers. While these are pseudocode/diagrams, adding a specifier (e.g., text or plaintext) improves markdown rendering consistency.

📝 Proposed fix
-```
+```text
 fingerprints: [tier0 bucket 0][tier0 bucket 1]...[tier1 bucket 0][tier1 bucket 1]...

- +text
fingerprints: [tier0 bucket 0][tier1 bucket 0][tier0 bucket 1][tier1 bucket 1]...

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@program-v3.md` around lines 92 - 99, The fenced diagram blocks showing the
memory layout (the lines starting with "fingerprints: [tier0 bucket 0]...")
should include a Markdown language specifier for consistent rendering; update
both fenced code blocks that display the fingerprints layout to use a plain text
specifier (e.g., change ``` to ```text) so the blocks that contain
"fingerprints: [tier0 bucket 0][tier0 bucket 1]...[tier1 bucket 0][tier1 bucket
1]..." and the interleaved version "fingerprints: [tier0 bucket 0][tier1 bucket
0][tier0 bucket 1][tier1 bucket 1]..." are fenced as ```text.

162-164: Add language specifier for TSV header block.

📝 Proposed fix
-```
+```tsv
 commit	find_rate	lookup_gap_99	miss_gap_99	insert_gap_99	status	description
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @program-v3.md around lines 162 - 164, The fenced code block containing the
TSV header line
"commit\tfind_rate\tlookup_gap_99\tmiss_gap_99\tinsert_gap_99\tstatus\tdescription"
should include a language specifier for TSV; update the opening fence from "" to "tsv" so the header block is marked as TSV (locate the block that contains
the exact header string to apply the change).


</details>

</blockquote></details>
<details>
<summary>src/string_hybrid.zig (1)</summary><blockquote>

`39-47`: **`StringEntry` stores raw pointer without lifetime guarantees.**

`StringEntry` stores `key_ptr: [*]const u8` and `key_len: usize` which point to memory owned by the caller. If the caller frees or modifies the original key buffer, lookups will access invalid memory or return incorrect results.

This is acceptable for benchmark code where key buffers outlive the map, but should be documented for general use.


<details>
<summary>📝 Suggested documentation</summary>

```diff
 const StringEntry = struct {
+    /// IMPORTANT: The map does not own or copy key data.
+    /// Callers must ensure key memory remains valid for the map's lifetime.
     key_ptr: [*]const u8,
     key_len: usize,
     value: u64,
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@src/string_hybrid.zig` around lines 39 - 47, StringEntry currently stores a
raw pointer (key_ptr) and length (key_len) that reference caller-owned memory,
which can become invalid; update StringEntry to either (A) store an owned slice
or copy of the key (e.g., allocate and store []const u8 or a Zig
ArrayList/Allocator-managed buffer) or (B) explicitly document the lifetime
requirement and provide a safe constructor that takes ownership or clones the
key; adjust the StringEntry.key(self) method accordingly to return the owned
slice, and add a clear comment on the struct (or add a create/copy function)
describing that callers must ensure the backing buffer outlives the entry if you
keep the current raw-pointer approach.
```

</details>

</blockquote></details>
<details>
<summary>src/verify-strings.zig (1)</summary><blockquote>

`62-62`: **Same readability suggestion: expand the single-line shuffle.**

This matches the pattern in `autobench-strings-verify.zig`. Consider extracting a shared shuffle helper if this pattern is used across multiple files.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@src/verify-strings.zig` at line 62, The one-line Fisher–Yates shuffle inside
verify-strings.zig is hard to read; expand the single-line loop into a
multi-line, well-indented while loop using the same variables (idx, rng,
splitmix64, shuffle_order, tmp) so each operation is on its own line, and then
optionally move that logic into a shared helper function (e.g., shuffleArray or
fyShuffle) that accepts the slice and rng pointer so
autobench-strings-verify.zig and verify-strings.zig can both call it.
```

</details>

</blockquote></details>
<details>
<summary>bench-strings.sh (2)</summary><blockquote>

`5-8`: **Quote the `pkg-config` output to prevent word splitting issues.**

The shell will split the output on whitespace and glob characters. While typically safe for pkg-config, quoting is more robust.

<details>
<summary>🔧 Proposed fix</summary>

```diff
-g++ -O3 -march=native -DNDEBUG -DABSL_HASHTABLEZ_SAMPLE_PARAMETER=0 \
-    bench-abseil-strings.cpp -o bench-abseil-strings \
-    $(pkg-config --cflags --libs absl_hash absl_raw_hash_set absl_hashtablez_sampler)
+g++ -O3 -march=native -DNDEBUG -DABSL_HASHTABLEZ_SAMPLE_PARAMETER=0 \
+    bench-abseil-strings.cpp -o bench-abseil-strings \
+    "$(pkg-config --cflags --libs absl_hash absl_raw_hash_set absl_hashtablez_sampler)"
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@bench-strings.sh` around lines 5 - 8, The pkg-config invocation in the g++
command is unquoted which can cause word-splitting/globbing issues; update the
g++ invocation that builds bench-abseil-strings to wrap the pkg-config
substitution in double quotes (i.e., quote the $(pkg-config --cflags --libs
absl_hash absl_raw_hash_set absl_hashtablez_sampler) part) so the flags are
treated as a single expansion and not split by the shell.
```

</details>

---

`40-48`: **Integer comparison may fail on non-numeric values.**

If the awk extraction produces an unexpected non-numeric string (e.g., malformed log line), the `-gt` comparison will error. This is unlikely given controlled benchmark output, but a defensive check could help debugging.

<details>
<summary>🛡️ Optional: Add numeric validation</summary>

```diff
+    # Validate numeric values
+    is_numeric() { [[ "$1" =~ ^[0-9]+$ ]]; }
+
-    if [ -n "$e_ins" ] && [ -n "$e_get" ] && [ -n "$e_del" ] && [ -n "$e_miss" ] && \
-       [ "$e_ins" -gt 0 ] && [ "$e_get" -gt 0 ] && [ "$e_del" -gt 0 ] && [ "$e_miss" -gt 0 ]; then
+    if is_numeric "$e_ins" && is_numeric "$e_get" && is_numeric "$e_del" && is_numeric "$e_miss" && \
+       [ "$e_ins" -gt 0 ] && [ "$e_get" -gt 0 ] && [ "$e_del" -gt 0 ] && [ "$e_miss" -gt 0 ]; then
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@bench-strings.sh` around lines 40 - 48, The current conditional that checks
e_ins/e_get/e_del/e_miss using -gt can fail if any extracted value is
non-numeric; before performing the integer comparisons in the if block (the line
starting with "if [ -n "$e_ins" ] && ... && [ "$e_miss" -gt 0 ]; then"), add a
simple numeric validation for the four extracted variables (e_ins, e_get, e_del,
e_miss) — e.g., test each against a numeric regex (allow integers or decimals)
or coerce/validate with awk — and only proceed to compute
ins_gap/get_gap/del_gap/mis_gap and printf when all four pass the numeric check
and are >0. Ensure you reference and validate the same variables used later
(a_ins/a_get/a_del/a_miss and e_*) so the awk division and printf are only
executed with safe numeric inputs.
```

</details>

</blockquote></details>
<details>
<summary>README.md (1)</summary><blockquote>

`106-110`: **Specify language for fenced code block.**

Adding a language identifier enables syntax highlighting and satisfies linting rules.

<details>
<summary>🔧 Add language specifier</summary>

````diff
-```
+```bash
 zig build test       # run tests
 zig build bench      # full benchmark
 bash bench-v2.sh     # comparison vs abseil (requires abseil-cpp)
 ```
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 106 - 110, Update the fenced code block that contains
the three commands starting with "zig build test" to include a language
identifier by changing the opening ``` to ```bash so the block reads as a bash
snippet; locate the triple-backtick fenced block which encloses "zig build
test", "zig build bench", and "bash bench-v2.sh" and add the "bash" specifier
for proper syntax highlighting and lint compliance.
src/autobench-strings-verify.zig (1)

56-56: Consider expanding the shuffle loop for readability.

The single-line Fisher-Yates implementation is correct but dense. Breaking it across multiple lines would improve maintainability.

♻️ Expanded shuffle loop
-    while (idx > 1) { idx -= 1; const j = splitmix64(&rng) % (idx + 1); const tmp = shuffle_order[idx]; shuffle_order[idx] = shuffle_order[j]; shuffle_order[j] = tmp; }
+    while (idx > 1) {
+        idx -= 1;
+        const j = splitmix64(&rng) % (idx + 1);
+        const tmp = shuffle_order[idx];
+        shuffle_order[idx] = shuffle_order[j];
+        shuffle_order[j] = tmp;
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/autobench-strings-verify.zig` at line 56, The one-line Fisher‑Yates
shuffle in the loop is correct but hard to read; refactor the while loop that
uses idx, rng, splitmix64, and shuffle_order into multiple lines: decrement idx
on its own line, compute j = splitmix64(&rng) % (idx + 1) on the next line, then
swap using a temporary variable (tmp) across two lines so each step (decrement,
index calc, swap) is explicit; keep the existing variable names (idx, rng, j,
tmp, shuffle_order) and preserve the same logic and boundaries so behavior
remains unchanged.
bench-verify.cpp (1)

80-83: Consider using PRIu64 for portable uint64_t formatting.

%lu assumes uint64_t is unsigned long, which isn't guaranteed on all platforms (e.g., Windows MSVC uses unsigned long long).

🔧 Portable format specifier
+#include <cinttypes>
 ...
-    printf("ABSEIL\t%s\tn=%zu\tload=%zu\tordered_us=%lu\tshuffled_us=%lu\n",
+    printf("ABSEIL\t%s\tn=%zu\tload=%zu\tordered_us=%" PRIu64 "\tshuffled_us=%" PRIu64 "\n",
            label, n, fill * 100 / n,
            median(lkp_ordered, MEASURED), median(lkp_shuffled, MEASURED));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-verify.cpp` around lines 80 - 83, The printf in bench-verify.cpp that
prints median(lkp_ordered, MEASURED) and median(lkp_shuffled, MEASURED) uses %lu
which is not portable for uint64_t; update the format to use the PRIu64 macro
(e.g., "%" PRIu64) for those two fields and ensure you `#include` <inttypes.h> at
the top of the file so PRIu64 is defined; keep the existing %zu for size_t (n,
load) and adjust nothing else.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bench-abseil-strings.cpp`:
- Around line 75-81: The benchmark currently performs hit lookups in insertion
order by iterating keys[i] in the loop (using variables keys, map.find, fill,
do_not_optimize), so change it to perform shuffled lookups: create a
vector<size_t> indices of length fill, populate with 0..fill-1, shuffle it with
std::shuffle and a std::mt19937 seeded appropriately, then iterate using
keys[indices[i]] (and call map.find on that) to measure lookup_us; ensure the
shuffle happens once before the timing window so only lookup cost is measured.

In `@bench-abseil.cpp`:
- Around line 62-68: The lookup loop in bench-abseil.cpp uses insertion-order
keys (keys[i]) for hit lookups; change it to perform shuffled random access like
the other benchmarks by indexing keys with the existing shuffle_order
permutation (or by shuffling keys prior). Specifically, in the block that
measures lookup_us, replace the direct keys[i] access with
keys[shuffle_order[i]] (or apply std::shuffle to keys before the loop) so the
map.find calls use the shuffled order; keep other variables (map, fill,
do_not_optimize, lookup_us) unchanged.

In `@bench-go/go.mod`:
- Line 5: The go.mod entry for github.com/cockroachdb/swiss is marked as
indirect but main.go imports it directly; remove the "// indirect" marker or add
the dependency as a direct requirement by running `go get
github.com/cockroachdb/swiss@v0.0.0-20251224182025-b0f6560f979b` (or update the
version) so the require line for github.com/cockroachdb/swiss becomes a direct
dependency; check main.go to confirm the import and then run `go mod tidy` to
ensure module file consistency.

In `@bench-go/main.go`:
- Around line 113-114: The RESULT line in bench-go/main.go currently prints an
extra "impl=go-swiss" field and places miss_us before delete_us, which breaks
bench-strings.sh parsing; update the fmt.Printf call (the RESULT formatter in
main.go) to remove the "impl=..." token and reorder the placeholders so
delete_us is printed before miss_us (i.e., format: RESULT\t
n=...\tload=...\tinsert_us=...\tlookup_us=...\tdelete_us=...\tmiss_us=...),
ensuring the field names and order exactly match what bench-strings.sh expects;
alternatively, if you prefer supporting both formats, add parsing logic to
bench-strings.sh to accept the "impl=" field and both field orders, but prefer
standardizing main.go's PRINTF first.

In `@bench-rust/target/.rustc_info.json`:
- Line 1: The bench build artifacts (the target directory and files like
.rustc_info.json) were committed; update the repo by adding the
bench-rust/target/ entry to .gitignore and stop tracking those files, then
remove the committed artifacts from git (unstage/remove from index) and commit
the change so the target files are no longer tracked; ensure .rustc_info.json
and any other files under the target directory are removed from the repository
history or at least untracked going forward.

In `@bench-rust/target/CACHEDIR.TAG`:
- Around line 1-3: Remove the committed Cargo build output by deleting the
CACHEDIR.TAG and other files under the bench-rust/target directory from version
control and ignore future artifacts: add an entry to .gitignore to exclude the
Rust target directory (e.g., a pattern matching target/ or bench-rust/target/),
then remove the already-committed files from the index (untrack them) and commit
the changes so the build outputs are no longer in the repository.

In
`@bench-rust/target/release/.fingerprint/bench-hashbrown-f2a29ac529dfc934/bin-bench-hashbrown.json`:
- Line 1: Add bench-rust/target/ to .gitignore and remove the already committed
build artifacts: update the repository .gitignore (at repo root or bench-rust/)
to include a line "target/" and then remove the committed files from git
tracking (e.g., git rm --cached) for entries under bench-rust/target/, including
bench-rust/target/release/.fingerprint/.../bin-bench-hashbrown.json so they
remain locally but are no longer tracked or pushed.

In `@bench-verify.cpp`:
- Around line 42-46: The shuffle in bench-verify.cpp currently instantiates
std::mt19937_64 rng(42) and uses it in the Fisher–Yates loop that swaps
lookup_order; replace the RNG with the same splitmix64 implementation and
seeding (seed 42) used by src/autobench-verify.zig so the generated pseudorandom
values match across implementations; keep the existing loop and swap logic (the
for loop touching lookup_order and swapping by index j) but replace the rng
type/constructor and RNG call site to use splitmix64.next() (or the C++
splitmix64 equivalent) so shuffled_us comparisons are valid.

In `@insights-v3.md`:
- Around line 28-30: Update the paragraph that claims Abseil is "thread-safe
with some overhead" to instead state that these benchmarks are single-threaded
and that absl::flat_hash_map (and Abseil containers) follow STL-style semantics
and require external synchronization for concurrent mutation; replace the
phrasing with a clear caveat that concurrent use of absl::flat_hash_map needs
external locking (e.g., mutexes) and is not intrinsically thread-safe while
keeping the rest of the note about single-threaded benchmarking and absl::Hash
vs integer-specialized multiply-hash intact.

In `@program-v2.md`:
- Around line 95-100: Update the bench-abseil.cpp setup to avoid the misleading
reserve(fill) instruction: in the file/benchmark setup where it currently says
to call reserve(fill) before inserts, change it to the corrected reservation
scheme reserve(n) (or clearly mark the reserve(fill) line as historical/legacy)
so the benchmark uses the final, unbiased approach described in insights-v2.md;
locate the text referring to reserve(fill) in the bench-abseil.cpp instructions
and replace it with the reserve(n) wording (or add an explicit “historical”
comment) so readers replaying v2 will run the corrected phase-1.

In `@program-v5-cross-lang.md`:
- Around line 25-28: The benchmark currently labels the target "Rust hashbrown"
but uses std::collections::HashMap (which uses RandomState/SipHash), conflating
hasher cost with table implementation; update the plan to either (A) explicitly
benchmark hashbrown::HashMap with a consistent fast hasher (e.g., use
hashbrown::HashMap with ahash/FxHasher configured) so the comparison isolates
table performance, or (B) rename the target to "Rust std::HashMap" and note that
results include RandomState/SipHash overhead; refer to HashMap,
hashbrown::HashMap, RandomState, and SipHash when making the change so the
intent and measurement are unambiguous.

In `@src/autobench-realistic.zig`:
- Around line 25-29: The usElapsed function computes microseconds but due to
operator precedence only ns_diff is divided by 1_000, making the seconds term
remain in nanoseconds; fix by computing total_ns = s * 1_000_000_000 + `@as`(i64,
`@intCast`(ns_diff)) (or properly cast ns_diff to i64 and add), then return
`@as`(u64, total_ns / 1_000); update the usElapsed implementation to build
total_ns first (using symbols s, ns_diff, and usElapsed) and perform a single
division to produce microseconds.

In `@src/autobench-strings.zig`:
- Around line 40-44: The usElapsed function is unsafe because it casts a
potentially negative ns_diff into u64 which will panic; update usElapsed to
perform the nanosecond arithmetic in a signed wide integer (e.g., i128 or i64)
by combining (end.sec - start.sec) * 1_000_000_000 with ns_diff as a signed
value, then divide by 1_000 and finally cast the non-negative result to u64;
target the function name usElapsed and the local ns_diff/end.sec/start.sec
symbols when making the change and ensure you guard/cast only after the combined
total is computed and verified non-negative.

In `@src/hybrid.zig`:
- Around line 595-609: defaultGetOverflow currently only scans tier 1, causing
misses for elements inserted into tier 2+ by insertAnyTier; change it to mirror
getWithProbes by iterating all tiers from 1 to self.num_tiers-1 and for each
tier use that tier's tier_bucket_counts[i] and tier_starts[i], probe up to
`@min`(MAX_PROBES, num_buckets) using bucketIndex, call findValueInBucket(abs_idx,
key, fp) and return the value if found, and perform the same per-tier early
termination using matchEmpty(&self.fingerprints[abs_idx]) to return null for
that tier; only return null after all tiers/probes are exhausted. Ensure you
reference defaultGetOverflow, getWithProbes, insertAnyTier, num_tiers,
tier_bucket_counts, tier_starts, MAX_PROBES, findValueInBucket, matchEmpty,
fingerprints, and bucketIndex when making the change.

In `@src/string_hybrid.zig`:
- Around line 302-312: defaultGetOverflow currently only inspects tier 1 which
misses items inserted by insertAnyTier into higher tiers; update
defaultGetOverflow(self: *const Self, h: u64, key: []const u8, fp: u8) to
iterate tiers 1..self.num_tiers, and for each tier set num_buckets =
self.tier_bucket_counts[tier] and tier_start = self.tier_starts[tier], then
perform the existing probe loop (using bucketIndex, findValueInBucket, and
matchEmpty) for that tier and return the found value or continue to the next
tier; keep the same MAX_PROBES logic per tier and return null if not found in
any overflow tier.
- Around line 314-330: remove() only probes tier 0 and doesn't update
tier_slot_counts, so items placed in higher tiers by insertAnyTier() cannot be
removed and tier_slot_counts[0] stays stale; update remove(self: *Self, key:
[]const u8) to probe across all tiers for the bucket (use the same bucket_idx
calculation but for each probe check each tier’s slot via findKeyInBucket or a
new helper), and when you tombstone an entry set fingerprints[bucket_idx][slot]
= TOMBSTONE, decrement self.count and also decrement
self.tier_slot_counts[tier_index] (use the tier you found the key in); ensure
the code updates the correct tier_slot_counts element and keeps existing
TOMBSTONE, count, and fingerprint semantics so getEmptyFraction() and insertion
logic remain correct (or if you intentionally won't support removal from
overflow tiers, document that limitation in insertAnyTier/remove).

In `@verify-results.md`:
- Around line 60-66: The document contains a contradictory note: the line
stating "clang++ not available for cross-check" is now stale because the later
"Update: clang 22.1.1 now tested." reports clang results; remove the earlier
"clang++ not available for cross-check" sentence (or reword it into a historical
note) so that the section only reflects the current clang testing status and
keeps the abseil vs clang/GCC comparison consistent.

In `@verify-strings.cpp`:
- Around line 41-43: median_val currently returns arr[n/2], which picks the
upper middle for even n and biases medians; after sorting the uint64_t* arr in
median_val(sort(arr, arr+n)), return the true median: if n is odd return
arr[n/2], otherwise compute the average of the two middle values (arr[n/2 - 1]
and arr[n/2]) and return that average as uint64_t to handle even-sized samples
correctly.
- Around line 57-70: The microbenchmark currently uses the minimum runtime
(variable best) across 10 runs for absl::Hash<string_view>, producing an
optimistic outlier; change it to compute the median runtime after the same
warmup behavior used elsewhere: record each run's duration (from start/end
around the inner loop that uses hasher(std::string_view(keys[i & 1023], 16))),
skip the warmup runs as done in the other tests, sort the collected durations
and pick the median value to print (use that median in the printf and
nanoseconds/hash calculation instead of best). Ensure you reference and reuse N,
keys, hasher, and the 10-run loop so the change is localized.

---

Outside diff comments:
In `@src/hybrid.zig`:
- Around line 611-628: remove() only probes tier0 (uses
tier0_bucket_mask/tier0_bucket_shift) so it never finds entries inserted into
higher tiers by insertIntoTier/insertAnyTier; modify remove (or add a helper) to
mirror get() behavior by delegating to the overflow lookup for tiers >0: for
each tier attempt the same probe loop using that tier's bucket_mask and
bucket_shift (or call get_overflow_fn-style logic) and when
findKeyInBucket(bucket_idx, key, fp) returns a slot, set
fingerprints[bucket_idx][slot] = TOMBSTONE and decrement count and return true;
if no tier finds the key after probing MAX_PROBES per tier return false. Ensure
you reference and use findKeyInBucket, MAX_PROBES, TOMBSTONE, fingerprints,
count, and the per-tier bucket_mask/bucket_shift logic rather than always using
tier0_*.

---

Duplicate comments:
In `@src/autobench-verify.zig`:
- Around line 26-31: The usElapsed function can underflow when ns_diff is
negative; change the math to use signed arithmetic and only cast to u64 after
combining seconds and nanoseconds: treat the seconds difference as a signed
value (e.g., s_i64 = `@as`(i64, end.sec - start.sec)), keep ns_diff as i64,
compute total_ns in a sufficiently-wide signed type (e.g., i128) as
total_ns_signed = s_i64 * 1_000_000_000 + ns_diff, then cast total_ns_signed to
u64 once (or assert/non-negatively check) before dividing by 1_000; update the
usElapsed function in src/autobench-verify.zig accordingly to avoid casting a
negative ns_diff to u64.

---

Nitpick comments:
In `@bench-go/go.mod`:
- Around line 1-5: Delete the stale duplicate go.mod file that contains "module
bench-go" and "go 1.25" (this leftover has no Go source files and is
inconsistent with the main bench-go go.mod). Remove the file from the repo
(e.g., git rm), commit the deletion, and ensure there are no references or
imports relying on that duplicate; if CI or tooling references it, update those
references to the main bench-go/go.mod.

In `@bench-realistic.cpp`:
- Around line 132-134: The loop dereferences it->second after auto it =
map.find(lookup_keys[i]) without verifying the iterator is valid; update the
code around map.find/it to either assert(it != map.end()) or add a defensive
check (if (it == map.end()) handle error/continue) before calling
do_not_optimize(it->second) so lookups from lookup_keys are safe if key
generation changes.
- Around line 34-35: Update the "Workload 1" comment to explicitly state the
hit/miss split instead of just "90% lookup, 5% insert, 5% delete" — replace it
with a line such as "Workload 1: Mixed read/write (80% hit lookup, 10% miss
lookup, 5% insert, 5% delete) — simulates a cache or index under steady-state
load" so the actual 80/10/5/5 breakdown is clear; locate the comment near the
Workload 1 block in bench-realistic.cpp to make the change.

In `@bench-rust/src/main.rs`:
- Around line 70-77: The calls to map.get currently include redundant casts like
`&keys[order[i]] as &[u8]`; remove the unnecessary `as &[u8]` so you pass
`&keys[order[i]]` and `&miss_keys[order[i]]` directly to map.get. Update the
lookup loops that use `map.get(&keys[order[i]] as &[u8])` and
`map.get(&miss_keys[order[i]] as &[u8])` (in the blocks computing `lookup_us`
and the shuffled miss lookup) to call `map.get(&keys[order[i]])` and
`map.get(&miss_keys[order[i]])` respectively.

In `@bench-strings-verify.cpp`:
- Around line 61-66: The benchmark currently dereferences iterators returned by
map.find(...) (see use of map.find, keys, shuffle_order and do_not_optimize)
without checking for map.end(), which can segfault if a key is missing; add a
debug-only assertion before the dereference in both loops (after auto it =
map.find(...)) to assert(it != map.end()) so hits are still assumed in release
but missing-key bugs are caught during development.

In `@bench-strings.sh`:
- Around line 5-8: The pkg-config invocation in the g++ command is unquoted
which can cause word-splitting/globbing issues; update the g++ invocation that
builds bench-abseil-strings to wrap the pkg-config substitution in double quotes
(i.e., quote the $(pkg-config --cflags --libs absl_hash absl_raw_hash_set
absl_hashtablez_sampler) part) so the flags are treated as a single expansion
and not split by the shell.
- Around line 40-48: The current conditional that checks
e_ins/e_get/e_del/e_miss using -gt can fail if any extracted value is
non-numeric; before performing the integer comparisons in the if block (the line
starting with "if [ -n "$e_ins" ] && ... && [ "$e_miss" -gt 0 ]; then"), add a
simple numeric validation for the four extracted variables (e_ins, e_get, e_del,
e_miss) — e.g., test each against a numeric regex (allow integers or decimals)
or coerce/validate with awk — and only proceed to compute
ins_gap/get_gap/del_gap/mis_gap and printf when all four pass the numeric check
and are >0. Ensure you reference and validate the same variables used later
(a_ins/a_get/a_del/a_miss and e_*) so the awk division and printf are only
executed with safe numeric inputs.

In `@bench-verify.cpp`:
- Around line 80-83: The printf in bench-verify.cpp that prints
median(lkp_ordered, MEASURED) and median(lkp_shuffled, MEASURED) uses %lu which
is not portable for uint64_t; update the format to use the PRIu64 macro (e.g.,
"%" PRIu64) for those two fields and ensure you `#include` <inttypes.h> at the top
of the file so PRIu64 is defined; keep the existing %zu for size_t (n, load) and
adjust nothing else.

In `@bench.sh`:
- Line 14: The current pipeline using grep -oP 'lookup_ratio=\K[0-9.]+' is not
portable because grep -P is a GNU-only (Perl) extension; replace the trailing
grep -oP step with a portable sed or awk expression to extract the numeric value
from lines matched by grep "RESULT" and grep "n=1048576" (i.e., update the
pipeline that currently reads grep "RESULT" bench.log | grep "n=1048576" | grep
-oP 'lookup_ratio=\K[0-9.]+'). Use sed -n with a capture group or awk to print
the captured lookup_ratio value so the script works on macOS/BSD as well.

In `@insights.md`:
- Around line 10-27: The fenced pseudocode block that shows get(key) and
findKeyInBucket lacks a language specifier, so update the opening fence to
include one (e.g., ```text or ```pseudo) to ensure consistent
rendering/highlighting; leave the content of get(key) and findKeyInBucket
unchanged and keep the closing ``` intact.

In `@program-v3.md`:
- Around line 92-99: The fenced diagram blocks showing the memory layout (the
lines starting with "fingerprints: [tier0 bucket 0]...") should include a
Markdown language specifier for consistent rendering; update both fenced code
blocks that display the fingerprints layout to use a plain text specifier (e.g.,
change ``` to ```text) so the blocks that contain "fingerprints: [tier0 bucket
0][tier0 bucket 1]...[tier1 bucket 0][tier1 bucket 1]..." and the interleaved
version "fingerprints: [tier0 bucket 0][tier1 bucket 0][tier0 bucket 1][tier1
bucket 1]..." are fenced as ```text.
- Around line 162-164: The fenced code block containing the TSV header line
"commit\tfind_rate\tlookup_gap_99\tmiss_gap_99\tinsert_gap_99\tstatus\tdescription"
should include a language specifier for TSV; update the opening fence from "```"
to "```tsv" so the header block is marked as TSV (locate the block that contains
the exact header string to apply the change).

In `@program.md`:
- Around line 74-77: Update the fenced code block in the "Log. Append a line to
`results.tsv`" section so it includes a language specifier (e.g., ```tsv)
immediately after the opening backticks; locate the block showing
"<commit_hash>\t<lookup_ratio_1M>\t<insert_ratio_1M>\t<delete_ratio_1M>\t<keep|revert|crash>\t<description>"
and change the opening fence to include the specifier to ensure proper markdown
rendering.

In `@README.md`:
- Around line 106-110: Update the fenced code block that contains the three
commands starting with "zig build test" to include a language identifier by
changing the opening ``` to ```bash so the block reads as a bash snippet; locate
the triple-backtick fenced block which encloses "zig build test", "zig build
bench", and "bash bench-v2.sh" and add the "bash" specifier for proper syntax
highlighting and lint compliance.

In `@src/autobench-strings-verify.zig`:
- Line 56: The one-line Fisher‑Yates shuffle in the loop is correct but hard to
read; refactor the while loop that uses idx, rng, splitmix64, and shuffle_order
into multiple lines: decrement idx on its own line, compute j = splitmix64(&rng)
% (idx + 1) on the next line, then swap using a temporary variable (tmp) across
two lines so each step (decrement, index calc, swap) is explicit; keep the
existing variable names (idx, rng, j, tmp, shuffle_order) and preserve the same
logic and boundaries so behavior remains unchanged.

In `@src/autobench.zig`:
- Around line 84-126: Introduce a comptime toggle (e.g., comptime const
SHOW_DISTRIBUTION = false) and wrap the entire diagnostic block (the if (r == 0
and load_pct == 99 and n == 1_048_576) { ... } including the brute-force scan
that references map, map.get, map.getWithProbes, map.entries, map.fingerprints,
KEY_SEED, and splitmix64) in if (comptime SHOW_DISTRIBUTION) { ... } so the
prints and scan are compiled out by default and can be enabled at compile time
for debugging.

In `@src/hybrid.zig`:
- Around line 403-406: The runtime hash function inline fn hash(key: u64)
currently uses a single-multiply-XOR scheme that differs from the
ComptimeHybridElasticHash.hash() implementation; make them consistent by
replacing the runtime hash body with the same wyhash-style
double-multiply/xor/shift sequence used in ComptimeHybridElasticHash.hash()
(same constants and operations) so both runtime and comptime hashes produce the
same distribution, or if the simpler runtime variant is intentional, add a
TODO/comment in inline fn hash referencing ComptimeHybridElasticHash.hash and
include a brief note to keep behavior divergence deliberate and benchmarked;
locate the function by name (inline fn hash) and the comptime implementation
(ComptimeHybridElasticHash.hash) to copy or document the logic.

In `@src/string_hybrid.zig`:
- Around line 39-47: StringEntry currently stores a raw pointer (key_ptr) and
length (key_len) that reference caller-owned memory, which can become invalid;
update StringEntry to either (A) store an owned slice or copy of the key (e.g.,
allocate and store []const u8 or a Zig ArrayList/Allocator-managed buffer) or
(B) explicitly document the lifetime requirement and provide a safe constructor
that takes ownership or clones the key; adjust the StringEntry.key(self) method
accordingly to return the owned slice, and add a clear comment on the struct (or
add a create/copy function) describing that callers must ensure the backing
buffer outlives the entry if you keep the current raw-pointer approach.

In `@src/verify-strings.zig`:
- Line 62: The one-line Fisher–Yates shuffle inside verify-strings.zig is hard
to read; expand the single-line loop into a multi-line, well-indented while loop
using the same variables (idx, rng, splitmix64, shuffle_order, tmp) so each
operation is on its own line, and then optionally move that logic into a shared
helper function (e.g., shuffleArray or fyShuffle) that accepts the slice and rng
pointer so autobench-strings-verify.zig and verify-strings.zig can both call it.

In `@verify-hash-cost.cpp`:
- Around line 64-69: The printf calls use %lu for variables best_ours and
best_abseil which is not portable for uint64_t; include <cinttypes> (or
<inttypes.h>) and switch the format specifiers to use PRIu64 (e.g. "%" PRIu64)
for best_ours and best_abseil, and ensure any arithmetic involving N is cast
appropriately (or printed with a matching specifier) so the expressions
best_ours * 1000.0 / N and best_abseil * 1000.0 / N still format correctly;
update the three printf lines referencing best_ours, best_abseil and the ratio
accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 81e28ea9-888f-489b-96c2-8adfa31b31e5

📥 Commits

Reviewing files that changed from the base of the PR and between 3e37318 and d788da5.

⛔ Files ignored due to path filters (8)
  • abseil-strings.log is excluded by !**/*.log
  • bench-go/go.sum is excluded by !**/*.sum
  • bench-rust/Cargo.lock is excluded by !**/*.lock
  • bench-v2.log is excluded by !**/*.log
  • elastic-strings.log is excluded by !**/*.log
  • results-v2.tsv is excluded by !**/*.tsv
  • results-v3.tsv is excluded by !**/*.tsv
  • results.tsv is excluded by !**/*.tsv
📒 Files selected for processing (61)
  • .gitignore
  • AGENTS.md
  • README.md
  • bench-abseil-clang
  • bench-abseil-strings
  • bench-abseil-strings.cpp
  • bench-abseil.cpp
  • bench-go/bench-go
  • bench-go/go.mod
  • bench-go/main.go
  • bench-realistic
  • bench-realistic.cpp
  • bench-rust/Cargo.toml
  • bench-rust/bench-go/go.mod
  • bench-rust/src/main.rs
  • bench-rust/target/.rustc_info.json
  • bench-rust/target/CACHEDIR.TAG
  • bench-rust/target/release/.cargo-lock
  • bench-rust/target/release/.fingerprint/bench-hashbrown-f2a29ac529dfc934/bin-bench-hashbrown
  • bench-rust/target/release/.fingerprint/bench-hashbrown-f2a29ac529dfc934/bin-bench-hashbrown.json
  • bench-rust/target/release/.fingerprint/bench-hashbrown-f2a29ac529dfc934/dep-bin-bench-hashbrown
  • bench-rust/target/release/.fingerprint/bench-hashbrown-f2a29ac529dfc934/invoked.timestamp
  • bench-rust/target/release/bench-hashbrown
  • bench-rust/target/release/bench-hashbrown.d
  • bench-rust/target/release/deps/bench_hashbrown-f2a29ac529dfc934
  • bench-rust/target/release/deps/bench_hashbrown-f2a29ac529dfc934.d
  • bench-strings-verify
  • bench-strings-verify.cpp
  • bench-strings.sh
  • bench-v2.sh
  • bench-verify
  • bench-verify.cpp
  • bench.sh
  • build.zig
  • cross-lang-results.md
  • insights-v2.md
  • insights-v3.md
  • insights.md
  • program-v2.md
  • program-v3-verify.md
  • program-v3.md
  • program-v4-verify-strings.md
  • program-v5-cross-lang.md
  • program.md
  • results.md
  • src/autobench-realistic.zig
  • src/autobench-strings-verify.zig
  • src/autobench-strings.zig
  • src/autobench-verify.zig
  • src/autobench.zig
  • src/hybrid.zig
  • src/string_hybrid.zig
  • src/verify-strings.zig
  • verify-capacity
  • verify-capacity.cpp
  • verify-hash-cost
  • verify-hash-cost.cpp
  • verify-results.md
  • verify-strings-abseil
  • verify-strings-results.md
  • verify-strings.cpp

Comment on lines +62 to +68
start = std::chrono::steady_clock::now();
for (size_t i = 0; i < fill; i++) {
auto it = map.find(keys[i]);
do_not_optimize(it->second);
}
end = std::chrono::steady_clock::now();
uint64_t lookup_us = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if autobench.zig or autobench-strings.zig shuffle hit lookups

echo "=== Checking Zig benchmark lookup patterns ==="

# Look for shuffle usage near lookup timing in Zig files
rg -n -A5 -B5 "shuffle" src/autobench.zig src/autobench-strings.zig 2>/dev/null || echo "Files not found or no shuffle"

# Check if hit lookups use shuffle_order
rg -n "shuffle_order\[" src/autobench.zig src/autobench-strings.zig 2>/dev/null || echo "No shuffle_order usage found"

Repository: joshuaisaact/elastic-hash

Length of output: 174


🏁 Script executed:

# First, let's look at bench-abseil.cpp around lines 62-68 to confirm the lookup pattern
head -80 bench-abseil.cpp | tail -25

Repository: joshuaisaact/elastic-hash

Length of output: 1145


🏁 Script executed:

# Search for README and check for "shuffled random access" or similar claims
find . -iname "README*" -type f | head -5

Repository: joshuaisaact/elastic-hash

Length of output: 80


🏁 Script executed:

# Check if there's any shuffling in C++ benchmarks
rg -n "shuffle" bench-*.cpp 2>/dev/null || echo "No shuffle found in C++ benchmarks"

Repository: joshuaisaact/elastic-hash

Length of output: 1771


🏁 Script executed:

cat README.md | head -150

Repository: joshuaisaact/elastic-hash

Length of output: 5535


🏁 Script executed:

# Search README for keywords about shuffled, random, access patterns
rg -n "shuffled|random.access|hit.lookup" README.md

Repository: joshuaisaact/elastic-hash

Length of output: 502


Hit lookups use insertion order, not shuffled access.

The README documents "shuffled random access" for hit lookups (line 13), and the verification benchmarks (bench-verify.cpp, bench-strings-verify.cpp) implement this with shuffle_order. However, bench-abseil.cpp lines 63-66 iterate keys in insertion order (keys[i] in a simple loop) rather than shuffled random access. Align the main benchmark with the documented methodology.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-abseil.cpp` around lines 62 - 68, The lookup loop in bench-abseil.cpp
uses insertion-order keys (keys[i]) for hit lookups; change it to perform
shuffled random access like the other benchmarks by indexing keys with the
existing shuffle_order permutation (or by shuffling keys prior). Specifically,
in the block that measures lookup_us, replace the direct keys[i] access with
keys[shuffle_order[i]] (or apply std::shuffle to keys before the loop) so the
map.find calls use the shuffled order; keep other variables (map, fill,
do_not_optimize, lookup_us) unchanged.


go 1.25.4

require github.com/cockroachdb/swiss v0.0.0-20251224182025-b0f6560f979b // indirect
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Dependency should be direct, not indirect.

The main.go directly imports github.com/cockroachdb/swiss, so this should be a direct dependency. The // indirect comment is incorrect and typically indicates the dependency is only transitively required.

Proposed fix
-require github.com/cockroachdb/swiss v0.0.0-20251224182025-b0f6560f979b // indirect
+require github.com/cockroachdb/swiss v0.0.0-20251224182025-b0f6560f979b
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
require github.com/cockroachdb/swiss v0.0.0-20251224182025-b0f6560f979b // indirect
require github.com/cockroachdb/swiss v0.0.0-20251224182025-b0f6560f979b
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-go/go.mod` at line 5, The go.mod entry for github.com/cockroachdb/swiss
is marked as indirect but main.go imports it directly; remove the "// indirect"
marker or add the dependency as a direct requirement by running `go get
github.com/cockroachdb/swiss@v0.0.0-20251224182025-b0f6560f979b` (or update the
version) so the require line for github.com/cockroachdb/swiss becomes a direct
dependency; check main.go to confirm the import and then run `go mod tidy` to
ensure module file consistency.

Comment on lines +113 to +114
fmt.Printf("RESULT\timpl=go-swiss\tn=%d\tload=%d\tinsert_us=%d\tlookup_us=%d\tmiss_us=%d\tdelete_us=%d\n",
n, loadPct, median(ins), median(lkp), median(mis), median(del))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Output format differs from Abseil string benchmark.

The Go output includes an impl= field and has miss_us before delete_us, while bench-strings.sh expects no impl= field and delete_us before miss_us. This Go benchmark likely needs its own parser or the output should be standardized.

📋 Format comparison

Go output:

RESULT\timpl=go-swiss\tn=\tload=\tinsert_us=\tlookup_us=\tmiss_us=\tdelete_us=

bench-strings.sh expects:

RESULT\tn=\tload=\tinsert_us=\tlookup_us=\tdelete_us=\tmiss_us=

If you want to use bench-strings.sh with Go output, either add Go support to the parser or standardize the output format.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-go/main.go` around lines 113 - 114, The RESULT line in bench-go/main.go
currently prints an extra "impl=go-swiss" field and places miss_us before
delete_us, which breaks bench-strings.sh parsing; update the fmt.Printf call
(the RESULT formatter in main.go) to remove the "impl=..." token and reorder the
placeholders so delete_us is printed before miss_us (i.e., format: RESULT\t
n=...\tload=...\tinsert_us=...\tlookup_us=...\tdelete_us=...\tmiss_us=...),
ensuring the field names and order exactly match what bench-strings.sh expects;
alternatively, if you prefer supporting both formats, add parsing logic to
bench-strings.sh to accept the "impl=" field and both field orders, but prefer
standardizing main.go's PRINTF first.

@@ -0,0 +1 @@
{"rustc_fingerprint":16647195171711883400,"outputs":{"17747080675513052775":{"success":true,"status":"","code":0,"stdout":"rustc 1.94.0 (4a4ef493e 2026-03-02) (Arch Linux rust 1:1.94.0-2)\nbinary: rustc\ncommit-hash: 4a4ef493e3a1488c6e321570238084b38948f6db\ncommit-date: 2026-03-02\nhost: x86_64-unknown-linux-gnu\nrelease: 1.94.0\nLLVM version: 22.1.0\n","stderr":""},"7971740275564407648":{"success":true,"status":"","code":0,"stdout":"___\nlib___.rlib\nlib___.so\nlib___.so\nlib___.a\nlib___.so\n/usr\noff\npacked\nunpacked\n___\ndebug_assertions\npanic=\"unwind\"\nproc_macro\ntarget_abi=\"\"\ntarget_arch=\"x86_64\"\ntarget_endian=\"little\"\ntarget_env=\"gnu\"\ntarget_family=\"unix\"\ntarget_feature=\"fxsr\"\ntarget_feature=\"sse\"\ntarget_feature=\"sse2\"\ntarget_has_atomic=\"16\"\ntarget_has_atomic=\"32\"\ntarget_has_atomic=\"64\"\ntarget_has_atomic=\"8\"\ntarget_has_atomic=\"ptr\"\ntarget_os=\"linux\"\ntarget_pointer_width=\"64\"\ntarget_vendor=\"unknown\"\nunix\n","stderr":""}},"successes":{}} No newline at end of file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Build artifact should not be committed.

The bench-rust/target/ directory contains Cargo build artifacts that are machine-specific and auto-generated. Add bench-rust/target/ to .gitignore and remove these files from the repository.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-rust/target/.rustc_info.json` at line 1, The bench build artifacts (the
target directory and files like .rustc_info.json) were committed; update the
repo by adding the bench-rust/target/ entry to .gitignore and stop tracking
those files, then remove the committed artifacts from git (unstage/remove from
index) and commit the change so the target files are no longer tracked; ensure
.rustc_info.json and any other files under the target directory are removed from
the repository history or at least untracked going forward.

Comment on lines +302 to +312
fn defaultGetOverflow(self: *const Self, h: u64, key: []const u8, fp: u8) ?u64 {
if (self.num_tiers <= 1) return null;
const num_buckets = self.tier_bucket_counts[1];
const tier_start = self.tier_starts[1];
for (0..@min(MAX_PROBES, num_buckets)) |probe| {
const abs_idx = tier_start + bucketIndex(h, probe, num_buckets);
if (self.findValueInBucket(abs_idx, key, fp)) |val| return val;
if (matchEmpty(&self.fingerprints[abs_idx]) != 0) return null;
}
return null;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

defaultGetOverflow only searches tier 1, but insertAnyTier can place elements in any tier.

insertAnyTier() (line 270) iterates for (0..self.num_tiers) |tier|, potentially placing elements in tiers 2+. However, defaultGetOverflow() only searches tier 1 (line 304), making elements in higher tiers invisible to get().

This could cause get() to return null for keys that were successfully inserted.

🐛 Proposed fix to search all overflow tiers
     fn defaultGetOverflow(self: *const Self, h: u64, key: []const u8, fp: u8) ?u64 {
-        if (self.num_tiers <= 1) return null;
-        const num_buckets = self.tier_bucket_counts[1];
-        const tier_start = self.tier_starts[1];
-        for (0..@min(MAX_PROBES, num_buckets)) |probe| {
-            const abs_idx = tier_start + bucketIndex(h, probe, num_buckets);
-            if (self.findValueInBucket(abs_idx, key, fp)) |val| return val;
-            if (matchEmpty(&self.fingerprints[abs_idx]) != 0) return null;
+        for (1..self.num_tiers) |tier| {
+            const num_buckets = self.tier_bucket_counts[tier];
+            const tier_start = self.tier_starts[tier];
+            for (0..@min(MAX_PROBES, num_buckets)) |probe| {
+                const abs_idx = tier_start + bucketIndex(h, probe, num_buckets);
+                if (self.findValueInBucket(abs_idx, key, fp)) |val| return val;
+            }
         }
         return null;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn defaultGetOverflow(self: *const Self, h: u64, key: []const u8, fp: u8) ?u64 {
if (self.num_tiers <= 1) return null;
const num_buckets = self.tier_bucket_counts[1];
const tier_start = self.tier_starts[1];
for (0..@min(MAX_PROBES, num_buckets)) |probe| {
const abs_idx = tier_start + bucketIndex(h, probe, num_buckets);
if (self.findValueInBucket(abs_idx, key, fp)) |val| return val;
if (matchEmpty(&self.fingerprints[abs_idx]) != 0) return null;
}
return null;
}
fn defaultGetOverflow(self: *const Self, h: u64, key: []const u8, fp: u8) ?u64 {
for (1..self.num_tiers) |tier| {
const num_buckets = self.tier_bucket_counts[tier];
const tier_start = self.tier_starts[tier];
for (0..@min(MAX_PROBES, num_buckets)) |probe| {
const abs_idx = tier_start + bucketIndex(h, probe, num_buckets);
if (self.findValueInBucket(abs_idx, key, fp)) |val| return val;
}
}
return null;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/string_hybrid.zig` around lines 302 - 312, defaultGetOverflow currently
only inspects tier 1 which misses items inserted by insertAnyTier into higher
tiers; update defaultGetOverflow(self: *const Self, h: u64, key: []const u8, fp:
u8) to iterate tiers 1..self.num_tiers, and for each tier set num_buckets =
self.tier_bucket_counts[tier] and tier_start = self.tier_starts[tier], then
perform the existing probe loop (using bucketIndex, findValueInBucket, and
matchEmpty) for that tier and return the found value or continue to the next
tier; keep the same MAX_PROBES logic per tier and return null if not found in
any overflow tier.

Comment on lines +314 to +330
pub fn remove(self: *Self, key: []const u8) bool {
const h = hash(key);
const fp = fingerprint(h);
const mask = self.tier0_bucket_mask;
const bucket_base = h >> self.tier0_bucket_shift;

var probe: usize = 0;
while (probe < MAX_PROBES) : (probe += 1) {
const bucket_idx = (bucket_base +% @as(u64, probe)) & mask;
if (self.findKeyInBucket(bucket_idx, key, fp)) |slot| {
self.fingerprints[bucket_idx][slot] = TOMBSTONE;
self.count -= 1;
return true;
}
}
return false;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

remove() only searches tier 0 and doesn't decrement tier_slot_counts.

Two issues:

  1. Elements inserted via insertAnyTier() (lines 267-284) may reside in tiers 1+, but remove() only probes tier 0, leaving those elements un-removable.
  2. When removing from tier 0, tier_slot_counts[0] is not decremented, causing getEmptyFraction() to return incorrect values and potentially affecting insertion decisions.
🐛 Proposed fix to decrement tier_slot_counts
     fn remove(self: *Self, key: []const u8) bool {
         const h = hash(key);
         const fp = fingerprint(h);
         const mask = self.tier0_bucket_mask;
         const bucket_base = h >> self.tier0_bucket_shift;

         var probe: usize = 0;
         while (probe < MAX_PROBES) : (probe += 1) {
             const bucket_idx = (bucket_base +% `@as`(u64, probe)) & mask;
             if (self.findKeyInBucket(bucket_idx, key, fp)) |slot| {
                 self.fingerprints[bucket_idx][slot] = TOMBSTONE;
+                self.tier_slot_counts[0] -= 1;
                 self.count -= 1;
                 return true;
             }
         }
         return false;
     }

For elements in higher tiers, consider adding overflow tier removal or documenting this as a known limitation.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn remove(self: *Self, key: []const u8) bool {
const h = hash(key);
const fp = fingerprint(h);
const mask = self.tier0_bucket_mask;
const bucket_base = h >> self.tier0_bucket_shift;
var probe: usize = 0;
while (probe < MAX_PROBES) : (probe += 1) {
const bucket_idx = (bucket_base +% @as(u64, probe)) & mask;
if (self.findKeyInBucket(bucket_idx, key, fp)) |slot| {
self.fingerprints[bucket_idx][slot] = TOMBSTONE;
self.count -= 1;
return true;
}
}
return false;
}
pub fn remove(self: *Self, key: []const u8) bool {
const h = hash(key);
const fp = fingerprint(h);
const mask = self.tier0_bucket_mask;
const bucket_base = h >> self.tier0_bucket_shift;
var probe: usize = 0;
while (probe < MAX_PROBES) : (probe += 1) {
const bucket_idx = (bucket_base +% `@as`(u64, probe)) & mask;
if (self.findKeyInBucket(bucket_idx, key, fp)) |slot| {
self.fingerprints[bucket_idx][slot] = TOMBSTONE;
self.tier_slot_counts[0] -= 1;
self.count -= 1;
return true;
}
}
return false;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/string_hybrid.zig` around lines 314 - 330, remove() only probes tier 0
and doesn't update tier_slot_counts, so items placed in higher tiers by
insertAnyTier() cannot be removed and tier_slot_counts[0] stays stale; update
remove(self: *Self, key: []const u8) to probe across all tiers for the bucket
(use the same bucket_idx calculation but for each probe check each tier’s slot
via findKeyInBucket or a new helper), and when you tombstone an entry set
fingerprints[bucket_idx][slot] = TOMBSTONE, decrement self.count and also
decrement self.tier_slot_counts[tier_index] (use the tier you found the key in);
ensure the code updates the correct tier_slot_counts element and keeps existing
TOMBSTONE, count, and fingerprint semantics so getEmptyFraction() and insertion
logic remain correct (or if you intentionally won't support removal from
overflow tiers, document that limitation in insertAnyTier/remove).

Comment on lines +60 to +66
- g++ (GCC) 15.2.1 for abseil
- Zig 0.16.0-dev (LLVM 19+ backend) for elastic hash
- clang++ not available for cross-check

Both use -O3 equivalent. The different compiler backends (GCC vs LLVM) could generate different code quality. This is a limitation we can't easily resolve without installing clang.

**Update: clang 22.1.1 now tested.** GCC and clang produce nearly identical abseil lookup performance at 10-90% load (within 3%). At 99% load, GCC is actually 10% faster than clang. The compiler difference is NOT giving us an unfair advantage -- GCC-compiled abseil is the harder target.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove the stale “clang++ not available” note.

This section first says clang was unavailable and then immediately reports clang 22.1.1 results. Leaving both statements in the final document makes the verification narrative self-contradictory.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@verify-results.md` around lines 60 - 66, The document contains a
contradictory note: the line stating "clang++ not available for cross-check" is
now stale because the later "Update: clang 22.1.1 now tested." reports clang
results; remove the earlier "clang++ not available for cross-check" sentence (or
reword it into a historical note) so that the section only reflects the current
clang testing status and keeps the abseil vs clang/GCC comparison consistent.

Comment on lines +41 to +43
static uint64_t median_val(uint64_t* arr, int n) {
std::sort(arr, arr + n);
return arr[n / 2];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Return a true median for even-sized samples.

With MEASURED == 10, arr[n / 2] picks the upper middle sample, not the median. That biases every reported “median of 10” in Tests 2-4 and can skew cross-implementation comparisons.

Proposed fix
 static uint64_t median_val(uint64_t* arr, int n) {
     std::sort(arr, arr + n);
-    return arr[n / 2];
+    if ((n & 1) != 0) return arr[n / 2];
+    const uint64_t lo = arr[(n / 2) - 1];
+    const uint64_t hi = arr[n / 2];
+    return lo + (hi - lo) / 2;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@verify-strings.cpp` around lines 41 - 43, median_val currently returns
arr[n/2], which picks the upper middle for even n and biases medians; after
sorting the uint64_t* arr in median_val(sort(arr, arr+n)), return the true
median: if n is odd return arr[n/2], otherwise compute the average of the two
middle values (arr[n/2 - 1] and arr[n/2]) and return that average as uint64_t to
handle even-sized samples correctly.

Comment on lines +57 to +70
absl::Hash<std::string_view> hasher;
uint64_t best = UINT64_MAX;
for (int r = 0; r < 10; r++) {
auto start = std::chrono::steady_clock::now();
uint64_t acc = 0;
for (size_t i = 0; i < N; i++) {
acc += hasher(std::string_view(keys[i & 1023], 16));
}
auto end = std::chrono::steady_clock::now();
do_not_optimize(acc);
uint64_t us = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();
if (us < best) best = us;
}
printf(" absl::Hash<string_view>(16B): %lu us (%.1f ns/hash)\n\n", best, best * 1000.0 / N);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use the same summary statistic for Test 1.

The hash-cost microbenchmark reports the minimum runtime, while the rest of this file reports medians after warmup. That makes Test 1 systematically more optimistic and not directly comparable to the lookup numbers you're using elsewhere.

Proposed fix
     absl::Hash<std::string_view> hasher;
-    uint64_t best = UINT64_MAX;
-    for (int r = 0; r < 10; r++) {
+    uint64_t times[MEASURED];
+    for (int r = 0; r < TOTAL_RUNS; r++) {
         auto start = std::chrono::steady_clock::now();
         uint64_t acc = 0;
         for (size_t i = 0; i < N; i++) {
             acc += hasher(std::string_view(keys[i & 1023], 16));
         }
         auto end = std::chrono::steady_clock::now();
         do_not_optimize(acc);
         uint64_t us = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();
-        if (us < best) best = us;
+        if (r >= WARMUP) times[r - WARMUP] = us;
     }
-    printf("  absl::Hash<string_view>(16B): %lu us (%.1f ns/hash)\n\n", best, best * 1000.0 / N);
+    const uint64_t median = median_val(times, MEASURED);
+    printf("  absl::Hash<string_view>(16B): %lu us (%.1f ns/hash)\n\n", median, median * 1000.0 / N);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
absl::Hash<std::string_view> hasher;
uint64_t best = UINT64_MAX;
for (int r = 0; r < 10; r++) {
auto start = std::chrono::steady_clock::now();
uint64_t acc = 0;
for (size_t i = 0; i < N; i++) {
acc += hasher(std::string_view(keys[i & 1023], 16));
}
auto end = std::chrono::steady_clock::now();
do_not_optimize(acc);
uint64_t us = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();
if (us < best) best = us;
}
printf(" absl::Hash<string_view>(16B): %lu us (%.1f ns/hash)\n\n", best, best * 1000.0 / N);
absl::Hash<std::string_view> hasher;
uint64_t times[MEASURED];
for (int r = 0; r < TOTAL_RUNS; r++) {
auto start = std::chrono::steady_clock::now();
uint64_t acc = 0;
for (size_t i = 0; i < N; i++) {
acc += hasher(std::string_view(keys[i & 1023], 16));
}
auto end = std::chrono::steady_clock::now();
do_not_optimize(acc);
uint64_t us = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();
if (r >= WARMUP) times[r - WARMUP] = us;
}
const uint64_t median = median_val(times, MEASURED);
printf(" absl::Hash<string_view>(16B): %lu us (%.1f ns/hash)\n\n", median, median * 1000.0 / N);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@verify-strings.cpp` around lines 57 - 70, The microbenchmark currently uses
the minimum runtime (variable best) across 10 runs for absl::Hash<string_view>,
producing an optimistic outlier; change it to compute the median runtime after
the same warmup behavior used elsewhere: record each run's duration (from
start/end around the inner loop that uses hasher(std::string_view(keys[i &
1023], 16))), skip the warmup runs as done in the other tests, sort the
collected durations and pick the median value to print (use that median in the
printf and nanoseconds/hash calculation instead of best). Ensure you reference
and reuse N, keys, hasher, and the 10-run loop so the change is localized.

…for Go

Rust+ahash is now the closest competitor (1.46x at 50%, 1.09x at 99%).
Go swiss.Map improved from 6.27x to 2.28x with pre-allocated strings.
Elastic hash still fastest at every load factor against every competitor.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
bench-go/main.go (1)

121-122: ⚠️ Potential issue | 🟡 Minor

RESULT schema still differs from parser-expected format.

Line 121 and Line 198 still emit impl=... and place miss_us before delete_us. This matches the previously reported parser-compatibility issue.

Suggested format normalization
-fmt.Printf("RESULT\timpl=go-swiss\tn=%d\tload=%d\tinsert_us=%d\tlookup_us=%d\tmiss_us=%d\tdelete_us=%d\n",
-		n, loadPct, median(ins), median(lkp), median(mis), median(del))
+fmt.Printf("RESULT\tn=%d\tload=%d\tinsert_us=%d\tlookup_us=%d\tdelete_us=%d\tmiss_us=%d\n",
+		n, loadPct, median(ins), median(lkp), median(del), median(mis))

-fmt.Printf("RESULT\timpl=go-builtin\tn=%d\tload=%d\tinsert_us=%d\tlookup_us=%d\tmiss_us=%d\tdelete_us=%d\n",
-		n, loadPct, median(ins), median(lkp), median(mis), median(del))
+fmt.Printf("RESULT\tn=%d\tload=%d\tinsert_us=%d\tlookup_us=%d\tdelete_us=%d\tmiss_us=%d\n",
+		n, loadPct, median(ins), median(lkp), median(del), median(mis))

Also applies to: 198-199

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-go/main.go` around lines 121 - 122, The RESULT output lines currently
print "impl=go-swiss" and place miss_us before delete_us, which breaks the
parser; update both fmt.Printf calls that emit "RESULT\timpl=go-swiss..." (the
two print sites in main that format RESULT) to match the parser-expected schema
by removing the "impl=..." token and reordering fields so delete_us appears
before miss_us (i.e., RESULT\t
n=...\tload=...\tinsert_us=...\tlookup_us=...\tdelete_us=...\tmiss_us=...),
keeping the same numeric values and tab separators.
🧹 Nitpick comments (1)
cross-lang-results.md (1)

67-72: Add concrete environment metadata for reproducibility.

Consider adding CPU model, core pinning policy, OS/kernel, and exact compiler/runtime versions used for each language run. This will make the fairness caveats and rankings much easier to reproduce and defend.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cross-lang-results.md` around lines 67 - 72, Update the "## Remaining
fairness caveats" section to include concrete environment metadata for each
language benchmark run: list CPU model/frequency, core-pin/affinity policy used,
OS and kernel version, exact compiler/toolchain version and flags (e.g.,
rustc/cargo, gcc/clang, Zig/C++ toolchains), and runtime versions (Go, JVM,
etc.); add a short per-language table or bullet block under that header that
records these fields for every measured run so readers can reproduce the results
and defend the comparisons.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bench-go/main.go`:
- Around line 37-42: The median function (median, using arr and local s)
currently returns s[len(s)/2] which biases the result for even-length slices;
modify median to handle even counts by returning the average of the two middle
values: for n := len(s), if n==0 return 0 (or appropriate zero value), if n%2==1
return s[n/2], else return (s[n/2-1]+s[n/2])/2 using uint64 arithmetic to
preserve the function signature and avoid bias for even-sized samples.

---

Duplicate comments:
In `@bench-go/main.go`:
- Around line 121-122: The RESULT output lines currently print "impl=go-swiss"
and place miss_us before delete_us, which breaks the parser; update both
fmt.Printf calls that emit "RESULT\timpl=go-swiss..." (the two print sites in
main that format RESULT) to match the parser-expected schema by removing the
"impl=..." token and reordering fields so delete_us appears before miss_us
(i.e., RESULT\t
n=...\tload=...\tinsert_us=...\tlookup_us=...\tdelete_us=...\tmiss_us=...),
keeping the same numeric values and tab separators.

---

Nitpick comments:
In `@cross-lang-results.md`:
- Around line 67-72: Update the "## Remaining fairness caveats" section to
include concrete environment metadata for each language benchmark run: list CPU
model/frequency, core-pin/affinity policy used, OS and kernel version, exact
compiler/toolchain version and flags (e.g., rustc/cargo, gcc/clang, Zig/C++
toolchains), and runtime versions (Go, JVM, etc.); add a short per-language
table or bullet block under that header that records these fields for every
measured run so readers can reproduce the results and defend the comparisons.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d4b4bce8-2f62-461c-9d99-eb9bd886daf7

📥 Commits

Reviewing files that changed from the base of the PR and between d788da5 and 6aecf81.

⛔ Files ignored due to path filters (1)
  • bench-rust/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (113)
  • bench-go/bench-go
  • bench-go/main.go
  • bench-rust/Cargo.toml
  • bench-rust/src/main.rs
  • bench-rust/target/release/.fingerprint/ahash-28d25007c6697d4d/run-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/ahash-28d25007c6697d4d/run-build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/dep-lib-ahash
  • bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/invoked.timestamp
  • bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/lib-ahash
  • bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/lib-ahash.json
  • bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/build-script-build-script-build
  • bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/dep-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/invoked.timestamp
  • bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/bin-bench-hashbrown
  • bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/bin-bench-hashbrown.json
  • bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/dep-bin-bench-hashbrown
  • bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/invoked.timestamp
  • bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/dep-lib-cfg_if
  • bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/invoked.timestamp
  • bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/lib-cfg_if
  • bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/lib-cfg_if.json
  • bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/build-script-build-script-build
  • bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/dep-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/invoked.timestamp
  • bench-rust/target/release/.fingerprint/getrandom-7c74517ba22e6184/run-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/getrandom-7c74517ba22e6184/run-build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/dep-lib-getrandom
  • bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/invoked.timestamp
  • bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/lib-getrandom
  • bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/lib-getrandom.json
  • bench-rust/target/release/.fingerprint/libc-1bc3c604e29ae713/run-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/libc-1bc3c604e29ae713/run-build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/dep-lib-libc
  • bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/invoked.timestamp
  • bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/lib-libc
  • bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/lib-libc.json
  • bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/build-script-build-script-build
  • bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/dep-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/invoked.timestamp
  • bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/dep-lib-once_cell
  • bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/invoked.timestamp
  • bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/lib-once_cell
  • bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/lib-once_cell.json
  • bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/dep-lib-version_check
  • bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/invoked.timestamp
  • bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/lib-version_check
  • bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/lib-version_check.json
  • bench-rust/target/release/.fingerprint/zerocopy-2c2a0b3df16c71b3/run-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/zerocopy-2c2a0b3df16c71b3/run-build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/dep-lib-zerocopy
  • bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/invoked.timestamp
  • bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/lib-zerocopy
  • bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/lib-zerocopy.json
  • bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/build-script-build-script-build
  • bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/dep-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/invoked.timestamp
  • bench-rust/target/release/bench-hashbrown
  • bench-rust/target/release/build/ahash-28d25007c6697d4d/invoked.timestamp
  • bench-rust/target/release/build/ahash-28d25007c6697d4d/output
  • bench-rust/target/release/build/ahash-28d25007c6697d4d/root-output
  • bench-rust/target/release/build/ahash-28d25007c6697d4d/stderr
  • bench-rust/target/release/build/ahash-7d171089a62bd126/build-script-build
  • bench-rust/target/release/build/ahash-7d171089a62bd126/build_script_build-7d171089a62bd126
  • bench-rust/target/release/build/ahash-7d171089a62bd126/build_script_build-7d171089a62bd126.d
  • bench-rust/target/release/build/getrandom-6df4dbbcffb370b7/build-script-build
  • bench-rust/target/release/build/getrandom-6df4dbbcffb370b7/build_script_build-6df4dbbcffb370b7
  • bench-rust/target/release/build/getrandom-6df4dbbcffb370b7/build_script_build-6df4dbbcffb370b7.d
  • bench-rust/target/release/build/getrandom-7c74517ba22e6184/invoked.timestamp
  • bench-rust/target/release/build/getrandom-7c74517ba22e6184/output
  • bench-rust/target/release/build/getrandom-7c74517ba22e6184/root-output
  • bench-rust/target/release/build/getrandom-7c74517ba22e6184/stderr
  • bench-rust/target/release/build/libc-1bc3c604e29ae713/invoked.timestamp
  • bench-rust/target/release/build/libc-1bc3c604e29ae713/output
  • bench-rust/target/release/build/libc-1bc3c604e29ae713/root-output
  • bench-rust/target/release/build/libc-1bc3c604e29ae713/stderr
  • bench-rust/target/release/build/libc-6b81b46a947c48f0/build-script-build
  • bench-rust/target/release/build/libc-6b81b46a947c48f0/build_script_build-6b81b46a947c48f0
  • bench-rust/target/release/build/libc-6b81b46a947c48f0/build_script_build-6b81b46a947c48f0.d
  • bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/invoked.timestamp
  • bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/output
  • bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/root-output
  • bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/stderr
  • bench-rust/target/release/build/zerocopy-9010127d9528e97d/build-script-build
  • bench-rust/target/release/build/zerocopy-9010127d9528e97d/build_script_build-9010127d9528e97d
  • bench-rust/target/release/build/zerocopy-9010127d9528e97d/build_script_build-9010127d9528e97d.d
  • bench-rust/target/release/deps/ahash-7cdcab918ea16e7f.d
  • bench-rust/target/release/deps/bench_hashbrown-415e0f0b67b371d2
  • bench-rust/target/release/deps/bench_hashbrown-415e0f0b67b371d2.d
  • bench-rust/target/release/deps/cfg_if-d985813f4553fd9f.d
  • bench-rust/target/release/deps/getrandom-9596904795c59be3.d
  • bench-rust/target/release/deps/libahash-7cdcab918ea16e7f.rlib
  • bench-rust/target/release/deps/libahash-7cdcab918ea16e7f.rmeta
  • bench-rust/target/release/deps/libc-67bdb2dd2ef38a9b.d
  • bench-rust/target/release/deps/libcfg_if-d985813f4553fd9f.rlib
  • bench-rust/target/release/deps/libcfg_if-d985813f4553fd9f.rmeta
  • bench-rust/target/release/deps/libgetrandom-9596904795c59be3.rlib
  • bench-rust/target/release/deps/libgetrandom-9596904795c59be3.rmeta
  • bench-rust/target/release/deps/liblibc-67bdb2dd2ef38a9b.rlib
  • bench-rust/target/release/deps/liblibc-67bdb2dd2ef38a9b.rmeta
  • bench-rust/target/release/deps/libonce_cell-7d17331ba62697a3.rlib
  • bench-rust/target/release/deps/libonce_cell-7d17331ba62697a3.rmeta
  • bench-rust/target/release/deps/libversion_check-88e8b2470aacf655.rlib
  • bench-rust/target/release/deps/libversion_check-88e8b2470aacf655.rmeta
  • bench-rust/target/release/deps/libzerocopy-6dd092b58363eab7.rlib
  • bench-rust/target/release/deps/libzerocopy-6dd092b58363eab7.rmeta
  • bench-rust/target/release/deps/once_cell-7d17331ba62697a3.d
  • bench-rust/target/release/deps/version_check-88e8b2470aacf655.d
  • bench-rust/target/release/deps/zerocopy-6dd092b58363eab7.d
  • cross-lang-results.md
✅ Files skipped from review due to trivial changes (67)
  • bench-rust/target/release/.fingerprint/ahash-28d25007c6697d4d/run-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/invoked.timestamp
  • bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/invoked.timestamp
  • bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/lib-libc
  • bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/lib-zerocopy
  • bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/build-script-build-script-build
  • bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/lib-ahash
  • bench-rust/target/release/build/libc-1bc3c604e29ae713/invoked.timestamp
  • bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/build-script-build-script-build
  • bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/invoked.timestamp
  • bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/bin-bench-hashbrown
  • bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/build-script-build-script-build
  • bench-rust/target/release/.fingerprint/getrandom-7c74517ba22e6184/run-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/invoked.timestamp
  • bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/invoked.timestamp
  • bench-rust/target/release/.fingerprint/getrandom-7c74517ba22e6184/run-build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/zerocopy-2c2a0b3df16c71b3/run-build-script-build-script-build
  • bench-rust/target/release/build/getrandom-7c74517ba22e6184/invoked.timestamp
  • bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/invoked.timestamp
  • bench-rust/target/release/build/getrandom-7c74517ba22e6184/root-output
  • bench-rust/target/release/.fingerprint/ahash-7d171089a62bd126/build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/invoked.timestamp
  • bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/invoked.timestamp
  • bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/lib-getrandom.json
  • bench-rust/target/release/.fingerprint/ahash-28d25007c6697d4d/run-build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/libc-1bc3c604e29ae713/run-build-script-build-script-build
  • bench-rust/target/release/.fingerprint/zerocopy-2c2a0b3df16c71b3/run-build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/lib-cfg_if.json
  • bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/invoked.timestamp
  • bench-rust/target/release/.fingerprint/ahash-7cdcab918ea16e7f/lib-ahash.json
  • bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/invoked.timestamp
  • bench-rust/target/release/deps/bench_hashbrown-415e0f0b67b371d2.d
  • bench-rust/target/release/.fingerprint/zerocopy-9010127d9528e97d/build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/lib-version_check
  • bench-rust/target/release/build/libc-1bc3c604e29ae713/root-output
  • bench-rust/target/release/deps/cfg_if-d985813f4553fd9f.d
  • bench-rust/target/release/build/zerocopy-9010127d9528e97d/build_script_build-9010127d9528e97d.d
  • bench-rust/target/release/build/ahash-7d171089a62bd126/build_script_build-7d171089a62bd126.d
  • bench-rust/target/release/.fingerprint/libc-1bc3c604e29ae713/run-build-script-build-script-build.json
  • bench-rust/target/release/.fingerprint/version_check-88e8b2470aacf655/lib-version_check.json
  • bench-rust/target/release/.fingerprint/libc-6b81b46a947c48f0/build-script-build-script-build.json
  • bench-rust/target/release/build/ahash-28d25007c6697d4d/output
  • bench-rust/target/release/build/ahash-28d25007c6697d4d/invoked.timestamp
  • bench-rust/target/release/deps/version_check-88e8b2470aacf655.d
  • bench-rust/target/release/deps/once_cell-7d17331ba62697a3.d
  • bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/root-output
  • bench-rust/target/release/.fingerprint/libc-67bdb2dd2ef38a9b/lib-libc.json
  • bench-rust/target/release/build/libc-6b81b46a947c48f0/build_script_build-6b81b46a947c48f0.d
  • bench-rust/target/release/deps/ahash-7cdcab918ea16e7f.d
  • bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/lib-once_cell
  • bench-rust/target/release/deps/libc-67bdb2dd2ef38a9b.d
  • bench-rust/target/release/deps/getrandom-9596904795c59be3.d
  • bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/invoked.timestamp
  • bench-rust/target/release/.fingerprint/getrandom-9596904795c59be3/lib-getrandom
  • bench-rust/target/release/.fingerprint/cfg-if-d985813f4553fd9f/lib-cfg_if
  • bench-rust/target/release/build/libc-1bc3c604e29ae713/output
  • bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/invoked.timestamp
  • bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/invoked.timestamp
  • bench-rust/target/release/build/ahash-28d25007c6697d4d/root-output
  • bench-rust/target/release/.fingerprint/zerocopy-6dd092b58363eab7/lib-zerocopy.json
  • bench-rust/target/release/.fingerprint/once_cell-7d17331ba62697a3/lib-once_cell.json
  • bench-rust/target/release/build/zerocopy-2c2a0b3df16c71b3/output
  • bench-rust/target/release/build/getrandom-6df4dbbcffb370b7/build_script_build-6df4dbbcffb370b7.d
  • bench-rust/target/release/.fingerprint/getrandom-6df4dbbcffb370b7/build-script-build-script-build
  • bench-rust/target/release/build/getrandom-7c74517ba22e6184/output
  • bench-rust/target/release/.fingerprint/bench-hashbrown-415e0f0b67b371d2/bin-bench-hashbrown.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • bench-rust/Cargo.toml
  • bench-rust/src/main.rs

Comment on lines +37 to +42
func median(arr []uint64) uint64 {
s := make([]uint64, len(arr))
copy(s, arr)
sort.Slice(s, func(i, j int) bool { return s[i] < s[j] })
return s[len(s)/2]
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Median computation is biased for even-sized samples.

On Line 41, median returns s[len(s)/2]. With 10 measured runs, this picks the upper middle value (6th), not the true median-of-10.

Suggested fix
 func median(arr []uint64) uint64 {
 	s := make([]uint64, len(arr))
 	copy(s, arr)
 	sort.Slice(s, func(i, j int) bool { return s[i] < s[j] })
-	return s[len(s)/2]
+	mid := len(s) / 2
+	if len(s)%2 == 0 {
+		return (s[mid-1] + s[mid]) / 2
+	}
+	return s[mid]
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func median(arr []uint64) uint64 {
s := make([]uint64, len(arr))
copy(s, arr)
sort.Slice(s, func(i, j int) bool { return s[i] < s[j] })
return s[len(s)/2]
}
func median(arr []uint64) uint64 {
s := make([]uint64, len(arr))
copy(s, arr)
sort.Slice(s, func(i, j int) bool { return s[i] < s[j] })
mid := len(s) / 2
if len(s)%2 == 0 {
return (s[mid-1] + s[mid]) / 2
}
return s[mid]
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-go/main.go` around lines 37 - 42, The median function (median, using
arr and local s) currently returns s[len(s)/2] which biases the result for
even-length slices; modify median to handle even counts by returning the average
of the two middle values: for n := len(s), if n==0 return 0 (or appropriate zero
value), if n%2==1 return s[n/2], else return (s[n/2-1]+s[n/2])/2 using uint64
arithmetic to preserve the function signature and avoid bias for even-sized
samples.

joshuaisaact and others added 2 commits March 24, 2026 09:02
Disproves cache-density hypothesis. On M4's 16MB L2 where both
fingerprint arrays fit, elastic hash is *faster* relative to abseil
than on x86. The win is cache lines touched per probe, not L2 vs L3
spill. Fixed linux-only timers in zig benchmarks for macOS/ARM.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
BENCHMARK-M4.md (2)

81-82: Unconventional stderr redirection pattern.

Line 81 redirects stderr to the results log (2> results-m4-elastic.log), which suggests the benchmark output goes to stderr rather than stdout. This is unconventional since stderr is typically reserved for errors and warnings, not results.

Consider updating the Zig benchmark to output results to stdout, then use standard output redirection:

zig build autobench-strings -Doptimize=ReleaseFast > results-m4-elastic.log 2>/dev/null

This pattern (used for Abseil at line 77 and others) is more conventional and makes it clearer where results vs errors are captured.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@BENCHMARK-M4.md` around lines 81 - 82, The benchmark invocation currently
redirects stderr into the results file using the command string "zig build
autobench-strings -Doptimize=ReleaseFast 2> results-m4-elastic.log"; change the
benchmark to emit results to stdout and update the invocation to redirect stdout
into results (and optionally silence stderr) so results-m4-elastic.log contains
only benchmark output; update the documented command in BENCHMARK-M4.md (the
"zig build autobench-strings -Doptimize=ReleaseFast" line) to use stdout
redirection and suppress or separate stderr as appropriate.

37-61: Consider providing automated build scripts.

The manual build commands are detailed but complex, especially the Abseil compilation with pkg-config fallback (lines 40-50). This complexity makes the setup error-prone and difficult to maintain.

💡 Consider creating a build script

Create a build-benchmarks.sh script that:

  • Detects the platform (macOS/Linux)
  • Tries pkg-config and falls back to explicit paths automatically
  • Builds all benchmark executables in sequence
  • Reports any build failures clearly

This would reduce user friction and make the setup more maintainable.

Example structure:

#!/bin/bash
set -e

echo "Building Abseil benchmark..."
if pkg-config --exists absl_hash; then
    g++ -O3 -march=native ... $(pkg-config ...)
else
    echo "pkg-config failed, trying explicit paths..."
    g++ -O3 -march=native ... -I/opt/homebrew/include ...
fi

echo "Building Zig benchmarks..."
zig build autobench-strings -Doptimize=ReleaseFast

# ... etc for Rust and Go
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@BENCHMARK-M4.md` around lines 37 - 61, The benchmark README currently lists
complex manual build commands (the Abseil g++ command with pkg-config fallback
and separate Zig/Rust/Go steps) which is error-prone; create a
build-benchmarks.sh script that detects OS (macOS vs Linux), attempts pkg-config
for Abseil and falls back to explicit include/lib paths, runs the Abseil g++
build (the long g++ line shown), then runs "zig build autobench-strings
-Doptimize=ReleaseFast", "cd bench-rust && cargo build --release", and "cd
bench-go && go build -o bench-go .", failing fast with clear error messages and
exit codes so users can build all benchmarks automatically and the README can
reference this script.
cross-lang-results.md (1)

110-115: Address the logical tension between emphasizing cross-language comparison and the "only apples-to-apples" caveat.

The caveat in lines 114-115 states that "the only truly apples-to-apples comparison is elastic hash (Zig) vs abseil (C++)" - but this somewhat contradicts the document's primary focus on cross-language comparisons. Consider either:

  • Reframing to emphasize the Zig vs C++ comparison as the most rigorous result, with cross-language data providing supplementary context
  • Softening the caveat to note that while cross-language comparisons include runtime/compiler differences, they still offer valuable insights

Additionally, the document would benefit from discussing statistical significance or measurement variance. While "median of 10 runs" is mentioned, there's no analysis of variance or whether observed differences exceed noise levels.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cross-lang-results.md` around lines 110 - 115, Update the "Remaining fairness
caveats" section to resolve the tension: explicitly state that the elastic hash
(Zig) vs abseil (C++) comparison is the most rigorous "apples-to-apples" result
while framing other cross-language results as supplementary context that reflect
runtime/compiler differences; replace the absolute phrasing "the only truly
apples-to-apples" with softer wording that highlights rigor (e.g., "the most
controlled apples-to-apples") and call out that cross-language comparisons are
informative but not strictly equivalent. Also add a short paragraph explaining
the measurement uncertainty for "median of 10 runs": report standard deviation
(or 95% confidence intervals) and whether observed differences exceed
measurement noise, and suggest including a simple statistical indicator (stddev,
CI, or boxplot) alongside medians to show variance. Ensure changes reference the
existing section title "Remaining fairness caveats", the specific comparison
"elastic hash (Zig) vs abseil (C++)", and the current practice "median of 10
runs".
src/autobench-strings.zig (1)

41-44: Minor: Returns upper-median for even-sized array.

With MEASURED = 10, this returns element at index 5 rather than averaging indices 4 and 5. For benchmark reporting this is acceptable (consistent "upper median"), but worth noting for precision.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/autobench-strings.zig` around lines 41 - 44, The current median function
(median, using sort.insertion and MEASURED) returns the upper-median for
even-sized arrays; to fix, update median to handle even MEASURED by sorting
as-is then computing the average of the two middle elements (use indices
MEASURED/2 - 1 and MEASURED/2) and return their mean as u64 (handle potential
overflow appropriately, e.g., (low / 2) + (high / 2) or use checked arithmetic);
alternatively if you prefer keeping upper-median behavior, add a clear comment
in the median function documenting that it intentionally returns the upper
median for even MEASURED.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@BENCHMARK-M4.md`:
- Around line 103-107: The current manual swap of src/autobench.zig to run the
shuffled verification benchmark is fragile and lacks error handling; add a
proper build target in build.zig that compiles and runs
src/autobench-strings-verify.zig (e.g., create an executable named
"autobench-verify" and a run step that the "autobench-verify" step depends on)
and replace the swap-and-build shell sequence with a single invocation of that
new target (zig build autobench-verify), or if you must keep the swap approach
wrap the swap/restore in a safe cleanup handler (trap/EXIT) to always restore
src/autobench.zig and remove the .bak on exit/failure so autobench and
autobench-strings-verify.zig are not left out of sync.

In `@cross-lang-results.md`:
- Around line 9-34: Update the "Apple Silicon M4 results" section to clarify L2
cache details: state which M4 variant was tested (or note that the exact variant
is unspecified), change the phrase "~16MB shared L2" to explicitly say "≈16MB
P‑core cluster L2" (or "≈16MB per P‑core cluster L2"), and add a brief
parenthetical that different M4 variants have different P‑core L2 sizes (e.g.,
M4 Max ≈32MB) so readers understand the scope of the measurement; keep the
performance conclusion about cache lines/tiered metadata unchanged.

---

Nitpick comments:
In `@BENCHMARK-M4.md`:
- Around line 81-82: The benchmark invocation currently redirects stderr into
the results file using the command string "zig build autobench-strings
-Doptimize=ReleaseFast 2> results-m4-elastic.log"; change the benchmark to emit
results to stdout and update the invocation to redirect stdout into results (and
optionally silence stderr) so results-m4-elastic.log contains only benchmark
output; update the documented command in BENCHMARK-M4.md (the "zig build
autobench-strings -Doptimize=ReleaseFast" line) to use stdout redirection and
suppress or separate stderr as appropriate.
- Around line 37-61: The benchmark README currently lists complex manual build
commands (the Abseil g++ command with pkg-config fallback and separate
Zig/Rust/Go steps) which is error-prone; create a build-benchmarks.sh script
that detects OS (macOS vs Linux), attempts pkg-config for Abseil and falls back
to explicit include/lib paths, runs the Abseil g++ build (the long g++ line
shown), then runs "zig build autobench-strings -Doptimize=ReleaseFast", "cd
bench-rust && cargo build --release", and "cd bench-go && go build -o bench-go
.", failing fast with clear error messages and exit codes so users can build all
benchmarks automatically and the README can reference this script.

In `@cross-lang-results.md`:
- Around line 110-115: Update the "Remaining fairness caveats" section to
resolve the tension: explicitly state that the elastic hash (Zig) vs abseil
(C++) comparison is the most rigorous "apples-to-apples" result while framing
other cross-language results as supplementary context that reflect
runtime/compiler differences; replace the absolute phrasing "the only truly
apples-to-apples" with softer wording that highlights rigor (e.g., "the most
controlled apples-to-apples") and call out that cross-language comparisons are
informative but not strictly equivalent. Also add a short paragraph explaining
the measurement uncertainty for "median of 10 runs": report standard deviation
(or 95% confidence intervals) and whether observed differences exceed
measurement noise, and suggest including a simple statistical indicator (stddev,
CI, or boxplot) alongside medians to show variance. Ensure changes reference the
existing section title "Remaining fairness caveats", the specific comparison
"elastic hash (Zig) vs abseil (C++)", and the current practice "median of 10
runs".

In `@src/autobench-strings.zig`:
- Around line 41-44: The current median function (median, using sort.insertion
and MEASURED) returns the upper-median for even-sized arrays; to fix, update
median to handle even MEASURED by sorting as-is then computing the average of
the two middle elements (use indices MEASURED/2 - 1 and MEASURED/2) and return
their mean as u64 (handle potential overflow appropriately, e.g., (low / 2) +
(high / 2) or use checked arithmetic); alternatively if you prefer keeping
upper-median behavior, add a clear comment in the median function documenting
that it intentionally returns the upper median for even MEASURED.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ad92223-b64f-4c85-90c4-af44175198bd

📥 Commits

Reviewing files that changed from the base of the PR and between 3be46ed and 93444af.

⛔ Files ignored due to path filters (4)
  • results-m4-abseil.log is excluded by !**/*.log
  • results-m4-elastic.log is excluded by !**/*.log
  • results-m4-go.log is excluded by !**/*.log
  • results-m4-rust.log is excluded by !**/*.log
📒 Files selected for processing (5)
  • BENCHMARK-M4.md
  • README.md
  • cross-lang-results.md
  • src/autobench-strings-verify.zig
  • src/autobench-strings.zig
✅ Files skipped from review due to trivial changes (1)
  • README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/autobench-strings-verify.zig

Comment on lines +9 to +34
## Apple Silicon M4 results

Tested on M4 (~16MB shared L2). The hypothesis was that elastic hash's advantage was cache-density-specific: our 1MB fingerprint array fits in x86's ~512KB L2 while abseil's 2MB control bytes spill to L3. On M4, both should fit in L2, so the advantage should shrink or disappear.

**It didn't.** It got bigger.

### Shuffled hit lookup (us, lower is better)

| Load | Elastic (Zig) | Abseil (C++) | Ratio |
|------|--------------|-------------|-------|
| 10% | **719** | 2,861 | **3.98x** |
| 25% | **2,276** | 10,169 | **4.47x** |
| 50% | **8,863** | 22,984 | **2.59x** |
| 75% | **15,972** | 33,624 | **2.11x** |
| 90% | **22,118** | 41,671 | **1.88x** |
| 99% | **25,748** | 46,543 | **1.81x** |

### x86 vs M4 comparison (shuffled hit lookup at 50% load)

| Platform | Elastic | Abseil | Gap |
|----------|---------|--------|-----|
| x86 (~512KB L2) | 11,119 | 19,312 | **1.74x** |
| M4 (~16MB L2) | 8,863 | 22,984 | **2.59x** |

The advantage grew from 1.74x to 2.59x. The tiered metadata advantage is not about L2 vs L3 spill -- it's about cache lines touched per probe. Separated, dense fingerprint arrays mean fewer cache line fetches under random access, regardless of which cache level they live in.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if documentation mentions specific CPU models
rg -i "cpu|processor|intel|amd|apple m[0-9]" --type=md -C 2

Repository: joshuaisaact/elastic-hash

Length of output: 1433


🌐 Web query:

Apple M4 chip L2 cache size specifications official

💡 Result:

Apple does not officially publish detailed L2 cache sizes for the M4 family chips in their specifications or press releases. Official Apple sources like the M4 announcement, iPad Pro specs, MacBook Pro tech specs, and MacBook Air specs mention CPU core counts, GPU, Neural Engine, memory bandwidth, and unified memory but no L2 cache details. Third-party sources based on analysis provide the following: For base M4 (e.g., 9-10 core CPU: 3-4 P-cores + 4-6 E-cores): - P-core L2 cache: 16 MB (shared per P-core cluster) - E-core L2 cache: 4 MB (shared per E-core cluster) For M4 Pro (12-14 core CPU: 8-10 P-cores + 4 E-cores): - Similar per-cluster L2: P-cores likely 16 MB per cluster (two P-clusters for 10 P-cores), E-cores 4 MB. For M4 Max (14-16 core CPU: 10-12 P-cores + 4 E-cores): - P-core L2: 32 MB per P-core (per third-party analysis, possibly shared differently), E-core L2: 4 MB. Sources cite Wikipedia, teardowns, and benchmarks. P-cores are clustered (e.g., base M4 has one 4 P-core cluster sharing 16 MB L2), E-cores share 4 MB L2. No official confirmation; Apple silicon cache details are typically reverse-engineered.

Citations:


🏁 Script executed:

# Check more context around the M4 and x86 cache mentions
rg -B5 -A5 "~16MB|M4.*L2|x86.*512KB" --type=md cross-lang-results.md

Repository: joshuaisaact/elastic-hash

Length of output: 1610


Clarify M4 cache hierarchy details; x86 specification is adequate.

The document already specifies "AMD/Intel ~512KB L2" for x86, so generalization is not a concern. However, the M4 claim needs clarification:

  • M4 P-core L2 is indeed ~16MB, but this is per P-core cluster, not a single shared pool across all cores
  • Different M4 variants have different L2 configurations (M4 Pro similar, M4 Max has 32MB P-core L2)
  • The phrasing "~16MB shared L2" is imprecise without context

Consider noting which specific M4 variant was tested, or clarifying that the 16MB refers to the P-core cluster L2 cache. The performance conclusion about cache lines touched per probe remains sound regardless.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cross-lang-results.md` around lines 9 - 34, Update the "Apple Silicon M4
results" section to clarify L2 cache details: state which M4 variant was tested
(or note that the exact variant is unspecified), change the phrase "~16MB shared
L2" to explicitly say "≈16MB P‑core cluster L2" (or "≈16MB per P‑core cluster
L2"), and add a brief parenthetical that different M4 variants have different
P‑core L2 sizes (e.g., M4 Max ≈32MB) so readers understand the scope of the
measurement; keep the performance conclusion about cache lines/tiered metadata
unchanged.

…mpetitors

Size sweep at 50% load with shuffled access on Apple M4. Elastic hash
beats abseil (1.3-2.4x), Rust hashbrown+ahash (1.2-3x), and Go
swiss.Map (1.8-3.5x) at every tested size. The x86 finding that small
tables favored abseil does not hold on M4. Added proper build step for
shuffled verification benchmark.
@joshuaisaact joshuaisaact changed the title cross-language benchmark: elastic hash vs abseil, Rust hashbrown, Go swiss.Map cross-language benchmark: elastic hash 2-3x faster on both x86 and Apple M4 Mar 24, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
cross-lang-results.md (1)

11-11: ⚠️ Potential issue | 🟡 Minor

Clarify cache-size claims; current sentence is self-contradictory and imprecise.

The phrase “1MB fingerprint array fits in x86’s ~512KB L2” is mathematically incorrect, and “M4 (~16MB shared L2)” should be scoped as P-core cluster L2 (variant-dependent). Please revise this sentence to avoid misleading the cache argument.

Proposed wording fix
-Tested on M4 (~16MB shared L2). The hypothesis was that elastic hash's advantage was cache-density-specific: our 1MB fingerprint array fits in x86's ~512KB L2 while abseil's 2MB control bytes spill to L3. On M4, both should fit in L2, so the advantage should shrink or disappear.
+Tested on Apple M4 (≈16MB P-core cluster L2 on base M4; cache layout varies by M4 variant). The hypothesis was that elastic hash's advantage was cache-density-specific: elastic’s metadata footprint is smaller than Abseil’s, so on x86 (~512KB L2) Abseil should spill earlier to lower cache levels. On M4, both metadata regions should fit more comfortably in upper caches, so the advantage should shrink or disappear.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cross-lang-results.md` at line 11, The sentence claiming “1MB fingerprint
array fits in x86’s ~512KB L2” is incorrect and the M4 L2 description is
ambiguous; update the sentence that discusses cache sizes (the line referencing
“1MB fingerprint array” and “M4 (~16MB shared L2)”) to correctly state that a
1MB fingerprint array exceeds a typical per-core x86 L2 (~512KB) and instead
note that on Apple M4 the P‑core cluster’s shared L2 (variant-dependent, ~16MB)
can accommodate both our 1MB fingerprint array and abseil’s 2MB control bytes,
so the L2-fitting advantage should shrink or disappear—also scope “M4”
explicitly to the P‑core cluster L2.
🧹 Nitpick comments (2)
README.md (1)

73-76: Use architecture-neutral SIMD wording here.

“SSE2 vector comparison” is x86-specific, but this README also presents Apple M4 measurements. Consider documenting this as generic SIMD (SSE2 on x86 / NEON on ARM) to keep the architecture section accurate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 73 - 76, Update the "SIMD bucketed probing" wording
to be architecture-neutral: replace the phrase "SSE2 vector comparison" with
something like "SIMD vector comparison (e.g., SSE2 on x86, NEON on ARM)" in the
README section titled "SIMD bucketed probing" so the description covers both x86
and ARM implementations while leaving the rest of the bullets (16-element
buckets, 8-bit fingerprints) unchanged.
bench-rust/src/main.rs (1)

112-115: Output format inconsistency with ahash benchmark.

The siphash benchmark outputs only lookup_us while the ahash version outputs all four metrics. If the comparison scripts expect uniform output, this will cause parsing issues. Consider either:

  • Matching the ahash output format with all metrics
  • Clearly documenting that siphash results are reference-only and excluded from automated parsing
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-rust/src/main.rs` around lines 112 - 115, The current println in the
siphash benchmark only emits lookup_us
(println!("RESULT\timpl=rust-siphash\tn={}\tload={}\tlookup_us={}", n, load_pct,
median(&mut lkp))), causing output format mismatch with the ahash benchmark;
update this print to emit the same four metrics the ahash runner prints (e.g.,
n, load_pct and the three timing metrics including lookup_us plus the other
timing fields used by ahash) so automated parsers can consume siphash results
identically, ensuring you read the same variables used in the benchmark (n,
load_pct, median(&mut lkp) and the other timing/stat vars) and format them in
the same RESULT\timpl=... schema.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bench-rust/src/main.rs`:
- Around line 84-88: The RESULT line printed by the println! macro currently
emits an extra "impl=rust-ahash" field and places miss_us before delete_us,
which breaks bench-v2.sh parsing; update the println! invocation (the formatting
string in main where println!(... ) is called) to match the expected order and
field names: emit "RESULT\t n={}\t load={}\t insert_us={}\t lookup_us={}\t
delete_us={}\t miss_us={}" (remove the impl= field and swap delete_us and
miss_us positions) while keeping the same variables (n, load_pct, median(&mut
ins), median(&mut lkp), median(&mut del), median(&mut mis)); if you still need
to identify the implementation, write that name to stderr separately instead of
in the RESULT line.

In `@bench-strings-verify.cpp`:
- Around line 73-74: The printf uses %lu for lkp_ordered and lkp_shuffled which
are uint64_t values; make the format portable by including <inttypes.h> and
using the PRIu64 macro (e.g., use "%" PRIu64) for those fields while keeping n
and load_pct as-is—update the printf that prints ABSEIL, n, load_pct,
median_val(lkp_ordered, MEASURED), median_val(lkp_shuffled, MEASURED) to use
PRIu64 for the two uint64_t results and add the inttypes.h include.

In `@README.md`:
- Around line 111-115: Update the fenced code block containing the commands "zig
build test", "zig build bench", and "bash bench-v2.sh" to include the language
identifier "bash" after the opening backticks (i.e., change ``` to ```bash) so
the block conforms to Markdown lint rule MD040 and improves syntax highlighting.

---

Duplicate comments:
In `@cross-lang-results.md`:
- Line 11: The sentence claiming “1MB fingerprint array fits in x86’s ~512KB L2”
is incorrect and the M4 L2 description is ambiguous; update the sentence that
discusses cache sizes (the line referencing “1MB fingerprint array” and “M4
(~16MB shared L2)”) to correctly state that a 1MB fingerprint array exceeds a
typical per-core x86 L2 (~512KB) and instead note that on Apple M4 the P‑core
cluster’s shared L2 (variant-dependent, ~16MB) can accommodate both our 1MB
fingerprint array and abseil’s 2MB control bytes, so the L2-fitting advantage
should shrink or disappear—also scope “M4” explicitly to the P‑core cluster L2.

---

Nitpick comments:
In `@bench-rust/src/main.rs`:
- Around line 112-115: The current println in the siphash benchmark only emits
lookup_us (println!("RESULT\timpl=rust-siphash\tn={}\tload={}\tlookup_us={}", n,
load_pct, median(&mut lkp))), causing output format mismatch with the ahash
benchmark; update this print to emit the same four metrics the ahash runner
prints (e.g., n, load_pct and the three timing metrics including lookup_us plus
the other timing fields used by ahash) so automated parsers can consume siphash
results identically, ensuring you read the same variables used in the benchmark
(n, load_pct, median(&mut lkp) and the other timing/stat vars) and format them
in the same RESULT\timpl=... schema.

In `@README.md`:
- Around line 73-76: Update the "SIMD bucketed probing" wording to be
architecture-neutral: replace the phrase "SSE2 vector comparison" with something
like "SIMD vector comparison (e.g., SSE2 on x86, NEON on ARM)" in the README
section titled "SIMD bucketed probing" so the description covers both x86 and
ARM implementations while leaving the rest of the bullets (16-element buckets,
8-bit fingerprints) unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2b7783f9-dcc6-44a1-9a40-ee024b6920e8

📥 Commits

Reviewing files that changed from the base of the PR and between 93444af and a18b161.

📒 Files selected for processing (7)
  • README.md
  • bench-go/main.go
  • bench-rust/src/main.rs
  • bench-strings-verify.cpp
  • build.zig
  • cross-lang-results.md
  • src/autobench-strings-verify.zig
🚧 Files skipped from review as they are similar to previous changes (2)
  • bench-go/main.go
  • build.zig

Comment on lines +84 to +88
println!(
"RESULT\timpl=rust-ahash\tn={}\tload={}\tinsert_us={}\tlookup_us={}\tmiss_us={}\tdelete_us={}",
n, load_pct,
median(&mut ins), median(&mut lkp), median(&mut mis), median(&mut del)
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

RESULT output format incompatible with bench-v2.sh parser.

The output format differs from what bench-v2.sh expects in two ways:

  1. Extra impl= field: The parser expects n= as the 2nd field, but impl=rust-ahash shifts all positions.
  2. Field order mismatch: This outputs miss_us before delete_us, but the parser and C++ benchmark use delete_us before miss_us.

This will cause the comparison scripts to extract incorrect values or fail entirely.

Proposed fix to match the expected format
     println!(
-        "RESULT\timpl=rust-ahash\tn={}\tload={}\tinsert_us={}\tlookup_us={}\tmiss_us={}\tdelete_us={}",
+        "RESULT\tn={}\tload={}\tinsert_us={}\tlookup_us={}\tdelete_us={}\tmiss_us={}",
         n, load_pct,
-        median(&mut ins), median(&mut lkp), median(&mut mis), median(&mut del)
+        median(&mut ins), median(&mut lkp), median(&mut del), median(&mut mis)
     );

If you need to distinguish implementations in the output, consider updating bench-v2.sh to handle an optional impl= field, or output the implementation name to stderr separately.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-rust/src/main.rs` around lines 84 - 88, The RESULT line printed by the
println! macro currently emits an extra "impl=rust-ahash" field and places
miss_us before delete_us, which breaks bench-v2.sh parsing; update the println!
invocation (the formatting string in main where println!(... ) is called) to
match the expected order and field names: emit "RESULT\t n={}\t load={}\t
insert_us={}\t lookup_us={}\t delete_us={}\t miss_us={}" (remove the impl= field
and swap delete_us and miss_us positions) while keeping the same variables (n,
load_pct, median(&mut ins), median(&mut lkp), median(&mut del), median(&mut
mis)); if you still need to identify the implementation, write that name to
stderr separately instead of in the RESULT line.

Comment on lines +73 to +74
printf("ABSEIL\tn=%zu\tload=%d\tordered=%lu\tshuffled=%lu\n",
n, load_pct, median_val(lkp_ordered, MEASURED), median_val(lkp_shuffled, MEASURED));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PRIu64 for portable uint64_t formatting.

%lu assumes uint64_t is unsigned long, which isn't true on all platforms (e.g., 32-bit systems or some Windows configurations where uint64_t is unsigned long long).

🔧 Suggested fix

Add the include:

 `#include` <cstdio>
+#include <cinttypes>

Then update the printf:

-    printf("ABSEIL\tn=%zu\tload=%d\tordered=%lu\tshuffled=%lu\n",
-           n, load_pct, median_val(lkp_ordered, MEASURED), median_val(lkp_shuffled, MEASURED));
+    printf("ABSEIL\tn=%zu\tload=%d\tordered=%" PRIu64 "\tshuffled=%" PRIu64 "\n",
+           n, load_pct, median_val(lkp_ordered, MEASURED), median_val(lkp_shuffled, MEASURED));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
printf("ABSEIL\tn=%zu\tload=%d\tordered=%lu\tshuffled=%lu\n",
n, load_pct, median_val(lkp_ordered, MEASURED), median_val(lkp_shuffled, MEASURED));
printf("ABSEIL\tn=%zu\tload=%d\tordered=%" PRIu64 "\tshuffled=%" PRIu64 "\n",
n, load_pct, median_val(lkp_ordered, MEASURED), median_val(lkp_shuffled, MEASURED));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bench-strings-verify.cpp` around lines 73 - 74, The printf uses %lu for
lkp_ordered and lkp_shuffled which are uint64_t values; make the format portable
by including <inttypes.h> and using the PRIu64 macro (e.g., use "%" PRIu64) for
those fields while keeping n and load_pct as-is—update the printf that prints
ABSEIL, n, load_pct, median_val(lkp_ordered, MEASURED), median_val(lkp_shuffled,
MEASURED) to use PRIu64 for the two uint64_t results and add the inttypes.h
include.

Comment on lines 111 to 115
```
zig build bench
zig build test # run tests
zig build bench # full benchmark
bash bench-v2.sh # comparison vs abseil (requires abseil-cpp)
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the fenced code block.

This trips markdown lint (MD040). Use bash for this command block.

Proposed fix
-```
+```bash
 zig build test       # run tests
 zig build bench      # full benchmark
 bash bench-v2.sh     # comparison vs abseil (requires abseil-cpp)
</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.21.0)</summary>

[warning] 111-111: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @README.md around lines 111 - 115, Update the fenced code block containing
the commands "zig build test", "zig build bench", and "bash bench-v2.sh" to
include the language identifier "bash" after the opening backticks (i.e., change
tobash) so the block conforms to Markdown lint rule MD040 and improves
syntax highlighting.


</details>

<!-- fingerprinting:phantom:triton:hawk -->

<!-- This is an auto-generated comment by CodeRabbit -->

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant