[Autoloop: tsb-perf-evolve] by github-actions[bot] · Pull Request #255 · githubnext/tsb

github-actions · 2026-04-30T19:00:38Z

🤖 This PR is maintained by Autoloop. Each accepted iteration adds a commit to this branch.

Program Goal

Minimize fitness = tsb_mean_ms / pandas_mean_ms for Series.sortValues at n=100k. Lower is better; < 1.0 means tsb beats pandas.

Current best metric: 21.841 (tsb≈116ms / pandas≈5.34ms, iteration 28 — merged via #249)

Iteration 29: AoS scatter layout

Switch the LSD radix sort ping-pong buffers from SoA (6 separate typed arrays: _rxA_idx, _rxA_lo, _rxA_hi, _rxB_idx, _rxB_lo, _rxB_hi) to a single AoS layout (_rxA, _rxB — each element uses 3 consecutive uint32 words: [origRowIdx, loKey, hiKey]).

With SoA, each scatter step writes 3 words to 3 separate large arrays (random positions), touching 3 separate cache lines per element. With AoS, all 3 writes target consecutive addresses in a single array, hitting one cache line per element — ~3× fewer cache-line evictions during scatter.

Hypothesis: The 8×n random scatter writes are the dominant bottleneck at n=100k. Packing all three fields into one cache line per element should reduce cache pressure and improve throughput.

Invariants preserved: same algorithm (8-pass LSD radix), same public signature, same NaN/null handling, same sort correctness. Tests unchanged.

Related issue: #189

Run: https://github.com/githubnext/tsessebe/actions/runs/25183052353

Generated by Autoloop · ● 4.4M · ◷

Switch the radix sort ping-pong buffers from SoA (_rxA_idx, _rxA_lo, _rxA_hi, _rxB_idx, _rxB_lo, _rxB_hi — 6 separate typed arrays) to a single AoS layout (_rxA, _rxB — each element occupies 3 consecutive uint32 words: [origRowIdx, loKey, hiKey]). With AoS, all three scatter writes per element target the same cache line (12 consecutive bytes), reducing random-write cache-line pressure ~3× versus the previous SoA layout where each write touched a separate cache line in a separate 1MB buffer. Run: https://github.com/githubnext/tsessebe/actions/runs/25183052353 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions Bot added autoloop automation labels Apr 30, 2026

mrjf marked this pull request as ready for review April 30, 2026 21:13

mrjf merged commit 36a2857 into main Apr 30, 2026
12 checks passed

mrjf deleted the autoloop/tsb-perf-evolve branch April 30, 2026 22:54

github-actions Bot mentioned this pull request May 2, 2026

[Autoloop: tsb-perf-evolve] #262

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Autoloop: tsb-perf-evolve]#255

[Autoloop: tsb-perf-evolve]#255
mrjf merged 1 commit into
mainfrom
autoloop/tsb-perf-evolve

github-actions Bot commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

github-actions Bot commented Apr 30, 2026

Program Goal

Iteration 29: AoS scatter layout

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant