Skip to content

Conversation

ctsk
Copy link
Contributor

@ctsk ctsk commented Oct 13, 2025

Which issue does this PR close?

Doesn't close any issue. Adds a reproducer for #17897

What changes are included in this PR?

The PR includes a small benchmark that measures the performance of the MinMaxBytesAccumulator with a growing number of groups. Executing it shows a significant drop in throughput as the number of groups in the accumulator grows.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the functions Changes to functions implementation label Oct 13, 2025
@ctsk ctsk force-pushed the fix/min-max-reproduce branch from ad377e4 to 7f77730 Compare October 14, 2025 09:42
@alamb alamb changed the title Reproduce quadratic runtime in min_max_bytes Add benchmark min_max_bytes benchmark (Reproduce quadratic runtime in min_max_bytes) Oct 14, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Than you @ctsk

I ran this benchmark locally and also observed the increase in time:

nable flat sampling, or reduce sample count to 50.
min_max_bytes/10        time:   [1.9633 ms 1.9760 ms 1.9886 ms]
                        thrpt:  [41.195 Melem/s 41.458 Melem/s 41.726 Melem/s]
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild
min_max_bytes/20        time:   [4.4614 ms 4.4844 ms 4.5076 ms]
                        thrpt:  [36.348 Melem/s 36.536 Melem/s 36.724 Melem/s]
min_max_bytes/50        time:   [14.882 ms 14.942 ms 15.004 ms]
                        thrpt:  [27.299 Melem/s 27.413 Melem/s 27.523 Melem/s]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
min_max_bytes/100       time:   [41.292 ms 41.448 ms 41.626 ms]
                        thrpt:  [19.680 Melem/s 19.765 Melem/s 19.839 Melem/s]
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe
Benchmarking min_max_bytes/150: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.4s, or reduce sample count to 50.
min_max_bytes/150       time:   [84.874 ms 85.346 ms 85.841 ms]
                        thrpt:  [14.315 Melem/s 14.398 Melem/s 14.478 Melem/s]
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild
Benchmarking min_max_bytes/200: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 13.4s, or reduce sample count to 30.
min_max_bytes/200       time:   [131.32 ms 132.06 ms 132.83 ms]
                        thrpt:  [12.334 Melem/s 12.406 Melem/s 12.477 Melem/s]
Benchmarking min_max_bytes/300: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 36.9s, or reduce sample count to 10.
min_max_bytes/300       time:   [304.79 ms 321.95 ms 340.51 ms]
                        thrpt:  [7.2174 Melem/s 7.6336 Melem/s 8.0632 Melem/s]
Found 14 outliers among 100 measurements (14.00%)
  14 (14.00%) high mild
Benchmarking min_max_bytes/400: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 45.2s, or reduce sample count to 10.
min_max_bytes/400       time:   [438.54 ms 442.35 ms 448.10 ms]
                        thrpt:  [7.3126 Melem/s 7.4077 Melem/s 7.4721 Melem/s]
Found 10 outliers among 100 measurements (10.00%)
  5 (5.00%) high mild
  5 (5.00%) high severe
Benchmarking min_max_bytes/500: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 128.3s, or reduce sample count to 10.
min_max_bytes/500       time:   [1.2893 s 1.3077 s 1.3226 s]
                        thrpt:  [3.0969 Melem/s 3.1323 Melem/s 3.1770 Melem/s]
Found 16 outliers among 100 measurements (16.00%)
  4 (4.00%) low severe
  8 (8.00%) high mild
  4 (4.00%) high severe

@alamb alamb changed the title Add benchmark min_max_bytes benchmark (Reproduce quadratic runtime in min_max_bytes) Add min_max_bytes benchmark (Reproduce quadratic runtime in min_max_bytes) Oct 14, 2025
@alamb alamb added this pull request to the merge queue Oct 14, 2025
Merged via the queue into apache:main with commit a4c95aa Oct 14, 2025
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants