-
Notifications
You must be signed in to change notification settings - Fork 34
feat(collection): Replace quickSort with pdqsort for performance and robustness #922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(collection): Replace quickSort with pdqsort for performance and robustness #922
Conversation
This commit introduces a comprehensive benchmark harness to provide empirical evidence for the proposal to replace the existing quickSort implementation with a more performant and robust pdqsort algorithm. The harness evaluates performance across five critical data patterns to ensure a thorough comparison: - **Random:** Tests performance on typical, unsorted data. - **Sorted & Reverse Sorted:** Tests for handling of common presorted patterns. - **Few Unique:** Tests efficiency on data with high duplication. - **Pathological:** Tests robustness against inputs designed to cause worst-case behavior in quicksorts. The baseline implementation (`quickSortBaseline`) is a direct copy of the existing SDK code to ensure the comparison is fair and accurate. The results from this benchmark will be used to justify the enhancement in the accompanying pull request.
PR Health
Breaking changes
|
| Package | Change | Current Version | New Version | Needed Version | Looking good? |
|---|---|---|---|---|---|
| collection | Breaking | 1.19.1 | 1.20.0-wip | 2.0.0 Got "1.20.0-wip" expected >= "2.0.0" (breaking changes) |
This check can be disabled by tagging the PR with skip-breaking-check.
Changelog Entry ❗
| Package | Changed Files |
|---|---|
| package:collection | pkgs/collection/lib/src/algorithms.dart |
Changes to files need to be accounted for in their respective changelogs.
This check can be disabled by tagging the PR with skip-changelog-check.
Coverage ⚠️
| File | Coverage |
|---|---|
| pkgs/collection/benchmark/sort_comparison_benchmark.dart | 💔 Not covered |
| pkgs/collection/lib/src/algorithms.dart | 💔 91 % ⬇️ 9 % |
This check for test coverage is informational (issues shown here will not fail the PR).
This check can be disabled by tagging the PR with skip-coverage-check.
API leaks ✔️
The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
| Package | Leaked API symbol | Leaking sources |
|---|
This check can be disabled by tagging the PR with skip-leaking-check.
License Headers ✔️
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
| Files |
|---|
| no missing headers |
All source files should start with a license header.
This check can be disabled by tagging the PR with skip-license-check.
This addresses the failing 'Changelog Entry' CI check by adding the required entry for the quickSort implementation replacement.
|
Looks pretty convincing. I'll have to look at it a little more to see if it's so universally better that we want to replace the simple quicksort, or if we may want to have both. I don't know how many people actually use the sort functions in The only nit I noticed in a quick read-through is that I'd avoid calling @natebosch WDYT? |
| /// Centralized generation of datasets for all benchmarks. | ||
| /// | ||
| /// Ensures all algorithms are tested on the exact same data. | ||
| class DatasetGenerator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Don't use classes with only static members. Put the members into a separate library that can be imported with a prefix if desired.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right — using classes with only static members isn't ideal. I just wrote this file as a temporary setup so we could benchmark the new sort implementation against the old one. It’s not meant to be merged into the main codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been resolved in this commit.
| } | ||
| } | ||
|
|
||
| /// Represents the final aggregated result of a benchmark. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Drop the "Represents" word. All classes represent something. Just say what that is.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been resolved in the recent commit.
| final left = 2 * root + 1; | ||
| final right = 2 * root + 2; | ||
| var largest = root; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Here we could optimize:
var largest = root;
var rootKey = keyOf(elements[start + root]);
while (true) {
var child = 2 * root + 1; // Left child.
if (child < n) {
var largestKey = rootKey;
var childKey = keyOf(elements[start + child]);
if (compare(largestKey, childKey) < 0) {
largest = child;
largestKey = childKey;
}
child++; // Right child.
if (child < n) {
var childKey = keyOf(elements[start + child]);
if (compare(largestKey, childKey) < 0) {
largest = child;
largestKey = childKey;
}
}
if (largest != root) {
_pdqSwap(elements, start + root, start + largest);
root = largest;
continue;
}
}
break;
}This should avoid calling keyOf on the same element more than once, including the root. (I'll assume this only gets called if the root is not a leaf.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed review!
The latest commit reflects these updates. Please let me know if there's anything else that needs attention!
|
Hi @lrn, thank you for the thoughtful review and feedback! This is a great discussion. Let me address your points directly.
That's the key question. I believe replacing it is the right move because
For these reasons, I don't see a scenario where a user would knowingly choose the older, less robust implementation, which argues against keeping both.
This was an excellent question, and it prompted me to add Benchmark Results: pdqsort (Enhancement) vs. SDK quickSort (Baseline) vs. List.sort
This shows that the goal of this PR isn't to replace
Excellent point, and thank you for catching that! You are absolutely right. I've just pushed a new commit that caches the keys for the pivot and the currently inspected element in the partitioning and insertion sort loops. This will avoid the redundant calls and improve performance when Thank you again for the review! Let me know if you have any other questions. |
This commit refactors the _pdqSiftDown helper function, which is used as part of the heapsort fallback mechanism within the quickSort implementation. The changes improve both performance and readability: 1- Key Caching: The key of the current largest element is now cached in a local variable (largestKey). This avoids multiple redundant calls to the keyOf function within the loop, reducing overhead. 2- Clearer Logic: The comparison logic has been restructured. Instead of potentially re-evaluating keyOf(elements[start + largest]) after the first comparison, the cached key is used, and the flow for comparing the root with its left and right children is now more explicit and easier to follow. 3- Early Exit: An early break is introduced for leaf nodes (when left >= n), which slightly simplifies the control flow. These changes do not alter the algorithm's behavior but make the implementation more efficient and maintainable.
Updated the comment for the BenchmarkResult class to enhance clarity by rephrasing it from "Represents the final aggregated result of a benchmark" to "The final aggregated result of a benchmark." This change aims to improve the readability of the code documentation.
…ckSort and pdqsort This commit introduces a new dataset generator for creating various data patterns used in benchmarking sorting algorithms. It also implements a comprehensive benchmark harness that compares the performance of the baseline quickSort and the enhanced pdqsort across multiple data conditions, including random, sorted, reverse sorted, few unique, and pathological datasets. The results will help evaluate the effectiveness of the pdqsort enhancement.
| if (compare(keyOf(elements[b]), keyOf(elements[c])) > 0) { | ||
| _pdqSwap(elements, b, c); | ||
| if (compare(keyOf(elements[a]), keyOf(elements[b])) > 0) { | ||
| _pdqSwap(elements, a, b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can also optimize the keyOf computations.
var keyA = keyOf(elements[a]);
var keyB = keyOf(elements[b]);
var keyC = keyOf(elements[c]);
if (compare(keyA, keyB) > 0) {
_pdqSwap(elements, a, b);
(keyA, keyB) = (keyB, keyA);
}
if (compare(keyB, keyC) > 0) {
_pdqSwap(elements, b, c);
if (compare(keyA, keyC) > 0) {
_pdqSwap(elements, a, b);
}
}I think that should do it.
(It's an unrolled three-element insertion sort, basically.)
Could possibly do the swapping in local memory instead of in the array:
var elementA = elements[a];
var keyA = keyOf(elementA);
var elementB = elements[b];
var keyB = keyOf(elementB);
var elementC = elements[c];
var keyC = keyOf(elementC);
if (compare(keyA, keyB) > 0) {
(elementA, keyA, elementB, keyB) = (elementB, keyB, elementA, keyA);
}
if (compare(keyB, keyC) > 0) {
if (compare(keyA, keyC) > 0) {
(elementA, elementB, elementC) = (elementC, elementB, elementA);
} else {
(elementB, elementC) = (elementC, elementB);
}
}
elements[a] = elementA;
elements[b] = elementB;
elements[c] = elementC;I don't know if that's an advantage, but there should be fewer index checks than with array writes. With the cost of actually writing the elements back if they were already in order.
But more complicated, may not be worth the effort.
| const int _pdqInsertionSortThreshold = 24; | ||
|
|
||
| /// Computes the base-2 logarithm of [n]. | ||
| int _log2(int n) => n == 0 ? 0 : (log(n) / ln2).floor(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can just be => n.bitLength - 1;.
It seems to work for positive numbers, even when compiled to JS.
| target.setRange(min + 1, targetOffset + i + 1, target, min); | ||
| target[min] = element; | ||
| } | ||
| } No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing end-of-file EOL.
(Did you run dart format? 😉 )
|
I think this looks fine, so LGTM from me. @natebosch WDYT? |
| if (i.isOdd) base[i], | ||
| ]; | ||
| return List.generate(count, (_) => List<int>.from(pathological)); | ||
| } No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing EOL.
| List<List<int>> _generateRandom() { | ||
| final r = Random(12345); | ||
| return List.generate( | ||
| count, (_) => List.generate(size, (_) => r.nextInt(2000))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use r.nextInt(size) to give it the same spread as the sorted lists. A size of 50000 and only 2000 distinct elements should give an average of 25 of each element, which is (just) above the insertion-sort threshold. I'd increase to at least 4000, probably more, just to make sure that all the paths are tested in this case too. (Otherwise it's effectively the same as "random with few unique values".)
| } | ||
|
|
||
| List<List<int>> _generateSorted() { | ||
| final base = List.generate(size, (i) => i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This list is used a lot.
Consider having a top-level
/// Numbers from 0 up to [size] in ascending order.
final List<int> _sizeList = [for (var i= 0; i < size; i++) i];Then this would be just:
=> [for (var i = 0; i < count; i++) [..._sizeList]];(Yes, I like list literals better than constructors. Not everybody has to agree. 😅 )
|
|
||
| List<List<int>> _generateReverse() { | ||
| final base = List.generate(size, (i) => size - 1 - i); | ||
| return List.generate(count, (_) => List<int>.from(base)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(And this could be
=> [for (var i = 0; i < count; i++) [..._sizeList.reversed]];)
| // for quicksort implementations by promoting unbalanced partitions. | ||
| final pathological = <int>[ | ||
| for (int i = 0; i < size; i++) | ||
| if (i.isEven) base[i], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be:
... <int> [
for (int i = 0; i < size; i += 2) base[i],
for (int i = 1; i < size; i += 2) base[i],
]I think it's more direct.
|
|
||
| List<int> getNextList() { | ||
| // Cloning the list is crucial so each run sorts an unsorted list. | ||
| return List<int>.from(datasets[_iteration++ % datasets.length]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you clone the list content anyway, there is no need to have 128 different instances of the same list in the sorted/reversed sets. One should be enough.
(This cloning takes time. It's probably unavoidable, we'd have to allocate a lot of lists ahead of time to avoid that, and the memory pressure starts mattering. It should be the same time for all the benchmarks, so at least it's a consistent overhead.
It should use .toList(), not List.from. That's likely to be more performant. The list knows its own length and the fastest way to copy its elements.
I'd not use % with a variable value. That's also expensive. Probably not remotely measurable compared to sorting 50000 elements, though. Still,
could be:
var dataset = datasets[_iteration];
iteration++;
if (iteration == datasets.length) iteration = 0;
return dataset.toList();)
| /// A base class for our sort benchmarks to reduce boilerplate. | ||
| /// Note: We extend `BenchmarkBase` for its structure but will use our own | ||
| /// timing. | ||
| abstract class SortBenchmarkBase extends BenchmarkBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can generate the dataset in a setUp function instead of creating it ahead of time.
That would also allow doing benchmarks with varying sizes of inputs, either all small, all large, or mixed.
This PR enhances the performance and robustness of the default unstable sort in
package:collectionby replacing the existingquickSortimplementation with a modern, state-of-the-art Pattern-Defeating Quicksort (pdqsort) algorithm. The implementation is a Dart port of the concepts from the canonical C++ version by Orson Peters.Motivation
The
pdqsortalgorithm is the default unstable sort in other high-performance ecosystems (like Rust) for good reason. It offers two key advantages over the existing implementation:This change is a pure, non-breaking enhancement that makes the standard library sort faster and safer with no API changes or regressions.
Benchmark Results
To validate this enhancement, a comprehensive benchmark was created, comparing the new
pdqsortimplementation against a direct copy of the existing SDKquickSortacross five critical data patterns.The results demonstrate a consistent and significant performance improvement across all tested scenarios, with no regressions.
Benchmark Results:
pdqsort (Enhancement) vs. SDK quickSort (Baseline)
(Results are the average/median time in microseconds to sort 50,000 integers, aggregated over 12 runs.)
Contribution guidelines:
dart format.Many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.
Note: The Dart team is trialing Gemini Code Assist. Don't take its comments as final Dart team feedback. Use the suggestions if they're helpful; otherwise, wait for a human reviewer.