-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add heapsort fallback in select_nth_unstable
#106997
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -831,6 +831,15 @@ fn partition_at_index_loop<'a, T, F>( | |||||||||||||||||||||||||||
) where | ||||||||||||||||||||||||||||
F: FnMut(&T, &T) -> bool, | ||||||||||||||||||||||||||||
{ | ||||||||||||||||||||||||||||
// Limit the amount of iterations and fall back to heapsort, similarly to `slice::sort_unstable`. | ||||||||||||||||||||||||||||
// This lowers the worst case running time from O(n^2) to O(n log n). | ||||||||||||||||||||||||||||
// FIXME: Investigate whether it would be better to use something like Median of Medians | ||||||||||||||||||||||||||||
// or Fast Deterministic Selection to guarantee O(n) worst case. | ||||||||||||||||||||||||||||
let mut limit = usize::BITS - v.len().leading_zeros(); | ||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. suggestion: if you're doing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just copied the limit calculation from the sorting code, but I can change it if that's preferred. In that case, I'll also change it for sort I guess. |
||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
// True if the last partitioning was reasonably balanced. | ||||||||||||||||||||||||||||
let mut was_balanced = true; | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
loop { | ||||||||||||||||||||||||||||
// For slices of up to this length it's probably faster to simply sort them. | ||||||||||||||||||||||||||||
const MAX_INSERTION: usize = 10; | ||||||||||||||||||||||||||||
|
@@ -839,6 +848,18 @@ fn partition_at_index_loop<'a, T, F>( | |||||||||||||||||||||||||||
return; | ||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
if limit == 0 { | ||||||||||||||||||||||||||||
heapsort(v, is_less); | ||||||||||||||||||||||||||||
return; | ||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
// If the last partitioning was imbalanced, try breaking patterns in the slice by shuffling | ||||||||||||||||||||||||||||
// some elements around. Hopefully we'll choose a better pivot this time. | ||||||||||||||||||||||||||||
if !was_balanced { | ||||||||||||||||||||||||||||
break_patterns(v); | ||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unsure: This looks like it makes it randomized? Does it need both randomization and the heapsort fallback? Could we just use one or the other? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be honest, I just copied this from the sort implementation. But the fact that sort does need both the pattern breaking and the heapsort fallback makes it seem to me like they're both needed. Alternatively, we could just remove this check, decrement There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I see, this is exactly what sort is doing rust/library/core/src/slice/sort.rs Lines 746 to 758 in 273c6c3
Makes sense to do the same thing, I guess. |
||||||||||||||||||||||||||||
limit -= 1; | ||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
// Choose a pivot | ||||||||||||||||||||||||||||
let (pivot, _) = choose_pivot(v, is_less); | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
|
@@ -863,6 +884,7 @@ fn partition_at_index_loop<'a, T, F>( | |||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
let (mid, _) = partition(v, pivot, is_less); | ||||||||||||||||||||||||||||
was_balanced = cmp::min(mid, v.len() - mid) >= v.len() / 8; | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
// Split the slice into `left`, `pivot`, and `right`. | ||||||||||||||||||||||||||||
let (left, right) = v.split_at_mut(mid); | ||||||||||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wikipedia says that median-of-median is somewhat slow, and not used normally, so I don't know if we'd want to start there, but maybe using it as fallback instead of heapsort. But that's a discussion for another time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what I meant was to use median of medians as a fallback instead of heapsort to keep the fast average case of quickselect and still ensure O(n) worst case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think I agree that fallback to median-of-medians would make sense.
And it sounds like it is used in practice in some contexts, see: https://en.wikipedia.org/wiki/Introselect