Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected/undocumented binary_search behavior with duplicates #51817

Closed
scottjmaddox opened this issue Jun 26, 2018 · 6 comments
Closed

Unexpected/undocumented binary_search behavior with duplicates #51817

scottjmaddox opened this issue Jun 26, 2018 · 6 comments
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@scottjmaddox
Copy link

scottjmaddox commented Jun 26, 2018

slice::binary_search (and perhaps its cousins?) behave in a fairly unexpected (and undocumented) way in the presence of duplicates:

let s = &[0, 1, 1, 1, 1, 2];
// What I would expect:
assert_eq!(s.binary_search(&1), Ok(1));
// Actual result:
assert_eq!(s.binary_search(&1), Ok(4));

While not strictly a bug (since the documentation does not state a particular behavior in this case) it would be fairly simple to tweak the implementation to provide the more expected behavior of returning the first match. Either way, it would be preferable if the behavior were documented.

Making the change at this point might result in some unexpected code breakage, even though the behavior is not documented. In order to ease the transition, it may be best to rename the current implementation to reverse_binary_search or similar, and provide the expected implementation for binary_search.

@kennytm kennytm added C-enhancement Category: An issue proposing an enhancement or a PR with one. A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools labels Jun 26, 2018
@kennytm
Copy link
Member

kennytm commented Jun 26, 2018

The exact behavior is deliberately unspecified. In fact it is "documented" that the position is unspecified:

... the fourth could match any position in [1, 4].

let s = [0, 1, 1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55];
// ...
let r = s.binary_search(&1);
assert!(match r { Ok(1...4) => true, _ => false, });

See #45333 for a discussion.

(As discussed in #45333, we may want a binary_search_with_{upper,lower}_bound function which guarantee the behavior on duplicates. After that we could update the documentation of binary_search to refer to the new methods.)

@kennytm kennytm added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jun 26, 2018
@scottjmaddox
Copy link
Author

Thanks for your response. I completely missed that detail when reading the documentation. Perhaps an extra phrase/sentence could be added to the documentation summary to make it more explicit? I'm probably not alone in skipping reading the Examples section if the function's usage is self explanatory. It could be changed, for example, to:

If the value is found then Ok is returned, containing the index of the matching element. If there are multiple matches, then any one of the matches could be returned. If the value is not found then Err is returned, containing the index where a matching element could be inserted while maintaining sorted order.

@frewsxcv
Copy link
Member

@scottjmaddox That wording looks good to me! Do you want to open a pull request with that change? If not, I can 💁‍♂️

@frewsxcv
Copy link
Member

Also we may want to add the same addition to binary_search_by and binary_search_by_key

@scottjmaddox
Copy link
Author

Hey, sorry, just now seeing your response (it was buried in gmail). If you would be willing to submit the PR, that would be awesome!

@frewsxcv
Copy link
Member

Opened a PR: #54700

frewsxcv added a commit to frewsxcv/rust that referenced this issue Oct 7, 2018
bors added a commit that referenced this issue Oct 8, 2018
Clarify docs for when binary_search has many matches.

Fixes #51817.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants