Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a single binary search for array contains_range #385

Merged

Conversation

Dr-Emann
Copy link
Member

The rest can be done by checking a single memory location

@lemire
Copy link
Member

lemire commented Aug 29, 2022

Very nice.

@lemire lemire merged commit 7434563 into RoaringBitmap:master Aug 29, 2022
@lemire
Copy link
Member

lemire commented Aug 29, 2022

Merged.

clap ! clap ! clap !

@Dr-Emann Dr-Emann deleted the array_contains_range_one_bsearch branch August 29, 2022 20:17
@gregsadetsky
Copy link
Contributor

gregsadetsky commented Sep 7, 2022

Hi,

Sorry to intervene with a potentially useless comment, but this change seems to introduce an issue (that may be irrelevant). I was able to generate an incorrect answer with this newly merged code in the following way:

array_container_t * a = array_container_create_range(3, 10);
bool answer = array_container_contains_range(a, 3, 0);
std::cout << "contains? " << (answer? "yes" : "no") << std::endl;

I'm getting an incorrect positive bool from array_container_contains_range when passing an invalid range where start > end. The same incorrect true result happens if you pass 1, -3 as arguments to array_container_contains_range.

The previous version of array_container_create_range correctly returns false for both [3, 0] and [1, -3]

Very briefly:

  • the upstream roaring_bitmap_contains_range function correctly checks here whether the range it is passed is invalid or empty e.g., if (range_start >= range_end) return true; -- EDIT no, it would actually obviously return true for start > end.... that's what the code says. duh
  • roaring_bitmap_contains_range however calls container_contains_range, which calls array_container_contains_range, which does not do specifically check for the start >= end condition. The new code that's discussed in this PR checks if (range_count <= 0) { return true; } which spuriously returns true for any range where start > end.

My suggestion - again, if this is all relevant and apologies otherwise! - would be to slightly modify this PR so that

if (range_count <= 0)

becomes

if (range_start >= range_end)

mimicking roaring_bitmap_contains_range, and returning the correct answer for the (potentially irrelevant) :-) cases described here.

I'd be happy to open a PR for this (or to go away ha) ;-) Cheers!

@lemire
Copy link
Member

lemire commented Sep 7, 2022

I'd be happy to open a PR for this

Please do so.

@gregsadetsky
Copy link
Contributor

gregsadetsky commented Sep 7, 2022

I'm sorry... I was confused. Checking range_start >= range_end would do the same thing as the code I was commenting on -- if start > end, it would return true... That's literally what the code says it's doing.

I guess the real question is: should array_container_contains_range or roaring_bitmap_contains_range care about invalid ranges passed to them (where start > end), or is this a useless bounds check?

@lemire
Copy link
Member

lemire commented Sep 7, 2022

We definitively want roaring_bitmap_contains_range to be well behaved because it is part of our public API.

Meanwhile array_container_contains_range is not part of our public API.

@gregsadetsky
Copy link
Contributor

gregsadetsky commented Sep 7, 2022

Makes sense, thanks.

So what should roaring_bitmap_contains_range(r, 3, 0) i.e. start > end return? false? or should it have an assert (checking end >= start) and blow up if it receives incorrect arguments?

Right now, passing start=3 end=0 returns a spurious true -- regardless of whether start and/or end (3, 0) are contained in the bitmap.

@Dr-Emann
Copy link
Member Author

Dr-Emann commented Sep 8, 2022

I think the existing behavior is actually correct: roaring_bitmap_contains_range(r, 3, 3) (an empty range) returns true for all bitmaps: All (0) values in the range are also contained by the bitmap. I think it also makes sense that a range from [3, 0) is also an empty range, so it should also return true.

@Dr-Emann Dr-Emann restored the array_contains_range_one_bsearch branch August 7, 2023 23:02
@Dr-Emann Dr-Emann deleted the array_contains_range_one_bsearch branch August 8, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants