-
Notifications
You must be signed in to change notification settings - Fork 224
Added and_scalar
and or_scalar
for boolean
#707
Added and_scalar
and or_scalar
for boolean
#707
Conversation
Codecov Report
@@ Coverage Diff @@
## main #707 +/- ##
==========================================
+ Coverage 69.77% 70.19% +0.41%
==========================================
Files 303 309 +6
Lines 16855 16815 -40
==========================================
+ Hits 11761 11803 +42
+ Misses 5094 5012 -82
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for this! The amount of testing that is here is really impressive.
I think that there is an optimization here:
- When the scalar is
true
,A & true = A
andA | true = A
. - When the scalar is
false
,A & false = false
andA | false = A
i.e. we do not need to compute this item by item; we just need to clone and/or initialize all items as false
/ unset
accordingly. This should be orders of magnitude faster since it is not O(N)
.
What do you think?
Yes. That should be great. I will update the code. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great one. Left two minor suggestions :)
src/compute/boolean.rs
Outdated
pub fn or_scalar(array: &BooleanArray, scalar: &BooleanScalar) -> BooleanArray { | ||
match scalar.value() { | ||
Some(true) => { | ||
let values = Bitmap::from_trusted_len_iter(std::iter::repeat(true).take(array.len())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let values = Bitmap::from_trusted_len_iter(std::iter::repeat(true).take(array.len())); | |
let values = Bitmap::from_len_zeroed(array.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this or
version, I want to initialize a Bitmap with all true
bits to implement A | true = true
. Is that right? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes, you are right!
Unfortunately we do not have a specific method for that, but it is worth checking MutableBitmap::extend_constant
here, since it sets 8 bits per write, still much faster than from_trusted_len_iter
^_^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, the reason we have all these seemly similar APIs is that for a single API (like rust's Vec::extend
) we need trait specialization in the stable channel. This is why we currently name every method differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for instruction. I updated to use MutableBitmap:: extend_constant
. Thanks.
…`and_scalar` when scalar value is false.
…all true values.
and_scalar
and or_scalar
for booleanand_scalar
and or_scalar
for boolean
Merged. Thanks a lot, @silathdiir! |
Close #663