-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up InList code to use functions rather than macros #2826
Conversation
/// | ||
/// # Example | ||
/// TODO | ||
fn unary_cmp<I, F>(array: &PrimitiveArray<I>, op: F) -> BooleanArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copy/paste from https://docs.rs/arrow/17.0.0/arrow/compute/kernels/arity/index.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ted-Jiang are there any performance benchmarks that can be run for the InList
implementation?
let null_bit_buffer = $left.data().null_buffer().cloned(); | ||
|
||
let comparison = | ||
(0..$left.len()).map(|i| unsafe { $op($left.value_unchecked(i), $right) }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in general using iterators (rather than value_unchecked()
) is faster
/// Applies an unary and infallible comparison function to a primitive | ||
/// array. This is the fastest way to perform an operation on a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Applies an unary and infallible comparison function to a primitive | |
/// array. This is the fastest way to perform an operation on a | |
/// TODO: move to arrow-rs: https://github.com/apache/arrow-rs/issues/1987 | |
/// | |
/// Applies an unary and infallible comparison function to a primitive | |
/// array. This is the fastest way to perform an operation on a |
@@ -230,39 +248,37 @@ macro_rules! set_contains_with_negated { | |||
fn in_list_primitive<T: ArrowPrimitiveType>( | |||
array: &PrimitiveArray<T>, | |||
values: &[<T as ArrowPrimitiveType>::Native], | |||
) -> Result<BooleanArray> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made infallible as it can't fail
) -> BooleanArray { | ||
array | ||
.iter() | ||
.map(|x| x.map(|x| !values.contains(&x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change now does check each element if it is Null
before invoking the closure where the compare_op_scalar!
macro does not. However, given that that the closure will do some non trivial string checks, I think first checking for null will likely not slow it down
Hmm, something is not correct. Will investigate |
I don't have time / plans to work on this any more |
Which issue does this PR close?
N/A (at the moment)
Rationale for this change
Inspired by @liukun4515 's PR here: #2809 I saw some macro use that could be reduced and may even be faster
What changes are included in this PR?
Are there any user-facing changes?
No