-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-26648: [C++] Optimize union equality comparison #45384
Conversation
|
Thanks! By the way is there any benchmark result for this pr? |
still figuring that part out! 😅 |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your PR @mcshawn10 . This looks mostly good, here are two minor suggestions.
cpp/src/arrow/compare.cc
Outdated
} | ||
} | ||
|
||
// Handle the final run | ||
const auto final_child_num = child_ids[left_codes[left_start_idx_ + run_start]]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably only do this final comparison if result_
is still true.
cpp/src/arrow/compare.cc
Outdated
|
||
if (!impl.Compare()) { | ||
result_ = false; | ||
return Status::OK(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can break
just as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the feedback! just implemented the suggestions
It turns out that we didn't have any proper tests for sparse union comparison, so I added one. |
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 1567be0. There were 8 benchmark results with an error:
There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
#26648 proposes an optimization in checking sparse array equality by detecting contiguous runs, this PR implements that change
What changes are included in this PR?
previously, sparse array comparison was checked one by one, in this change, contiguous runs are detected and compared by checking equality of current and previous child_ids
Are these changes tested?
already covered by existing unit tests
Are there any user-facing changes?
no