-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to arrow 23.0.0 #3483
Upgrade to arrow 23.0.0 #3483
Conversation
The failure in the
Looks related to some changes to the arity kernels -- specifically the code that is panic'ing seems to have been changed in apache/arrow-rs#2666 by @tustvold . I am not sure if DataFusion is passing bad input or if there is some corner case in arrow that is hit. I plan to look into it later today if no one else beat me to it. |
It will actually be apache/arrow-rs#2643 which changed the unchecked divide kernels to panic on divide by zero - apache/arrow-rs#2647 |
|
datafusion/physical-expr/src/expressions/binary/kernels_arrow.rs
Outdated
Show resolved
Hide resolved
@@ -24,7 +24,7 @@ use std::{any::Any, sync::Arc}; | |||
|
|||
use arrow::array::*; | |||
use arrow::compute::kernels::arithmetic::{ | |||
add, add_scalar, divide, divide_scalar, modulus, modulus_scalar, multiply, | |||
add, add_scalar, divide_opt, divide_scalar, modulus, modulus_scalar, multiply, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default divide
kernel changed behavior -- divide_opt
now has the "NULL on divide by zero" behavior datafusion expects
@@ -372,14 +372,18 @@ pub(crate) fn multiply_decimal_scalar( | |||
Ok(array) | |||
} | |||
|
|||
pub(crate) fn divide_decimal( | |||
pub(crate) fn divide_opt_decimal( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to be consistent with the arrow rename
Codecov Report
@@ Coverage Diff @@
## master #3483 +/- ##
=======================================
Coverage 85.79% 85.79%
=======================================
Files 300 300
Lines 55403 55397 -6
=======================================
- Hits 47533 47530 -3
+ Misses 7870 7867 -3
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
b935cc5
to
9342d2b
Compare
@@ -286,8 +286,7 @@ impl SchemaAdapter { | |||
let projected_schema = Arc::new(self.table_schema.clone().project(projections)?); | |||
|
|||
// Necessary to handle empty batches | |||
let mut options = RecordBatchOptions::default(); | |||
options.row_count = Some(batch.num_rows()); | |||
let options = RecordBatchOptions::new().with_row_count(Some(batch.num_rows())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update to use the nice API @askoa added in apache/arrow-rs#2729
if dep_name in ("arrow", "parquet", "arrow-flight") and constraint.get("version") is not None: | ||
doc[section][dep_name]["version"] = new_version | ||
if dep_name in ("arrow", "parquet", "arrow-flight"): | ||
if type(constraint) == tomlkit.items.String: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is code I wrote in IOx in https://github.com/influxdata/influxdb_iox/blob/main/scripts/update_arrow_deps.py
Basically the previous version didn't handle arrow = "22.0.0"
type toml
af16115
to
e6c9ced
Compare
Benchmark runs are scheduled for baseline = e873423 and contender = 67002a0. 67002a0 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
re apache/arrow-rs#2665
Rationale for this change
I am putting up this PR so I can incrementally test the effect on DataFusion of splitting up the arrow crate in PRs such as apache/arrow-rs#2693Hopefully we can use it to actually upgrade to arrow 23.0.0 when it is released as wellArrow 23.0.0 was released -- need to keep up with dependencies
What changes are included in this PR?
Update to arrow 23.0.0
Are there any user-facing changes?
No