-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve null handling for to_char #9689
Changes from 1 commit
30d9bf5
b2789c2
e505d6c
a3ad030
5156731
7c3d258
1c5c68c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -171,7 +171,11 @@ fn _to_char_scalar( | |
// of the implementation in arrow-rs we need to convert it to an array | ||
let data_type = &expression.data_type(); | ||
let is_scalar_expression = matches!(&expression, ColumnarValue::Scalar(_)); | ||
let array = expression.into_array(1)?; | ||
let array_from_expr = expression.into_array(1)?; | ||
let array = match format { | ||
Some(_) => array_from_expr, | ||
None => ColumnarValue::create_null_array(array_from_expr.len()).into_array(1)?, | ||
}; | ||
let format_options = match _build_format_options(data_type, format) { | ||
Ok(value) => value, | ||
Err(value) => return value, | ||
|
@@ -215,8 +219,19 @@ fn _to_char_array(args: &[ColumnarValue]) -> Result<ColumnarValue> { | |
}; | ||
// this isn't ideal but this can't use ValueFormatter as it isn't independent | ||
// from ArrayFormatter | ||
let formatter = ArrayFormatter::try_new(arrays[0].as_ref(), &format_options)?; | ||
let result = formatter.value(idx).try_to_string(); | ||
let result = match format { | ||
Some(_) => { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think in this case we should also be formatting to None, as in change let mut results: Vec<String> = vec![]; to let mut results: Vec<Option<String>> = vec![]; so that it can properly represent nulls There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Continuing to use the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated: |
||
let formatter = | ||
ArrayFormatter::try_new(arrays[0].as_ref(), &format_options)?; | ||
formatter.value(idx).try_to_string() | ||
} | ||
None => { | ||
let null_array = ColumnarValue::create_null_array(1).into_array(1)?; | ||
let formatter = | ||
ArrayFormatter::try_new(null_array.as_ref(), &format_options)?; | ||
formatter.value(0).try_to_string() | ||
} | ||
}; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something else I noticed while looking at this code is that
Effectively means the strings are copied twice (once to As a follow on PR we could potentially use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For someone whose role is to find ways to optimize things I seem to be rather poor at it sometimes :) Nice catch. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well there are different degrees of optimization for sure ;) |
||
match result { | ||
Ok(value) => results.push(value), | ||
Err(e) => return exec_err!("{}", e), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for Reviewer:
We can't directly return null when we see that the format is
None
becauseformat_options
(passed toArrayFormatter::try_new
) allow specifying the string to show for null & we need to respect this configuration option.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it is true that
ArrayFormatter::try_new
allows specifying the string to use for null values, I don't think that functionality is exposed viato_char
Thus I think this actually should simply return a new
StringArray
of all null values (which confusingly is different thanNullArray
)So in this case I think if the format is
None
the code should return a null string value (namelyColumnarValue::Scalar(ScalarValue::Utf8(None))
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented this change. Slight difference in the type though. Please see #9689 (comment).