-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: unwrap casts of string and dictionary columns #10323
Conversation
cc @jayzhan211 this one doesn't interfere with |
|
||
# Now query using an integer which must be coerced into a dictionary string | ||
query T? | ||
SELECT * from test where column2 = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explain
query for this is helpful too.
@@ -298,19 +306,46 @@ fn is_support_data_type(data_type: &DataType) -> bool { | |||
) | |||
} | |||
|
|||
/// Returns true if [UnwrapCastExprRewriter] supports casting this value as a string | |||
fn is_supported_string_type(data_type: &DataType) -> bool { | |||
matches!(data_type, DataType::Utf8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can also support LargeUtf8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
this should prevent returning Transformed in cases where nothing was changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @erratic-pattern and @jayzhan211 -- this is a really nice PR. Thank you for proceeding and pushing to get the "right" solution (rather than changing coercion)
} | ||
if lit_value.is_null() { | ||
// null value can be cast to any type of null value | ||
return Ok(Some(ScalarValue::try_from(target_type)?)); | ||
return ScalarValue::try_from(target_type).ok(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if ignoring errors will make tracking down unsupported types harder (I am imagining when we try and add REEArray or StringViewArray). 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if ignoring errors will make tracking down unsupported types harder (I am imagining when we try and add REEArray or StringViewArray). 🤔
yeah we may want to reintroduce a Result type, but as it stands currently the errors were already being ignored anyway so they didn't serve any purpose.
also the boolean is_supported_* means that we'll just silently skip those cases anyway.
so realistically most of those errors just don't ever occur and if they did occur we wouldn't know about them. we would need to rethink the design of this code to surface those errors.
another interesting possibility is debug assertions that panic, but only on 'debug' builds. I've always found this to be an interesting but underutilized tool for uncovering bugs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I was mistaken, these errors weren't being ignored. I could add them back in in a new PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I was mistaken, these errors weren't being ignored. I could add them back in in a new PR?
I think that would be really nice if you had the time.
I made a PR with a few more tests here: #10335 |
Which issue does this PR close?
Closes #10220
Rationale for this change
Improves performance of queries that filter string/dictionary columns against literals (
WHERE dict_col = 1
) by avoiding casting the columnWhat changes are included in this PR?
Allows the optimizer to unwrap casts of string types in comparison operations.
Are these changes tested?
Yes
Are there any user-facing changes?
No