-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Sort expression (Expr::Sort
)
#12177
Remove Sort expression (Expr::Sort
)
#12177
Conversation
Not ready for review. Just for FYI. |
eaeb216
to
f2a254f
Compare
c29d147
to
cdea34e
Compare
The build is all green (https://github.com/findepi/datafusion/actions/runs/10574858019)! |
cdea34e
to
1c3f194
Compare
Rebased, should be ready for initial review pass. I split the change into couple commits, but bulk of it is still one big commit. cc @alamb @comphead @jonahgao @jayzhan211 @andygrove @crepererum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at all the invalid states (panics and "unimplemented" errors) that are removed in this PR, this is a clear improvement. I also agree with the overall premise that "sorts" are not really a subtype/variant of Expr
.
This should probably get a 2nd ACK by another maintainer.
there is a new conflict (perhaps due to #12196), let me rebase. |
Part of effort to remove `Expr::Sort`.
Part of effort to remove `Expr::Sort`.
Take `expr::Sort` in `LogicalPlanBuilder.sort`. Accept any `Expr` in new function, `LogicalPlanBuilder.sort_by` which apply default sort ordering. Part of effort to remove `Expr::Sort`.
Part of effort to remove `Expr::Sort`.
Part of effort to remove `Expr::Sort`.
Remove sort as an expression, i.e. remove `Expr::Sort` from `Expr` enum. Use `expr::Sort` directly when sorting. The sort expression was used in context of ordering (sort, topk, create table, file sorting). Those places require their sort expression to be of type Sort anyway and no other expression was allowed, so this change improves static typing. Sort as an expression was illegal in other contexts.
1c3f194
to
9840b8f
Compare
})) | ||
} | ||
|
||
pub fn replace_sort_expressions(sorts: Vec<Sort>, new_expr: Vec<Expr>) -> Vec<Sort> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe returns with Result<Vec<Sort>>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. This is used in two places. In one we know the list of expressions is exact size.
the other is LogicalPlan.with_new_exprs
which does assert!
on provided new expr length. so it looks we don't need to be permissive here?
None, | ||
)); | ||
let plan = LogicalPlanBuilder::from(table_scan) | ||
.aggregate(vec![col("c")], vec![expr, count_distinct(col("b"))])? | ||
.build()?; | ||
// Do nothing | ||
let expected = "Aggregate: groupBy=[[test.c]], aggr=[[sum(test.a) ORDER BY [test.a], count(DISTINCT test.b)]] [c:UInt32, sum(test.a) ORDER BY [test.a]:UInt64;N, count(DISTINCT test.b):Int64]\ | ||
let expected = "Aggregate: groupBy=[[test.c]], aggr=[[sum(test.a) ORDER BY [test.a ASC NULLS LAST], count(DISTINCT test.b)]] [c:UInt32, sum(test.a) ORDER BY [test.a ASC NULLS LAST]:UInt64;N, count(DISTINCT test.b):Int64]\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the same as previous?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default ASC/DESC NULLS FIRST/LAST behavior seems to come from
datafusion/datafusion/sql/src/expr/order_by.rs
Lines 101 to 107 in c22e13f
let asc = asc.unwrap_or(true); | |
expr_vec.push(Expr::Sort(Sort::new( | |
Box::new(expr), | |
asc, | |
// when asc is true, by default nulls last to be consistent with postgres | |
// postgres rule: https://www.postgresql.org/docs/current/queries-order.html | |
nulls_first.unwrap_or(!asc), |
if i read this correctly, ASC NULLS LAST is the default
2f0ea05
to
39877e4
Compare
thank you @jayzhan211 for your review! would you mind taking another look? |
this commit is longer than advised in the review comment, but after squashing the diff will be smaller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
EPIC! |
thank you @jayzhan211 @crepererum @alamb for all your time spent reviewing this! |
❤️ thank you for the (brave) merge, @crepererum ! |
Remove sort as an expression, i.e. remove
Expr::Sort
fromExpr
enum.Use
expr::Sort
directly when sorting.The sort expression was used in context of ordering (sort, topk, create
table, file sorting). Those places require their sort expression to be
of type Sort anyway and no other expression was allowed, so this change
improves static typing. Sort as an expression was illegal in other
contexts.
For #1468 (comment)
Fixes #12193