Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(frontend): order by arbitrary expr #2329

Merged
merged 3 commits into from
May 6, 2022
Merged

feat(frontend): order by arbitrary expr #2329

merged 3 commits into from
May 6, 2022

Conversation

likg227
Copy link
Contributor

@likg227 likg227 commented May 5, 2022

What's changed and what's your intention?

PLEASE DO NOT LEAVE THIS EMPTY !!!

Please explain IN DETAIL what the changes are in this PR and why they are needed:

  • Summarize your change (mandatory)
    This pr solve frontend: Support ORDER BY arbitrary column #2124, so we can order by number of output column, column label and any expression that is valid in select list now. It uses BatchProject for Batch and out_fields for stream to do the projection. Some part of the pr may not be good enough, and I'm open for any advice. Thx!

Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests

Refer to a related PR or issue link (optional)

close #2124 #1635

@github-actions github-actions bot added the type/fix Bug fix label May 5, 2022
@codecov
Copy link

codecov bot commented May 5, 2022

Codecov Report

Merging #2329 (b79c018) into main (0193870) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #2329      +/-   ##
==========================================
- Coverage   71.07%   71.07%   -0.01%     
==========================================
  Files         675      675              
  Lines       84502    84612     +110     
==========================================
+ Hits        60061    60137      +76     
- Misses      24441    24475      +34     
Flag Coverage Δ
rust 71.07% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/frontend/src/lib.rs 60.00% <ø> (ø)
src/frontend/src/binder/insert.rs 98.52% <100.00%> (+0.02%) ⬆️
src/frontend/src/binder/query.rs 98.64% <100.00%> (+0.34%) ⬆️
src/frontend/src/optimizer/mod.rs 95.02% <100.00%> (+0.67%) ⬆️
src/frontend/src/planner/query.rs 100.00% <100.00%> (ø)
src/frontend/src/planner/select.rs 93.60% <100.00%> (+0.31%) ⬆️
src/frontend/src/planner/set_expr.rs 100.00% <100.00%> (ø)
src/meta/src/hummock/hummock_manager.rs 88.62% <0.00%> (-2.61%) ⬇️
src/meta/src/hummock/vacuum.rs 83.18% <0.00%> (-2.08%) ⬇️
src/common/src/types/ordered_float.rs 24.70% <0.00%> (-0.20%) ⬇️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@xiangjinwu xiangjinwu changed the title fix(frontend): order by. feat(frontend): order by arbitrary expr May 5, 2022
@github-actions github-actions bot added type/feature and removed type/fix Bug fix labels May 5, 2022
Copy link
Contributor

@xiangjinwu xiangjinwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally lgtm, with some small tweaks needed.

btw did you try any e2e tests?

@@ -30,6 +31,7 @@ pub struct BoundQuery {
pub order: Vec<FieldOrder>,
pub limit: Option<usize>,
pub offset: Option<usize>,
pub order_exprs: Vec<ExprImpl>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • extra_order_exprs
  • Maybe we can put this on BoundSelect rather than BoundQuery, because this feature only works when body matches BoundSetExpr::Select. They cannot be used with VALUES, UNION, or even SELECT DISTINCT.

Side note: as of now we push and pop context in bind_query. But in select a1 from a union select b1 from b, the context should actually be different for each select. We do not need to fix in this PR, but this is why we cannot bind order by a1 here - it is only available in the context of the select.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I orginally wanted to put it in BoundSelect, but ORDER BY clause is bound after BoundSelect was generated and because BoundSetExpr is an enum, it is difficult to put extra_order_exprs into BoundSelect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not difficult as matching enum also helps reject union here.

However, after rethinking, I agree that putting extra_order_exprs at the same level of order and out_fields is better. It can be confusing that each BoundSelect has extra_order_exprs, which is only meaningful when it has no siblings.

As for rejecting union as well as using the correct context when binding extra_order_exprs, the same idea can be extended from values: (1) Without union, the query and its body select share the same context (current behavior); (2) With union, each descendant select will use their own context, leaving the context of query empty, and extra_order_exprs would return bind error as expected.

src/frontend/src/binder/query.rs Outdated Show resolved Hide resolved
src/frontend/src/binder/query.rs Outdated Show resolved Hide resolved
src/frontend/src/binder/query.rs Outdated Show resolved Hide resolved
src/frontend/test_runner/tests/testdata/order_by.yaml Outdated Show resolved Hide resolved
src/frontend/src/binder/query.rs Outdated Show resolved Hide resolved
src/frontend/src/planner/select.rs Outdated Show resolved Hide resolved
@@ -30,6 +31,7 @@ pub struct BoundQuery {
pub order: Vec<FieldOrder>,
pub limit: Option<usize>,
pub offset: Option<usize>,
pub order_exprs: Vec<ExprImpl>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not difficult as matching enum also helps reject union here.

However, after rethinking, I agree that putting extra_order_exprs at the same level of order and out_fields is better. It can be confusing that each BoundSelect has extra_order_exprs, which is only meaningful when it has no siblings.

As for rejecting union as well as using the correct context when binding extra_order_exprs, the same idea can be extended from values: (1) Without union, the query and its body select share the same context (current behavior); (2) With union, each descendant select will use their own context, leaving the context of query empty, and extra_order_exprs would return bind error as expected.

@likg227 likg227 merged commit 68dc6fc into main May 6, 2022
@likg227 likg227 deleted the lkg/order-by branch May 6, 2022 08:15
@xiangjinwu xiangjinwu mentioned this pull request May 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

frontend: Support ORDER BY arbitrary column
2 participants