Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(frontend): bind project #615

Merged
merged 8 commits into from
Mar 3, 2022
Merged

feat(frontend): bind project #615

merged 8 commits into from
Mar 3, 2022

Conversation

likg227
Copy link
Contributor

@likg227 likg227 commented Mar 1, 2022

What's changed and what's your intention?

PLEASE DO NOT LEAVE THIS EMPTY !!!

Please explain IN DETAIL what the changes are in this PR and why they are needed:

  • Summarize your change (mandatory)
  • How does this PR work? Need a brief introduction for the changed logic (optional)
  • Describe clearly one logical change and avoid lazy messages (optional)
  • Describe any limitations of the current code (optional)

In this pr, I support queries like explain select v1 from t;.
Please notice that I haven't support recursive BindContext, and I will support it in the future.

Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests

Refer to a related PR or issue link (optional)

Copy link
Member

@fuyufjh fuyufjh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM

use crate::expr::ExprImpl;

impl Binder {
pub fn bind_projection(&mut self, projection: Vec<SelectItem>) -> Result<Vec<ExprImpl>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

Suggested change
pub fn bind_projection(&mut self, projection: Vec<SelectItem>) -> Result<Vec<ExprImpl>> {
pub fn bind_select_items(&mut self, projection: Vec<SelectItem>) -> Result<Vec<ExprImpl>> {

use crate::optimizer::PlanRef;

impl Planner {
pub(super) fn plan_projection(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend to call it Project instead of Projection

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we create one file per operator? At least for projection we can just place it under planner/select.rs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also thinking to put this directly in select, as it just calls the constructor and is only used there.

Copy link
Contributor Author

@likg227 likg227 Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emmm I did this because bind_vec_table_with_join() is also in a file..........
and although bind_projection() is simple now, it will become complicated as we develop more features....

Copy link
Contributor Author

@likg227 likg227 Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend to call it Project instead of Projection

self.bind_projection(select.projection) The reason why I use projection is that BoundSelect use projection(not project) as its member.........

Copy link
Contributor

@xiangjinwu xiangjinwu Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bind_vec_table_with_join is in a file intended to handle all types of TableRefs: base table, join, and subquery. It may be split into smaller files later.

although bind_projection() is simple now, it will become complicated as we develop more features

It is likely {bind,plan}_select will become complicated, with projection / select_items being one of the steps or even several parts that are hard to be extracted into a dedicated function.

Comment on lines 23 to 24
Some(column) => Ok(ExprImpl::InputRef(Box::new(InputRef::new(
column.id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Column id is different from column index. In the catalog, each column has its own id; while for InputRef, it cares about how to locate the column from input chunk. In the simplest form the input is from scan.

For example, if we scan start(id=2), end(id=7) and want to evaluate end - start, it would be Sub(InputRef(1), InputRef(0)).

Err(ErrorCode::ItemNotFound(format!("Invalid table: {}", table_name)).into())
}
},
None => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two branches are a little bit duplicated. We can start by supporting column names only as there is only one table now. When there are multiple tables, maybe we can use a reversed map column -> table -> schema: (1) when only column name is provided and it is unambiguous after lookup, the column is successfully bound to the only table; (2) when the column lookup yields multiple tables, the extra table name can be used or it is an ambiguous reference error.

Comment on lines 94 to 99
write!(
f,
"LogicalProject {{ exprs: {:?}, expr_alias: {:?} }}",
self.exprs(),
self.expr_alias(),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try this: f.debug_struct().field().field().finish().

@likg227 likg227 force-pushed the lkg/bind-project branch from 9e93885 to 747b530 Compare March 3, 2022 05:58
.or_default()
.push(ColumnBinding::new(
table_name.clone(),
column.id() as usize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the bind_table always returns a relation with all columns in that table, right?

Copy link
Member

@fuyufjh fuyufjh Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By design, there will be an ordered list Vec<ColumnDesc> or Vec<ColumnId> in the table catalog to denote the ordering of columns in a table. So here you should get each column according to that and assign index from 0, instead of column.id() as usize

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may imagine ColumnId as a random number unique for each column :)

@neverchanje
Copy link
Contributor

@likg227 I suggest that you can add a test in rust/frontend/tests/testdata/basic_query.yaml since I've found that this pr may not be runnable.

A simple test case is:

create table t (v1 int, v2 int);
select v1 from t;

@likg227 likg227 force-pushed the lkg/bind-project branch from 747b530 to 6566f98 Compare March 3, 2022 08:23
@neverchanje neverchanje linked an issue Mar 3, 2022 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Mar 3, 2022

Codecov Report

Merging #615 (7cac54f) into main (b875e31) will increase coverage by 0.06%.
The diff coverage is 63.07%.

Impacted file tree graph

@@             Coverage Diff              @@
##               main     #615      +/-   ##
============================================
+ Coverage     71.34%   71.40%   +0.06%     
- Complexity     2701     2706       +5     
============================================
  Files           894      898       +4     
  Lines         51165    51338     +173     
  Branches       1724     1730       +6     
============================================
+ Hits          36503    36659     +156     
- Misses        13849    13864      +15     
- Partials        813      815       +2     
Flag Coverage Δ
java 59.86% <ø> (-0.19%) ⬇️
rust 76.37% <63.07%> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
rust/frontend/src/binder/expr/column.rs 30.00% <30.00%> (ø)
rust/frontend/src/binder/expr/mod.rs 77.77% <50.00%> (-7.94%) ⬇️
rust/frontend/src/binder/projection.rs 80.00% <80.00%> (ø)
rust/frontend/src/binder/bind_context.rs 100.00% <100.00%> (ø)
rust/frontend/src/binder/mod.rs 100.00% <100.00%> (ø)
rust/frontend/src/binder/select.rs 100.00% <100.00%> (ø)
rust/frontend/src/binder/table_ref.rs 84.84% <100.00%> (+4.07%) ⬆️
...rontend/src/optimizer/plan_node/logical_project.rs 47.91% <100.00%> (+47.91%) ⬆️
rust/frontend/src/planner/select.rs 80.00% <100.00%> (+13.33%) ⬆️
rust/meta/src/hummock/compaction.rs 67.32% <0.00%> (-0.66%) ⬇️
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b875e31...7cac54f. Read the comment docs.

Copy link
Contributor

@xiangjinwu xiangjinwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.debug_struct("LogicalProject")
.field("exprs", self.exprs())
.field("expr_alias", &format_args!("{:?}", self.expr_alias()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.field("expr_alias", self.expr_alias()) not enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ignore the schema field which is useless to display and the input field which is redundant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was asking about the field expr_alias. It turns out the following works:

  • .field("expr_alias", &self.expr_alias)
  • .field("expr_alias", &self.expr_alias())
    but not:
  • .field("expr_alias", self.expr_alias()) because &[Option<String>] cannot be used as &dyn Debug.

}
}

pub fn bind_all_columns(&mut self) -> Result<Vec<ExprImpl>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like this has closer relationship with select/projection than as part of binder/expr. What about move this when we fix the ordering of expansion?

.iter()
.find(|column| column.table_name == *table_name)
{
Some(column) => Ok(ExprImpl::InputRef(Box::new(InputRef::new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ExprImpl::InputRef(Box::new(a)) -> a.to_expl_impl()

@likg227 likg227 merged commit 79bdca9 into main Mar 3, 2022
@likg227 likg227 deleted the lkg/bind-project branch March 3, 2022 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rust frontend: bind & plan project
4 participants