-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add optimizer config param to avoid grouping partitions prefer_existing_union
#10259
Conversation
plan, | ||
!first_enforce_distribution, | ||
prefer_existing_sort, | ||
prefer_existing_union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observing the test union_to_interleave
above it. If it covers all needed cases, these 2 tests will cover all needed cases for the not-convert, too. Let me know if it is not the case
@alamb Is this PR is good enough? We do not need to set |
plan = if plan.as_any().is::<UnionExec>() && can_interleave(children_plans.iter()) { | ||
|
||
plan = if plan.as_any().is::<UnionExec>() | ||
&& !config.optimizer.prefer_existing_union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that one of the motivations (influxdata#4) for this new flag is to preserve the sorting of the Union - would re-using the existing flag prefer_existing_sort
make sense here?
One argument against that would be if you wanted prefer_existing_sort: false
and prefer_existing_union: true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that one of the motivations (influxdata#4) for this new flag is to preserve the sorting of the Union - would re-using the existing flag
prefer_existing_sort
make sense here?One argument against that would be if you wanted
prefer_existing_sort: false
andprefer_existing_union: true
.
This is an excellent point @phillipleblanc . I just double checked and we actually have prefer_existing_sort
set to true in IOx already (code ref, not public)
What do you think about using the existing flag @NGA-TRAN ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phillipleblanc and @alamb
@mustafasrepo 's comment #10259 (comment) suggest to use the current approach. Do I need to do anything more here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I think this approach looks good! Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @NGA-TRAN -- this is looking like it is pretty close in my opinion
I think @phillipleblanc 's suggestions are quite good and we should consider them carefully
plan = if plan.as_any().is::<UnionExec>() && can_interleave(children_plans.iter()) { | ||
|
||
plan = if plan.as_any().is::<UnionExec>() | ||
&& !config.optimizer.prefer_existing_union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that one of the motivations (influxdata#4) for this new flag is to preserve the sorting of the Union - would re-using the existing flag
prefer_existing_sort
make sense here?One argument against that would be if you wanted
prefer_existing_sort: false
andprefer_existing_union: true
.
This is an excellent point @phillipleblanc . I just double checked and we actually have prefer_existing_sort
set to true in IOx already (code ref, not public)
What do you think about using the existing flag @NGA-TRAN ?
}; | ||
|
||
($EXPECTED_LINES: expr, $PLAN: expr, $FIRST_ENFORCE_DIST: expr, $PREFER_EXISTING_SORT: expr) => { | ||
assert_optimized!($EXPECTED_LINES, $PLAN, $FIRST_ENFORCE_DIST, $PREFER_EXISTING_SORT, 10, false, 1024); | ||
assert_optimized!($EXPECTED_LINES, $PLAN, $FIRST_ENFORCE_DIST, $PREFER_EXISTING_SORT, 10, false, 1024, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these macros are getting a bit hairy -- maybe we can clean them up (convert to using functions rather than macros) in a subsequent PR
cc @mustafasrepo and @ozankabak for your thoughts |
I think having dedicated config setting is more verbose and clear (as in plan.as_any().is::<UnionExec>()
&& !config.optimizer.prefer_existing_sort
&& can_interleave(children_plans.iter()) this will prefer In the future, if we add support for |
Sounds good to me.
This is an excellent idea, I will file a ticket to track the idea @NGA-TRAN can you get the tests passing on this PR and we'll give it another review? |
@alamb @phillipleblanc and @mustafasrepo |
prefer_existing_union
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks good to me.
Thank you for the reviews @phillipleblanc @mustafasrepo and @ozankabak
I plan to file the follow on tickets (e.g. about sort preserving interleave) tomorrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is LGTM! Thanks @NGA-TRAN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
plan = if plan.as_any().is::<UnionExec>() && can_interleave(children_plans.iter()) { | ||
|
||
plan = if plan.as_any().is::<UnionExec>() | ||
&& !config.optimizer.prefer_existing_union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I think this approach looks good! Thanks!
Since this PR is good to go, merging it in. I am also in the process of filing the follow on work |
…_existing_union` (apache#10259) * feat: add a config param to avoid converting union to interleave * chore: update config for the tests * chore: update configs.md
Filed #10314 to track order preserving UnionExec |
I also started collecting sorted related optimizations in #10313 as well |
…_existing_union` (apache#10259) * feat: add a config param to avoid converting union to interleave * chore: update config for the tests * chore: update configs.md
…_existing_union` (apache#10259) * feat: add a config param to avoid converting union to interleave * chore: update config for the tests * chore: update configs.md
Which issue does this PR close?
Closes ##10257
Rationale for this change
What changes are included in this PR?
Add an option to avoid converting Union to Interleave
Are these changes tested?
Yes
Are there any user-facing changes?
No, because by the fault, nothing is changed