Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: LimitPushdown rule uncorrect remove some GlobalLimitExec #14245

Merged
merged 8 commits into from
Jan 29, 2025

Conversation

zhuqi-lucas
Copy link
Contributor

@zhuqi-lucas zhuqi-lucas commented Jan 23, 2025

Which issue does this PR close?

Closes #14204

What changes are included in this PR?

Fix LimitPushdown rule uncorrect remove some GlobalLimitExec.

Are these changes tested?

Yes, added unit test and slt testing.

@github-actions github-actions bot added optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Jan 23, 2025
@xudong963 xudong963 self-requested a review January 23, 2025 05:47
@github-actions github-actions bot added the physical-expr Physical Expressions label Jan 23, 2025
@github-actions github-actions bot removed the physical-expr Physical Expressions label Jan 23, 2025
@@ -4225,7 +4225,7 @@ query IIIIB
SELECT * FROM t0 FULL JOIN t1 ON t0.c2 >= t1.c2 LIMIT 2;
----
2 2 2 2 true
3 3 2 2 true
2 2 2 2 false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

According to this PR, the limit push down to full join both side:
#12963

@@ -4247,8 +4247,10 @@ logical_plan
physical_plan
01)CoalesceBatchesExec: target_batch_size=3, fetch=2
02)--HashJoinExec: mode=CollectLeft, join_type=Full, on=[(c1@0, c1@0)]
03)----MemoryExec: partitions=1, partition_sizes=[1]
04)----MemoryExec: partitions=1, partition_sizes=[1]
03)----GlobalLimitExec: skip=0, fetch=2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

According to this PR, the limit push down to full join both side:
#12963

So the limit will apply both side already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think it is correct to push a limit below a Full join -- a FullJoin will create null values to match any misisng rows 🤔 So even if you limited both sides you'll still get rows out of there that shouldn't be ...

Copy link
Contributor Author

@zhuqi-lucas zhuqi-lucas Jan 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think it is correct to push a limit below a Full join -- a FullJoin will create null values to match any misisng rows 🤔 So even if you limited both sides you'll still get rows out of there that shouldn't be ...

Thank you @alamb for review, this is a good point, and we don't do anything for the full join limit in current PR.

The logical plan already push down full join limit since:
#12963

For example the following code:

## Test !join.on.is_empty() && join.filter.is_none()
query TT
EXPLAIN SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 LIMIT 2;
----
logical_plan
01)Limit: skip=0, fetch=2
02)--Full Join: t0.c1 = t1.c1
03)----Limit: skip=0, fetch=2
04)------TableScan: t0 projection=[c1, c2], fetch=2
05)----Limit: skip=0, fetch=2
06)------TableScan: t1 projection=[c1, c2, c3], fetch=2

And the physical plan will apply the limit pushdown, but without current PR, the child limit will be overridden by parent limit, so it does not show the limit in physical plan before.

I suggest we can create a following issue to discuss do we need to fallback or change the full join push down limit case:
#12963

What's your opinion? Thanks a lot!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the current datafusion code pushes limits down through FULL OUTER JOINs I agree we should file a bug and fix it.

Copy link
Contributor Author

@zhuqi-lucas zhuqi-lucas Jan 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alamb :
I looked into the code about the push down for join:

/// Adds a limit to the inputs of a join, if possible
fn push_down_join(mut join: Join, limit: usize) -> Transformed<Join> {
    use JoinType::*;

    fn is_no_join_condition(join: &Join) -> bool {
        join.on.is_empty() && join.filter.is_none()
    }

    let (left_limit, right_limit) = if is_no_join_condition(&join) {
        match join.join_type {
            Left | Right | Full | Inner => (Some(limit), Some(limit)),
            LeftAnti | LeftSemi | LeftMark => (Some(limit), None),
            RightAnti | RightSemi => (None, Some(limit)),
        }
    } else {
        match join.join_type {
            Left => (Some(limit), None),
            Right => (None, Some(limit)),
            Full => (Some(limit), Some(limit)),
            _ => (None, None),
        }
    };

    if left_limit.is_none() && right_limit.is_none() {
        return Transformed::no(join);
    }
    if let Some(limit) = left_limit {
        join.left = make_arc_limit(0, limit, join.left);
    }
    if let Some(limit) = right_limit {
        join.right = make_arc_limit(0, limit, join.right);
    }
    Transformed::yes(join)
}

I think it's safe we just want to limit any result, we don't care about which line to return for the optimization.

If we want to get the accurate limit same with not limit, i think we need to remove those push down join optimization?

Thanks a lot!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line looks suspicious.

            Full => (Some(limit), Some(limit)),

I will work on trying to file a bug tomorrow. BTW I don't think this is introduced in your PR

I have also marked this PR as must be part of the 45 release

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @alamb!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure this a bug (not introduced by this PR):

I am now reviewing the rest of this PR more carefully

Thank you for your patience @zhuqi-lucas

@@ -2217,11 +2217,6 @@ async fn write_parquet_with_order() -> Result<()> {
let df = ctx.sql("SELECT * FROM data").await?;
let results = df.collect().await?;

let df_explain = ctx.sql("explain SELECT a FROM data").await?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Remove print info

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @zhuqi-lucas -- I looked at this PR carefully and it looks good to me -- both code and tests ❤️ 🦾

Thank you for the fix

@@ -4264,8 +4266,10 @@ logical_plan
physical_plan
01)GlobalLimitExec: skip=0, fetch=2
02)--NestedLoopJoinExec: join_type=Full, filter=c2@0 >= c2@1
03)----MemoryExec: partitions=1, partition_sizes=[1]
04)----MemoryExec: partitions=1, partition_sizes=[1]
03)----GlobalLimitExec: skip=0, fetch=2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plans is incorrect due to

(not this PR)

03)----SubqueryAlias: t1
04)------Limit: skip=0, fetch=10
05)--------TableScan: testsubquerylimit projection=[a, b], fetch=10
06)----Limit: skip=0, fetch=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this plan looks good to me -- the Limit 1 is still here

global_state.satisfied = true;
// If the plan's children have limit, we shouldn't change the global state to true,
// because the children limit will be overridden if the global state is changed.
if pushdown_plan.children().iter().any(|child| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like most of the rest of this rule is implemented in terms of LimitExec (defined in this file) to abstract away directly looking for GlobalLimitExec and LocalLimitExec

I think you could do something like this instead to be more consistent with the rest of the codebase

            if pushdown_plan.children().iter().any(|&child| {
                extract_limit(child).is_some()
            }) {
                global_state.satisfied = false;
            }

However, this works too and fixes the bug so I think it is fine as is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good suggestion and it makes code clear, addressed in latest PR. Thanks @alamb !

@@ -247,7 +246,15 @@ pub fn pushdown_limit_helper(
}
} else {
// Add fetch or a `LimitExec`:
global_state.satisfied = true;
// If the plan's children have limit, we shouldn't change the global state to true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the children's limit is >= the globe limit, can we push down the limit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question @xudong963 , added the slt testing for children's limit is >= the globe limit, the limit should also support push down consistent with current behaviour, thanks!

@zhuqi-lucas zhuqi-lucas requested a review from xudong963 January 29, 2025 09:14
Comment on lines 252 to 258
if pushdown_plan
.children()
.iter()
.any(|&child| extract_limit(child).is_some())
{
global_state.satisfied = false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic is strange, if comes to the else branch(248 lines), it means global_state.satisfied == false;, then here(257 lines) sets global_state.satisfied = false;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it @xudong963 , the previous logic will always setting true in the global_state.satisfied == false. This logic is keep the false for some cases. I change the logic to more clear that, we only setting to true to exclude the above case.
Thanks!

@zhuqi-lucas zhuqi-lucas requested a review from xudong963 January 29, 2025 15:21
@@ -248,7 +247,15 @@ pub fn pushdown_limit_helper(
}
} else {
// Add fetch or a `LimitExec`:
global_state.satisfied = true;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the original logic for setting back to true.

@alamb
Copy link
Contributor

alamb commented Jan 29, 2025

Thanks again @zhuqi-lucas and @xudong963 -- this PR took a while but I think things are good in the end

@alamb alamb merged commit 2510e34 into apache:main Jan 29, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LimitPushdown rule uncorrect remove some GlobalLimitExec
3 participants