Remove Arc<LogicalPlan> from LogicalPlan, stop copying LogicalPlans #4628

tustvold · 2022-12-14T19:03:13Z

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Related to #4627, the current representation of LogicalPlan contains Arc<LogicalPlan> at various points, whilst this does reduce the cost of copying a LogicalPlan tree, it:

Complicates rewrites by necessitating clones
Results in double-boxing - e.g. Vec<Arc<LogicalPlan>>
Permits cycles and all the excitement that would entail
Marginal overhead from additional atomics
Unidiomatic is perhaps too strong, but it is strange for a tree datastructure to have shared ownership

Describe the solution you'd like

I would like to remove the Arc, replacing with Box where necessary. Methods that currently take Arc<LogicalPlan> should be updated to take LogicalPlan.

Describe alternatives you've considered

Additional context

This likely wants to wait until we are cloning LogicalPlan less frequently

The text was updated successfully, but these errors were encountered:

alamb · 2022-12-14T20:18:13Z

I believe @mingmwang and @jackwener have noted this in the past -- the idea is good to me

jackwener · 2022-12-15T00:25:13Z

Hope to this change😍, it is very meaningful to optimizer.

jackwener · 2022-12-15T03:09:42Z

If we can remove Arc<LogicalPlan> and use LogicalPlan, wen can use pattern-match to match subtree pattern like:

match projection-filter

LogicalPlan::Projection(Projection{LogicalPlan::Filter, ..})

like this:

mslapek · 2023-03-04T15:16:20Z

How would OptimizerRule look after the change?

Currently we have:

/// Try and rewrite `plan` to an optimized form, returning None if the plan cannot be
/// optimized by this rule.
fn try_optimize(
    &self,
    plan: &LogicalPlan,
    config: &dyn OptimizerConfig,
) -> Result<Option<LogicalPlan>>;

With the current try_optimize signature, lack of Arc<LogicalPlan> would require to clone whole non-optimised subbranches.

alamb · 2023-03-04T16:35:57Z

How would OptimizerRule look after the change?

Maybe we could change the signature to something like

enum OptimizedPlan {
  // Optimizer did not make any changes to the original input pla
  NoChange(LogicalPlan),
  /// Optimizer rewrote the original plan
  Rewritten(LogicalPlan),
}


/// Try and rewrite `plan` to an optimized form
fn try_optimize(
    &self,
    plan: &LogicalPlan,
    config: &dyn OptimizerConfig,
) -> Result<OptimizedPlan>;

tustvold · 2023-03-04T17:35:05Z

I think you would want to make try_optimize take ownership not a borrow, and then return it

alamb · 2023-03-04T21:38:06Z

I think you would want to make try_optimize take ownership not a borrow, and then return it

I think there are cases (like deciding when a fixed point is reached) where the caller wants to distinguish between no more optimization and a new plan.

However, now that LogicalPlan supports PartialEq I think maybe we could just use that to check if any changes were made

https://github.com/apache/arrow-datafusion/blob/c37ddf72ec539bd39cce0dd4ff38db2e36ddb55f/datafusion/expr/src/logical_plan/plan.rs#L52

So the signature maybe could be

/// Try and rewrite `plan` to an optimized form
fn try_optimize(
    &self,
    plan: LogicalPlan,
    config: &dyn OptimizerConfig,
) -> Result<LogicalPlan>;

🤔

mslapek · 2023-03-05T13:11:41Z

To compare a new logical plan with the old one, we need to have both plans in the memory.

Without Arc<...>, it requires to have always doubled RAM usage for plans. With Arc<...> a slightly less? 🤔

tustvold · 2023-03-05T13:16:15Z

However, now that LogicalPlan supports PartialEq I think maybe we could just use that to check if any changes were made

This is not a particularly cheap operation, involving a lot of string comparisons. I think having the return value indicate if changes have been made as in your original example makes sense to me.

My point was if we remove the Arc we need to be careful to move LogicalPlan and avoid cloning, as any clone is then a deep clone of the entire tree. We therefore need to pass in an owned value so that it can be moved into the return type

mslapek · 2023-03-15T14:26:52Z

I think having the return value indicate if changes have been made as in your original example makes sense to me.

At the first sight the idea seems to be compelling...

Nevertheless I'm quite pessimistic about the idea. 😵‍💫

The main reason are optimisations, which UNDO other optimisations!

(sorry for the long post! but it's core architecture stuff...)

Examples of cancelling optimisations

Example 1. Commutative optimisations

Some operations are commutative, like projection or limit - it is used in both PushDownProjection and PushDownLimit - but in a different directions (grep commute in these files).

So for some plans they will always give Some(new_plan). 😕

Example 2. Undoing inside of an optimisation

Inside PushDownLimit we have:

Limit A -> Projection -> Limit B
Added a limit after the projection: Limit A -> Projection -> Limit C -> Limit B
Merge of the Limit C and Limit B in ANOTHER invocation of try_optimize: Limit A -> Projection -> Limit B

We have many invocations of try_optimize due to apply_order option.

The point is that for a fixed point (merge of Limit C + Limit B is Limit B) PushDownLimit will always yield Some(...).

Theoretical pain

Let's look at the issue formally... 😎

Consider small changes Δ_1, Δ_2, ... Δ_n - some signed integers.

The total change is:

Δ_all = Δ_1 + Δ_2 + ... + Δ_n

So Δ_all ≠ 0 implies Δ_i ≠ 0 for some i.

The premise of Option<Plan> is that the implication holds in the another direction - which is NOT the case!

Just give Δ_1 = 1, Δ_2 = -1, Δ_3 = 0... (Δ_2 cancels changes from Δ_1!)

Alternatives?

Let's look again at the implication - how can we use it?

Δ_all ≠ 0 implies Δ_i ≠ 0 for some i.

I would consider to use some stats about the tree - as a kind of heuristics. For example: number of nodes in the tree.

Each optimization could return (LogicalPlan, Δ of number of nodes). If sum of all Δ of number of nodes is nonzero - then we know the tree has changed.

I guess there could be other useful stats, like ∑ (node_type * node depth).

alamb · 2023-03-19T12:04:20Z

#5623 from @mslapek is a nice step towards this goal, I think

sadboy · 2023-12-20T17:41:37Z

To add a counter point to this change -- without sub-tree sharing, and in the presence of CTEs, LogicalPlan trees would be exponential in the size of the SQL query:

WITH
   A as (select 1 x),
   B as (select * from A, A)
   C as (select * from B, B)
   D as (select * from C, C)
select * from D;

tustvold · 2023-12-20T17:57:37Z

without sub-tree sharing

Is this something that is actually practicable? I would have thought the optimizer would simply ruin any effort to do this?

sadboy · 2023-12-20T18:32:26Z

I'm not familiar with the current optimizer implementation details, but this is a problem that manifests way before the optimizer comes into play -- if we take away sub-tree sharing in LogicalPlan, then the SQL compiler would be forced to generate exponential trees right from the start. Whereas in the current setup, (properly) generated LP trees would always be linear in the size of the input query, and if it blows up in some later optimizer stage, I assume it shouldn't be too hard to optimize the optimizer.

tustvold · 2023-12-20T18:58:18Z

Is this a problem? Is the memory usage of the plan representation a concern? This feels like a relatively niche optimisation for plans with repeated CTEs, that may perhaps be a touch premature? I would be very surprised if the optimizer won't blow this away when it rewrites the plans anyway.

sadboy · 2023-12-20T19:43:48Z

Yes this is a very real problem. We see this kind of pattern in production warehouse queries fairly often. They're usually the result of some automated query composition, and can get quite big by themselves. Tacking on an exponential factor on top means the system will be completely unusable (i.e. upwards of an hour just to compile one query, without even invoking the optimizer).

It's not just about the memory footprint -- if your datastructure itself is exponential then that's basically your lower bound for performance, as a simple operation like clone() would take exponential time. In general, exponential blow ups in production systems are deal breakers IMO, and removing them should not be considered premature optimization.

All that is to say, if you plan to remove Arc<LogicalPlan> (which I'm neutral), then you'll have to replace it with some other mechanism for common subtree sharing.

tustvold · 2023-12-20T20:16:56Z

👍 I agree we should not regress any extant functionality in this space. That being said Arc is probably a poor way to go about sub-tree sharing, if it is used for this at all, as shared mutation is not possible. Some sort of mutable interner would likely be a better approach, and would facilitate optimising the given plan only once, as opposed to for every appearance

alamb · 2023-12-20T21:04:09Z

if we take away sub-tree sharing in LogicalPlan, then the SQL compiler would be forced to generate exponential trees right from the start. Whereas in the current setup, (properly) generated LP trees would always be linear in the size of the input query, and if it blows up in some later optimizer stage, I assume it shouldn't be too hard to optimize the optimizer.

For what it is worth, @mustafasrepo and I are working on something similar #8582 (in the physical plans now, any CTEs used more than once will be expanded out and the results not shared)

jayzhan211 · 2024-07-09T14:09:21Z

Note about clone to remove

ScalarSubqueryToJoin

datafusion/datafusion/optimizer/src/scalar_subquery_to_join.rs

Line 153 in e65c3e9

let mut cur_input = projection.input.as_ref().clone();

tustvold added the enhancement New feature or request label Dec 14, 2022

tustvold self-assigned this Dec 14, 2022

mslapek mentioned this issue Mar 16, 2023

Add LogicalPlanSignature and use in the optimizer loop #5623

Merged

alamb changed the title ~~Remove Arc<LogicalPlan> from LogicalPlan~~ Remove Arc<LogicalPlan> from LogicalPlan, stop copying LogicalPlans Mar 19, 2023

alamb mentioned this issue Mar 19, 2023

[Epic] A collection of issues to improve planning performance / speed / efficiency #5637

Open

15 tasks

doki23 mentioned this issue Oct 9, 2023

reduce clones of LogicalPlan in planner #7775

Closed

alamb mentioned this issue Mar 16, 2024

[EPIC] Stop copying LogicalPlan during OptimizerPasses #9637

Closed

31 tasks

This was referenced Jul 7, 2024

Minor: Remove clone in optimizer #11315

Merged

Preserve LogicalPlan to avoid clone for analyzer #11316

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Arc<LogicalPlan> from LogicalPlan, stop copying LogicalPlans #4628

Remove Arc<LogicalPlan> from LogicalPlan, stop copying LogicalPlans #4628

tustvold commented Dec 14, 2022

alamb commented Dec 14, 2022

jackwener commented Dec 15, 2022 •

edited

Loading

jackwener commented Dec 15, 2022 •

edited

Loading

mslapek commented Mar 4, 2023

alamb commented Mar 4, 2023 •

edited

Loading

tustvold commented Mar 4, 2023

alamb commented Mar 4, 2023

mslapek commented Mar 5, 2023

tustvold commented Mar 5, 2023 •

edited

Loading

mslapek commented Mar 15, 2023

alamb commented Mar 19, 2023

sadboy commented Dec 20, 2023

tustvold commented Dec 20, 2023

sadboy commented Dec 20, 2023

tustvold commented Dec 20, 2023 •

edited

Loading

sadboy commented Dec 20, 2023

tustvold commented Dec 20, 2023 •

edited

Loading

alamb commented Dec 20, 2023

jayzhan211 commented Jul 9, 2024

Remove Arc<LogicalPlan> from LogicalPlan, stop copying LogicalPlans #4628

Remove Arc<LogicalPlan> from LogicalPlan, stop copying LogicalPlans #4628

Comments

tustvold commented Dec 14, 2022

alamb commented Dec 14, 2022

jackwener commented Dec 15, 2022 • edited Loading

jackwener commented Dec 15, 2022 • edited Loading

mslapek commented Mar 4, 2023

alamb commented Mar 4, 2023 • edited Loading

tustvold commented Mar 4, 2023

alamb commented Mar 4, 2023

mslapek commented Mar 5, 2023

tustvold commented Mar 5, 2023 • edited Loading

mslapek commented Mar 15, 2023

Examples of cancelling optimisations

Theoretical pain

Alternatives?

alamb commented Mar 19, 2023

sadboy commented Dec 20, 2023

tustvold commented Dec 20, 2023

sadboy commented Dec 20, 2023

tustvold commented Dec 20, 2023 • edited Loading

sadboy commented Dec 20, 2023

tustvold commented Dec 20, 2023 • edited Loading

alamb commented Dec 20, 2023

jayzhan211 commented Jul 9, 2024

jackwener commented Dec 15, 2022 •

edited

Loading

jackwener commented Dec 15, 2022 •

edited

Loading

alamb commented Mar 4, 2023 •

edited

Loading

tustvold commented Mar 5, 2023 •

edited

Loading

tustvold commented Dec 20, 2023 •

edited

Loading

tustvold commented Dec 20, 2023 •

edited

Loading