Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Arc<LogicalPlan> from LogicalPlan, stop copying LogicalPlans #4628

Open
tustvold opened this issue Dec 14, 2022 · 19 comments
Open

Remove Arc<LogicalPlan> from LogicalPlan, stop copying LogicalPlans #4628

tustvold opened this issue Dec 14, 2022 · 19 comments
Assignees
Labels
enhancement New feature or request

Comments

@tustvold
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Related to #4627, the current representation of LogicalPlan contains Arc<LogicalPlan> at various points, whilst this does reduce the cost of copying a LogicalPlan tree, it:

  • Complicates rewrites by necessitating clones
  • Results in double-boxing - e.g. Vec<Arc<LogicalPlan>>
  • Permits cycles and all the excitement that would entail
  • Marginal overhead from additional atomics
  • Unidiomatic is perhaps too strong, but it is strange for a tree datastructure to have shared ownership

Describe the solution you'd like

I would like to remove the Arc, replacing with Box where necessary. Methods that currently take Arc<LogicalPlan> should be updated to take LogicalPlan.

Describe alternatives you've considered

Additional context

This likely wants to wait until we are cloning LogicalPlan less frequently

@tustvold tustvold added the enhancement New feature or request label Dec 14, 2022
@tustvold tustvold self-assigned this Dec 14, 2022
@alamb
Copy link
Contributor

alamb commented Dec 14, 2022

I believe @mingmwang and @jackwener have noted this in the past -- the idea is good to me

@jackwener
Copy link
Member

jackwener commented Dec 15, 2022

Hope to this change😍, it is very meaningful to optimizer.

@jackwener
Copy link
Member

jackwener commented Dec 15, 2022

If we can remove Arc<LogicalPlan> and use LogicalPlan, wen can use pattern-match to match subtree pattern like:

match projection-filter

LogicalPlan::Projection(Projection{LogicalPlan::Filter, ..})

like this:

image

@mslapek
Copy link
Contributor

mslapek commented Mar 4, 2023

How would OptimizerRule look after the change?

Currently we have:

/// Try and rewrite `plan` to an optimized form, returning None if the plan cannot be
/// optimized by this rule.
fn try_optimize(
    &self,
    plan: &LogicalPlan,
    config: &dyn OptimizerConfig,
) -> Result<Option<LogicalPlan>>;

With the current try_optimize signature, lack of Arc<LogicalPlan> would require to clone whole non-optimised subbranches.

@alamb
Copy link
Contributor

alamb commented Mar 4, 2023

How would OptimizerRule look after the change?

Maybe we could change the signature to something like

enum OptimizedPlan {
  // Optimizer did not make any changes to the original input pla
  NoChange(LogicalPlan),
  /// Optimizer rewrote the original plan
  Rewritten(LogicalPlan),
}


/// Try and rewrite `plan` to an optimized form
fn try_optimize(
    &self,
    plan: &LogicalPlan,
    config: &dyn OptimizerConfig,
) -> Result<OptimizedPlan>;

@tustvold
Copy link
Contributor Author

tustvold commented Mar 4, 2023

I think you would want to make try_optimize take ownership not a borrow, and then return it

@alamb
Copy link
Contributor

alamb commented Mar 4, 2023

I think you would want to make try_optimize take ownership not a borrow, and then return it

I think there are cases (like deciding when a fixed point is reached) where the caller wants to distinguish between no more optimization and a new plan.

However, now that LogicalPlan supports PartialEq I think maybe we could just use that to check if any changes were made

https://github.com/apache/arrow-datafusion/blob/c37ddf72ec539bd39cce0dd4ff38db2e36ddb55f/datafusion/expr/src/logical_plan/plan.rs#L52

So the signature maybe could be

/// Try and rewrite `plan` to an optimized form
fn try_optimize(
    &self,
    plan: LogicalPlan,
    config: &dyn OptimizerConfig,
) -> Result<LogicalPlan>;

🤔

@mslapek
Copy link
Contributor

mslapek commented Mar 5, 2023

To compare a new logical plan with the old one, we need to have both plans in the memory.

Without Arc<...>, it requires to have always doubled RAM usage for plans. With Arc<...> a slightly less? 🤔

@tustvold
Copy link
Contributor Author

tustvold commented Mar 5, 2023

However, now that LogicalPlan supports PartialEq I think maybe we could just use that to check if any changes were made

This is not a particularly cheap operation, involving a lot of string comparisons. I think having the return value indicate if changes have been made as in your original example makes sense to me.

My point was if we remove the Arc we need to be careful to move LogicalPlan and avoid cloning, as any clone is then a deep clone of the entire tree. We therefore need to pass in an owned value so that it can be moved into the return type

@mslapek
Copy link
Contributor

mslapek commented Mar 15, 2023

I think having the return value indicate if changes have been made as in your original example makes sense to me.

At the first sight the idea seems to be compelling...

Nevertheless I'm quite pessimistic about the idea. 😵‍💫

The main reason are optimisations, which UNDO other optimisations!

(sorry for the long post! but it's core architecture stuff...)


Examples of cancelling optimisations

Example 1. Commutative optimisations

Some operations are commutative, like projection or limit - it is used in both PushDownProjection and PushDownLimit - but in a different directions (grep commute in these files).

So for some plans they will always give Some(new_plan). 😕

Example 2. Undoing inside of an optimisation

Inside PushDownLimit we have:

  1. Limit A -> Projection -> Limit B
  2. Added a limit after the projection: Limit A -> Projection -> Limit C -> Limit B
  3. Merge of the Limit C and Limit B in ANOTHER invocation of try_optimize: Limit A -> Projection -> Limit B

We have many invocations of try_optimize due to apply_order option.

The point is that for a fixed point (merge of Limit C + Limit B is Limit B) PushDownLimit will always yield Some(...).


Theoretical pain

Let's look at the issue formally... 😎

Consider small changes Δ_1, Δ_2, ... Δ_n - some signed integers.

The total change is:

Δ_all = Δ_1 + Δ_2 + ... + Δ_n

So Δ_all ≠ 0   implies   Δ_i ≠ 0 for some i.

The premise of Option<Plan> is that the implication holds in the another direction - which is NOT the case!

Just give Δ_1 = 1, Δ_2 = -1, Δ_3 = 0... (Δ_2 cancels changes from Δ_1!)

Alternatives?

Let's look again at the implication - how can we use it?

Δ_all ≠ 0   implies   Δ_i ≠ 0 for some i.

I would consider to use some stats about the tree - as a kind of heuristics. For example: number of nodes in the tree.

Each optimization could return (LogicalPlan, Δ of number of nodes). If sum of all Δ of number of nodes is nonzero - then we know the tree has changed.

I guess there could be other useful stats, like ∑ (node_type * node depth).

@alamb alamb changed the title Remove Arc<LogicalPlan> from LogicalPlan Remove Arc<LogicalPlan> from LogicalPlan, stop copying LogicalPlans Mar 19, 2023
@alamb
Copy link
Contributor

alamb commented Mar 19, 2023

#5623 from @mslapek is a nice step towards this goal, I think

@sadboy
Copy link
Contributor

sadboy commented Dec 20, 2023

To add a counter point to this change -- without sub-tree sharing, and in the presence of CTEs, LogicalPlan trees would be exponential in the size of the SQL query:

WITH
   A as (select 1 x),
   B as (select * from A, A)
   C as (select * from B, B)
   D as (select * from C, C)
select * from D;

@tustvold
Copy link
Contributor Author

without sub-tree sharing

Is this something that is actually practicable? I would have thought the optimizer would simply ruin any effort to do this?

@sadboy
Copy link
Contributor

sadboy commented Dec 20, 2023

I'm not familiar with the current optimizer implementation details, but this is a problem that manifests way before the optimizer comes into play -- if we take away sub-tree sharing in LogicalPlan, then the SQL compiler would be forced to generate exponential trees right from the start. Whereas in the current setup, (properly) generated LP trees would always be linear in the size of the input query, and if it blows up in some later optimizer stage, I assume it shouldn't be too hard to optimize the optimizer.

@tustvold
Copy link
Contributor Author

tustvold commented Dec 20, 2023

Is this a problem? Is the memory usage of the plan representation a concern? This feels like a relatively niche optimisation for plans with repeated CTEs, that may perhaps be a touch premature? I would be very surprised if the optimizer won't blow this away when it rewrites the plans anyway.

@sadboy
Copy link
Contributor

sadboy commented Dec 20, 2023

Yes this is a very real problem. We see this kind of pattern in production warehouse queries fairly often. They're usually the result of some automated query composition, and can get quite big by themselves. Tacking on an exponential factor on top means the system will be completely unusable (i.e. upwards of an hour just to compile one query, without even invoking the optimizer).

It's not just about the memory footprint -- if your datastructure itself is exponential then that's basically your lower bound for performance, as a simple operation like clone() would take exponential time. In general, exponential blow ups in production systems are deal breakers IMO, and removing them should not be considered premature optimization.

All that is to say, if you plan to remove Arc<LogicalPlan> (which I'm neutral), then you'll have to replace it with some other mechanism for common subtree sharing.

@tustvold
Copy link
Contributor Author

tustvold commented Dec 20, 2023

👍 I agree we should not regress any extant functionality in this space. That being said Arc is probably a poor way to go about sub-tree sharing, if it is used for this at all, as shared mutation is not possible. Some sort of mutable interner would likely be a better approach, and would facilitate optimising the given plan only once, as opposed to for every appearance

@alamb
Copy link
Contributor

alamb commented Dec 20, 2023

if we take away sub-tree sharing in LogicalPlan, then the SQL compiler would be forced to generate exponential trees right from the start. Whereas in the current setup, (properly) generated LP trees would always be linear in the size of the input query, and if it blows up in some later optimizer stage, I assume it shouldn't be too hard to optimize the optimizer.

For what it is worth, @mustafasrepo and I are working on something similar #8582 (in the physical plans now, any CTEs used more than once will be expanded out and the results not shared)

@jayzhan211
Copy link
Contributor

Note about clone to remove

ScalarSubqueryToJoin

let mut cur_input = projection.input.as_ref().clone();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants