Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve create_initial_plan to get rid of "stack overflow" issues on complex queries #9573

Closed
metesynnada opened this issue Mar 12, 2024 · 1 comment · Fixed by #10023
Closed
Labels
enhancement New feature or request

Comments

@metesynnada
Copy link
Contributor

Is your feature request related to a problem or challenge?

Some of the complex queries #9373, #9375, or like tpcd_64 etc. are facing stack overflow issues.

Through debugging, it appears that the root cause of these stack overflow problems lies in the create_initial_plan function, specifically due to its recursive implementation.

Describe the solution you'd like

I am not sure how to do it, however, create_initial_plan can be rewritten to remove recursive calls.

Describe alternatives you've considered

NA

Additional context

NA

@metesynnada metesynnada added the enhancement New feature or request label Mar 12, 2024
@Jefffrey
Copy link
Contributor

Jefffrey commented Apr 7, 2024

I took at shot at implementing something that might fix this, but one roadblock I am facing is currently this logic for the joins:

https://github.com/apache/arrow-datafusion/blob/1a002bccd420ff91ec149ee1ba9c42061510f906/datafusion/core/src/physical_planner.rs#L975-L989

It seems like this is a special case where join could be mutated before it is then transformed into its physical form, which doesn't sit right with what I'm attempting, and I feel doesn't sit right with the rest of the logic happening in the existing function.

I wonder if we should work on extracting this out first (maybe to the optimizer?) before then proceeding with a refactor of create_initial_plan 🤔

This way create_initial_plan can assume the passed in LogicalPlan is final and simply needs conversion to physical form, and not need to deal with mutating the input plan any more than is required for this 1-1 conversion 🤔

Edit: btw, both issues linked in the body don't seem to fail in create_initial_plan but actually fails in SQL planning for #9375 and analysis for #9373

Edit2: I managed to get around the roadblock, but it is kinda ugly and probably still worth some consideration, in order to keep create_initial_plan simpler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
2 participants