-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add recursion limit configuration to DFParser
#14095
Conversation
Signed-off-by: Tai Le Manh <[email protected]>
Thank you, this fix makes sense to me. I think we can add the bug reproducer to |
8d170b4
to
3e00a75
Compare
3e00a75
to
c512c76
Compare
@@ -52,6 +52,7 @@ const SQLITE_PREFIX: &str = "sqlite"; | |||
pub fn main() -> Result<()> { | |||
tokio::runtime::Builder::new_multi_thread() | |||
.enable_all() | |||
.thread_stack_size(4 * 1024 * 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to adjust this config to avoid error thread 'tokio-runtime-worker' has overflowed its stack
of sqllogictest
process since we change the with_recursion_limit
config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am worried about this. We have recently added the the recursive
protection feature precisely to avoid stack overflows (ever) but now after this change the stacks can overflow.
Here is the stack where it overflows (precisely what the Recursion limit is designed to prevent:
<sqlparser::tokenizer::Token as core::clone::Clone>::clone tokenizer.rs:52
<sqlparser::tokenizer::TokenWithSpan as core::clone::Clone>::clone tokenizer.rs:633
core::option::Option<&T>::cloned option.rs:1917
sqlparser::parser::Parser::next_token mod.rs:3443
sqlparser::parser::Parser::parse_data_type_helper mod.rs:8097
sqlparser::parser::Parser::parse_data_type mod.rs:8083
sqlparser::parser::Parser::parse_prefix::{{closure}} mod.rs:1243
sqlparser::parser::Parser::try_parse mod.rs:3808
sqlparser::parser::Parser::maybe_parse mod.rs:3795
sqlparser::parser::Parser::parse_prefix mod.rs:1242
sqlparser::parser::Parser::parse_subexpr mod.rs:964
sqlparser::parser::Parser::parse_expr mod.rs:957
core::ops::function::FnMut::call_mut function.rs:166
sqlparser::parser::Parser::parse_comma_separated_with_trailing_commas mod.rs:3730
sqlparser::parser::Parser::parse_comma_separated mod.rs:3715
sqlparser::parser::Parser::parse_prefix mod.rs:1381
sqlparser::parser::Parser::parse_subexpr mod.rs:964
sqlparser::parser::Parser::parse_infix mod.rs:2928
sqlparser::parser::Parser::parse_subexpr mod.rs:974
sqlparser::parser::Parser::parse_expr mod.rs:957
core::ops::function::FnMut::call_mut function.rs:166
sqlparser::parser::Parser::parse_comma_separated_with_trailing_commas mod.rs:3730
sqlparser::parser::Parser::parse_comma_separated mod.rs:3715
sqlparser::parser::Parser::parse_prefix mod.rs:1381
sqlparser::parser::Parser::parse_subexpr mod.rs:964
sqlparser::parser::Parser::parse_infix mod.rs:2928
sqlparser::parser::Parser::parse_subexpr mod.rs:974
sqlparser::parser::Parser::parse_expr mod.rs:957
core::ops::function::FnMut::call_mut function.rs:166
sqlparser::parser::Parser::parse_comma_separated_with_trailing_commas mod.rs:3730
sqlparser::parser::Parser::parse_comma_separated mod.rs:3715
sqlparser::parser::Parser::parse_prefix mod.rs:1381
sqlparser::parser::Parser::parse_subexpr mod.rs:964
sqlparser::parser::Parser::parse_infix mod.rs:2928
sqlparser::parser::Parser::parse_subexpr mod.rs:974
sqlparser::parser::Parser::parse_expr mod.rs:957
core::ops::function::FnMut::call_mut function.rs:166
sqlparser::parser::Parser::parse_comma_separated_with_trailing_commas mod.rs:3730
sqlparser::parser::Parser::parse_comma_separated mod.rs:3715
sqlparser::parser::Parser::parse_prefix mod.rs:1381
sqlparser::parser::Parser::parse_subexpr mod.rs:964
sqlparser::parser::Parser::parse_infix mod.rs:2928
sqlparser::parser::Parser::parse_subexpr mod.rs:974
sqlparser::parser::Parser::parse_expr mod.rs:957
core::ops::function::FnMut::call_mut function.rs:166
sqlparser::parser::Parser::parse_comma_separated_with_trailing_commas mod.rs:3730
sqlparser::parser::Parser::parse_comma_separated mod.rs:3715
...
std::panicking::try::do_call panicking.rs:557
__rust_try 0x000000010761f11c
[Inlined] std::panicking::try panicking.rs:520
[Inlined] std::panic::catch_unwind panic.rs:358
std::thread::Builder::spawn_unchecked_::{{closure}} mod.rs:559
core::ops::function::FnOnce::call_once{{vtable.shim}} function.rs:250
[Inlined] <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once boxed.rs:1972
[Inlined] <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once boxed.rs:1972
std::sys::pal::unix::thread::Thread::new::thread_start thread.rs:105
_pthread_start 0x00000001940832e4
Note @blaginin recently added the recursive
feature to sql parser (but it isn't yet released) which I think will allow this query to run without having to update the stack size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, very clear. So, I will close this PR and wait for the new release of sqlparser-rs
. Tthanks for the review @alamb @2010YOUY01
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for working on this @tlm365 -- however I worry that this change exposes DataFusion users to stack overflows 🤔
It seems like a new version of sqlparser should be able to fix that. Maybe once that fix is released we can make this change without causing stack overflows
Speaking if sqlparser-rs releases, it seems it is (past) time to make a new one:
@@ -257,6 +257,9 @@ fn ensure_not_set<T>(field: &Option<T>, name: &str) -> Result<(), ParserError> { | |||
Ok(()) | |||
} | |||
|
|||
// By default, allow expressions up to this deep before error | |||
const DEFAULT_REMAINING_DEPTH: usize = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double checked that the default in sqlparser is 50 so this is 2x the size https://docs.rs/sqlparser/latest/src/sqlparser/parser/mod.rs.html#187
I am a little worried about this as it is a hard coded limit but I think we can always make it a config flag later
@@ -257,6 +257,9 @@ fn ensure_not_set<T>(field: &Option<T>, name: &str) -> Result<(), ParserError> { | |||
Ok(()) | |||
} | |||
|
|||
// By default, allow expressions up to this deep before error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can say this is 2x the default limit of sqlparser so it is clear this is meant to increase the limit
// By default, allow expressions up to this deep before error | |
// By default, allow expressions up to this deep before error | |
// (2x the default in sqlparser) |
@@ -52,6 +52,7 @@ const SQLITE_PREFIX: &str = "sqlite"; | |||
pub fn main() -> Result<()> { | |||
tokio::runtime::Builder::new_multi_thread() | |||
.enable_all() | |||
.thread_stack_size(4 * 1024 * 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am worried about this. We have recently added the the recursive
protection feature precisely to avoid stack overflows (ever) but now after this change the stacks can overflow.
Here is the stack where it overflows (precisely what the Recursion limit is designed to prevent:
<sqlparser::tokenizer::Token as core::clone::Clone>::clone tokenizer.rs:52
<sqlparser::tokenizer::TokenWithSpan as core::clone::Clone>::clone tokenizer.rs:633
core::option::Option<&T>::cloned option.rs:1917
sqlparser::parser::Parser::next_token mod.rs:3443
sqlparser::parser::Parser::parse_data_type_helper mod.rs:8097
sqlparser::parser::Parser::parse_data_type mod.rs:8083
sqlparser::parser::Parser::parse_prefix::{{closure}} mod.rs:1243
sqlparser::parser::Parser::try_parse mod.rs:3808
sqlparser::parser::Parser::maybe_parse mod.rs:3795
sqlparser::parser::Parser::parse_prefix mod.rs:1242
sqlparser::parser::Parser::parse_subexpr mod.rs:964
sqlparser::parser::Parser::parse_expr mod.rs:957
core::ops::function::FnMut::call_mut function.rs:166
sqlparser::parser::Parser::parse_comma_separated_with_trailing_commas mod.rs:3730
sqlparser::parser::Parser::parse_comma_separated mod.rs:3715
sqlparser::parser::Parser::parse_prefix mod.rs:1381
sqlparser::parser::Parser::parse_subexpr mod.rs:964
sqlparser::parser::Parser::parse_infix mod.rs:2928
sqlparser::parser::Parser::parse_subexpr mod.rs:974
sqlparser::parser::Parser::parse_expr mod.rs:957
core::ops::function::FnMut::call_mut function.rs:166
sqlparser::parser::Parser::parse_comma_separated_with_trailing_commas mod.rs:3730
sqlparser::parser::Parser::parse_comma_separated mod.rs:3715
sqlparser::parser::Parser::parse_prefix mod.rs:1381
sqlparser::parser::Parser::parse_subexpr mod.rs:964
sqlparser::parser::Parser::parse_infix mod.rs:2928
sqlparser::parser::Parser::parse_subexpr mod.rs:974
sqlparser::parser::Parser::parse_expr mod.rs:957
core::ops::function::FnMut::call_mut function.rs:166
sqlparser::parser::Parser::parse_comma_separated_with_trailing_commas mod.rs:3730
sqlparser::parser::Parser::parse_comma_separated mod.rs:3715
sqlparser::parser::Parser::parse_prefix mod.rs:1381
sqlparser::parser::Parser::parse_subexpr mod.rs:964
sqlparser::parser::Parser::parse_infix mod.rs:2928
sqlparser::parser::Parser::parse_subexpr mod.rs:974
sqlparser::parser::Parser::parse_expr mod.rs:957
core::ops::function::FnMut::call_mut function.rs:166
sqlparser::parser::Parser::parse_comma_separated_with_trailing_commas mod.rs:3730
sqlparser::parser::Parser::parse_comma_separated mod.rs:3715
sqlparser::parser::Parser::parse_prefix mod.rs:1381
sqlparser::parser::Parser::parse_subexpr mod.rs:964
sqlparser::parser::Parser::parse_infix mod.rs:2928
sqlparser::parser::Parser::parse_subexpr mod.rs:974
sqlparser::parser::Parser::parse_expr mod.rs:957
core::ops::function::FnMut::call_mut function.rs:166
sqlparser::parser::Parser::parse_comma_separated_with_trailing_commas mod.rs:3730
sqlparser::parser::Parser::parse_comma_separated mod.rs:3715
...
std::panicking::try::do_call panicking.rs:557
__rust_try 0x000000010761f11c
[Inlined] std::panicking::try panicking.rs:520
[Inlined] std::panic::catch_unwind panic.rs:358
std::thread::Builder::spawn_unchecked_::{{closure}} mod.rs:559
core::ops::function::FnOnce::call_once{{vtable.shim}} function.rs:250
[Inlined] <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once boxed.rs:1972
[Inlined] <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once boxed.rs:1972
std::sys::pal::unix::thread::Thread::new::thread_start thread.rs:105
_pthread_start 0x00000001940832e4
Note @blaginin recently added the recursive
feature to sql parser (but it isn't yet released) which I think will allow this query to run without having to update the stack size
To be clear I think we will still need to increase the recursion limit in DataFusion to allow the sqllogictest to pass. But until the next sqlparser upgrade I don't we can do so safely |
Which issue does this PR close?
Closes #14091.
Rationale for this change
What changes are included in this PR?
Set parser config, now support up to 100 expressions.
Are these changes tested?
Yes, by
datafusion-testing
module.Are there any user-facing changes?
No.