Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Medium Array of Vecs cloned causes stack overflow but doesn't provide any details #111733

Open
teken opened this issue May 18, 2023 · 9 comments
Open
Labels
A-diagnostics Area: Messages for errors, warnings, and lints T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@teken
Copy link

teken commented May 18, 2023

Code

fn main() {
    const EMPTY: Vec<u32> = vec![];
    let mut grid: [Vec<u32>; 2500] = [EMPTY; 2500];

    let grid_clone = grid.clone();
}

Current output

thread 'main' has overflowed its stack
error: process didn't exit successfully: `target\debug\t.exe` (exit code: 0xc00000fd, STATUS_STACK_OVERFLOW)

Desired output

thread 'main' has overflowed its stack
stack overflow caused by clone() on line 5
error: process didn't exit successfully: `target\debug\t.exe` (exit code: 0xc00000fd, STATUS_STACK_OVERFLOW)

Rationale and extra context

No response

Other cases

No response

Anything else?

No response

@teken teken added A-diagnostics Area: Messages for errors, warnings, and lints T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 18, 2023
@teken teken changed the title Large Array of Vecs cloned causes stack overflow but doesn't provide any details Medium Array of Vecs cloned causes stack overflow but doesn't provide any details May 18, 2023
@saethlin
Copy link
Member

saethlin commented May 18, 2023

What is your rustc --version --verbose?

The code as-written doesn't stack overflow on Linux, bumping the buffer size up to 30000 will always stack overflow (on stable), but somewhere around 29000, the program sometimes overflows and sometimes doesn't.

@saethlin
Copy link
Member

The backtrace to the crash:

#6  <signal handler called>
#7  0x000055ef6590d911 in core::array::try_from_fn<core::ops::try_trait::NeverShortCircuit<alloc::vec::Vec<u32, alloc::alloc::Global>>, 50000, core::array::try_from_trusted_iterator::next::{closure_env#0}<core::ops::try_trait::NeverShortCircuit<alloc::vec::Vec<u32, alloc::alloc::Global>>, core::iter::adapters::map::Map<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<alloc::vec::Vec<u32, alloc::alloc::Global>>>, fn(alloc::vec::Vec<u32, alloc::alloc::Global>) -> core::ops::try_trait::NeverShortCircuit<alloc::vec::Vec<u32, alloc::alloc::Global>>>>> (cb=...)
    at /home/ben/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/array/mod.rs:97
#8  0x000055ef6590de57 in core::array::try_from_trusted_iterator<alloc::vec::Vec<u32, alloc::alloc::Global>, core::ops::try_trait::NeverShortCircuit<alloc::vec::Vec<u32, alloc::alloc::Global>>, 50000, core::iter::adapters::map::Map<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<alloc::vec::Vec<u32, alloc::alloc::Global>>>, fn(alloc::vec::Vec<u32, alloc::alloc::Global>) -> core::ops::try_trait::NeverShortCircuit<alloc::vec::Vec<u32, alloc::alloc::Global>>>> (iter=...)
    at /home/ben/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/array/mod.rs:815
#9  0x000055ef6590e677 in core::array::from_trusted_iterator<alloc::vec::Vec<u32, alloc::alloc::Global>, 50000, core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<alloc::vec::Vec<u32, alloc::alloc::Global>>>> (iter=...)
    at /home/ben/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/array/mod.rs:795
#10 core::array::{impl#21}::clone<alloc::vec::Vec<u32, alloc::alloc::Global>, 50000> (array=0x7fff38cc3330)
    at /home/ben/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/array/mod.rs:416
#11 core::array::{impl#20}::clone<alloc::vec::Vec<u32, alloc::alloc::Global>, 50000> (self=0x7fff38cc3330)
    at /home/ben/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/array/mod.rs:400
#12 scratch::main () at src/main.rs:6

The crash is here (I bumped the array size to 50000):

Dump of assembler code for function _ZN4core5array11try_from_fn17h71bcb574ac43dc29E:
   0x00005606ef6db0a0 <+0>:	mov    %rsp,%r11
   0x00005606ef6db0a3 <+3>:	sub    $0x5b8000,%r11
   0x00005606ef6db0aa <+10>:	sub    $0x1000,%rsp
=> 0x00005606ef6db0b1 <+17>:	movq   $0x0,(%rsp)

I think a large fraction of the problem here is that try_from_fn contains enough stack space for 5 of these arrays:

fn try_from_fn(_1: F) -> <<R as Try>::Residual as Residual<[<R as Try>::Output; N]>>::TryType {
    debug cb => _1;
    let mut _0: <<R as ops::try_trait::Try>::Residual as ops::try_trait::Residual<[<R as ops::try_trait::Try>::Output; N]>>::TryType;
    let mut _2: [mem::maybe_uninit::MaybeUninit<<R as ops::try_trait::Try>::Output>; N];
    let mut _3: ops::control_flow::ControlFlow<<R as ops::try_trait::Try>::Residual>;
    let mut _4: &mut [mem::maybe_uninit::MaybeUninit<<R as ops::try_trait::Try>::Output>];
    let mut _5: &mut [mem::maybe_uninit::MaybeUninit<<R as ops::try_trait::Try>::Output>; N];
    let mut _6: F;
    let mut _7: isize;
    let mut _9: [<R as ops::try_trait::Try>::Output; N];
    let mut _10: [mem::maybe_uninit::MaybeUninit<<R as ops::try_trait::Try>::Output>; N];
    let mut _11: bool;
    let mut _12: bool;

I suspect the rest of the memory overhead comes from the fact that this clone() call returns the cloned array through many layers of abstraction that don't get optimized away. Every stack frame along the way probably needs space for the array.

I don't know how to solve this well. But I feel like it should be possible to write a Clone implementation for arrays that doesn't need so many locals that are the size of the array.

@saethlin saethlin added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 18, 2023
@scottmcm
Copy link
Member

I don't know how to solve this well. But I feel like it should be possible to write a Clone implementation for arrays that doesn't need so many locals that are the size of the array.

I did make this better back in 52df055#diff-b83fad8dc170d433a4e4aaa27de41f508c1ec8e854326fe815d779e70338080a, but it still needs things like #111551 before it'll be properly better -- see specifically https://github.com/rust-lang/rust/pull/111551/files#diff-b83fad8dc170d433a4e4aaa27de41f508c1ec8e854326fe815d779e70338080a showing that letting LLVM get rid of another array-sized temporary.

See also the general issues #91521 and #79914

@saethlin
Copy link
Member

So IMO the fact that this crashes at all is already covered better elsewhere, and this is really just a diagnostic request.

@rustbot label -T-libs

@rustbot rustbot removed the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 18, 2023
@scottmcm
Copy link
Member

scottmcm commented May 18, 2023

I don't know how to solve this well. But I feel like it should be possible to write a Clone implementation for arrays that doesn't need so many locals that are the size of the array.

One problem here is that it needs to return it. At the ABI level, for an array this long that'll be a pointer, which is great, but there's no way in the Rust code to get that pointer, so there's no way for the Rust code to write to it directly.

Thus getting this to not use extra stack space fundamentally depends on either

  • an optimization sufficiently smart to let the code building up its local [MaybeUninit<T>; N] and then doing return transmute(that_local) compile into code that's writing to the return place directly, or
  • some language trick to get access to the return place directly (probably as a &mut MaybeUninit<T>?) and then return the value already in there -- custom MIR could do it, but we probably don't want that in core.

@saethlin
Copy link
Member

Sure. Clone needs to return the array once, but the stack exhaustion cited here occurs because we return it many times. One extra temporary might be in the range of expectation, but currently we have about 10.

@teken
Copy link
Author

teken commented May 19, 2023

What is your rustc --version --verbose?

The code as-written doesn't stack overflow on Linux, bumping the buffer size up to 30000 will always stack overflow (on stable), but somewhere around 29000, the program sometimes overflows and sometimes doesn't.

rustc 1.68.2 (9eb3afe9e 2023-03-27)
binary: rustc
commit-hash: 9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0
commit-date: 2023-03-27
host: x86_64-pc-windows-msvc
release: 1.68.2
LLVM version: 15.0.6

I'm on windows

@scottmcm
Copy link
Member

release: 1.68.2

#107634 landed for 1.69, so definitely check if it's better after an update -- in 1.68.2 it had an extra step of going through Result<[T; N], IntoIter<T, N>> that in practice didn't optimize out.

@teken
Copy link
Author

teken commented May 19, 2023

When I updated graded to rustc 1.69.0 (84c898d65 2023-04-16) it overflows at around 22000 items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants