-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core-vec-append has gotten slower #3183
Comments
The latest perf graphs show this being back at where it was. |
Subtle changes in code gen are causing major performance swings by hitting stack boundaries in different ways. |
I mentioned this in private mail, may as well leave it here in case it helps: https://bugzilla.mozilla.org/show_bug.cgi?id=472791, especially /be |
Thanks, Brendan. |
Yeah, it's possible cache split load/stores are magnifying the effect, or penalizing later stack segments by (f.e.) making them not be 64-byte aligned (or whatever the line size is, and presumably whatever LLVM assumes its frames are). But honestly I think the main thing is just that we're hitting cases where recursive code either bottoms out within a stack chunk or thrashes back-and-forth over the edge of one. To fix this more-long-term I'd like to try moving to a single chunk size (== OS page size) and keeping the pages in a per-task freelist. This will work nicely with the mostly-copying GC that pcwalton's suggesting we play with too: the task can share a single task-local page freelist between stack and GC uses. The only additional check we'd need in that case is for larger-than-a-page-sized allocations / stack frames, which we can service from a slower malloc path (and should be rare). Should also solve any potential mismatches between LLVM-assumed alignment of the stack and where we start each frame. |
do not wrap comments in doctest to avoid failing doctest runs
Relevant upstream PRs: rust-lang#124797 rust-lang#124957 rust-lang#124003
This seems to have happened around 0284f94. Normally this test being slower leads to dramatic slowdowns across the board, which doesn't seem to have happened here. Still, we should keep an eye on it. The benchmark time went from about 500ms to 2.25s.
The text was updated successfully, but these errors were encountered: