Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better greedy layout computation for generators #62575

Open
tmandry opened this issue Jul 11, 2019 · 4 comments
Open

Better greedy layout computation for generators #62575

tmandry opened this issue Jul 11, 2019 · 4 comments
Labels
A-coroutines Area: Coroutines C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tmandry
Copy link
Member

tmandry commented Jul 11, 2019

I've been thinking about a new approach for computing the layout of generators, since I merged #59897. Actually, I've been thinking about it before that, but the approach there felt a bit safer to start with.

The idea is that we use the struct layout computation, rather than the enum computation, as a starting point, and add in the fact that fields can overlap. The basic algorithm is simple:

  • We iterate through every field, in descending order of alignment and number of conflicts with other fields*.
  • For each field, pick the first offset (with respect to its alignment) that it's still allowed to occupy, given the other fields that we've already placed.

Because we're working in descending order of alignment (and alignments are all powers of 2), we should be able to use a "segment-tree-like" data structure with a bitset of fields that are still allowed to be placed at a given position. This means that updating the tree is logarithmic in the number of alignment levels, and so is searching for the first position a field is allowed to occupy.

This greedy algorithm should be an improvement over the current algorithm, which doesn't attempt to overlap any fields that are live across more than one yield in the generator. It also wouldn't force "overlap-zone" fields to always come after "non-overlap" fields, which means we can do a better job of eliminating padding. Finally, the logic should be easier to follow in code than the current optimization (once the data structure is written).

This problem is NP-complete, so there might be better algorithms than this out there. I'd love to hear suggestions!


(*) The number of conflicts is the number of other fields in this struct that cannot be overlapped with the current field. The exact heuristic may change. Today, we use number of conflicts to help us decide which fields get to stay in an "overlap zone" and which get kicked out. Then, we use order of alignment to lay out the fields in each zone.

@csmoe csmoe added the A-coroutines Area: Coroutines label Jul 11, 2019
@jonas-schievink jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 11, 2019
@jonas-schievink jonas-schievink added the I-heavy Issue: Problems and improvements with respect to binary size of generated code. label Mar 4, 2020
@Eulogizethesun

This comment has been minimized.

@bjorn3

This comment has been minimized.

@memoryruins

This comment has been minimized.

@jonas-schievink

This comment has been minimized.

Swatinem added a commit to Swatinem/rust that referenced this issue Feb 7, 2023
This adds one more test that should track improvements to generator
layout, like rust-lang#62958 and rust-lang#62575.

In particular, this test highlights suboptimal layout, as the storage
for the argument future is not being reused across its usage as `upvar`,
`local` and `awaitee` (being polled to completion).
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 7, 2023
Add test for Future inflating arg size to 3x

This adds one more test that should track improvements to generator
layout, like rust-lang#62958 and rust-lang#62575.

In particular, this test highlights suboptimal layout, as the storage
for the argument future is not being reused across its usage as `upvar`,
`local` and `awaitee` (being polled to completion).

This is on top of rust-lang#107692 (as those would conflict with each other)

It is a minimal repro for code mentioned in moka-rs/moka#212 (comment) (CC `@tatsuya6502)`
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-coroutines Area: Coroutines C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants