Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast clone graph #7168

Closed
wants to merge 20 commits into from
Closed

Fast clone graph #7168

wants to merge 20 commits into from

Conversation

Eh2406
Copy link
Contributor

@Eh2406 Eh2406 commented Jul 23, 2019

This is a redesign of the Graph used by the Resolver. The Resolver is usually invoked with a lockfile, so the first thing we try will work, and any work we to to prepare for backtracking is wasted. A plane Graph will have an O(n) clone, way too expensive. An Rc<Graph> has O(1) clone but is no savings as we just end up with O(n) make_mut. Before the PR we have a Graph built from im-rs parts, this gives use O(1) clone at only O(ln(n)) overhead on every operation. This PRs StackGraph has O(ln(number of clones)) clone and no overhead! There is a catch, if we have to backtrack then we do a O(delta) reset. In effect this PR moves work from the happy "lockfile" path to the "backtracking" path.

@rust-highfive
Copy link

r? @ehuss

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 23, 2019
@bors
Copy link
Contributor

bors commented Jul 24, 2019

☔ The latest upstream changes (presumably #7174) made this pull request unmergeable. Please resolve the merge conflicts.

@Eh2406 Eh2406 force-pushed the fast-clone-graph branch from 98e23c5 to 3a41475 Compare July 24, 2019 19:25
@Eh2406
Copy link
Contributor Author

Eh2406 commented Jul 24, 2019

Rebased.

/// This is a directed Graph structure. Each edge can have an `E` associated with it,
/// but may have more then one or none. Furthermore, it is designed to be "append only" so that
/// it can be queried as it would have bean when it was smaller. This allows a `reset_to` method
/// that efficiently undoes the most reason modifications.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the comments here and the fields below be expanded to indicate what all the integer pointers are intended for? From reading this I don't have a great grasp of how this is implemented, and it looks pretty complicated internally. To clarify I don't doubt it, was just hoping to not have to read so closely below after reading this :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a new type for each use of usize and some comments.

A fair bit of the complexity comes from the or none support. This is needed as a lockfile knows all the packages eth package depends on but does not know the real Dependency objects. This data structure supports interweaving of adding links with edge data and links that do not have data. We use both but we don't in practice interweave them. I wonder if there is a way to simplify the structure if we ban interweaving...

@alexcrichton
Copy link
Member

I'll try to get to this tomorrow morning, it looks like it's going to be pretty deep into data structure land so I'm not quite ready to do that today!

@Eh2406 Eh2406 force-pushed the fast-clone-graph branch 4 times, most recently from dfad1cc to 1b2f42d Compare July 25, 2019 17:10
@Eh2406
Copy link
Contributor Author

Eh2406 commented Jul 25, 2019

Rebased, simplified, and added some comments. What else can be made clearer?

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version is indeed easier to read, thanks!

I'm still having difficulty getting through this, although resolve stuff is in general really hard to grok. I find myself lacking a lot of context though in the sense that I'd still love to see like a multi-paragraph-long description explaining the problem that this data structure is trying to solve.

I guess another way to put it is that the resolution phase of cargo has a lot of interesting patterns, and those constraints are informing what this data structure is going to look like. It'd be great to have that explicitly spelled out in comments as well as having specific comments as to the implementation and how it works. For exaple the reset_to function seems like the whole crux of this dat structure but it has few comments to explain why it's doing what it's doing and why it works the way it does.

src/cargo/util/graph.rs Outdated Show resolved Hide resolved
src/cargo/util/graph.rs Outdated Show resolved Hide resolved

/// connect `node`to `child` associating it with `edge`.
/// Note that if this and `link` are used on the same graph
/// odd things may happen when `reset_to` is called.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a bit of an odd restriction which may be a holdover from some old version of the resolver, would it be possible to only have either link or add_edge?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link is only needed as a lockfile knows all the packages each package depends on but does not know the real Dependency objects (hear). This could be a Graph<_, ()> but it needs to go into a Resolve so it needs to be Graph<_, Dependency>. I am open to suggestions for how to work around this!

Having had a chance to sleep on it, I don't remember why I thought it would be a problem. I will just remove the "Note".

src/cargo/util/graph.rs Outdated Show resolved Hide resolved
/// - no overhead over using the `Graph` directly when modifying
/// Is this too good to be true? There are two caveats:
/// - It can only be modified using a strict "Stack Discipline", only modifying the biggest clone
/// of the graph.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an interesting restriction, but one you could subvert as well, right? Is this what the asserts are intended for in the borrow function to dynamically assert this invariant?

src/cargo/util/graph.rs Show resolved Hide resolved
type Item = &'a E;

fn next(&mut self) -> Option<&'a E> {
while let Some(edge_link) = self.index.and_then(|old_index| self.graph.get(old_index)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the get here required to be a fallible lookup? I'd imagine that if we're iterating all internal edge pointers should already be valid

.nodes
.get(from)
.and_then(|x| x.get(to).copied())
.filter(|idx| idx < &self.age.len_edges),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra filtering done here in the view is somewhat odd, can you be sure to add comments (not sure where, just somewhere would be fine) as to why it's necessary?

@Eh2406
Copy link
Contributor Author

Eh2406 commented Jul 25, 2019

Thank you for the on-point feedback. I added some comments and responded to some of your points. I did not yet have time for the multi-paragraph-long description. I will need to think through how to describe only the complexity required to read this module vs providing enough context to grok why it is what we need. You other points should become clearer if (when?) I describe the hole picture well.

@Eh2406 Eh2406 force-pushed the fast-clone-graph branch from 38226c1 to d452d95 Compare July 25, 2019 22:12
}

/// A RAII gard that allows getting `&` references to the prefix of a `Graph` as stored in a `StackGraph`.
/// Other views of the inner `Graph` may have added things after this `StackGraph` was created.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I'm not sure how this sentence is true, once this is created it freezes Graph inside it because of the borrow on the RefCell, right?

fn next(&mut self) -> Option<&'a E> {
while let Some(edge_link) = self.index.and_then(|idx| {
// Check that the `idx` points to something in `self.graph`. It may not if we are
// looking at a smaller prefix of a larger graph.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is something I don't quite understand, I thought the point of reset_to was that these sort of extraneous edges were pruned out? How do they linger around?

@Eh2406 Eh2406 force-pushed the fast-clone-graph branch from 7cff0f8 to 4410b21 Compare July 27, 2019 22:10
@Eh2406
Copy link
Contributor Author

Eh2406 commented Jul 28, 2019

I wrote several paragraphs of words, I can no longer tell if they mean anything. Let me know if any of it makes sense and is worth having around. I added a test that I hope helps show the queries this supports.

Having spent so long trying to describe what I meant by "Stack Discipline", I am now wondering if it would be possible to detect when a deep clone is needed automatically. At runtime given how the Resolver uses the StackGraph, the ancere is never, but it may be easier to have code to deep clone when needed then it is to describe how to arrange not to need deep clones. I will continue to think about it. Now that I have started to think about that, if we conservatively always to a deep clone before befor reset_to we still get the same BigO. I will give that a try, "Stack Discipline" may be premature optimization. I will try.

@bors
Copy link
Contributor

bors commented Jul 29, 2019

☔ The latest upstream changes (presumably #7186) made this pull request unmergeable. Please resolve the merge conflicts.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Jul 29, 2019

I am now setup on my new laptop, so I can do more accurate bench-marking. The version that does a deep clone any time it may be needed is slower then master when backtracking is involved. More experiments are needed.

@Eh2406 Eh2406 force-pushed the fast-clone-graph branch from 682cdc1 to 5aff945 Compare July 30, 2019 02:32
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the new comments! They look pretty good to me, and when you're ready I think this is basically good to go

//! `dependency_graph_so_far.clone();`. To make this more annoying the first thing we try will
//! probably work, and any work we do to prepare for the next iteration is wasted. If we had a
//! `undo_activate` we could be much more efficient, completely remove the `.clone()` and just
//! `undo_activate` if things tern out to not work. Unfortunately, making shore `undo_activate`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/shore/sure/


/// This is a directed Graph structure. Each edge can have an `E` associated with it,
/// but may have more then one or none. Furthermore, it is designed to be "append only" so that
/// it can be queried as it would have bean when it was smaller. This allows a `reset_to` method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/bean/been/

@Eh2406
Copy link
Contributor Author

Eh2406 commented Jul 31, 2019

Added code to determine when a deep clone is need. This cuts into the wins from this PR, but means that it does not need "Stack Discipline" to be correct, and is faster or the same on all the things I have tried.

Over all, this has been a lot of work to beat im-rs for a small win. @bodil I am very impressed.

@Eh2406 Eh2406 force-pushed the fast-clone-graph branch from d69866e to 518028b Compare July 31, 2019 16:14
@alexcrichton
Copy link
Member

Hm I may be getting lost again. I'm sort of confused, if this is all a pretty small win should we stick with the im-rs data structures? Those are presumably much easier to read/write, but especially with the new Clone implementation this is much more complex to understand.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Jul 31, 2019

3 weeks ago when I started committing to this branch I was definitely planning to say: "this adds a large amount of complexity, is it worth it for the performance boost?" I seem to have forgotten to make that clere. Sorry! It is entirely reasonable for us to close this becust the wins are not worth the code. But let's get quantitative:

% improve by commit and commend

Etch average of 3 runs on my new laptop as idle as I can get it, and with -Zno-index-update.
Please report results on your workload.

2000 crate stress test

commit update generate-lockfile check
ab1d533a7265b237ff2c0e8e9fae67d70211f1ca 13.29% 12.73% 2.58%
5aff94507e2f86ee912b0c3c08f9fb2faa9e2e6a -11.16% -11.34% 3.86%
518028b60fca16e66e7639b562a8d0b7ab2deedf (head) 1.85% 10.32% 3.18%

on cargo's repo

commit update generate-lockfile check
ab1d533a7265b237ff2c0e8e9fae67d70211f1ca 5.75% -0.77% 3.13%
5aff94507e2f86ee912b0c3c08f9fb2faa9e2e6a 0.35% 3.27% 3.29%
518028b60fca16e66e7639b562a8d0b7ab2deedf (head) 3.74% 3.32% 4.58%

Is ~5% on no-op checks of Cargo big enough to justify all this new code?

@Eh2406 Eh2406 force-pushed the fast-clone-graph branch from 518028b to 1f405c1 Compare August 1, 2019 14:20
@bors
Copy link
Contributor

bors commented Sep 27, 2019

☔ The latest upstream changes (presumably #7452) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Oct 1, 2019

☔ The latest upstream changes (presumably #7361) made this pull request unmergeable. Please resolve the merge conflicts.

@Eh2406 Eh2406 mentioned this pull request Mar 12, 2020
bors added a commit that referenced this pull request Mar 16, 2020
De-duplicate edges

This is a quick fix for #7985. It is possible to have more than one dependency that connects two packages, if one is a dev and one a regular. The code has use a `Vec` to represent that potential multiplicity. This switches it to a `HashSet` to fix #7985. But if there is only a handful of ways we can have more than one then perhaps we can do something with less indirection/allocations.

Note that #7168 (which was already abandoned) will need to be redesigned for whatever we do for this.
@ehuss ehuss added S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 13, 2021
@Eh2406
Copy link
Contributor Author

Eh2406 commented Jan 13, 2021

This is not going to get the attention that is needed to merge. And at this point I'd rather see the effort go into PubGrub not into optimizing this resolver.

A lot of ink has gone into it, a more responsible maintainer would turn it into documentation before closing. But I am not that responsible.

@Eh2406 Eh2406 closed this Jan 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants