Only allow one of each links attribute in resolver #4978

Eh2406 · 2018-01-25T16:44:05Z

This is a start on fixing the links part of #4902. It needs code that adds the links attribute to the index. But, I wanted to get feedback.

rust-highfive · 2018-01-25T16:44:17Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

alexcrichton · 2018-01-25T21:23:48Z

Awesome, thanks!

It looks though like there's some CI failures? In addition to that, could some tests be added about successfully choosing a resolution when two possible graphs would have otherwise been possible?

The final comment I'd have for this is is that resolution tends to be a pretty performance sensitive path in Cargo, so would it be possible to avoid the HashSet construction on each iteration?

Eh2406 · 2018-01-25T22:48:10Z

The CI failures are test looking for the old links checker, they will probably need to be updated. Maybe we should remove the old cheker entirely? That is a good suggestion for added test.

As to performance:

Do we have a benchmark?
We could use a Vec as n is likely to be small.
We could add it as a field of Context, but we would have to be careful it got updated in skink with activations.
Why are we using the default hasher instead of fnv for all these hash maps?

Eh2406 · 2018-01-26T01:02:30Z

We cant remove the old checker, ~~most~~all indexes don't have the new field. :-)

alexcrichton · 2018-01-26T15:20:49Z

@Eh2406 yeah I thin kit's fine to remove the old checker entirely. Although I will say though that the error messages from the old checker are quite good, and it'd be a shame to lose those... Fixing that may require a more general overhaul to resolver error messages which already aren't the best unfortunately :(

In any case unfortunately I don't have many benchmarks for the resolver. The main one I've used in the past is "how long does it take to resolve Servo's dependency graph?" but without information in the index it'll be deceptively fast I think. In general though we just need to avoid allocations if possible on each turn of the loop if we can, or otherwise make sure the allocations are pretty small. Would it be possible to have a field that's updated over time rather than recreated each time?

The allocations have been most of the runtime so far, I've never seen the hashing part of a hash map show up in profiles.

Eh2406 · 2018-01-26T20:50:42Z

The first time I tried this I had a persistent set, and got myself confused. Then I tried making the set dynamically and submit the pr. This time I got the persistent set to work. I think the third time's the charm!

As to the old checker, I don't think we can remove it until we are sure that all of the index (and clones of the index) are updated to have the new field. I could be wrong, I am new to this codebase. If I am wrong, please enlighten me!

Switching from the good old checker error messages to the resolver messages is a shame. But what can we do. :-(

How hard is it to resolve Servo's dependency graph in isolation as someone that has never touched Servo's code befor?

alexcrichton · 2018-01-26T23:25:16Z

src/cargo/core/resolver/mod.rs

@@ -1047,6 +1053,9 @@ impl<'a> Context<'a> {
                       .or_insert(Vec::new());
        if !prev.iter().any(|c| c == summary) {
            self.resolve_graph.push(GraphNode::Add(id.clone()));
+            if let Some(link) = summary.links() {
+                self.links.insert(link.to_owned());


Could this assert that we're not replacing somethign that was already there?

I could make it assert!(self.links.insert(link.to_owned())); but I think this can get triggered if updating a lock file that already has multiple crates with the same links attribute. Is this acceptable? Is there some way to get better error messages?

Hm that would indeed be bad! Thinking through how this can happen I think we could trigger this if an older cargo created a lock file with a bunch of new crates (that all have links in the manifest) and then a newer cargo tries to use that lock file.

In that case, sounds like we should skip the assert for now!

Er well, but if we do actually run into that case the previous project surely couldn't build previously becuase it would have been caught later...

How about this, could we return an error from here to entirely abort resolution?

I think I have that working now.

alexcrichton · 2018-01-26T23:27:51Z

Ok great! I think this looks good to me in terms of perf, and we can always continue to tweak as more cases come up.

I've actually found it somewhat easy to profile with Servo before, but unfortunately it may not be too interesting to profile with this PR as-is (as there's no links keys in the index). If you'd like though you could perhaps manufacture a links for all Summary instances created (a unique name in links) temporarily to test the performance?

Otherwise I've historically just cd'd into Servos' checkout and run cargo generate-lockfile or something like cargo update -p servo which shouldn't hit het network after the first time and stresseses the path of "we've already got a Cargo.lock so this shoudl be fast". Perhaps worth testing if you can with some synthetic links keys?

alexcrichton · 2018-01-26T23:29:33Z

Oh right, some further thoughts:

Let's actually not remove the old system just yet. Because many crates on crates.io don't have the links key in the index we won't otherwise catch those errors if we remove that logic. We can eventually remove it one day perhaps but only after we've published a lot.
This will need to be coupled with an update to how we publish crates to the registry. That way information will actually go into the index!

Eh2406 · 2018-01-27T14:34:35Z

So I got Servo set up so I can run the test. I benchmarked 4 versions. I am running windows so I don't have a time command. I am just looking at wall time for now. For each one I am running cargo update -p servo.

The base, the commit before my PR. Takes approximately 3 second to run. Is it usually this fast? Maybe I just got a lucky checkout of Servo.
The initial version of this PR, with a new HashSet for each loop. I killed it after a minute had already gone by. So, ya that is not acceptable.
The current head of this PR. This tests rolling out this PR without backfilling the index. Takes approximately 2 second to run.
If I add a line links.or_else(|| if name.starts_with("sys-") {Some(name.to_string())} else {None}) to registry\index.rs. This tests after the index is filled in. Takes approximately 4 second to run.

alexcrichton · 2018-01-27T18:55:23Z

Thanks for investigating @Eh2406! I'm actually surprised it took 3 seconds at all, that sort of update should happen in milliseconds ather than seconds...

Well in any case sounds like the current PR as-is is certainly acceptable speed-wise!

Eh2406 · 2018-01-28T04:05:57Z

I added the links attribute to the NewCrate struct then followed the error messages. Please let me know if I did that correctly.

alexcrichton · 2018-01-29T15:00:37Z

Ok thanks @Eh2406! Thinking about this some more...

I'm getting more and more worried as I think about it about the error messages here. It's pretty well-known at this point I believe that the resolve error messages coming out of Cargo are pretty bad (almost un-actionable!). That's also coupled with the fact that non-resolvable crate graphs look like hangs I think could get us into a bit of a tight spot with this feature.

For example we've, since inception, had requests and an implementation for better error messages around the links key. That implies to me that this situation is actually coming up in the wild a nontrivial number of times. If we were to move the errors to graph resolution then I fear that all of these cases coming up in the wild will become un-debuggable.

In that sense I'm not entirely sure how to balance this implementation. On one hand I definitely believe it's technically correct (although would be good to have a test to assert that this error can guide crate graph resolution). On the other hand though I'm not sure that this is worth the loss in error message quality. From my experience cases like #4902 are coming up pretty rarely where there's a resolvable crate graph but Cargo isn't finding it immediately.

I'm curious what you think about this as well?

Eh2406 · 2018-01-29T21:28:15Z

I have a lot to say in response to your comment. How much will actually be relevant to this pr I won't know till I try and say it. But I hope to do a lot of work on the resolver and it will be relevant to that larger context. I will try and respond point by point.

It's pretty well-known at this point I believe that the resolve error messages coming out of Cargo are pretty bad (almost un-actionable!).

I agree! I would love to work on fixing them when I understand the resolver deeply so I can see how to improve it. I started working on this PR so I could begin to understand the resolver. If I begin with some low hanging problems, even if I never understand the resolver deeply I will still have contributed.

I fear that making good error messages for resolver will be hard. Resolver currently returns some random crate it can not resolve, usually one that the user did not even know was being depend upon. It is not clear how to synthesize it into a larger coherent user centric message.

that non-resolvable crate graphs look like hangs

I think this is surprisingly tightly tied to the error messages. I think the solution to failing fast on most non-resolvable crate graphs in valves taking the fact that some random crate can not be resolved synthesize it into a larger coherent problem then back track until the bigger problem goes away. If we had a good way of building a larger coherent problem we would have them both solved.

For example we've, since inception, had requests and an implementation for better error messages around the links key. That implies to me that this situation is actually coming up in the wild a nontrivial number of times. If we were to move the errors to graph resolution then I fear that all of these cases coming up in the wild will become un-debuggable.

I think all the cases that are debugabal in the current system, would just resolve into something that worked. I.E. If you can fix it by specifying a specific version of a dependency, then it will just no longer be an issue. The process of backtracking will find the dependency that works.

Of course the resolver error messages are offil so if you have a bigger problem than pinning a dependency the error messages will get a lot worse.

although would be good to have a test to assert that this error can guide crate graph resolution

I'd love some guidance on making a test in which cargo has multiple versions to choose from only the oldest of witch does not have a links attribute that conflicts with the main crate. Or any other test you would like to suggest.

From my experience cases like #4902 are coming up pretty rarely where there's a resolvable crate graph but Cargo isn't finding it immediately.

I don't think that is an accurate description of #4902. I don't see that particular report mentioning cargo being slow. I see it reporting that cargo always builds a lock file that it then cannot use. With, in that case, no way to fix it. That is a really bad user experience.

I think over all this replaces friendly error messages that clearly tell you what problem cargo made for you with cargo silently not making the error in the first place.

Eh2406 · 2018-01-30T16:22:28Z

I think this is a ergonomic win, but reasonable people can disagree. If you decide to postpone this until we somehow get better error messages then I will open up PRs related to other issues with the resolver and let this one bit rot. Knowing me, If I end up in a holding pattern I will likely get distracted, so one way or the other I'd like to keep things moving so I can contribute.

alexcrichton · 2018-01-31T19:44:58Z

Thanks for the writeup @Eh2406! (and sorry for taking a few days to get back to you)

Mostly that all sounds great to me, I'd be thrilled to help you learn about the resolver if you'd like!

One point though is...

I think over all this replaces friendly error messages that clearly tell you what problem cargo made for you with cargo silently not making the error in the first place.

Almost! A few days ago we talked about this more broadly in the cargo triage meeting and what to do with this PR. Our conclusion though was that we probably don't want to land this yet, but we're curious to learn more information!

My fear, more specifically, is that entirely unresolvable crate graphs will start showing up (producing bad error messages). I suspect that, because we've improved the error messages here, it's somewhat often coming up in practice that there's multiple native libraries of the same name that can't get built together. While in #4902 it's possible that Cargo could have actually found a successful resolution I suspect that is not the case for most of the times this comes up in practice. Whenever I've seen this before it's been impossible to resolve the crate graph with this constraint, but rather a dependency simply needed to be updated.

Along those lines though, @SergioBenitez as the reporter of #4902 is this an issue that's coming up in practice for you a lot? The solution in this PR will solve #4902 but will unfortunately create some other problems as well, so we're trying to figure out how to best weigh the priorities here.

@Eh2406 if you'd like to tackle resolution one of the highest priority bits of it I think would be the error messages, and I'd be more than willing to help mentor you on improvements to that!

Eh2406 · 2018-02-05T19:50:17Z

Just a note for when someone/I come back to this, in addition to the error messages, It will take some work to rebase/merge this with #4834, and that is a much bigger/more common UX improvement.

bors · 2018-02-06T17:33:59Z

☔ The latest upstream changes (presumably #4834) made this pull request unmergeable. Please resolve the merge conflicts.

# Conflicts: # src/cargo/core/resolver/mod.rs # tests/resolve.rs

Eh2406 · 2018-02-15T03:30:34Z

Worked on the error messages, dose that look better?

bors · 2018-02-21T20:21:09Z

☔ The latest upstream changes (presumably #5063) made this pull request unmergeable. Please resolve the merge conflicts.

alexcrichton · 2018-02-24T20:46:52Z

@bors: r+

Looks amazing, thanks so much @Eh2406!

bors · 2018-02-24T20:46:53Z

📌 Commit 68a40ad has been approved by alexcrichton

bors · 2018-02-24T20:47:00Z

⌛ Testing commit 68a40ad with merge fe0c18b...

Only allow one of each links attribute in resolver This is a start on fixing the `links` part of #4902. It needs code that adds the links attribute to the index. But, I wanted to get feedback.

alexcrichton · 2018-02-24T20:47:08Z

@Eh2406 also, would you be up for following through with the crates.io changes needed here as well?

Eh2406 · 2018-02-24T21:05:57Z

Um befor we merge, how do we test that adding links to the index will not break old versions of cargo?
@joshtriplett asked us to test it explicitly.

bors · 2018-02-24T21:09:25Z

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing fe0c18b to master...

alexcrichton · 2018-02-25T11:59:56Z

@Eh2406 oh I'd be fine adding a test yeah, although I'm quite confident that it won't require any code changes.

Eh2406 · 2018-02-25T13:43:52Z

I don't think it's as simple as adding a test. Adding a test will test that it works with current cargo. I guess we would need to manually alter the index, and then every previous version of cargo.

Eh2406 · 2018-02-26T03:23:18Z

Ok so I am trying to test if the links attribute will brake old cargo. This is a log of my progress.

set up a new project called testlinks, use rustup override set 1.0.0-gnu to make sure we are using an old cargo.
dell my C:\Users\...\.cargo\registry then run cargo search winapi inside testlinks. This checks out the index exactly as rust 1.0.0 wanted it.
Add a dependency to testlinks on cpython = {version = "*", features=["python27-sys"]} python27-sys = "0.0.6" and generate a lock file. This set of dependencies are chosen to lead to having both [[package]] name = "python27-sys" version = "0.0.6" and [[package]] name = "python27-sys" version = "0.1.2"
edit the index to add ,"links":"python27"} to the end of each row of the python27-sys file. save and commit.
rerun testlinks>cargo generate-lockfile and insure it has the same result.
try with a fresh build of cargo from master. witch should error but gives the same results, something is wrong. looks like something throw out my commit. re committing and still nothing.

This test failed! In 2 ways:

With cargo from 1.0.0 the Updating registry throws out my changes making the test useless.
With cargo from master it doesn't seem to be reading the attribute.

TLDR:
How do I get cargo from 1.0.0 to see an index with a links attribute? Why is cargo from master not reading the links attribute? Help!?

alexcrichton · 2018-02-26T16:22:32Z

@Eh2406 IIRC Cargo at 1.0.0 actually unconditionally reads the registry, so you should be able to do:

Use Cargo 1.0.0 to generate a lock file
Edit the index. Take a crate that's in the lock file and add an index key to the row in the index for that crate
Rerun Cargo 1.0.0, it'll read the registry entry and ingore the index key

For master cargo are you using nightly or building from source? I think these changes haven't made their way into nightly yet unfortunately.

Eh2406 · 2018-02-26T16:30:56Z

I think that is what I tried, but when Cargo 1.0.0 is Updating registry it does a clean checkout that removes my change to the index. I.E. change files to have the links and commit. Check that files are changed, And that commit in log. Run testlinks>cargo generate-lockfile. Index files do not have links, nor is my commit in the git log.

For the master test I am building from source.

alexcrichton · 2018-02-26T16:48:15Z

Hm I'm not so sure in that case :(. Been awhile since I used Cargo 1.0.0... In any case using Cargo 1.20.0 or something like that (not too old) is probably sufficient. No one's using Rust 1.0.0 in production right now I'd imagine.

As to why master isn't reading the attribute... not sure!

Eh2406 · 2018-02-26T17:40:46Z

made a clone of the index at https://github.com/Eh2406/crates.io-index. use .cargo/config to override. pushed https://github.com/Eh2406/crates.io-index/commit/2d969f4712c5aaa9083e5e964879b73c8ed59ae2

Good news:

testlinks>rustup override set 1.0.0-x86_64-pc-windows-gnu
info: using existing install for '1.0.0-x86_64-pc-windows-gnu'
testlinks>cargo generate-lockfile
    Updating registry `https://github.com/Eh2406/crates.io-index`

Bad news:
from source, master cargo generate-lockfile still is not returning an error.

Eh2406 · 2018-02-26T18:39:02Z

Good news I figured it out, and all is well!
cpython = {version = "*", features=["python27-sys"]} python27-sys = "0.0.6" Does have a solution it is to use an old version of cpython No error message needed.

Switching to cpython = {version = "0.1.0", features=["python27-sys"]} python27-sys = "0.0.6" gives error: failed to select a version for `python27-sys`. as intended!

alexcrichton · 2018-02-26T20:03:29Z

Nice!

1248: Try to add links to the index r=carols10cents This is a companion to rust-lang/cargo#4978 It is adding the links attribute to the index git. I am working on windows so cannot test locally, I.E. this may involve some sparing with the CI to get this green. Advice and help are welcome!

Eh2406 added 2 commits January 25, 2018 11:33

add the Manifest links attribute to Summary

78f3564

Only allow one of each links attribute in resolver

732d29d

rust-highfive assigned alexcrichton Jan 25, 2018

Get ci passing

1ee77e4

Keep the list of links as part of Context

87d188f

alexcrichton reviewed Jan 26, 2018

View reviewed changes

Eh2406 added 2 commits January 27, 2018 15:44

ensure we are not using links more then once

3fb2981

Send the links attribute to the registry

58590ae

Eh2406 mentioned this pull request Jan 28, 2018

Try to add links to the index rust-lang/crates.io#1248

Merged

Add a test for issues/4902

affddaa

alexcrichton mentioned this pull request Feb 5, 2018

backtrack if can not activate #5000

Merged

Eh2406 force-pushed the master branch from 8785cf7 to affddaa Compare February 13, 2018 22:02

Merge branch 'conflict_tracking' into links

4d2d444

# Conflicts: # src/cargo/core/resolver/mod.rs # tests/resolve.rs

better error messages

70dada8

Merge remote-tracking branch 'origin/master' into links

68a40ad

bors merged commit 68a40ad into rust-lang:master Feb 24, 2018

This was referenced Feb 24, 2018

Some Clippy suggestions #5070

Merged

make RemainingCandidates::next peekable. #5044

Merged

Eh2406 mentioned this pull request Oct 18, 2023

I still get links errors despite #4978 landing rust-lang/crates.io#7310

Closed

bsteinb mentioned this pull request Apr 4, 2018

Linking to native library multiple times through build dependencies #5237

Open

Eh2406 mentioned this pull request Sep 17, 2018

Cargo fails to find minimal satisfiable set of dependencies #4902

Closed

Eh2406 mentioned this pull request May 2, 2019

Consider using Cargo's links feature to ensure use only once rust-embedded/cortex-m#137

Closed

jebrosen mentioned this pull request Jun 2, 2019

multiple allowed versions of packages that depend on ring rwf2/Rocket#1016

Closed

ehuss added this to the 1.26.0 milestone Feb 6, 2022

Only allow one of each links attribute in resolver #4978

Only allow one of each links attribute in resolver #4978

Conversation

Eh2406 commented Jan 25, 2018 • edited Loading

rust-highfive commented Jan 25, 2018

alexcrichton commented Jan 25, 2018

Eh2406 commented Jan 25, 2018

Eh2406 commented Jan 26, 2018

alexcrichton commented Jan 26, 2018

Eh2406 commented Jan 26, 2018

alexcrichton Jan 26, 2018

Choose a reason for hiding this comment

Eh2406 Jan 27, 2018

Choose a reason for hiding this comment

alexcrichton Jan 27, 2018

Choose a reason for hiding this comment

alexcrichton Jan 27, 2018

Choose a reason for hiding this comment

Eh2406 Jan 27, 2018

Choose a reason for hiding this comment

alexcrichton commented Jan 26, 2018

alexcrichton commented Jan 26, 2018

Eh2406 commented Jan 27, 2018

alexcrichton commented Jan 27, 2018

Eh2406 commented Jan 28, 2018

alexcrichton commented Jan 29, 2018

Eh2406 commented Jan 29, 2018

Eh2406 commented Jan 30, 2018

alexcrichton commented Jan 31, 2018

Eh2406 commented Feb 5, 2018

bors commented Feb 6, 2018

Eh2406 commented Feb 15, 2018

bors commented Feb 21, 2018

alexcrichton commented Feb 24, 2018

bors commented Feb 24, 2018

bors commented Feb 24, 2018

alexcrichton commented Feb 24, 2018

Eh2406 commented Feb 24, 2018

bors commented Feb 24, 2018

alexcrichton commented Feb 25, 2018

Eh2406 commented Feb 25, 2018

Eh2406 commented Feb 26, 2018

alexcrichton commented Feb 26, 2018

Eh2406 commented Feb 26, 2018

alexcrichton commented Feb 26, 2018

Eh2406 commented Feb 26, 2018

Eh2406 commented Feb 26, 2018

alexcrichton commented Feb 26, 2018

Eh2406 commented Jan 25, 2018 •

edited

Loading