backtrack if can not activate #5000

Eh2406 · 2018-02-01T20:34:20Z

This is a fix for #4347
Unfortunately this too regressed error messages for the case that you specified a dependency feature that does not exist.
@alexcrichton advice on improving the message?

rust-highfive · 2018-02-01T20:34:31Z

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

alexcrichton · 2018-02-02T23:10:39Z

Oh dear that's a bummer! It seems like a lot of roads are pointing towards error messages not being that great...

This one I would say though yeah is even more pressing than links in that this comes up quite frequently (typos and such). In that sense it'd be pretty bad if we regressed this error message :(

Eh2406 · 2018-02-02T23:44:09Z

Is there some way to print a error just as we would have but then continue processing, if so I have a bunch of ideas for how to improve error messages, at least in this case. Or is there some way to tie an an error message to a previous one. So we can say options for dependent b causes include...

…

On Feb 2, 2018 6:10 PM, "Alex Crichton" ***@***.***> wrote: Oh dear that's a bummer! It seems like a lot of roads are pointing towards error messages not being that great... This one I would say though yeah is even more pressing than links in that this comes up quite frequently (typos and such). In that sense it'd be pretty bad if we regressed this error message :( — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#5000 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADiaQDfACrexbHlxNdx5FQGjornubYrVks5tQ5XxgaJpZM4R2Pge> .

Eh2406 · 2018-02-05T19:46:47Z

It will take some work to rebase/merge this with #4834, and that is a much bigger/more common UX improvement. I think this should be put on hold for that to land and for a plan for error messages to be developed.
@alexcrichton You are right, I have been thoroughly convinced that error messages should happen next. You offered some mentership on that, how do you think I should proceed?

alexcrichton · 2018-02-05T19:57:13Z

@Eh2406 certainly! Unfortunately I don't really know how to precisely proceed. I do have a feeling, though, for how to proceed generally!

So prior to #4978 we had a custom error message for the links key which indicated. I think that a great end state would be to end up basically as close as we can to an error message like that where you've got a causal chain through packages and a nice description for what's happening.

As to how to get there... I'm not sure. I think it may also be worth taking a step back and looking at the resolution algorithm from a distance too. Right now Cargo in theory attempts all possible crate resolution graphs. I mean like all possible graphs. This can put us in one of two situations:

At least one graph is "successful" and satisfies all the constraints. In this situation Cargo should never return an error and eventually reach such a graph.
All possible graphs would lead to a resolution failure, so Cargo should return an error.

Here we're most interested in the latter situation, but it also means there's no one error for us to return. In some sense we tried everything and it could have all failed for a number of reasons. I think we could probably optimize to return the "shortest" error perhaps? (for whatever definition "shortest" actually means)

There's also the separate bug that Cargos' backtracking and exploration of the problem space takes forever, but hopefully #4834 can help with that!

Hm so that's unfortunately not super actionable, but is that helpful at least to get started?

Eh2406 · 2018-02-05T21:37:06Z

Thanks for the pointer to the links code, I will go and read that. I like the idea of having a good causal chain for the cases when we can.

I think it may be useful to break it down into more situations.

The tree is trivially solvable, even if it is big. This is common, For example Servo, when I tested it, does No backtracking at all. -> No error messages.
The tree is solvable, given a exponential amount of time. Make resolution backtracking smarter #4834 will make it much more rare, But the problem is NP complete so this is always a possibility. -> We should print some kind of diagnostic to help the programer make it easier for us. Ether at some time into resolution like we do Resolving dependency graph... Or wen we bail after a user configured number of time as you suggested in Abort crate resolution if too many candidates have been tried #4066
The tree has no solutions. -> We need to report something useful to the programer.

We should keep in mind that the most common reason for any kind of error is a simple typo.
And that the user's mental model does not include millions of permutations of ancient versions of the dependencies.

The first time we discover that the tree is not trivial we are in the state closest to the user's mental model of just take the newest version of everything. And almost anything we print will be helpful in discovering that it is just a typo. So I think we should print a good causal chain for the backtracking to the user, and continue in the serch.

Another huristick if something deep in our tree is unsolvable, then we probably hit almost the same error message lots of times. For example We depend on A=* (with 100 versions to choose from) which depends on B=* (with 100 versions to choose from) which depends on C=* all versions have been yanked, If we printed all errors (a thout experiment) We will have 10k "can't find a candidate for C ...." and just 100 "can't find a candidate for B ...." and 1 "can't find a candidate for A ...." The most common one is the most helpful. So maybe 10 minutes into resolution we can print "we are still working on finding something that works but, here are the 3 most often encountered difficulties:" each with a full a good causal chain for the backtracking.

All of these thoughts depend on being able to print the message as part of some context, and then continue with the serch. How do I do that? I could make a function that returns the errors as strings. But the words "errors as strings" makes me very nervous.

alexcrichton · 2018-02-05T21:47:02Z

All excellent points! I think we should take it as an assumption that if we're going to produce a human readable error then we can reach that conclusion in a reasonable amount of time. In that sense I hope that #4834 will get us closer to that goal but we may have other bugs of "Cargo goes in the weeds too much". I don't think, though, that such a situation should impact the design of the error messages here.

Now I will say we have two other tricks up our sleeve for "gee resolution is taking forever":

First we can actually inform the user what's happening. Recently Cargo changed to do this and will print out, after I think a second or so of resolution, "Resolving crate graph...". It's not much but it at least let's the user know we're not stuck in network traffic.
Second we should probably implement Cargo just giving up at some point. If Cargo has attempted a million graphs and they're all failing, it's doubtful that the million+1th graph will work. This is sort of covered by Abort crate resolution if too many candidates have been tried #4066 where we should (a) have a hard limit on the amount of time Cargo takes during resolution and (b) when this error happens mention something like "hey this may be a bug in Cargo, just letting you know"

In that sense I think your second and third cases would fold into one another. I definitely agree that mistakes tend to be common typos before we get to serious "oh dear what to do" scenarios. Despite that though if Cargo is indeed successful I don't think that we should print out anything (even if we hit errors along the way). I think that only if we end up failing entirely should we print an error.

We could certainly explore though a more progress-bar like situation for Cargo where it prints incrementally what it's doing for debugging perhaps?

All of these thoughts depend on being able to print the message as part of some context, and then continue with the serch. How do I do that? I could make a function that returns the errors as strings. But the words "errors as strings" makes me very nervous.

I think it's sort of encoded as CargoResult or weird options in resolve today, but I think we should stick with that. An error happening only happens in the "slow" case where "slow" means that you've modified or don't already have a Cargo.lock. On the fast path we'll never hit an error because the Cargo.lock means we'll resolve precisely to the same one as before.

In that sense I wouldn't be too worried about making errors expensive to compute, it hopefully isn't too bad! We can of course profile and test on Servo though to test this out.

Eh2406 · 2018-02-05T22:58:36Z

Ok, I think we mostly agree and if we don't it is because we are talking past each other. I think I need to start thinking more "incrementally". I should start with reading the old 'links' code and seeing if I can make the current error messages better. Only then come back to the larger questions of when to print and for witch backtracks.

needs a test where we have an activation_error the then try activate something that dose not work and backtrack to where we had the activation_error then: - Hit fast backtracking that go past the crate with the missing features - Or give a bad error message that does not mention the activation_error. The test will pass, but there is code that is not yet justified by tests

alexcrichton · 2018-03-01T17:12:33Z

tests/testsuite/features.rs

+    ... required by package `foo v0.0.1 ([..])`
+versions that meet the requirements `*` are: 0.0.1
+
+the package `bar` depends on `bar`, with features: `bar` but it does not have these features.


Hm shouldn't this read "the package foo" instead of "the package bar"?

Yes, and it is fixed. Thanks. (Sorry about all the force pushing.)

alexcrichton · 2018-03-01T17:13:29Z

tests/testsuite/features.rs

+package `bar v0.0.1 ([..])`
+    ... which is depended on by `foo v0.0.1 ([..])`
+
+all possible versions conflict with previously selected packages.


Would this clause be possible to remove in this scenario? I think it's more applicable to version conflicts than feature conflicts, right?

Eh2406 · 2018-03-01T18:41:46Z

I think this is finally ready to go.

Some things I did not figure out how to test:

Where to clone BacktrackFrame. I think I have this right. I think my initial version could have had O(n^2), but I could not get a test case that proved it.
Adding conflicting_activations to the BacktrackFrame. I think this is needed as I described in the commit, but I could not get a test case that proved it.

alexcrichton · 2018-03-01T21:27:14Z

Nah this looks great to me, thanks @Eh2406

@bors: r+

bors · 2018-03-01T21:27:15Z

📌 Commit 2cbd1dd has been approved by alexcrichton

bors · 2018-03-01T21:27:22Z

⌛ Testing commit 2cbd1dd with merge 382967a...

@alexcrichton

backtrack if can not activate This is a fix for #4347 Unfortunately this too regressed error messages for the case that you specified a dependency feature that does not exist. @alexcrichton advice on improving the message?

bors · 2018-03-01T21:49:06Z

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 382967a to master...

@alexcrichton

missed this important bug In the PR #5000 I finished and we merged yesterday I missed a bug and left in an outdated comment. @alexcrichton

@alexcrichton

Faster resolver: Cache past conflicting_activations, prevent doing the same work repeatedly. This work is inspired by @alexcrichton's [comment](#4066 (comment)) that a slow resolver can be caused by all versions of a dependency being yanked. Witch stuck in my brain as I did not understand why it would happen. If a dependency has no candidates then it will be the most constrained and will trigger backtracking in the next tick. Eventually I found a reproducible test case. If the bad dependency is deep in the tree of dependencies then we activate and backtrack `O(versions^depth)` times. Even tho it is fast to identify the problem that is a lot of work. **The set up:** 1. Every time we backtrack cache the (dep, `conflicting_activations`). 2. Build on the work in #5000, Fail to activate if any of its dependencies will just backtrack to this frame. I.E. for each dependency check if any of its cached `conflicting_activations` are already all activated. If so we can just skip to the next candidate. We also add that bad `conflicting_activations` to our set of `conflicting_activations`, so that we can... **The pay off:** If we fail to find any candidates that we can activate in lite of 2, then we cannot be activated in this context, add our (dep, `conflicting_activations`) to the cache so that next time our parent will not bother trying us. I hear you saying "but the error messages, what about the error messages?" So if we are at the end `!has_another` then we disable this optimization. After we mark our dep as being not activatable then we activate anyway. It won't resolve but it will have the same error message as before this PR. If we have been activated for the error messages then skip straight to the last candidate, as that is the only backtrack that will end with the user. I added a test in the vain of #4834. With the old code the time to run was `O(BRANCHING_FACTOR ^ DEPTH)` and took ~3min with DEPTH = 10; BRANCHING_FACTOR = 5; with the new code it runs almost instantly with 200 and 100.

rust-highfive assigned alexcrichton Feb 1, 2018

Eh2406 mentioned this pull request Feb 14, 2018

Conflict tracking #5037

Merged

Eh2406 force-pushed the i4347 branch 2 times, most recently from c1abbab to 4ece446 Compare March 1, 2018 04:40

Eh2406 added 2 commits February 28, 2018 23:40

add a test for cargo/issues/4347

8348469

WIP: make it recurs if activation fails

3300728

Eh2406 force-pushed the i4347 branch from 4ece446 to 0a064b4 Compare March 1, 2018 04:41

Eh2406 force-pushed the i4347 branch from 0a064b4 to 5d4402c Compare March 1, 2018 11:34

One more todo to fix

6218584

alexcrichton reviewed Mar 1, 2018

View reviewed changes

Eh2406 force-pushed the i4347 branch 2 times, most recently from d0a828d to 26b96a4 Compare March 1, 2018 18:23

fix cause and the error messages

2cbd1dd

Eh2406 force-pushed the i4347 branch from 26b96a4 to 2cbd1dd Compare March 1, 2018 18:24

bors merged commit 2cbd1dd into rust-lang:master Mar 1, 2018

Eh2406 mentioned this pull request Mar 2, 2018

missed this important bug #5104

Merged

bors added a commit that referenced this pull request Mar 2, 2018

Auto merge of #5104 - Eh2406:bug_fix, r=alexcrichton

8f76fde

missed this important bug In the PR #5000 I finished and we merged yesterday I missed a bug and left in an outdated comment. @alexcrichton

Eh2406 mentioned this pull request Mar 6, 2018

make RemainingCandidates::next peekable. #5044

Merged

alexcrichton mentioned this pull request Mar 6, 2018

Regression in resolver performance #5130

Closed

Eh2406 deleted the i4347 branch March 6, 2018 20:46

Eh2406 mentioned this pull request Mar 12, 2018

Faster resolver: Cache past conflicting_activations, prevent doing the same work repeatedly. #5168

Merged

Eh2406 mentioned this pull request Dec 20, 2018

Cargo silently ignores patch crates with a missing feature. #6444

Closed

ehuss added this to the 1.26.0 milestone Feb 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backtrack if can not activate #5000

backtrack if can not activate #5000

Eh2406 commented Feb 1, 2018

rust-highfive commented Feb 1, 2018

alexcrichton commented Feb 2, 2018

Eh2406 commented Feb 2, 2018 via email

Eh2406 commented Feb 5, 2018

alexcrichton commented Feb 5, 2018

Eh2406 commented Feb 5, 2018

alexcrichton commented Feb 5, 2018

Eh2406 commented Feb 5, 2018

alexcrichton Mar 1, 2018

Eh2406 Mar 1, 2018

alexcrichton Mar 1, 2018

Eh2406 Mar 1, 2018

Eh2406 commented Mar 1, 2018

alexcrichton commented Mar 1, 2018

bors commented Mar 1, 2018

bors commented Mar 1, 2018

bors commented Mar 1, 2018

backtrack if can not activate #5000

backtrack if can not activate #5000

Conversation

Eh2406 commented Feb 1, 2018

rust-highfive commented Feb 1, 2018

alexcrichton commented Feb 2, 2018

Eh2406 commented Feb 2, 2018 via email

Eh2406 commented Feb 5, 2018

alexcrichton commented Feb 5, 2018

Eh2406 commented Feb 5, 2018

alexcrichton commented Feb 5, 2018

Eh2406 commented Feb 5, 2018

alexcrichton Mar 1, 2018

Choose a reason for hiding this comment

Eh2406 Mar 1, 2018

Choose a reason for hiding this comment

alexcrichton Mar 1, 2018

Choose a reason for hiding this comment

Eh2406 Mar 1, 2018

Choose a reason for hiding this comment

Eh2406 commented Mar 1, 2018

alexcrichton commented Mar 1, 2018

bors commented Mar 1, 2018

bors commented Mar 1, 2018

bors commented Mar 1, 2018