-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pip uses backtracking when dependency installation fails #9764
Comments
This is basically an impossible problem. The first design was actually to fail the entire installation on build failures, and we got a flood of requests for the current behaviour, so we implemented it. Judging from the initial backlash (and the quietness after we released the change until this issue), I am assuming more people find the current behaviour more useful. |
I see! I still believe something like |
Yeah, a flag like that makes a lot of sense. Once the legacy resolver is removed entirely, we can start implementing various “strategy flags” for the resolver, and this is one of the first I’m looking to have as well. |
I have also reported the problem, and I don't understand: there is no reason to not fail fast in any circumstances, when the outcome is anyway to fail:
when other people complained, it was probably because there were still several issues inter-mixing in code and minds around the new resolver. Like in Chess: if I do that move I will loose my King, ... but maybe I can still take his Queen ? |
Is there a typo here? The sentence only makes sense if I read it as there is no reason to not fail fast. The problem is exactly not every build failure is created the same, and the final outcome is not always to fail. Indeed it makes no sense to continue if a package version fails to build when it is expected to, but not all projects manage their distributions like this. Some would release versions that only target certain platforms and don’t expect to be built on others, for example. You may argue they are using the project version incorrectly (full disclosure, I’m personally quite annoyed by them), but pip need to take a very lax position to those scenarios since Python packaging has been historically permissive in this area (far too much IMO), especially regarding sdists. Again, I do feel failing immediately on a build failure is a reasonable approach, but that’s not something we can just do. (Another consideration is you can always put additional constraints to limit backtracking, but there’s no to tell pip to continue if a build failure exits the process immediately.) |
Typo corrected |
A quick failure would be great! An example requirements.txt that creates endless backtracking (using pip 21.0.1) :
...but! If we install the 2 dependencies in the following way, pip will break nicely:
The nice way of breaking occurs as the second step fails:
When we install the two dependencies in combination with |
But that's essentially the reason why we have a resolver at all - "figuring out a set of versions that works" is in general a really hard problem, and people aren't able to do it by hand (without significant effort). In some cases - possibly including this one - working out a usable set of restrictions is easy enough, but how do we know if that's the case up front (without doing the resolve that caused the problem in the first place)? In this particular case, is there an earlier version of module-docker which supports docker 3.7? How would you find that version to install it? The only way I can imagine is checking each older version of module-docker to find one that works - which is all that pip is doing. So how would it be any better if you did that yourself? As @uranusjr said, "stop on first failure" is a reasonable option to offer, and once we've managed to remove all of the "old resolver" code so we can make changes like that, we'll probably look at offering it as an option. But I don't know how useful it will be in practice - I suspect we'll just have people requesting that when failing, pip provide more information to help them work out how to fix the problem, which is precisely what we won't be able to do, because we stopped when we hit the error! |
I would also like to point out again the distinction between
|
Again, I think the problem is how does pip know it's a missing system library (for example) that caused the problem? That's the key issue. |
It can't, but it should know when the problem is due to a version conflict, shouldn't it? |
No, because it can't get the dependencies without doing a build, which is what fails (at least sometimes)... But if you're saying that any build failure should cause an immediate stop, isn't that what both @uranusjr and I have confirmed is reasonable to have as an option? |
Got it, I thought that could be inferred from the metadata without doing a build. I'm happy with such an option then! |
Sorry if this sent the explanations back to Square One. I think we are in accordance on all things except what to do with actual hours of backtracking? As I hear you, it's a "wontfix" because that's how it's designed? Bug or design? This post became a bit long, but please don't read it as a demanding or negative one, the the dependency resolver is great and I hope it's okay to suggest improvements. I'll keep working with the same isolated example: One package is pinned to an old version and another more recent package dependency has a reverse dependency to the same package, but with a new version.
I can't think of a simpler and smaller example than this - it spawned hours of backtracking, I don't even know where it ends :) The resources used are (a lot of) network, 100% CPU, local pip .cache filling up and possibly also network proxies filling up. You have the helicopter perspective of this from the PyPi usage data. The default behavior is very bad and causes CIs to lock up for hours. With pre-commit, environments are created in a hidden way without the users knowledge. Even when the user monitors terminal outputs, the thousands of lines of terminal output may be difficult to understand and identifying which version specification that causes the issue, seems less likely. I would expect things to Just Break®, rather than the current behavior, but I understand that the solution here is perhaps to find a compromise. It's possible to say that pip<20 (old dependency resolver) was already a compromise - it proceeded despite the version conflict but created a warning for the user IIRC. But better solutions surely exist. The example isn't something that is desirable to have automatically backtracked nor silently ignored -- it should ideally break so the developer can fix it. Consider the example again:
Scenarios:
Question: Maybe I haven't noticed cases where backtracking was active and helpful because it Just Worked®? When you find that people have benefited from backtracking, how many steps have typically run? SuggestionSolution:
Example sketch (final ERROR text is a new mockup)
|
I love how every discussion on resolution kind of eventually tries to suggest a counter + exit after a certain number of rounds kind of proposal. That works in abstract, until you actually try to define what a “step” is in the resolver. I’m not saying it won’t work, but have been through this multiple time and don’t really want to go into the same abstract discussions again. Feel free to try your idea out and come up with a proof of concept; we can continue the discussion from there. |
I agree that the behaviour here is bad, but I don't have enough information to understand why it's happening yet. Let's come back to that, though. Regarding your proposed solution:
We have that. It's called The problem you have appears to be that in your case, the time spent is not because of too many rounds. But we don't know what it is. So we need more information. If you were to profile your case, and identify:
Then, we might be able to determine where the problem lies in your case. Without trying to pre-judge, I'm fairly certain that the answer won't be something pip can address easily (typically, it's builds that take a long time to complete). Some workarounds which I'm sure aren't acceptable, but may give you some food for thought:
None of these will fix the issue, but they may give you insights, and possibly even suggest a way forward. If you produce a proof of concept fix from that which helps your issue, we'd love to know.
Quite probably. We have many millions of people using pip daily. And we've had people comment that the new resolver was a significant benefit for them. Honestly, do you really think we would have released the new resolver if we'd had feedback that it was a net loss? This is probably the most extensively publicised feature pip has ever released, and we did more user research on it than we ever had before (thanks to the funding we received). So yes, I'm afraid you are in a small minority here. I know that's no help to you personally, but as pip maintainers we have to look at the wider picture.
We have no idea. Nobody tells us anything when things work well. Maybe you can imagine how demoralising that can be? Particularly when people who raise issues assume we have all that information to hand 🙁 I think we're just going round in circles now (ironic, really 😉). I suggest that if you want to make progress with this, you profile where pip is spending its time, as I suggested above, and give us some feedback on precisely what pip (or the build tool) is doing in all that time. |
Once again, thanks for taking the time to elaborate, @pfmoore Hope that it can seem both encouraging and motivating that everyone here are happy with the new resolver as well and just trying to find solutions from perceived problems. |
I tried by reducing the number 2000000 down to 2, or even 0. My issue is when it fails (now quickly), it still doesn't tell me what is its problem, or the first problem it had, so I don't know what package problem I have:
|
Well, a single round can include backtracking various versions of a single package. Setting it to 2, however, would mean you can't install more than 2 packages. :) |
Merging into #10655 since it covers this problem, and we don’t really need two issues on this. |
Description
When installation of a dependency fails, pip uses the backtracking feature to try other versions of the package (even if the failure is not due to a version conflict)
Expected behavior
I understand that the backtracking is useful to solve version conflicts. Trying different versions when the installation fails for another reason than a version conflict is IMO not useful most of the time, as it often indicates a missing system package.
I find this particular annoying during CI tests, as it takes forever before the test actually fails. If this is intended behavior, it would be great to have a flag to disable it.
pip version
21.0.1
Python version
3.7.10
OS
arch linux
How to Reproduce
As an example, I install scikit-bio into a clean environment (which fails, because the package doesn't properly declare the numpy dependency)
Output
Code of Conduct
I agree to follow the PSF Code of Conduct.
The text was updated successfully, but these errors were encountered: