Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New resolver downloads hundreds of different package versions, without giving reason #9215

Closed
anetbnd opened this issue Dec 3, 2020 · 56 comments
Labels
C: dependency resolution About choosing which dependencies to install

Comments

@anetbnd
Copy link

anetbnd commented Dec 3, 2020

  • pip version: 20.3.1
  • Python version: Python 3.8.5
  • Operating system: Windows 10 or Ubuntu

We switched today from pip 20.2.2 to 20.3.1 and suddenly in a big project with a very long dependency list (over 60 packages), pip does not manage anymore to install the dependencies from the requirements.txt file. It always tries to install hundreds of different package versions from the same package. Sometimes, it just tries out 3 or 10 versions, which takes some time but works in the end, but sometimes, it downloads and installs all possible versions it can find.

Here this is what happens with the package requests:

INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run, you can press Ctrl + C to do so. To improve how pip performs, tell us what happened here: https://pip.pypa.io/surveys/backtracking 
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run, you can press Ctrl + C to do so. To improve how pip performs, tell us what happened here: https://pip.pypa.io/surveys/backtracking 
INFO: pip is looking at multiple versions of requests to determine which version is compatible with other requirements. This could take a while. 
Collecting requests Downloading requests-2.24.0-py2.py3-none-any.whl (61 kB) 
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /packages/1a/70/1935c770cb3be6e3a8b78ced23d7e0f3b187f5cbfab4749523ed65d7c9b1/requests-2.23.0-py2.py3-none-any.whl Downloading requests-2.23.0-py2.py3-none-any.whl (58 kB) 
INFO: pip is looking at multiple versions of chardet to determine which version is compatible with other requirements. This could take a while. 
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /packages/51/bd/23c926cd341ea6b7dd0b2a00aba99ae0f828be89d72b2190f27c11d4b7fb/requests-2.22.0-py2.py3-none-any.whl Downloading requests-2.22.0-py2.py3-none-any.whl (57 kB) 
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /packages/7d/e3/20f3d364d6c8e5d2353c72a67778eb189176f08e873c9900e10c0287b84b/requests-2.21.0-py2.py3-none-any.whl 
Downloading requests-2.21.0-py2.py3-none-any.whl (57 kB) 
Downloading requests-2.20.1-py2.py3-none-any.whl (57 kB) 
Downloading requests-2.20.0-py2.py3-none-any.whl (60 kB) 
Downloading requests-2.19.1-py2.py3-none-any.whl (91 kB) 
Downloading requests-2.19.0-py2.py3-none-any.whl (91 kB) 
Downloading requests-2.18.4-py2.py3-none-any.whl (88 kB) 
Downloading requests-2.18.3-py2.py3-none-any.whl (88 kB) 
Downloading requests-2.18.2-py2.py3-none-any.whl (88 kB) 
Downloading requests-2.18.1-py2.py3-none-any.whl (88 kB) 
Downloading requests-2.18.0-py2.py3-none-any.whl (563 kB) 
Downloading requests-2.17.3-py2.py3-none-any.whl (87 kB) 
Downloading requests-2.17.2-py2.py3-none-any.whl (87 kB) 
Downloading requests-2.17.1-py2.py3-none-any.whl (87 kB) 
Downloading requests-2.17.0-py2.py3-none-any.whl (87 kB) 
Downloading requests-2.16.5-py2.py3-none-any.whl (87 kB) 
Downloading requests-2.16.4-py2.py3-none-any.whl (87 kB) 
Downloading requests-2.16.3-py2.py3-none-any.whl (86 kB) 
Downloading requests-2.16.2-py2.py3-none-any.whl (86 kB) 
Downloading requests-2.16.1-py2.py3-none-any.whl (85 kB) 
Downloading requests-2.16.0-py2.py3-none-any.whl (85 kB) 
Downloading requests-2.15.1-py2.py3-none-any.whl (558 kB) 
Downloading requests-2.14.2-py2.py3-none-any.whl (560 kB) 
Downloading requests-2.14.1-py2.py3-none-any.whl (559 kB) 
Downloading requests-2.14.0-py2.py3-none-any.whl (559 kB) 
INFO: pip is looking at multiple versions of requests to determine which version is compatible with other requirements. This could take a while. 
Downloading requests-2.13.0-py2.py3-none-any.whl (584 kB) 
Downloading requests-2.12.5-py2.py3-none-any.whl (576 kB)
Downloading requests-2.12.4-py2.py3-none-any.whl (576 kB) 
Downloading requests-2.12.3-py2.py3-none-any.whl (575 kB) 
Downloading requests-2.12.2-py2.py3-none-any.whl (575 kB) 
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run, you can press Ctrl + C to do so. To improve how pip performs, tell us what happened here: https://pip.pypa.io/surveys/backtracking 
Downloading requests-2.12.1-py2.py3-none-any.whl (574 kB) 
Downloading requests-2.12.0-py2.py3-none-any.whl (574 kB) 
Downloading requests-2.11.1-py2.py3-none-any.whl (514 kB) 
Downloading requests-2.11.0-py2.py3-none-any.whl (514 kB) 
Downloading requests-2.10.0-py2.py3-none-any.whl (506 kB) 
Downloading requests-2.9.2-py2.py3-none-any.whl (502 kB) 
Downloading requests-2.9.1-py2.py3-none-any.whl (501 kB) 
Downloading requests-2.9.0-py2.py3-none-any.whl (500 kB) 
Downloading requests-2.8.1-py2.py3-none-any.whl (497 kB) 
Downloading requests-2.8.0-py2.py3-none-any.whl (476 kB) 
Downloading requests-2.7.0-py2.py3-none-any.whl (470 kB) 
Downloading requests-2.6.2-py2.py3-none-any.whl (470 kB) 
Downloading requests-2.6.1-py2.py3-none-any.whl (469 kB) 
Downloading requests-2.6.0-py2.py3-none-any.whl (469 kB) 
Downloading requests-2.5.3-py2.py3-none-any.whl (468 kB) 
Downloading requests-2.5.2-py2.py3-none-any.whl (474 kB) 
Downloading requests-2.5.1-py2.py3-none-any.whl (464 kB) 
Downloading requests-2.5.0-py2.py3-none-any.whl (464 kB) 
Downloading requests-2.4.3-py2.py3-none-any.whl (459 kB) 
Downloading requests-2.4.2-py2.py3-none-any.whl (459 kB) 
Downloading requests-2.4.1-py2.py3-none-any.whl (458 kB) 
Downloading requests-2.4.0-py2.py3-none-any.whl (457 kB) 
Downloading requests-2.3.0-py2.py3-none-any.whl (452 kB) 
Downloading requests-2.2.1-py2.py3-none-any.whl (625 kB) 
Downloading requests-2.2.0-py2.py3-none-any.whl (623 kB) 
Downloading requests-2.1.0-py2.py3-none-any.whl (445 kB) 
Downloading requests-2.0.1-py2.py3-none-any.whl (439 kB) 
Downloading requests-2.0.0-py2.py3-none-any.whl (391 kB) 
Downloading requests-1.2.3.tar.gz (348 kB) 
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run, you can press Ctrl + C to do so. To improve how pip performs, tell us what happened here: https://pip.pypa.io/surveys/backtracking 
Downloading requests-1.2.2.tar.gz (348 kB) 
Downloading requests-1.2.1.tar.gz (348 kB) 
Downloading requests-1.2.0.tar.gz (341 kB) 
Downloading requests-1.1.0.tar.gz (337 kB) 
Downloading requests-1.0.4.tar.gz (336 kB) 
Downloading requests-1.0.3.tar.gz (335 kB) 
Downloading requests-1.0.2.tar.gz (335 kB) 
Downloading requests-1.0.1.tar.gz (335 kB) 
Downloading requests-1.0.0.tar.gz (335 kB) 
Downloading requests-0.14.2.tar.gz (361 kB) 
ERROR: Command errored out with exit status 1: 
command: /opt/venv/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-ilzlnu9j/requests_17e83d46f96d481997d94d444ebabe71/setup.py'"'"'; __file__='"'"'/tmp/pip-install-ilzlnu9j/requests_17e83d46f96d481997d94d444ebabe71/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-t94ak9d1 cwd: /tmp/pip-install-ilzlnu9j/requests_17e83d46f96d481997d94d444ebabe71/ 
Complete output (11 lines): 
Traceback (most recent call last): 
File "<string>", line 1, in <module> 
File "/tmp/pip-install-ilzlnu9j/requests_17e83d46f96d481997d94d444ebabe71/setup.py", line 6, in <module> 
import requests 
File "/tmp/pip-install-ilzlnu9j/requests_17e83d46f96d481997d94d444ebabe71/requests/__init__.py", line 52, in <module>
 from . import utils 
File "/tmp/pip-install-ilzlnu9j/requests_17e83d46f96d481997d94d444ebabe71/requests/utils.py", line 22, in <module> 
from .compat import parse_http_list as _parse_list_header 
File "/tmp/pip-install-ilzlnu9j/requests_17e83d46f96d481997d94d444ebabe71/requests/compat.py", line 112, in <module> 
from .packages import chardet2 as chardet 
ImportError: cannot import name 'chardet2' from 'requests.packages' (/tmp/pip-install-ilzlnu9j/requests_17e83d46f96d481997d94d444ebabe71/requests/packages/__init__.py) 
----------------------------------------

While I understand, that this is coming from the new resolver and a potential compatibility conflict with a certain package, I don't understand what is the exact issue here. We don't use requests directly, but it seems that this a dependency from a dependency we have. So we are not directly in control of this. And pip does not give me any help to understand, which packages have a version restriction, which lead to this behavior.

My suggestion is here to provide a better debug output for developers. When pip has to try out different version, it should tell why it must do so. For example:

Check package "requests-2.24.0-py2.py3-none-any.whl", but it can not be used, because package "<any package>" restricts the version to X and "<another package>" requests version Y. We try out an older version.
...

This would help me to understand where to begin with fixing those conflicts. At the moment I'm absolutely lost and have to switch back to pip 20.2.2.

@brainwane
Copy link
Contributor

brainwane commented Dec 3, 2020

Hello and thank you for your bug report! I'm sorry you're having trouble right now. Thank you for sharing your report with us and including some thoughts on how to address it -- I agree that a more specific error message would be better. I am going to defer to the resolver developers on whether this is an issue they are already addressing in a different issue or whether it's distinct.

You probably saw, in that output, the line:

INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run, you can press Ctrl + C to do so. To improve how pip performs, tell us what happened here: https://pip.pypa.io/surveys/backtracking

but maybe we should also link to more info there, or have it show up again later.

Just in case it's useful, I'll mention here some useful troubleshooting and workaround tips from the documentation:

  • If pip is taking longer to install packages, read Dependency resolution backtracking for ways to reduce the time pip spends backtracking due to dependency conflicts.

  • If you don’t want pip to actually resolve dependencies, use the --no-deps option. This is useful when you have a set of package versions that work together in reality, even though their metadata says that they conflict. For guidance on a long-term fix, read Fixing conflicting dependencies.

  • In pip 20.3, you can choose the old resolver behavior using the flag --use-deprecated=legacy-resolver. This will work until we release pip 21.0 (see Deprecation timeline).

Do any of those tips help?

(If you don't mind, please also tell us what could have happened differently so you could have tested and caught and reported this during the pip resolver beta period.)

@anetbnd
Copy link
Author

anetbnd commented Dec 3, 2020

Thank you very much! The information you provided is very helpful. I think I will use the legacy-resolver as long as I don't know the root cause for the issue.

Regarding your last question: I tested this also during the beta-phase and have seen the same issue. I reported the issue over https://pip.pypa.io/surveys/backtracking

But it is also my fault, I did not spend so much time in understanding the root cause then. I was thinking that this is an code error in pip rather than a dependency error, because I have not seen a clear dependency conflict statement and could not believe that this is a "normal" behavior in such cases. Thus I just filled the survey and forgot it until today.

@anetbnd
Copy link
Author

anetbnd commented Dec 3, 2020

One further comment from my side: When I read the documentation the overall statement is "be more strict" to avoid those issues above.
While it is clear, that a higher restriction in choosing the libraries are reducing the search space and can in a individual case help to reduce the pip-runtime, I think in general this is a very bad advice. Because when developer starts to be more strict with the library version, also those projects which are libraries, them we will get more compatibility issues with pip in future.

Wouldn't it be better to say "be as open as possible and as strict as necessary" in order to guarantee, that most libraries will also in future work together?

@lelit
Copy link

lelit commented Dec 3, 2020

I just hit the same: I have the habit of being as strict as possible in my constraints.txt files, pinning almost every needed package, and their dependants, recursively. I quickly discovered what was going on, and was able to fix the issue because luckily it was in one of my own packages, but I worry about possible similar cases where the fix won't be so easy.
In my case, the faulty package declared a dependency on celery[redis], and, even if the constraints.txt contains celery==4.4.7 and redis==3.5.3, the resolver wasn't able to handle the case, backtracking thru all celery versions, starting from current 5.0.3 going back until 3.0.0, where it stopped because that version's setup.py failed with an error.
The fix was simple: I splitted the celery[redis] dependency in two lines, i.e. I made my package depends on celery and redis.
Given that one cannot put celery[redis]==4.4.7 in the constraints.txt, I guess that's the only option. But it is easy to imagine how difficult it can be to disentangle the tree when packages out of my control declare dependencies using extras...
Anyway, couldn't the resolver at least consider the constraints I provide, and avoid looking for versions of celery that for sure won't be accepted? I mean, if I tell that I want celery==4.4.7, what's the point of downloading all its historic versions and even higher versions?

@pradyunsg
Copy link
Member

@lelit could you make a Github Gist with reproduction instructions for this? The resolver should be considering the constraints you pass it and restrict the versions it explores based on it.

@lelit
Copy link

lelit commented Dec 3, 2020

Sure, I'll try to distill a recipe.

@lelit
Copy link

lelit commented Dec 3, 2020

Ok, got it!
As the comment in constraints.txt says, the problem is triggered by the presence of an unsolvable dependency.
See https://gist.github.com/lelit/50410bec3e31bd98af8a1109fed2d673: executing pip -c constraints.txt -r requirements.txt results in

$ pip install -c constraints.txt -r requirements.txt 
Collecting celery[redis]
  Using cached celery-5.0.3-py3-none-any.whl (392 kB)
  Using cached celery-5.0.2-py3-none-any.whl (392 kB)
  Using cached celery-5.0.1-py3-none-any.whl (392 kB)
  Using cached celery-5.0.0-py3-none-any.whl (389 kB)
  Using cached celery-4.4.7-py2.py3-none-any.whl (427 kB)
Collecting pgcli
  Downloading pgcli-3.0.0-py3-none-any.whl (70 kB)
     |████████████████████████████████| 70 kB 719 kB/s 
INFO: pip is looking at multiple versions of celery[redis] to determine which version is compatible with other requirements. This could take a while.
Collecting celery[redis]
  Using cached celery-4.4.6-py2.py3-none-any.whl (426 kB)
  Using cached celery-4.4.5-py2.py3-none-any.whl (426 kB)
  Using cached celery-4.4.4-py2.py3-none-any.whl (426 kB)
  Using cached celery-4.4.3-py2.py3-none-any.whl (424 kB)
  Using cached celery-4.4.2-py2.py3-none-any.whl (422 kB)
  Using cached celery-4.4.1-py2.py3-none-any.whl (422 kB)
  Using cached celery-4.4.0-py2.py3-none-any.whl (421 kB)
  Using cached celery-4.3.1-py2.py3-none-any.whl (415 kB)
....

@pradyunsg
Copy link
Member

Hurray! Thanks for sharing. I'm AFK at the moment, but if you could you try that with the current master branch, that'd be awesome. There's one fairly major fix merged since the release that'd make things more... efficient.

It can be installed with pip install git+https://github.com/pypa/pip.git.

@lelit
Copy link

lelit commented Dec 4, 2020

@pradyunsg: using current master the "backtrace" moves to another package (pytz), while for celery it still visits “incompatible” v5.0.x versions.

@hwalinga
Copy link

hwalinga commented Dec 6, 2020

I had a related problem, but it was somewhat worse, because seemingly it not only downloads them, but also verifies them. And there was one version of this package from 4 years ago that was mispackaged, and pip immediately aborted mission. See spyder-ide/spyder#14365

And there is more, because after introspecting the dependency tree of Spyder, all dependencies requiring the package decorator don't specify any version constraint. Pip should just immediately install the newest version.

Based on this I have a few recommendations:

  • When backtracking don't completely abort mission when a package is mispackaged.
  • Don't initialize backtracking when none of the dependents specify any maximum version, just install the latest version
  • Don't download and introspect all possible versions when backtracking, but evaluate them from most recent to oldest and immediately install the first version that satisfies the requirements.

@pradyunsg
Copy link
Member

pradyunsg commented Dec 6, 2020

evaluate them from most recent to oldest and immediately install the first version that satisfies the requirements.

This is what the resolver does.

When backtracking don't completely abort mission when a package is mispackaged.

This behaviour is under discussion at #9203

Don't initialize backtracking when none of the dependents specify any maximum version, just install the latest version

In this case, yes indeed. That's what the behavior of the resolver would be.

@anetbnd
Copy link
Author

anetbnd commented Dec 7, 2020

@hwalinga
It seems that there is any incompatibility in the package list, since otherwise it would not even start backtracking.
But you can solve the issue with the breaking installation (I also have seen it in my project above) by specifying in your requirements.txt file a version of this package, which is higher than the version that breaks the installation.

@hwalinga
Copy link

hwalinga commented Dec 7, 2020

Given #9232 it seems more people are seeing unnecessary backtracking.

@ljades
Copy link

ljades commented Dec 7, 2020

  • In pip 20.3, you can choose the old resolver behavior using the flag --use-deprecated=legacy-resolver. This will work until we release pip 21.0 (see Deprecation timeline).

This needs to remain a feature in one way or another (maybe a different flag?) when 21.0 releases. The new resolver has made testing several of our packages in Tox, particularly those with many ML-centric dependencies virtually impossible.

@pfmoore
Copy link
Member

pfmoore commented Dec 7, 2020

This needs to remain a feature in one way or another (maybe a different flag?) when 21.0 releases.

Is that because you don't expect to have addressed the issue before the 21.0 release? If that's the case, do you have any feel yet for how long it will take before you'll be ready to switch to the new resolver? We won't retain the old resolver indefinitely, but it's certainly an option to delay removing it a little longer than 21.0, if there's sufficient reason. (But there's always the option of pinning to an older version of pip, if you expect it to take a really long time to address the problems).

@hwalinga
Copy link

hwalinga commented Dec 7, 2020

Is that because you don't expect to have addressed the issue before the 21.0 release?

If 21.0 is released next month, well yes, maybe there is not enough time to fix everything.

I suspect this is the most problematic issue currently #9203 (comment) (at least stuff broke for me) and I wonder if the statement that this is uncommon is true (it wasn't the oauthlib in my case, so that makes at least 2).

Given by the statement that the problems are particular ML-centric packages, I suspect old mispackaged packages are the problem as well. (ML experts aren't known to be experts in software packaging.)

As long this is in Pip, I wouldn't abandon the old resolver just yet:

# TODO: (Longer term) Rather than abort, reject this candidate
# and backtrack. This would need resolvelib support.

@ljades
Copy link

ljades commented Dec 7, 2020

It's not that we don't expect to have addressed the issue before 21.0, it's that right now the new resolver without any flexibility will probably by-design persist the issue.

Here's a walkthrough of the issue:

The reason why it is rough on Tox is that Tox functions by re-instantiating a new virtualenv and reinstalling a package's dependencies fresh when it runs. We also dockerize this process so we can ensure we can test for consistency regardless of platform.

When pip does the fresh reinstallation, by running through the entire dependency resolution each time, it slows down this process dramatically. And on packages with heavier dependencies with wide ranges of version possibilities (we have even encountered 19 hour hangups trying to install future from a third-party sub-dependency), test installation times skyrocket.

Even if it finishes, any time in the future we need to test again, this process has to start all over.

Our projects use Poetry and Pipenv, which lock dependency resolutions so that they do not need to be repeated until dependency requirements explicitly change.

Right now, pip 20.3 does not seem to have this option to lock.

We do currently pin our pip version, and we've pinned for now at 20.2.4, but not having the option to get any new features in later pip versions because we want the option to not always use dependency resolution (even Pipenv gives you the option, when debugging, to install dependencies without resolution and locking) feels like a harsh tradeoff.

@brainwane brainwane added the state: needs eyes Needs a maintainer/triager to take a closer look label Dec 9, 2020
@jaraco
Copy link
Member

jaraco commented Dec 11, 2020

I've encountered this issue as well in this build. It doesn't happen on my local macOS workstation nor on an Ubuntu Focal docker image nor on the Python 3.8 build in GHA, but on Python 3.6 for macOS and Linux as found in GHA, the build fails when it encounters the dependency pytimeparse.

What's particularly interesting about pytimeparse is it has no dependencies and it's only required by one (transitive) dependency in the project (recapturedocs -> jaraco.mongodb -> pytimeparse). That is, I can't think of any reason why pip would be inclined to spend any time downloading anything but the most recent version of pytimeparse and should never trigger backtracking.

Also, when it attempts to download every version of pytimeparse, it fails when it gets to 1.0.0, which had a bug 6 years ago.

@uranusjr
Copy link
Member

uranusjr commented Dec 11, 2020

@jaraco It is clear in reptrospect that pytimeparse won’t make a difference since it has no dependencies, but the resolver can’t know that unless it downloads and tries the distribution since Python package distributions store dependency information in the package.

@hwalinga
Copy link

@uranusjr That is completely correct, if pytimeparse indeed introduced conflicting dependencies. However after downloading the most recent version of pytimeparse, it should conclude that that version can be installed without error, and proceed to install the most recent version, without downloading all other versions of pytimeparse.

I have suggested this before, but pip should when it is backtracking check from most recent to older versions of a package and then whenever it finds a version that satisfies the requirements go ahead and install that, without going through all possible versions. Likely in almost all cases, it will conclude the most recent version suffices and install that.

However, that remark got dismissed because pip should already do that (#9215 (comment)). But given that pip went ahead and seemingly digged into 6 years of the history of a simple package with no dependencies on itself, being a dependent of only one other package, I have a hard time believing that pip really does this correctly.

@uranusjr
Copy link
Member

@hwalinga #9187 (comment)

@jaraco
Copy link
Member

jaraco commented Dec 11, 2020

After some further investigation on the recapturedocs issue, I found I was able to replicate the issue locally by building on Python 3.6 (not sure why I didn't consider that earlier). I then attempted to build on Python 3.7 and got the error ERROR: Package 'recapturedocs' requires a different Python: 3.7.4 not in '>=3.8', a constraint I'd forgotten was present. So that may explain the trigger for the pytimeparse scan. Still, I would expect two different behaviors:

  • If the root target for install (.[testing]) is inherently incompatible with the environment, there's no value in evaluating combinations of dependencies. It will never resolve.
  • When building combinations of candidate dependencies to determine their dependencies, if a version of a package can't be built, it should be excluded but not crash the install.

jaraco added a commit to jaraco/pip-9216-repro that referenced this issue Dec 11, 2020
@pfmoore
Copy link
Member

pfmoore commented Aug 12, 2021

By the time it had downloaded and analyzed all viable versions of poetry-core, it had all the information it needed to know there was no solution to the request.

One reason pip cannot use this information might be that there's no capability in packaging's specifier implementation to determine that a set of specifiers can never match. For example SpecifierSet(">1.0,<1.0") is "clearly" never going to match, but I can find no way in the SpecifierSet API to determine that fact. If such a feature were added to packaging, we might be able to use it to prune "obviously invalid" branches like this one. But without it, we can only test every version in turn.

Having said this, we shouldn't technically need to download sdists or wheels just to check if their version matches a specifier - we assume that the version in the filename is accurate (which it should according to the specs, and we do later raise an error if the metadata doesn't agree). But we only do this in practice in the finder (which may only know a partial set of specifiers) - we "prepare" the candidate (which includes downloading) as part of constructing it, even if we don't need anything other than name and version yet.

Maybe we should reconsider this, and lazily prepare candidates, only doing so when necessary (to build, or when we need metadata other than name/version). IIRC, we did this originally, but abandoned it when it became complex, as it seemed like a premature optimisation, and eagerly preparing allowed better error reporting. It might be worth reconsidering this decision.

@pfmoore
Copy link
Member

pfmoore commented Aug 12, 2021

Does the following patch help with this issue?

diff --git a/src/pip/_internal/resolution/resolvelib/candidates.py b/src/pip/_internal/resolution/resolvelib/candidates.py
index 5d510db86..ff2b4019a 100644
--- a/src/pip/_internal/resolution/resolvelib/candidates.py
+++ b/src/pip/_internal/resolution/resolvelib/candidates.py
@@ -153,7 +153,15 @@ class _InstallRequirementBackedCandidate(Candidate):
         self._ireq = ireq
         self._name = name
         self._version = version
-        self.dist = self._prepare()
+        self._is_prepared = False
+        self._dist = None
+
+    @property
+    def dist(self) -> Distribution:
+        if not self._is_prepared:
+            self._dist = self._prepare()
+            self._is_prepared = True
+        return self._dist

     def __str__(self) -> str:
         return f"{self.name} {self.version}"

Sorry, I don't have a Python 3.7 install to hand to test this right now. It's not a complete fix (the test suite fails because inconsistent metadata is reported incorrectly) but if it does address this issue, then that will demonstrate that there might be value in lazily preparing candidates.

@notatallshaw
Copy link
Member

notatallshaw commented Aug 12, 2021

Does the following patch help with this issue?

I created a git tag by patching 21.2.3: notatallshaw@fd33b5a and installing: python -m pip install git+git://github.com/notatallshaw/pip@lazy_dist

I could not find any improvement in the performance of either this reproducible example or the reproducible example in #10201. Let me know if there's anything you would like me to try.

@jaraco
Copy link
Member

jaraco commented Aug 12, 2021

One reason pip cannot use this information might be that there's no capability in packaging's specifier implementation to determine that a set of specifiers can never match. For example SpecifierSet(">1.0,<1.0") is "clearly" never going to match, but I can find no way in the SpecifierSet API to determine that fact. If such a feature were added to packaging, we might be able to use it to prune "obviously invalid" branches like this one. But without it, we can only test every version in turn.

I started down this route and filing a bug in packaging describing the feature request when I realized that maybe this feature isn't what's needed.

Having said this, we shouldn't technically need to download sdists or wheels just to check if their version matches a specifier - we assume that the version in the filename is accurate (which it should according to the specs, and we do later raise an error if the metadata doesn't agree). But we only do this in practice in the finder (which may only know a partial set of specifiers) - we "prepare" the candidate (which includes downloading) as part of constructing it, even if we don't need anything other than name and version yet.

Right. I'm also thinking it shouldn't be necessary to determine if a specifier set is invalid, only that no packages in the index match that specifier set. So when the current candidate set of packages is twine==3.4.2 and poetry-core==1.0.3, which when expanded could produce a combined requirement like importilb-metadata<2,>3.6, pip should be able to ascertain from the index that no packages resolve for that Requirement.

And that approach is even more relevant, because the specifier might include valid specifiers that still match no packages in the index.

So it may be more complicated than you first realize.

I agree. And I've pondered this for scores of minutes and haven't imagined an algorithm that's obviously superior. I appreciate the effort and only seek to add examples and suggestions that may prove helpful.

@flying-sheep
Copy link

flying-sheep commented Aug 13, 2021

One suggestion: Please make pip mention why it has to go for backtracking.

In one of my use cases I

  1. tied down my dependencies just to make it backtrack less (e.g. scipy>=1.7.1)
  2. figured out that one of my packages (A) wanted click>=8.0 while celery wants click>=7.0,<8.0 and fixed that
  3. totally despaired when pip started backtracking scipy for another package’s (B) build, since B depends on A, and therefore still has scipy constrained to >=1.7.1.

with no justification, pip just starts downloading every scipy version under the sun. Let it please tell us

well I tried this configuration, but couldn’t solve it:
[important info about failing constraints here]
so I’m randomly ignoring constraints to fullfill other constraints, have fun watching me download.

@robinpaulson
Copy link

is this bug related to this problem the python foundation is having with pypi?

https://status.python.org/

it looks like their server overload problems date back to about the time of the changes to the dep resolver

@robinpaulson
Copy link

from the link: " Investigating - PyPI's search backends are experiencing an outage causing the backends to timeout and fail, leading to degradation of service for the web app. Uploads and installs are currently unaffected but logged in actions and search via the web app and API access via XMLRPC are currently experiencing partial outages.
Dec 14, 09:41 UTC "

@notatallshaw
Copy link
Member

is this bug related to this problem the python foundation is having with pypi?

https://status.python.org/

it looks like their server overload problems date back to about the time of the changes to the dep resolver

No, this is an unrelated issue. pip search was what was previously using PyPIs XMLRPC interface, it appears some bad/careless actor was conducting an overwhelming number of searches per second and the XMLRPC was a legacy interface that couldn't handle it. This interface is not used as part of pip install or pip download

albeus pushed a commit to EMBL-EBI-TSI/cpa-bioexcel-cwl that referenced this issue Sep 8, 2021
Pip stalls trying to install cwlref-runner. It is related to the new
dependency resolver:
* pypa/pip#9215
* https://stackoverflow.com/questions/65122957/resolving-new-pip-backtracking-runtime-issue

This changeset introduces the use of pipenv to manage the setup.
facebook-github-bot pushed a commit to facebookresearch/Kats that referenced this issue Sep 20, 2021
Summary:
After D30550349 (5f44b20) and D30652610 (f1694ef), which introduced `neuralprophet` to Kats, a number of issues arose with the build, mostly stemming from the known issue that `neuralprophet` only supports older `torch` versions (ourownstory/neural_prophet#332).

The dependency web was quite complex, as some issues arose from dependencies of dependencies of dependencies. All my solutions are detailed below:

1. The `pip install -r test_requirements.txt` failed as `pip` could not resolve the added dependency complexities from downgrading the required `torch` version in D30884622 (8d0b005). This is a known issue in newer pip versions (pypa/pip#9215), so this diff forces the Github test to use the legacy resolver.
2. `torch>=1.4.0` was still resolving to `torch==1.8.0`, so this diff forces a version downgrade to `torch<1.7.0` which is compatible with `neuralprophet`.
3. Subsequently, the `gpytorch` dependency was failing to install due to it requiring a more recent `torch` version, so this diff downgrades the required `gpytorch` version to `1.2.1` which will still accept `torch==1.6.0`.
4. The newest version of `ax-platform` has `botorch==0.5.1` as a requirement, and this version of `botorch` requires `gpytorch>=1.5.1`, which is incompatible with #3. As such, this diff forces `ax-platform==0.1.18`, which will install `botorch>=0.3.2`, which accepts `gpytorch>=1.2`.
5. Nonetheless, despite #4, sometimes `pip` would still install `botorch==0.5.1`, as it technically satisfies the `ax-platform==0.1.18` requirement of `botorch>=0.3.2`. As such, this diff explicitly adds `botorch==0.3.2` as a dependency and places it before the `ax-platform` installation, ensuring that the correct version of `botorch` is installed, thus allowing the dependencies to resolve.

Reviewed By: michaelbrundage

Differential Revision: D31044893

fbshipit-source-id: 9152fe04da199dd0061472ea60b302ed3945238f
albeus pushed a commit to EMBL-EBI-TSI/cpa-bioexcel-cwl that referenced this issue Sep 21, 2021
Toil setup is incompatible with recent python versions. Moreover python
is currently suffering of an issue in its depencency resolver.

Errors fixed:

* error in rdflib-jsonld setup command: use_2to3 is invalid
  See: https://stackoverflow.com/questions/69100275/error-while-downloading-the-requirements-using-pip-install-setup-command-use-2
* pip stalling in resolving toils dependencies
  See: pypa/pip#9215

Changes:

* Fixed setuptools and other packages versions (using Pipfile) to overcome
  the "use2to3 error".
* Removed toil role (it has one task only). Now toil is installed
  togheter with cwl using a common pipenv configuration.
gnaponie added a commit to gnaponie/freshmaker that referenced this issue Oct 8, 2021
This is the preferred way. Additionally it helps prevents this pip
issue: pypa/pip#9215

To use pip-compile, pip-tools is required:
https://github.com/jazzband/pip-tools/

Versions will be automatically updated by dependabot.

Signed-off-by: Giulia Naponiello <[email protected]>
gnaponie added a commit to gnaponie/freshmaker that referenced this issue Oct 11, 2021
This is the preferred way. Additionally it helps prevents this pip
issue: pypa/pip#9215

To use pip-compile, pip-tools is required:
https://github.com/jazzband/pip-tools/

Versions will be automatically updated by dependabot.

Signed-off-by: Giulia Naponiello <[email protected]>
gnaponie added a commit to gnaponie/freshmaker that referenced this issue Oct 11, 2021
This is the preferred way. Additionally it helps prevents this pip
issue: pypa/pip#9215

To use pip-compile, pip-tools is required:
https://github.com/jazzband/pip-tools/

Versions will be automatically updated by dependabot.

Signed-off-by: Giulia Naponiello <[email protected]>
@pradyunsg pradyunsg added C: dependency resolution About choosing which dependencies to install and removed C: new resolver labels Oct 12, 2021
gnaponie added a commit to redhat-exd-rebuilds/freshmaker that referenced this issue Oct 13, 2021
This is the preferred way. Additionally it helps prevents this pip
issue: pypa/pip#9215

To use pip-compile, pip-tools is required:
https://github.com/jazzband/pip-tools/

Versions will be automatically updated by dependabot.

Signed-off-by: Giulia Naponiello <[email protected]>
@Gallaecio
Copy link

Gallaecio commented Feb 4, 2022

I think #10481 might have fixed this issue, as suggested in #10201 (comment)

@pradyunsg
Copy link
Member

Indeed! I think we just missed this one while triaging to close the rest. Thanks for the comment @Gallaecio!

See #10201 (comment) for guidance, if you're still seeing this issue -- notably, please file a new issue with clear reproduction steps so that we can investigate the reported behaviour.

odl-github pushed a commit to opendaylight/releng-builder that referenced this issue Feb 20, 2022
Add workaround to pin odltools to the latest.
Changes in the new dependency resolver in 21.3
is likely the casue of downloading multiple versions
of packages. 

Issue-ID: LF-JIRA IT-23648
Ref: pypa/pip#9215
Change-Id: Ic40714d449d6c94ec9b8660afc6ad4135e778441
Signed-off-by: Anil Belur <[email protected]>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 7, 2022
@pradyunsg pradyunsg removed the state: needs eyes Needs a maintainer/triager to take a closer look label Dec 9, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: dependency resolution About choosing which dependencies to install
Projects
None yet
Development

No branches or pull requests