Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some pip dependencies optional requirements are now hard in devendored pip #8916

Closed
FFY00 opened this issue Sep 24, 2020 · 8 comments
Closed

Comments

@FFY00
Copy link
Member

FFY00 commented Sep 24, 2020

Environment

  • pip version: latest
  • Python version: 3.8
  • OS: Arch Linux

Description
Seems like #8391 broke devendored pip by making some optional dependencies of pip dependencies hard.

pip seems to be trying to import all modules from the dependencies

if DEBUNDLED:

Which includes modules that would only get imported if you were using an optional feature. Previously, pip would ignore import errors, but #8391 changed that.

I am not sure what would be the best fix here, adjust the module pip imports or revert/add validation to #8391.
It would also be really good to have a test for this, to prevent it from happening again. Actually, having proper integration tests to verify that pip is working when devendored would be great.

This has been blocking pip from being updated in Arch Linux. Although no issue was opened, @felixonmars did point it out in #5354, but it apparently fell on deaf ears.

I have advised for #8391 to be reverted in Arch Linux's pip to allow us to update but I would like a proper solution from the upstream.

Expected behavior
This should not happen 😛
No optional dependencies for pip dependencies should be required when pip is devendored.

How to Reproduce

  1. Install a devendored pip
  2. Run it without python-ntlm

Output

As pointed out in #5354 (comment)

Traceback (most recent call last):
  File "pip_sphinxext.py", line 11, in <module>
    from pip._internal.cli import cmdoptions
  File "/build/python-pip/src/pip-20.2/src/pip/_internal/cli/cmdoptions.py", line 23, in <module>
    from pip._internal.cli.progress_bars import BAR_TYPES
  File "/build/python-pip/src/pip-20.2/src/pip/_internal/cli/progress_bars.py", line 7, in <module>
    from pip._vendor import six
  File "/build/python-pip/src/pip-20.2/src/pip/_vendor/__init__.py", line 83, in <module>
    vendored("requests.packages.urllib3.contrib.ntlmpool")
  File "/build/python-pip/src/pip-20.2/src/pip/_vendor/__init__.py", line 33, in vendored
    __import__(modulename, globals(), locals(), level=0)
  File "/usr/lib/python3.8/site-packages/urllib3/contrib/ntlmpool.py", line 9, in <module>
    from ntlm import ntlm
ModuleNotFoundError: No module named 'ntlm'
@xavfernandez
Copy link
Member

I guess, one could add an optional argument to `vendored() function.

Something along the lines of:

diff --git a/src/pip/_vendor/__init__.py b/src/pip/_vendor/__init__.py
index 581db54c8..17053473f 100644
--- a/src/pip/_vendor/__init__.py
+++ b/src/pip/_vendor/__init__.py
@@ -26,17 +26,18 @@ WHEEL_DIR = os.path.abspath(os.path.dirname(__file__))
 # Define a small helper function to alias our vendored modules to the real ones
 # if the vendored ones do not exist. This idea of this was taken from
 # https://github.com/kennethreitz/requests/pull/2567.
-def vendored(modulename):
+def vendored(modulename, optional=False):
     vendored_name = "{0}.{1}".format(__name__, modulename)
 
     try:
         __import__(modulename, globals(), locals(), level=0)
     except ImportError:
-        # This error used to be silenced in earlier variants of this file, to instead
-        # raise the error when pip actually tries to use the missing module. 
-        # Based on inputs in #5354, this was changed to explicitly raise the error.
-        # Re-raising the exception without modifying it is an intentional choice. 
-        raise
+        if not optional:
+            # This error used to be silenced in earlier variants of this file, to instead
+            # raise the error when pip actually tries to use the missing module. 
+            # Based on inputs in #5354, this was changed to explicitly raise the error.
+            # Re-raising the exception without modifying it is an intentional choice. 
+            raise
     else:
         sys.modules[vendored_name] = sys.modules[modulename]
         base, head = vendored_name.rsplit(".", 1)
@@ -81,7 +82,7 @@ if DEBUNDLED:
     vendored("requests.packages.urllib3.connection")
     vendored("requests.packages.urllib3.connectionpool")
     vendored("requests.packages.urllib3.contrib")
-    vendored("requests.packages.urllib3.contrib.ntlmpool")
+    vendored("requests.packages.urllib3.contrib.ntlmpool", optional=True)
     vendored("requests.packages.urllib3.contrib.pyopenssl")
     vendored("requests.packages.urllib3.exceptions")
     vendored("requests.packages.urllib3.fields")

Or maybe simply removing vendored("requests.packages.urllib3.contrib.ntlmpool") line since pip does not use it.

@FFY00
Copy link
Member Author

FFY00 commented Sep 25, 2020

That is fine, but we should have tests to make sure it doesn't happen again. If we go with this approach, writing the tests might be complicated.

Consider the following use case:

pip depends on A and B
A 1.0 depends on C
B optionally depends on C (like the ntlm example)

If we do a simple test to see if pip works when devendored, it will pass, because A brings C.
Now let's say A 1.1 drops the C dependency, when we update it, devendored pip will break.

So, if we go with that approach, to have silent errors on cherry picked modules or simply not importing them, we need to test pip with only one devendored library at a time.

The problem here is that due this weird import management, imports are error prone and can easily get out of sync. AFAIK, requests.packages.urllib3.contrib.ntlmpool should have never been imported, it wasn't used anywhere, now or in the past.

This could be battled by having a CI check calculating the required imports. https://github.com/Instagram/LibCST could be of help here.

Or, my preferred approach, simply revert #8391. AFAIK no official PyPA pip distribution is devendored, the option is there for users like Linux distros, who are more than capable of doing the dependency management themselves.

Seems like the use case presented in #5354 is that the errors shown when someone tries to use an external devendored pip in a Python environment without the required dependencies can be a bit obscure. The approach taken here of just throwing the errors on the import is reasonable, but as pointed out, they break devendored pip. A better, although not perfect, approach would be to for eg. override sys.excepthook with a custom function that would say something like "Oh, there was an exception, there was also another exception earlier, we couldn't import a module, show import traceback but it was suppressed, that might be the cause of it :)"

cc @pradyunsg @McSinyx @deveshks

@pradyunsg
Copy link
Member

Reverting the said PR works for me.

I'm a strong -1 on having any tests for the devendored pip. It makes things even more complicated for us and expands our test matrix with basically no gain to us since we don't even support it and recommend that downstream repackagers keep things in a vendored state (see Fedora). Our tests already take over an hour locally and ~30 minutes on CI despite massive parallelization, and I'm not interested in any discussion that expands our matrix.

I'm not saying we'll go out of our way to break devendored pip, but the current state of "it's a small additional semi-supported module, to help keep downstream patches smaller" is one I'm interested in maintaining.

If there's some way to make this module work better for downstream Linux distros, I'm happy to hear about it. I don't think anyone else really cares about this module anyway. ;)

@FFY00
Copy link
Member Author

FFY00 commented Sep 25, 2020

I'm a strong -1 on having any tests for the devendored pip. It makes things even more complicated for us and expands our test matrix with basically no gain to us since we don't even support it and recommend that downstream repackagers keep things in a vendored state (see Fedora). Our tests already take over an hour locally and ~30 minutes on CI despite massive parallelization, and I'm not interested in any discussion that expands our matrix.

That can be mitigated. I would propose that the tests are only run when the vendoring machinery is touched or when the pip version is changed, which indicates a new release is coming.

The problem here is that you are in fact taking patches for the vendoring logic, and that has blocked 20.2, 20.2.1, 20.2.2 and 20.2.3 from ever hitting the Arch Linux repos. I understand that this might be a bit of extra work, but please consider the impact on end users, whom mostly get pip via distribution packages.

I don't think I am being unreasonable here with my proposal. Devendored pip is not a priority, and is provided as best-effort but I don't think that should block the addition of tests. I do agree that these tests shouldn't take much impact on pip's CI, and that's why I propose being smart about when you run them.

With that said, I do, however, wish that devendored pip was properly officially supported. As I said, most people consume pip via the distribution packages. If users aren't complaining to you, it's because they are complaining to us 🙃. But I understand that takes maintainer time. I could offer to help out, but I am not sure how much that is worth.

@pfmoore
Copy link
Member

pfmoore commented Sep 26, 2020

I 100% agree with what @pradyunsg said here.

The problem here is that you are in fact taking patches for the vendoring logic

I could interpret that as saying that you don't think we should take patches for the devendoring if we don't commit to testing them. That doesn't seem productive, so I'm not quite sure what you are suggesting is the problem here (unless you're just repeating the point that you'd prefer it if pip tested the devendoring).

Is there a reason why distributions can't do their own testing of the devendoring code on pip master? My immediate thought would be that having the test suite run by (for example) Fedora, on a Fedora Linux system, using a version of pip devendored using their devendoring scripts, would be a much better way of catching problems than us running some generic tests on whatever version of Ubuntu Github actions provides. I imagine it's mainly just a question of resources for the distributions, much the same as it is for pip.

@FFY00
Copy link
Member Author

FFY00 commented Sep 26, 2020

I could interpret that as saying that you don't think we should take patches for the devendoring if we don't commit to testing them. That doesn't seem productive, so I'm not quite sure what you are suggesting is the problem here (unless you're just repeating the point that you'd prefer it if pip tested the devendoring).

Yes, I find it a bit problematic that you take patches for how pip behaves when devendored but never actually test if it still works. To be fair, you did ask Fedora, but they don't even use this. @felixonmars did point out in the issue that the patch broke devendored pip but no one paid attention.

What I am saying is that I do not see how adding tests that could be disabled by default is a bad thing. Exactly what do you have to lose? I think the impact on the maintainer workload would be minimal, and I have already offered to help out if needed.

Is there a reason why distributions can't do their own testing of the devendoring code on pip master? My immediate thought would be that having the test suite run by (for example) Fedora, on a Fedora Linux system, using a version of pip devendored using their devendoring scripts, would be a much better way of catching problems than us running some generic tests on whatever version of Ubuntu Github actions provides. I imagine it's mainly just a question of resources for the distributions, much the same as it is for pip.

Why does the distribution used for the tests matter? You just need to get the dependencies into a virtual environment and install the devendored pip there.

There is no magic script that devendor pip, this is literally all that's needed:
https://github.com/archlinux/svntogit-packages/blob/4b34dd54233fbcf43ae39280ac38e2309ab7cfc8/trunk/PKGBUILD#L28

But anyway, I feel like I am being too pushy already. There is nothing I can do if you just don't want any tests in this repo. Thank you for your work and sorry for bringing this up again, I will refrain from doing it in the future ☹️

@FFY00 FFY00 closed this as completed Sep 26, 2020
FFY00 added a commit to FFY00/pip that referenced this issue Sep 26, 2020
…low-fix"

This reverts commit 7a60395, reversing
changes made to d3ce025.

It fixes devendored pip. See pypa#8916.

Signed-off-by: Filipe Laíns <[email protected]>
@pradyunsg
Copy link
Member

@felixonmars did point out in the issue that the patch broke devendored pip but no one paid attention.

Since you're bringing this up multiple times, yes, and it's super easy to miss a single notification from a merged PR. For context, I have over 75 notifications from pip/PyPA in the past 3 days.

For me personally, if it's closed/merged on pip's repository, and doesn't mention me, I do not look at it. Honestly, that's more because GitHub Notifications isn't great rather than anything else tho.

@FFY00
Copy link
Member Author

FFY00 commented Sep 27, 2020

Bad choice of words, I apologise. What I was trying to say here is that even with us pointing out that the change broke things, no action was taken, the issue was not resolved. This is not an attack on the maintainers, but rather it is pointing out the limitations, which is exactly why I think it is especially important to have tests in open source projects. Maintainer time is not guaranteed and may vary greatly on both long and short term. I think it is much better to be preventive when you have no guarantee you will have time to deal with the fallout. Anywayy, I said I was gonna drop it.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants