Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packages imported with pip/requirements infrastructure cause import issues #14

Closed
cpatrick opened this issue Sep 21, 2017 · 29 comments
Closed

Comments

@cpatrick
Copy link

For example, import google.auth blocks google.storage blocks google.bigquery, etc.

Here's a minimal reproduction: https://github.com/cpatrick/minimal_bazel_python_bug_repro.

Maybe I'm misunderstanding a feature or option I could be using?

@mattmoor
Copy link
Contributor

With zero context, I suspect this is caused by this issue, but can TAL to confirm after lunch.

Thank you for the repro!

@mattmoor mattmoor self-assigned this Sep 21, 2017
@mattmoor
Copy link
Contributor

Interestingly that does not seem like the same issue.

@duggelz the repro prints out sys.path, which seems to include the root of the bigquery whl's repository, so I can't understand why the binary isn't able to load it? Do you have any great insight here?

@mattmoor mattmoor assigned duggelz and unassigned mattmoor Sep 21, 2017
@cpatrick
Copy link
Author

cpatrick commented Sep 21, 2017

So, my initial guess is that since multiple packages contain a google root package and the approach taken is one directory per package added to sys.path, the import system finds the first one and doesn't search through the rest of sys.path.

A solution could be to combine all of the requirements-installed python packages into a single dist-packages-style folder, but that's a little bit scary...

I'm hoping y'all have better ideas. 😸

@mattmoor
Copy link
Contributor

@cpatrick Yeah, that would be trouble if that's what's going on.

If that's truly what's going on, yes we might be able to work around it in the pip rules, but it still represents a somewhat fundamental problem with how the imports=[] kwarg manifests (I assume it is the usage here that creates these sys.path entries)

Perhaps we need a better solution for declaring a library to be a part of the top-level namespace than imports=["."], and py_binary should handle it better than just stacking sys.path (perhaps it should blend these in the runfiles or something?).

Thinking out loud...

@cpatrick
Copy link
Author

Ah, thanks for the pointer and explanation. Yeah, I think that's what happens. the imports=["."] is a bit too much of a hammer.

Blending in runfiles as pip/setuptools does in site-packages feels like the option most consistent with the standard way of doing this. Looking at my virtualenv for the project that I'm bazel-ifying (or from the requirements in my repro-case) that's what you get--a single tree with google/auth, google/cloud/storage, google/cloud/bigquery, etc. all from different packages.

@duggelz
Copy link

duggelz commented Sep 21, 2017

The google package is complicated. Basically every package like google.cloud.foo, or google.protobuf, or google.tonsofotherstuff, defines its own "google" package, and the first one wins. So every package has to carefully construct its "google" package which does the right thing if it happens to be the "winner", which can easily change based on almost anything. This doesn't always happen. We don't have a central "owns the google package" authority unfortunately.

So, either 1) subpar doesn't support the right pkg_resources and pkgutil calls, which is completely possible, or 2) the "winning" google package happens to be one of the ones that doesn't correctly initialize itself in the maximally friendly way.

I'll look into it further.

@mattmoor
Copy link
Contributor

Here it is just py_binary, not even PAR :-/

@cpatrick
Copy link
Author

cpatrick commented Oct 3, 2017

Hi folks. Any update on this? Let me know if there's anything I can do to help reproduce or even contribute to fixing the issue.

@jpoehnelt
Copy link

Any workarounds for this?

@mattmoor
Copy link
Contributor

@damienmg Anyone from the Bazel/Python side we can include here?

@damienmg
Copy link

damienmg commented Oct 18, 2017 via email

@drigz
Copy link
Contributor

drigz commented Nov 1, 2017

The issue appears to be linked to the presence of empty __init__.py files inside the various google/ directories. These don't do the "right thing" described by @duggelz, which involves modifying __path__ so that the other directories are searched for subpackages.

By modifying the runfiles after the build (specifically, replacing those files with this one) I could get the google modules to import (although it still fails with No module named concurrent.futures, which I haven't looked into yet).

The empty __init__.py files are created by Bazel if your package doesn't have __init__.pys already. Maybe we need an optional flag for py_library that marks the library as a "namespace library" and uses puts the right thing into __init__.py? Or pip_import could create these files?

drigz added a commit to drigz/rules_python that referenced this issue Nov 3, 2017
This appears to fix bazelbuild#14, but I haven't tested extensively to see if it
breaks anything else.
drigz added a commit to drigz/minimal_bazel_python_bug_repro that referenced this issue Nov 3, 2017
- workaround bazelbuild/rules_python#14 in rules_python
- workaround bazelbuild/rules_python#30 by installing `futures`
- update deps to fix an incompatibility in google.cloud.exceptions
@drigz
Copy link
Contributor

drigz commented Nov 3, 2017

I was able to work around the issue in 5f78b4a, which generates __init__.py files with namespace support. However, I'm not sure this is the right path, and I haven't tested extensively yet.

The bit I like least is hardcoding that googleapis-common-protos' empty google/api/__init__.py should be overwritten.

drigz/minimal_bazel_python_bug_repro@9644328 shows the changes necessary to use this, if anyone else is interested in trying it out.

@duggelz
Copy link

duggelz commented Nov 3, 2017

My description above was actually somewhat incorrect, since the namespace packages are handled in a different way than I thought. I'm experimenting with a solution.

drigz added a commit to drigz/rules_python that referenced this issue Nov 6, 2017
This appears to fix bazelbuild#14, but I haven't tested extensively to see if it
breaks anything else.
@duggelz
Copy link

duggelz commented Nov 16, 2017

I haven't forgotten about this, but it's a little tricky.

@drigz
Copy link
Contributor

drigz commented Nov 20, 2017

No problem, I've been getting on OK with the workaround.

I'm curious; what approach are you trying? Will it link/copy the files into a single site-packages style tree?

hutchk added a commit to hutchk/rules_python that referenced this issue Nov 27, 2017
in order to address Issue bazelbuild#14
nikhaldi added a commit to nikhaldi/rules_python that referenced this issue Dec 13, 2017
Copied from hutchk@10d8f0f

This is a hack and should be superseded with whatever real solution
people come up with for bazelbuild#14
nikhaldi added a commit to nikhaldi/rules_python that referenced this issue Jan 9, 2018
Copied from hutchk/rules_python@10d8f0f

This is a hack and should be superseded with whatever real solution
people come up with for bazelbuild#14
@drigz
Copy link
Contributor

drigz commented Jan 15, 2018

@duggelz Would you like me to create a pull request with my workaround? Looks like a couple of users have cherry-picked it.

@mouadino
Copy link

mouadino commented Jan 28, 2018

I found this issue in the google group thread here https://groups.google.com/forum/#!topic/bazel-discuss/NMO6KPyPKh4 but isn't this related to this issue #55? (sorry didn't see this issue before I created mine) but there I describe what's the root cause and what we can do about it, I also have a workaround there too.

@duggelz
Copy link

duggelz commented Feb 16, 2018

This is indeed a dup of #55. However, your writeup in #55 is excellent.

drigz added a commit to drigz/rules_python that referenced this issue Apr 6, 2018
This appears to fix bazelbuild#14, but I haven't tested extensively to see if it
breaks anything else.
@brian-peloton
Copy link

What's the current status on this? Between all the activity here and on #55, I'm kind of lost. However, it does appear to still be a problem for some situations.

I managed to work around it with a combination of legacy_create_init = False and creating a separate __init__.py for the namespace package which manually assigns __path__ with hard-coded absolute paths. That's really ugly though.

My specific problem is with zope-interface, which doesn't have a zope/__init__.py file at all, which means pkgutil.extend_path doesn't even pick it up. I'm not quite sure how it's supposed to work with python2 actually...

@drigz
Copy link
Contributor

drigz commented Jun 20, 2018

@brian-peloton My (incomplete) understanding is that this will be resolved as part of Operation Purple Boa but a near-term solution is not planned. https://github.com/TriggerMail/rules_pyz offers an alternative set of rules that may work better for you than the current rules_python.

#55 has more information (especially on the importance of .pth files for zope) but was closed after the introduction of legacy_create_init.

@lberki
Copy link
Collaborator

lberki commented Jun 24, 2018

Adding @brandjon, the new proud owner of our Python support.

@ali5h
Copy link

ali5h commented Jan 10, 2019

using backports.functools_lru_cache seems to cause the same issue. it uses pkgutil.extend_path

drigz added a commit to drigz/rules_python that referenced this issue Feb 21, 2019
This appears to fix bazelbuild#14, but I haven't tested extensively to see if it
breaks anything else.
@rodrigoaliste
Copy link

rodrigoaliste commented Feb 24, 2019

So, just to be clear, there's no current support for using protobuf and google-cloud packages with these rules?

@totorovirus
Copy link

@rodrigoaliste seems so

drigz added a commit to drigz/rules_python that referenced this issue Nov 19, 2019
This appears to fix bazelbuild#14, but I haven't tested extensively to see if it
breaks anything else.
alexeagle pushed a commit to alexeagle/rules_python that referenced this issue Aug 19, 2020
@alexeagle
Copy link
Collaborator

rules_python 0.1.0 has been released which upstreams the rules_python_external repo. Please switch from pip_import to pip_install which doesn't have this issue.

@MushuEE
Copy link

MushuEE commented Oct 30, 2020

@alexeagle chances there is an example of that switch? I am new to bazel and am having issue solving this.

@alexeagle
Copy link
Collaborator

Yeah look at examples in this repo. Sorry no one has had time to write a migration guide. Maybe @AutomatedTester can help

@AutomatedTester
Copy link
Contributor

I can maybe have a look at docs next week but @MushuEE but SeleniumHQ/selenium@40689b3 is my commit that moved us to 0.1.0 rules to get you moving sooner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests