-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Allow custom bootstrap template in python_repository
#2032
base: main
Are you sure you want to change the base?
Conversation
This PR allows users to set custom `bootstrap_template` and `stage2_bootstrap_template` when registering python toolchains using `python_repository` and `python_register_toolchains`. Otherwise users need to reimplement the entire `python_repository` repo rule in order to pass different bootstrap templates to the underlying `py_runtime`.
Thanks for the PR. What is the usecase for overriding these templates? |
A couple of things:
|
Hm, I'm not sure how I feel about this feature. I'm not sure our repo rules having large APIs has really aged very well. This API would also need to be ported over to the bzlmod extension and have the appropriate is_root checks added to it. Going through the repo-rule also means the values have to be re-specified for each registration. Is that a good thing, or bad thing? IDK; answer doesn't seem obvious to me. An alternative would be to use a flag; you would set it in your bazelrc or command line.
Can you tell a bit more about your use case? This sounds like something py_binary et al should handle for you (e.g., if you depend on libfoo, the rules do the needful to ensure libfoo is on the ld search path).
Oh, wow, yeah, you'll probably need to be customizing something about the way a binary is run to do that. You might have a hard time, though; AFICT, Bazel doesn't have any support for a persistent test process. It looks like there used to be experimental support for it, but it was removed a couple years ago due to being buggy. Good luck :) |
Thanks for the detailed response! Setting bootstrap script through flags will work for us too. I am not super clear about how depending on cc_library would work. Our python tests depend on many native libraries at runtime (libtorch, libcudnn, libcudart, etc). Some of these so files are loaded through in-house repo rules that downloads and unpacks Debian packages; some other so files are loaded via custom python wheel builder rules (based on rules_pycross). All of these currently are passed as The test daemon service would behave just like normal python interpreter, except it delegates computation to another shared process. Afaik the biggest trouble with bazel's persistent test worker is related to test state pollution. But since we are doing batch already we are optimistic about this issue. |
BTW, I noticed an issue with stage2_bootstrap_template at
Because the extra path entries are appended, it seems impossible to overwrite packages provided by python itself, for example, a newer version of pip. Also, it also seems that python would honor the user packages under |
This is mostly WAI, for two reasons: The first is it better matches the sys.path of a regular python invocation. The order of sys.path is: PWD, PYTHONPATH, stdlib, site-packages. Things like pypi dependencies are, conceptually, in the position of site-packages. The second is it avoids issues with other directories masking stdlib directories, which results in errors that are very confusing. The main cause of this is repo directories being added to sys.path (which is still the unfortunate default behavior of rules_python)
I don't think pip is actually part of the stdlib? But it is part of site-packages. I think what's happening is, because all the pypi deps are appended, they end up after site-packages. Which, yeah, that would mean a pypi dependency on a newer pip wouldn't be respected. Hm, yeah, maybe we should change that ordering, so long as we keep them after the stdlib entry. I filed #2064 about this |
There is an extra complication - if I remember correctly, pip is included as part of the hermetic toolchain and the behaviour here could be related to that. It is only included on unix platforms, if I remember correctly.
Maybe we should remove it from the toolchain? I think remember a few people maybe mentioning that they patch the toolchains to remove pip.
…On 15 July 2024 10:51:49 GMT+09:00, Richard Levasseur ***@***.***> wrote:
> Because the extra path entries are appended, it seems impossible to overwrite packages provided by python itself
This is mostly WAI, for two reasons:
The first is it better matches the sys.path of a regular python invocation. The order of sys.path is: PWD, PYTHONPATH, stdlib, site-packages. Things like pypi dependencies are, conceptually, in the position of site-packages.
The second is it avoids issues with other directories masking stdlib directories, which results in errors that are very confusing. The main cause of this is repo directories being added to sys.path (which is still the unfortunate default behavior of rules_python)
> overriding pip
I don't think pip is actually part of the stdlib? But it is part of site-packages. I think what's happening is, because all the pypi deps are appended, they end up after site-packages. Which, yeah, that would mean a pypi dependency on a newer pip wouldn't be respected.
Hm, yeah, maybe we should change that ordering, so long as we keep them after the stdlib entry. I filed #2064 about this
--
Reply to this email directly or view it on GitHub:
#2032 (comment)
You are receiving this because your review was requested.
Message ID: ***@***.***>
|
Thanks for explaining. Yeah, using cc_library with a prebuilt shared library is part of the solution (this gets meta-data about the library into Bazel). The other half is generating linker paths (rpath, ld_library_path) that point to where those shared libraries end up in runfiles. Making that work is a bit of a ways out, though |
Before, users need to reimplement the entire
python_repository
repo rule in order to pass different bootstrap templates to the underlyingpy_runtime
.After, users can simply pass custom templates to
bootstrap_template
andstage2_bootstrap_template
when registering python toolchains usingpython_repository
orpython_register_toolchains
.