Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Use -P to enable safe path semantics instead of PYTHONSAFEPATH #2122

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

allsey87
Copy link

@allsey87 allsey87 commented Aug 15, 2024

Setting PYTHONSAFEPATH has an unintentional side effect of propagating to child processes/non-py_binary instances of the Python interpreter that rely on the non-safepath behavior. The best example of this is the Meson build system whose support is included in rules_foreign_cc which will start subprocesses and sub-interpreters for tools such as Cython and the Emscripten compiler.

This PR changes the bootstrap script to use -P instead which will not have the side effect of propagating to child processes. Since versions of Python lower than 3.11.x do not support this argument, a check is necessary to prevent older interpreters from choking on this argument.

This solves the XY problem raised in issue #2060.

Copy link
Collaborator

@rickeylev rickeylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, checking sys.version in the bootstrap isn't correct: the Python used to run the bootstrap may not be the same Python used to run the user program.

What you'll need to is check the Python version in the rule code and modify the templates to have a list of interpreter args to use.

The gist of changes needed are:

Check the target Python version in the rule code:

In
python/private/common/py_executable_bazel.bzl#_create_executable, consult runtime_details.effective_runtime.interpreter_version_info. This will tell you what version Python is being built for.

Note that the value may be None, which means we don't know, at build time, what version is going to be used. I guess just fallback to using the environment variable in that case.

Modify the bootstrap template files:

  • python/private/stage1_bootstrap_template.sh: Add INTERPRETER_OPTS_CSV="%interpreter_opts_csv%" around L16; Add shell code to add the comma-separated values to the interpreter_args array down by L103. I think you want something like IFS=, read -ra foo <<$var (I just got that from the first stackoverflow hit).
  • python/private/python_bootstrap_template.txt: Add `_INTERPRETER_OPTS_CSV = "%interpreter_opts_csv%" around L25. Remember to update both the coverage and non-coverage code paths to insert the interpreter opts before the regular opts.
  • python/private/zip_main_template.py: Add `_INTERPRETER_OPTS_CSV = "%interpreter_opts_csv%" around L28. Down around L211 is where the full argv is built and the interpreter opts need to be inserted at.

The stage2 template doesn't need to be modified; it run by the target interpreter, so it's too late for it to try and affect opts/envvar for the interpreter itself.

Update the rule code to pass the interpreter options to template expansion. In py_executable_bazel.bzl:

  • _create_stage1_bootstrap: this handles expansion of stage1_bootstrap_template.sh and python_bootstrap_template.txt
  • _create_zip_main: this handles expansion of zip_main_template.py

In both cases, pass in a csv list of interpreter args.

I tried to push a basic test to your repo/branch to help, but git seems to have other ideas about where to push to. I pushed to my personal repo, instead; Here's the diff with my extra commit: main...rickeylev:rules_python:use_p_for_safepath

[Here's patch of just my commit](https://github.com/bazelbuild/rules_python/commit/6c083cd82c75e8f20af859e30cafea34cdc838b6.patch

HTH

@aignas
Copy link
Collaborator

aignas commented Aug 16, 2024 via email

@allsey87
Copy link
Author

How does your bazelrc look like? Check out our docs on the flags we support to enable the two stage bootstrap which is not yet the default.

On 16 August 2024 12:52:05 EEST, Michael Allwright @.> wrote: Sorry for what is perhaps a stupid question, but when using rules_python to set up a py_binary it seems that the file python/private/stage1_bootstrap_template.sh is never executed. I can literally delete its contents (as well as running bazel clean --expunge and nuking .cache/bazel) and rebuild and get the same result... Is this file perhaps not used on Linux? My modules.bazel file looks like this: starlark bazel_dep(name = "rules_python", version = "0.33.1") # use my locally checked out version of rules_python local_path_override( module_name = "rules_python", path = "rules_python" ) python = ***@***.***_python//python/extensions:python.bzl", "python") python.toolchain(python_version = "3.12.1", is_default = True) use_repo(python, python_3_12_1 = "python_3_12_1") -- Reply to this email directly or view it on GitHub: #2122 (comment) You are receiving this because your review was requested. Message ID: @.>

Yep I figured this out. I originally missed the is_script_bootstrap_enabled config setting. I have run out of time to work on this today and unfortunately will be abroad for a couple weeks. After which I will pick this up again and try to finish it off.

@allsey87 allsey87 closed this Aug 26, 2024
@allsey87 allsey87 force-pushed the use_p_for_safepath branch from f74ac9b to 2f46873 Compare August 26, 2024 11:14
@allsey87 allsey87 reopened this Aug 26, 2024
@allsey87
Copy link
Author

@rickeylev this PR is ready for review again. I mostly followed your instructions with two exceptions:

  1. I used a space-separated string for interpreter_opts since a comma-separated string would break things like --option=value1,value2
  2. I do not set PYTHONSAFEPATH if the interpreter version cannot be detected.

The reasoning for (2) is I believe it leads to less unexpected behavior. For example, it would be strange and hard to debug if changing the configuration of rules_python broke someone's build system due to PYTHONSAFEPATH being silently switched on. The counter argument is that not setting this environment variable may hide bugs in py_binary that try to access code from unsafe paths. However, the latter was already the case when PYTHONSAFEPATH was set and the Python version was less than 3.11, hence I feel that overall this solution has the least amount of surprising behavior.

@allsey87 allsey87 requested a review from rickeylev August 26, 2024 14:22
@aignas
Copy link
Collaborator

aignas commented Aug 26, 2024

Thanks for the update. Could you add a test to check that a python interpreter launched from within a python interpreter can still access to the same sitepackages. Right nowwe are only checking that we are not regressing, but this PR technically adds a new feature that allows starting new python interpreters from within a py_binary targets and those python processes will share the pythonpath with the parent.

#2076 did not allow this behaviour, if I understand it correctly.

As for reverting #2076, I would love to just have the same behaviour as regular python interpreter. So if reverting brings us closer, I'd be +1 to that, but right now as I understand, we don't need to do that.

@allsey87
Copy link
Author

@aignas it's a little bit unclear to me what this new test would actually be testing. Would you mind elaborating a bit on what you had in mind. For example, what would count as passing the test vs. failing the test?

To the best of my understanding, the site packages will be the same since (1) rules_python uses PYTHONPATH to set the site packages and (2) the child process/python interpreter will inherit the PYTHONPATH environment variable. I believe that safe-path mode should not impact this behavior.

I just checked the changes from #2076 and can confirm that they are already effectively reverted by this PR and that the test from #2076 is updated and used in this PR.

@aignas
Copy link
Collaborator

aignas commented Aug 31, 2024

Sorry for the delay in response. I thought that this PR would help with #2169, but thinking about it one more time, I am not sure. If you have time to experiment, feel free to do so in this PR, but we can look into this later.

@allsey87
Copy link
Author

allsey87 commented Sep 2, 2024

@aignas @rickeylev is it possible to get this reviewed and merged? As I understand, issue #2169 only applies to bootstrap_impl=script which is a new approach and which is not enabled by default (for now). Moreover, the propagation of PYTHONPATH to subprocesses is only tangentially related to this PR since using PYTHONSAFEPATH or -P has no impact on PYTHONPATH.

Copy link
Collaborator

@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for umplementing this.

Whilst this technically makes some setups not propagate the safepath env var, which could result in different behaviour, that in general should only happen inside a safe-pathed interpreter, so this change in behaviour is fine.

since it is unintentionally inherited by subprocesses. Instead, `-P` is passed to the runtime
interpreter given its version can be detected and is greater than or equal to 3.11. If the
runtime interpreter can not be detected or the version is less than 3.11, safe-path mode is not
enabled.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be only true for bazel 7 and above because for bazel 6 the starlark implementation of the rules is not used. Maybe worth pointing this out in the changelog.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for bazel 6 the starlark implementation of the rules is not used

Just for clarification, do you mean that py_binary and py_library are using the inbuilt Bazel implementations in versions below Bazel 7?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, yes. It is possible to specify the rules_python template to be used in bazel 6, but I am not sure how feasible it is to do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can revise the wording here after which I can push a commit:

  • (rules) In Bazel 7 and above, the PYTHONSAFEPATH environment variable is no longer used to enable safe-path mode since it was unintentionally inherited by subprocesses. Instead, -P is passed to the runtime interpreter given its version can be detected and is greater than or equal to 3.11. If the runtime interpreter can not be detected or the version is less than 3.11, safe-path mode is not enabled. Bazel 6 and below do not support the safe-path feature.

Do you think that is concise enough?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bazel 6 and bellow will continue using the embedded python_bootstrap_template.txt from the bazel repository and are unnafected by the change.

could be better wording.

That said I would like @rickeylev to correct me here if I am misremembering the state of bazel 6 and the python bootstrap template usage.

@allsey87
Copy link
Author

allsey87 commented Sep 5, 2024

@rickeylev is there anything blocking this that I can work on?

@aignas
Copy link
Collaborator

aignas commented Oct 14, 2024

I think this got lost in the stream of emails/notifications, a ping to come back to this. :)

@rickeylev
Copy link
Collaborator

I'm really torn about this. I do think using -P is the right thing, but switching to it is almost certainly going to break something. I think we'll just have to accept that, because I don't see a way to work around it. If we have to, we can add a global flag to also set the environ variable.

Ignoring that, two main issues I see with the PR:

(1) Bazel 6 support and python_bootstrap_template.txt -- I don't see how this could be working? I would expect CI to show a failure. In Bazel 6, it's going to use the builtin py_runtime object, but point to rules_python's python_bootstrap_template.txt. Since the builtin py_runtime isn't expanding %interpreter_opts%, that should result in e.g. python.exe %interpreter_opts% blabla, and thus fail.

(2) This PR changes behavior for the "we don't know the version" case, which is, unfortunately, the default situation for workspace builds. This is because workspace builds default to using @bazel_tools//tools/python:autodetecting_toolchain, which define a builtin py_runtime object that won't have any of the version information. I think what we should do is, if the runtime doesn't have the version, check the @rules_python//python/config_settings:python_version flag (ctx.attr._python_version_flag). I think it's reasonable to assume such a situation is the builtin py_runtime case (rules_python py_runtime will use that flag value). This will give users a way to cause the right logic to happen, if necessary.

@aignas
Copy link
Collaborator

aignas commented Dec 3, 2024

Is this still needed given the latest devolpments and fixing of #2409?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants