Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the "Trusted Publishing" flow directly in twine #999

Open
di opened this issue Jun 7, 2023 · 21 comments
Open

Support the "Trusted Publishing" flow directly in twine #999

di opened this issue Jun 7, 2023 · 21 comments
Labels
feature request security Issue that is related to security features of twine or PyPI

Comments

@di
Copy link
Member

di commented Jun 7, 2023

Last month PyPI added support for "Trusted Publishing": https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/. This uses an credential exchange flow to swap OIDC credentials with a short-lived PyPI API token, removing the need to maintain long-lived credentials/secrets. (More technical details on the exchange are here: https://docs.pypi.org/trusted-publishers/internals/)

We were able to do launch this without modifying twine directly because we instead updated https://github.com/pypa/gh-action-pypi-publish to do the token exchange prior to invoking twine upload.

That said, we're actively working on expanding support to other providers for which we don't have "canonical" workflows (such as pypi/warehouse#13551, pypi/warehouse#13575, and pypi/warehouse#13888).

For these providers (as well as for GitHub users who prefer not to use the pypi-publish GitHub action for whatever reason) it would be preferable to have twine be able to support the OIDC/API token exchange directly.

This would ideally include:

  1. detecting whether an "ambient" OIDC token is available at the time twine upload is run (this can be handled by https://pypi.org/p/id, so twine doesn't need to understand how to do this for N different providers)
  2. verifying that the OIDC token is acceptable for use with PyPI (this just requires that the aud claim of the token is set to the audience of the index that twine is attempting to upload to)
  3. exchanging the OIDC token for a PyPI API token (this is essentially the same as what the pypi-publish workflow does here: https://github.com/pypa/gh-action-pypi-publish/blob/110f54a3871763056757c3e203635d4c5711439f/oidc-exchange.py)

The alternative to twine supporting this directly is that the end user has to perform a fairly manual token exchange themselves directly (https://docs.pypi.org/trusted-publishers/using-a-publisher/#the-manual-way) which would be pretty bad UX overall.

Opening this issue to make sure maintainers are on board with the general idea before any implementation is started, please let me know if you have any thoughts.

(cc @woodruffw)

@di di added feature request security Issue that is related to security features of twine or PyPI labels Jun 7, 2023
@dstufft
Copy link
Member

dstufft commented Jun 7, 2023

I wonder if it makes sense to make authentication pluggable in some way? I'm pretty sure that twine right now assumes that every repository is authenticated with a username/password over basic auth. Things like API Tokens from PyPI get shoehorned into this by using a dummy username, but it would be nice to support any auth flow that a repository might want to implement.

@sigmavirus24
Copy link
Member

Yes, right now a Repository expects username/password

password: Optional[str],
but we could easily refactor that, since no one has picked up on that being a publicly available API. Regardless, we can make an authentication parameter that accepts a pluggable interface. I haven't looked at the interfaces PyPI is expecting, but if this is intended to be contributed, then I'm not sure I or Brian need to care too much.

@woodruffw
Copy link
Member

Thanks for the ping! I agree with the approach that @di has outlined.

Things like API Tokens from PyPI get shoehorned into this by using a dummy username, but it would be nice to support any auth flow that a repository might want to implement.

I agree with this in principle, although that might be a premature optimization/generalization for this particular use case (since the trusted publishing flow ultimately produces a normal looking PyPI API token). But there may be other indices that I'm not aware of that would benefit from pluggable auth methods?

@dstufft
Copy link
Member

dstufft commented Jun 7, 2023

PyPI itself would benefit from not having to use the BasicAuth hack anymore :)

But more generally, most of the cloud providers don't use long lived secrets that are usable directly with twine, and they expect that you run some command to get a short lived token currently or they "fake" a pluggable auth method by making a keyring backend that doesn't actually do keyring stuff, but that does their auth flow.

AWS

$ aws codeartifact login --tool twine --domain my_domain --domain-owner 111122223333 --repository my_repo
$ twine upload -r codeartifact dist/*

Google

# Deps
$ pip install keyring
$ pip install keyrings.google-artifactregistry-auth

# Teach Keyring where your auth file is
export GOOGLE_APPLICATION_CREDENTIALS=KEY-FILE
# OR
gcloud auth activate-service-account --key-file=KEY-FILE

# Upload
twine upload -r $PYTHON-REPO-ID dist/*

Azure

$ pip install artifacts-keyring
$ twine upload --repository-url https://pkgs.dev.azure.com/<org_name>/_packaging/<feed_name>/pypi/upload dist/*

So I think in principle there's already a need for this (and honestly the same is true for pip), and currently people are working around the lack of pluggable authentication by treating the keyring library as that pluggable authentication.

Using the keyring library works, but it's kind of an abuse of the keyring API to do that. Keyrings are supposed to support get, set, and delete for passwords, but all these things (Trusted Publishing included) really care about is supporting get, and the other operations don't make sense and do nothing.

@webknjaz
Copy link
Member

I've been wondering if it makes sense to have vendor-specific OIDC in twine ever since it was implemented in the action… I think that this might not belong in twine because this is something specific to runtimes where twine is used, not twine itself. Implementing it here would introduce the burden of having to track ways of identifying if something is a GitHub or GitLab Runner, or some AWS env etc., and actively monitoring changes in those. Besides, twine calls are often wrapped via other command runners like tox. Tox screens the environment variables by default, making the auto-detection broken, putting the burden of knowing upfront that they must to pre-configure tox to pass specific environment variables that twine needs (and knowing what those are!) in order for it to work. I imagine other command wrappers may behave similarly.

Keeping it in a GitHub Action, though, eliminates said burdens (and it also allows us to push the best practices to the wider community). I know that for GitLab, it'd also make sense to make some reusable pipeline.

The idea with plugins is interesting, but why would one install such a plugin locally? They'd probably need to add their own env detection to install it conditionally, since it'd only be really needed @ GHA.

I don't have very strong opinions on this, but I wanted to post the above thoughts to see if anybody else thinks that maybe having this inside twine might be not what people really need…

@tschm
Copy link

tschm commented Aug 17, 2023

This may help:

https://github.com/marketplace/actions/pypi-token-mint

Gets the token and nothing else...

@abravalheri
Copy link

Keeping it in a GitHub Action, though, eliminates said burdens (and it also allows us to push the best practices to the wider community). I know that for GitLab, it'd also make sense to make some reusable pipeline.

How about people that don't use GitHub actions? Would it be possible for twine to integrate with other CI providers that offer OIDC capability?

@webknjaz
Copy link
Member

How about people that don't use GitHub actions? Would it be possible for twine to integrate with other CI providers that offer OIDC capability?

Each OIDC provider will have to be special-cased. So if an ecosystem has standard means of implementing this, I probably wouldn't drag that into twine.
OTOH, plugins seems like a reasonable middle ground.

@webknjaz
Copy link
Member

webknjaz commented Aug 23, 2023

This may help:

github.com/marketplace/actions/pypi-token-mint

Gets the token and nothing else...

The example there looks harmful security-wise — it puts the build step in a job with OIDC privileges. If you follow the security best practice and build in a separate job, you won't really need poetry for publishing. Hence, no need for the action as pypi-publish has that embedded.

@woodruffw
Copy link
Member

Would it be possible for twine to integrate with other CI providers that offer OIDC capability?

FWIW, this is the idea behind id (https://github.com/di/id) (which is what the current official publish action uses under the hood): it abstracts the OIDC token retrieval operation across any of its supported providers.

In principle, twine should be able to depend on id and do id.detect_credential(audience="pypi") to obtain an OIDC credential in a platform-agnostic manner.

@webknjaz
Copy link
Member

to obtain an OIDC credential in a platform-agnostic manner.

But the calling code would still have to special-case working with different platforms, right?

@woodruffw
Copy link
Member

But the calling code would still have to special-case working with different platforms, right?

Yes, good point: at the moment the token exchange needs to be done on a provider-specific endpoint (e.g. /_/oidc/github/mint-token/ for GitHub). It's possible we could change that in the future (and perform OIDC validation against all known IdPs rather than expecting a match on a specific pre-identified one), however.

@webknjaz
Copy link
Member

It's possible we could change that in the future (and perform OIDC validation against all known IdPs rather than expecting a match on a specific pre-identified one), however.

That does sound excessive, though.

@woodruffw
Copy link
Member

Yeah, potentially 🙂 -- I'll need to think about it some more, but doing so shouldn't be too heavyweight: all of these OIDC tokens are keyed on a keyid anyways, so fanning across multiple different potential verifiers should be on the same order of complexity as a single provider with multiple keys (which we already support).

Either way, that's all hypothetical 🙂

@abravalheri
Copy link

abravalheri commented Aug 23, 2023

I am trying to understand here what are the interoperability challenges (sorry, last time I used openID connect was ages ago in a very specific context). Any pointers on that? As a standard it should be able to offer some form of interoperability...

Is it a matter of parametrising the endpoints and/or asking for the user to provide the token (with the audience set to pypi) as input?1

I tried to have a look on the links in the announcement post (e.g. https://docs.pypi.org/trusted-publishers/) but they are either too generic or too Github-centric, so I am not sure I understand...

Footnotes

  1. I mean: is the interoperability problem only the "automatic detection"?

@abravalheri
Copy link

Thank you very much @webknjaz, after reading it seems to me that PyPI for the moment is not supporting generic "trusted publishers" with OIDC, but rather "one trusted publisher" which is GitHub. If that is the case, it does seem that concentrating in the GitHub actions is the best. (I might be wrong, but that is what seems to me, specially after going through https://docs.pypi.org/trusted-publishers/using-a-publisher/#the-manual-way).

@woodruffw
Copy link
Member

On top of what @webknjaz said: the primary interop problem with OIDC is that each IdP has a lot of control over the claims that it exposes, so PyPI is responsible for papering over the differences between them in a way that's coherent/useful for end users.

But yeah, GitHub is really the only currently supported IdP (support for others is currently in progress).

@di
Copy link
Member Author

di commented Aug 23, 2023

It's possible we could change that in the future (and perform OIDC validation against all known IdPs rather than expecting a match on a specific pre-identified one), however.

I think we could do that (likely based on the iss claim alone, have a provider-specific endpoint is unnecessary).

@webknjaz
Copy link
Member

@abravalheri here's some other OIDC integrations scheduled FYI: https://github.com/pypi/warehouse/issues?q=is%3Aopen+label%3Atrusted-publishing+sort%3Aupdated-desc.

coldfix added a commit to coldfix/udiskie that referenced this issue Nov 23, 2023
In order to support trusted publishing.

I believe twine doesn't support it yet, see:

pypa/twine#999
coldfix added a commit to coldfix/udiskie that referenced this issue Nov 23, 2023
In order to support trusted publishing.

I believe twine doesn't support it yet, see:

pypa/twine#999
@woodruffw
Copy link
Member

Just to post an update here: PyPI now supports 4 different trusted publisher providers (GitHub, GitLab, Google, and ActiveState), and the token minting endpoint is now fully generic (/_/oidc/github/mint-token/ is now /_/oidc/mint-token/). So this should be relatively easy to add to twine in a generic fashion 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request security Issue that is related to security features of twine or PyPI
Projects
None yet
Development

No branches or pull requests

7 participants