-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for license expressions #707
base: main
Are you sure you want to change the base?
Conversation
Do we have actually need to have the validation/canonicalization done unconditionally? Creating a cyclic dependency between flit_core and packaging would really suck for Gentoo, and unless the non-canonicalized metadata is completely invalid and breaks everything, we'd rather see fewer moving parts for an end user who just need to install the package locally, rather than publish it on PyPI. |
cbcbb87
to
95a8099
Compare
From the PEP:
Hatchling uses |
The difference is that |
Also, I dare say that "build and publishing tools" are frontends, not backends. Though admittedly, it's not the first time where people are adding frontend work to backends, or pushing maintainer's work onto end users. |
|
|
Yeah, but in that case you could probably special case |
Thanks for looking into this! Flit does have a role in bootstrapping the packaging ecosystem, and as such it's useful to avoid dependencies, even vendored ones (since Linux distros in particular prefer to unvendor such things). We made an exception for tomli, since using TOML was standardised before it was in the stdlib, but I'd rather not make a pattern of this. At the same time, I do think it makes sense to validate this before producing a package. It's frustrating in some ways that we use the same pathways to make packages for publication as for immediate installation, but that's where we are, and there are upsides to this as well. I'd like to consider doing an independent implementation of SPDX validation & case normalisation. Besides preserving the zero-dependency status, this also means the ecosystem has two versions used in earnest to compare, rather than everyone doing the same thing. In a similar vein, Flit implements its own parsing of the PEP 621 It looks like this is tractable for license expressionss - at a glance, packaging has done it in about 150 lines, not counting the data of license names. @mgorny I take it you'd be happy with that? @cdce8p are you interested in working on that, or shall I have a go? |
1da4db3
to
aa16d4d
Compare
I understand your argument. Just skeptical a custom implementation wouldn't look quite similar to the existing one in packaging. The main advantage IMO of using the vendored version would be that they likely thought of any special case and if there is a bug, it's fixed upstream for everyone.
I'm certainly interested in moving forward on the PEP 639 support. (Not sure I would be the best to write the custom validation though.) Maybe an option would be to defer the validation for a moment and circle back to it when all other things are in place, i.e. before moving to metadata version 2.4? I removed the vendored packaging dependency here and replaced it with If not, what path would you prefer? |
Yeah, that would work for me. Thanks! |
As I see it, PEP 639 is metadata 2.4 - I don't see any reason to separate adding support for license expressions and emitting metadata version 2.4. It looks like validation in the spec is only a SHOULD, not MUST, for build tools. So we could postpone that for now and just accept whatever people give us, leaving validation to PyPI on upload. I don't really like this idea, though, because if we add validation at some point in the future, previously accepted input could suddenly be rejected. However, Flit has also always been a tool that aims to cover the 90% of simple, common use cases, not 100% of everything people might want. And I strongly suspect that 99% of the projects people use Flit for are released under a single license, with no need for In this case, unlike others where Flit has made simplifying assumptions, I expect that we'll add support for the full license expression spec in the future - possibly quite soon. But breaking it up like this seems like it could be a useful way to move forward in smaller steps. When we add compound expressions, more possible inputs will be allowed rather than fewer. |
It's technically possible to backport the PEP 639 data to version I do agree though, that the goal should be to move to 2.4 rather quickly. Splitting it up into #705 and this one just made the most sense to me. The required changes for
I was just proposing to defer it from this PR specifically to make the review easier. Once it is merged, I can take a look at a MVP for the validation part.
That might work if we bail out once we encounter any special characters. Could be a good first step. |
Thanks! If you're happy with that idea, let's go with validating a single license for this PR. I think we can include the optional I'd probably bump the metadata version to 2.4 before releasing, but I like the idea of ensuring that the emitted metadata is always valid, even before a release. 👍 |
Not sure that makes much sense tbh. Sure I could add a single check for say
Getting to |
Sorry, I meant an expression specifying one license, as discussed, not literally just one option. I do expect that to mean including the list of valid licenses in some form. If you don't want to do this, though, I'll try to get to it. |
I don't think having a hardcoded list of valid licenses is a good idea. While new licenses aren't added often, this would imply that you'll need to keep updating the list and making new releases whenever that happens, and people will have to |
Makes sense now. I'd still suggest to do that in a followup though. That would also provide a good opportunity to discuss different approaches. The PR here is basically ready on its own. Will start working on it this week. If you've some spare time, I'd appreciate if you could do a first pass through the change here and in #705 so we can get the fundamentals right.
SPDX identifier are only ever deprecated and never removed. Thus the only thing missing would be support for new ones. As you've said yourself though, hardcoding the list is the current approach taken by most tools. Downloading it on demand might work but is also prone to errors so I'd recommend against it. |
The hardcoded list doesn't seem like a major issue, to be honest. The vast majority of open source projects use a few familiar licenses which will have been on the list for years. We should make sure to have an 'escape hatch' for that one project blazing the trail for a new not-yet-included license, but I don't think it's going to be needed very often. |
https://peps.python.org/pep-0639/#add-string-value-to-license-key
https://peps.python.org/pep-0639/#add-license-expression-field
Partially vendor packaging
24.2
to include the license validation added in pypa/packaging#828.Until Metadata version
2.4
is supported, the license expression will be backfilled to theLicense
field.Hatch
does the same atm.Work on #692