-
Notifications
You must be signed in to change notification settings - Fork 987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-generate and serve simple index metadata #8487
Comments
For those that aren't as familiar with TUF, a few additional questions about this:
|
Yep! Thanks for the clarifying questions.
TUF is using BLAKE2 for the other target metadata (i.e., actually distribution packages), so it probably makes sense to use it here as well.
I don't believe so; I think just the file itself should be sufficient. @jku may be able to correct me here, if I'm missing something.
Just
This is probably the knottiest part. My first thought for this is that |
Yeah this all seems correct to me. The TUF metadata for a project index will look roughly like this:
A client that sees this metadata will download I'll mention that we can of course agree on a different path for index files if e.g. you don't want to pollute the "/simple/*" namespace with so many new items. If we do that the path should be relative to the pypi index url though (so that the path can be found on all warehouse instances without configuration). Something like https://pypi.org/simple/.project-indexes/ would be fine to me. |
Another option would be to do something like: That would make it easy for mirrors to keep all of the related files colocated, to enable deletion and cleanup work without having to track where those files are for a specific project. |
The reasoning is sound but the TUF client implementation currently expects the target name to include a filename that will then be prefixed with hash: this could of course be worked around but alternatively something like |
Yea those are fine with me. |
Something I did not think when we last discussed this: It might be a good idea to not use the project name in the file name itself because of filename length limits: so I would suggest something like This is not a practical issue right now (blake2b hash is 64 bytes and longest project name on pypi seems to be 80 bytes: still far from the 255 byte limit) but avoiding the potential problem seems like a good idea if doing so is painless. |
Since we can't use The longest is 80 characters but I'm not sure where the practical limit for this comes from, if any: https://pypi.org/project/Aaaaaaaaaaaaaaaaaaa-aaaaaaaaa-aaaaaaasa-aaaaaaasa-aaaaasaa-aaaaaaasa-bbbbbbbbbbb/ |
TUF client library by default assumes it's given a url that has a filename in the end: the client library then prefixes the filename with |
I think I theoretically can workaround I think the reasonable options are:
I couldn't quite follow why 'index.html' was problematic so do let me know if the first option is not on the table: I'll have to start a discussion in TUF community in that case. |
Bumping the question about Alternatively, would something like |
It could be used but it doesn't make much sense as an endpoint within our routes -- there is no
As such, this is kind of a poor assumption, because virtually all of our routes don't have "filenames", including the ones in question here (unless you consider the last part of the path a filename). If we say that project names are constrained to a maximum of 80 characters, is there any reason why |
I totally agree (I can also understand how they ended up with that design -- the focus was on passive systems where the targets and metadata are pre-generated and then served by a dumb fileserver). I'm just pointing out that the URL must end with
Sure that works. |
That works for me as well! Thanks for the explanation, @di! |
What's the problem this feature will solve?
As part of the TUF rollout (#7488), we will need to store hashes for the simple indices that
pip
and other resolvers use.These indices are currently generated dynamically from a template when requested, making that difficult. Instead, they should be generated once per relevant event (file upload/release) and stored somewhere (probably GCS). Stale indices should not be deleted from the store, as the TUF metadata may still refer to them.
cc @ewdurbin @dstufft
The text was updated successfully, but these errors were encountered: