-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TUF Initialization using python-tuf 2.0.0 #10870
TUF Initialization using python-tuf 2.0.0 #10870
Conversation
I want to thank @lukpueh for sharing his TUF expertise and helping me review and add improvements to this draft PR last few days. The idea of tagging you all here is to say thanks ❤️, and the second intention is to ask for help reviewing this draft PR 😎. I will continue working on the tests for this PR. |
Summarizing some architectural discussions I had with @kairoaraujo on slack: The current design implements Warehouse-specific TUF app code in The intention is to hide TUF metadata handling from Warehouse. And, in the future when better TUF repository tooling is available, replace the repo abstraction by such tooling (see theupdateframework/python-tuf#1136). But it seems hard to draw a clear line between application and repository responsibilities. Some pain points:
The added complexity of the repository abstraction together with an unclear distribution of responsibilities makes it harder to review the correctness of the implementation in terms of PEP and TUF spec compliance. Maybe it would be easier to remove the abstraction and implement all in app code? The two relevant classes All that said, we decided that we will stick to the current design, and wait for feedback from upstream. (cc @jku, who has spent a lot of thoughts on TUF repository abstractions.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High-level design seems to be in decent shape and headed in the right direction.
warehouse/tuf/repository.py
Outdated
keyids=[key["keyid"] for key in role_parameter.keys], | ||
threshold=role_parameter.threshold, | ||
terminating=None, | ||
paths=role_parameter.paths, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to pass in copies of the paths
and path_hash_prefixes
lists because they are mutable in both the original RolesPayload
and the new DelegatedRole
objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not seem necessary. RolesPayload
is only used as a transport container to move the data from the service to the repository. But I agree that it's a lot of references being passed around with potential for unwanted side-effects. That's why I've been advocating for merging service and repository (see #10870 (comment))
warehouse/tuf/repository.py
Outdated
key_rolename = key_rolename | ||
else: | ||
key_rolename = rolename | ||
role_metadata.signed.expires = role_expires |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a check that role_expires
is actually in the future? If so, this function and others that set a role's metadata expiration date should be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably does not hurt, other than adding a bit of cruft. The expiration date is determined by a per-role configured interval and a helper function, both controlled by us. Maybe instead of checking here we could add tests that the configured value is greater 0 and the helper indeed adds the value to now?
2bc6c23
to
f300b2f
Compare
As I added all the tests, I moved this PR from Draft. |
4d10b79
to
9eebff6
Compare
9eebff6
to
a8607be
Compare
fea707d
to
41e90a5
Compare
41e90a5
to
cbd1e41
Compare
An alternative implementation for this PR is available in kairoaraujo#1, following the design proposal outlined in #10870 (comment) above. The goal is to reduce complexity to facilitate general code review and review of the implementation correctness with regard to PEP458. I suggest we discuss the alternative implementation internally first (cc @kairoaraujo, @jku, @joshuagl, @mnm678) and update this PR here thereafter. That said, comments from the pypa community and general public are welcome at any time! |
I moved back to draft to implement tests. |
eccef82
to
3649de9
Compare
3649de9
to
815ab11
Compare
Could someone trigger the CI with that button below? |
@kairoaraujo would you please help fix the |
A gentle nudge/reminder that the dependency declarations in this PR are inconsistent, which is why the CI is failing. |
@pradyunsg, I see that dependencies errors are not related directly to my changes, but to the google-cloud-bigquery and protobuf. |
We're fixing this in #12226, nothing to do here. |
Added unit tests for tuf.hash_bins Signed-off-by: Kairo de Araujo <[email protected]>
Added unit tests for warehouse.tuf.repository Signed-off-by: Kairo de Araujo <[email protected]>
Fix general linting for tests added and tuf services Signed-off-by: Kairo de Araujo <[email protected]>
As ``warehouse.tuf.repository`` is using typing, the mypy found some issues and it was fixed. Some tests improvements, added some monkeypatch and new asserts for call recorded using pretend. Signed-off-by: Kairo de Araujo <[email protected]>
Fix some required paramenters for running the development environment. Fix bug on LocalKeyStorage Signed-off-by: Kairo de Araujo <[email protected]>
Remove unnecessary TargetsPayload data structure and use the TargetFile from the Python TUF (python-tuf) Metadata API. The TargetsPayload was used to add hashed targets. However, a similar data structure is provided by python-tuf. Signed-off-by: Kairo de Araujo <[email protected]>
Remove function _make_fileinfo RepositoryService that is not used anywhere. Signed-off-by: Kairo de Araujo <[email protected]>
The RolesPayload was used by the TUF initialization (for development purposes) and during the Roles Delegations. The RolesPayload is no longer necessary during the TUF development initialization once all the configuration is available on request settings. During the Roles Delegations, it was replaced by the python-TUF data structure DelegatedRole, reusing it from ``tuf.repository``. Signed-off-by: Kairo de Araujo <[email protected]>
This commit adds a refactoring on the key signature used. Instead of using from Key Storage Service keys as a dictionary, uses that as a ``securesystemslib.signer.Signer``. It gives more flexibility and uses the same data structure across the services, repository and TUF. Signed-off-by: Kairo de Araujo <[email protected]>
Reduce complexity and lines of code by implementing all current TUF management code inside `RepositoryService`, and thus removing one level of abstraction, previously implemented by the `MetadataRepository` class. NOTE: This patch is marked WIP, as it has not removed all references to `MetadataRepository`, nor adopted the tests. Moreover, it still needs review in terms of correctness wrt PEP458. But the reduced complexity should make this easier. NOTE: (2) There is more potential for DRY code, see reoccurring `_bump_version; _bump_expiration; _sign; _persist;` and `_update_snapshot; _update_timestamp;` call chains. For this iteration of the patch, I chose verbosity/explicitness over saving a few more lines. But maybe both can be achieved. Signed-off-by: Lukas Puehringer <[email protected]>
This commit fixes some bug implementation from the last commit. This improves the internal functions that require the role names to work correctly once the role object has no explicit type (name), adds the SPEC_VERSION in the service, and the JSONSerializer for persisting the files with a better appearance. Signed-off-by: Kairo de Araujo <[email protected]>
The introduction of python-tuf 2.0.0 adds the feature of Succinct Delegation Roles as part of TAP15 (https://github.com/theupdateframework/taps/blob/master/tap15.md) This feature reduces the number of lines as the Hash Bins become built-in on python-tuf. All unit tests updated. Signed-off-by: Kairo de Araujo <[email protected]>
Missing tuf.url setting in the conftest for the app_config. Signed-off-by: Kairo de Araujo <[email protected]>
8058e54
to
f420476
Compare
Fixed typos in conftest and tuf/interfaces. Signed-off-by: Kairo de Araujo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, just some nits, although I wasn't able to run the add-all-packages
command.
For next steps: I want to call out that in addition to generating target metadata w/in the upload request flow, we also need to implement a non-local KeyService
(and choose what service we're going to use as the backing service for this in production).
$(WAREHOUSE_CLI) tuf dev keypair --name targets --path /opt/warehouse/src/dev/tufkeys/targets1 | ||
$(WAREHOUSE_CLI) tuf dev keypair --name targets --path /opt/warehouse/src/dev/tufkeys/targets2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are there two different targets generated here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generated two different keys to use different configuration thresholds in the config, and the KeyService
handle multiple keys in the development environment.
warehouse/warehouse/tuf/__init__.py
Line 28 in bed6362
"tuf.targets.threshold": 2, |
request = config.task(_init_targets_delegation).get_request() | ||
try: | ||
config.task(_init_targets_delegation).run(request) | ||
except (FileExistsError, StorageError) as err: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit weird that we're catching a StorageError
from the underlying sub-dependency all the way up here. I'd expect this to be catching an exception specific to tuf
right where we're invoking that library instead.
@@ -358,6 +368,10 @@ def configure(settings=None): | |||
], | |||
) | |||
|
|||
# For development only: this artificially prolongs the expirations of any | |||
# Warehouse-generated TUF metadata by approximately one year. | |||
settings.setdefault("tuf.development_metadata_expiry", 31536000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to be used anywhere? Also, if we want to override this for development, we should set a default value to be used in production, and then override it in the dev/environment
file.
@@ -84,12 +84,12 @@ def render_simple_detail(project, request, store=False): | |||
f"{project.normalized_name}/{content_hash}.{project.normalized_name}.html" | |||
) | |||
|
|||
length = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we can still return the length even if we're not storing the file, correct? Any reason not to return it here? Seems like this would be undefined behavior otherwise.
@@ -598,6 +599,7 @@ def includeme(config): | |||
) | |||
config.add_redirect("/pypi/", "/", domain=warehouse) | |||
config.add_redirect("/packages/{path:.*}", files_url, domain=warehouse) | |||
config.add_redirect("/metadata/{path:.*}", metadata_url, domain=warehouse) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is necessary? Why do we need a redirect?
@@ -0,0 +1,242 @@ | |||
# General TUF Warehouse implementation Notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably all move into our docs.
# NOTE: This is a deviation from PEP 458, as published: the PEP | ||
# stipulates that bin-n metadata expires every 24 hours, which is | ||
# both burdensome for mirrors and requires a large number of redundant | ||
# signing operations even when the targets themselves do not change. | ||
# An amended version of the PEP should be published, at which point | ||
# this note can be removed. | ||
"tuf.bin-n.expiry": 604800, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is someone working on this?
|
||
@dev.command() | ||
@click.pass_obj | ||
def add_all_packages(config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running this locally, I got:
Traceback (most recent call last):
File "/opt/warehouse/src/warehouse/tuf/services.py", line 139, in get
file_object = open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/var/opt/warehouse/tuf_metadata/1.bins.json'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/opt/warehouse/src/warehouse/__main__.py", line 18, in <module>
sys.exit(warehouse())
File "/opt/warehouse/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/opt/warehouse/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/opt/warehouse/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/warehouse/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/warehouse/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/warehouse/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/warehouse/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/opt/warehouse/lib/python3.10/site-packages/click/decorators.py", line 38, in new_func
return f(get_current_context().obj, *args, **kwargs)
File "/opt/warehouse/src/warehouse/cli/tuf.py", line 148, in add_all_packages
config.task(_add_hashed_targets).run(request, targets)
File "/opt/warehouse/src/warehouse/tasks.py", line 71, in run
result = original_run(*args, **kwargs)
File "/opt/warehouse/src/warehouse/tuf/tasks.py", line 58, in add_hashed_targets
repository_service.add_hashed_targets(targets)
File "/opt/warehouse/src/warehouse/tuf/services.py", line 428, in add_hashed_targets
bin_n = self._load(RoleType.BINS.value)
File "/opt/warehouse/src/warehouse/tuf/services.py", line 219, in _load
return Metadata.from_file(role_name, None, self._storage_backend)
File "/opt/warehouse/lib/python3.10/site-packages/tuf/api/metadata.py", line 233, in from_file
with storage_backend.get(filename) as file_obj:
File "/usr/local/lib/python3.10/contextlib.py", line 135, in __enter__
return next(self.gen)
File "/opt/warehouse/src/warehouse/tuf/services.py", line 142, in get
raise StorageError(f"Can't open {filename}")
securesystemslib.exceptions.StorageError: Can't open /var/opt/warehouse/tuf_metadata/1.bins.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this during the make inittuf
?
Or did you follow the sequence:
Line 106 in bed6362
$(WAREHOUSE_CLI) tuf dev init-repo |
@kairoaraujo I see there are some questions from Dustin that I think you have not addressed yet; could you check? Thanks! |
Hi @brainwane, I will continue with that. |
@kairoaraujo I believe all the open questions in that design document have now been resolved; is there anything else you're blocked on before revising this pull request? Thanks! |
Hi folks |
Hi @kairoaraujo we're really excited about this. Anywhere we can see the current progress? milestones on the implementation? |
Thanks for the interest! In fact, should the stakeholders (e.g., Lukas, Kairo, Dustin, Ofek, Donald, Marina, Sumana, etc) have a meeting to sync up? |
Thanks for the friendly nudge, @avishayil! We just posted a status update to the "PEP 458 current status and next steps" -thread on Python discuss. Happy to provide more details! 🎉 |
Should we close this PR in favour of #13943? @miketheman @di |
Superseded by #15241 |
This work refactors the Draft PR by
@woodruffw, to build a new repository tool on top of the Python-TUF
Metadata API, and use it instead of the Python-TUF repository tool
that was deprecated in v1.0.0.
Part of #10672
Note to reviewer
The current implementation has some development-only components, and lacks a few services for full PEP458 compliance as well as extensive tests. However, it should qualify for a review of the overall architecture and flow (see details in 'Overview' below). Components and functionality that are planned for subsequent PRs are listed in 'Next steps' below.
Overview
warehouse.tuf.repository
MetadataRepository
implements a custom TUF metadata repository tool on top ofthe new Python-TUF Metadata API to create and maintain (update, sign, sync with storage) TUF metadata for Warehouse.
warehouse.tuf.services
LocalKeyService
provides a local file storage backend for TUF role keys used by the repository tool (development only!!).LocalStorageService
provides a local file storage backend for TUF role metadata used by the repository tool.RepositoryService
provides methods for common Warehouse-TUF tasks, using the repository tool.warehouse.tuf.tasks
Defines common Warehouse-TUF tasks that use the
RepositoryService
forinit_repository
,init_targets_delegation
),add_hashed_targets
)bump_bin_n_roles
,bump_snapshot
)warehouse.cli.tuf
Defines development commands for bootstrapping a TUF metadata repository (
keypair
,init_repo
,init_delegations
), backsigning existing packages and simple index pages (add_all_packages
,add_all_indexes
), and for manually triggering scheduled tasks (bump_bin_n_roles
,bump_snapshot
). CLI calls go throughwarehouse.cli.tasks
, to take advantage of the Celery/Redis queue.Next steps:
Using the Warehouse development environment for TUF
Follow the official Warehouse until
make initdb
The metadata is available at http://localhost:9001/metadata/
You can also upload a file using the Warehouse and add the targets using CLI
twine
Updated: Removed MetadataRepository implementation