-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix TUTORIAL.md and add regression testing for future issues #775
Conversation
e5709b2
to
f7f003d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While going through your test_tutorial.py
and comparing it line by line with the code snippets in TUTORIAL.md
, to see if the instructions are properly replicated (which they seem to be), I had a crazy idea:
Why not run Python's doctest
over the TUTORIAL.md
? That would be an actual regression test, i.e. if the tutorial is changed, doctest
would immediately detect regressions without manually syncing the test code.
From the doctest
docs, one of it's use cases is:
To write tutorial documentation for a package, liberally illustrated with input-output examples. Depending on whether the examples or the expository text are emphasized, this has the flavor of “literate testing” or “executable documentation”.
The function doctest.testfile
should mostly work right away, with a few minor tweaks to the tutorial snippets:
- Add additional function calls and their expected output here and there to assert the proper functioning,
- Replace non-deterministic output, such as keyids (you already do this in
test_tutorial.py
) - Replace shell code with Python code (you already do this in
test_tutorial.py
) - Deal with user input (you already do this in
test_tutorial.py
)
IMHO the easiest way would be to just monkeypatch the corresponding functionsgenerate_and_write_*_keypair
,import_*_privatekey_from_file
. You can usedoctest
'sglobs
variable to pass the patched function names.
What do you think? If you want I can help you. Also let me know if you prefer the setup as it is now. Then I'll resume my comparing lines. :)
tests/test_tutorial.py
Outdated
as those created by this test...). | ||
""" | ||
for directory in ['repository', 'my_repo', 'client', | ||
'repository/targets/my_project']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repository/targets/my_project
is already deleted in the first iteration of this loop, i.e. shutil.rmtree('repository')
tests/test_tutorial.py
Outdated
for fname in [ | ||
os.path.join('repository', 'targets', 'file1.txt'), | ||
os.path.join('repository', 'targets', 'file2.txt'), | ||
os.path.join('repository', 'targets', 'file3.txt'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're already deleting the repository
dir above.
tests/test_tutorial.py
Outdated
'unclaimed_key.pub']: | ||
if os.path.exists(fname): | ||
os.remove(fname) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a more general note, would it make sense to create and change into a temporary test directory in TestTutorial.setUp
and change back and remove the temporary directory in TestTutorial.tearDown
?
It's a good thought. I'll have to try it out to see how practical it proves to be. If it makes the tutorial harder to read, that's a dealbreaker. |
TODO: - Rephrase two comments in TUTORIAL.md snippets - See if there are any issues in "Dump Metadata", "Delegations", "Revoke Delegated Role", "Wrap up" and fix or ticketize - Create regression test for client (or ticketize) - Create commits - Evaluate theupdateframework#808 (some if it is already resolved or will be resolved with theupdateframework#775, split out rest into new tickets, e.g. doctest style tutorial tests)
TODO: - Rephrase comments in TUTORIAL.md snippets (where code changed) - See if there are any issues in "Revoke Delegated Role", consistent snapshot, delegate to hashed bins - Finish regression test for client - Create commits - Evaluate theupdateframework#808 (some if it is already resolved or will be resolved with theupdateframework#775, split out rest into new tickets, e.g. doctest style tutorial tests)
TODO: - Create commits for TUTORIAL.md and test_tutorial.py - Update PR theupdateframework#775 - Evaluate and update issue theupdateframework#808 (some if it is already resolved or will be resolved with theupdateframework#775, split out rest into new tickets, e.g. doctest style tutorial tests)
6b4a4b2
to
09e9137
Compare
Taking over @awwad's work, I addressed my own inline review comments from above and most of the suggestions I made in #808 (comment). The tutorial is now fully copy-pastable (runnable) and should generally not leave the repository in an inconsistent state. Remaining issues as described in #808 should be fixed in separate PRs. I suggest reviewers to:
Please read commit messages for details! |
69dc37f
to
a9c0086
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a really solid set of improvements to the tutorial, great work @lukpueh
docs/TUTORIAL.md
Outdated
>>> repository.dirty_roles() | ||
Dirty roles: ['timestamp', 'snapshot', 'targets'] | ||
Dirty roles: ['root', 'snapshot', 'targets', 'timestamp'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, I only have:
Dirty roles: ['targets']
I believe I ran through the tutorial verbatim, apart from that I didn't exit ipython so I didn't have to re-import and reload the keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the tuf repository code has a bit of a problem with state-keeping. dirty_roles()
shows roles that have been changed in memory, e.g. by loading a private key for a role (load_signing_key()
), which is necessary after loading an existing repository (load_repository()
). This does not mean, however, that the according metadata on disk actually needs to be updated, although this is exactly what's done with dirty roles when calling writeall()
.
This clearly needs a revision. But probably not as part of the current PR.
A related issue, that is writeall()
ignoring roles that were not explicitly changed and thus not marked as dirty, even though they should be updated in dependency of other roles is documented in #958.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation @lukpueh. I wasn't sure why the output I saw differed from that shown in the tutorial, so wanted to point it out. I'm happy with this not being addressed in the current PR, should we capture the issue around state-keeping so we don't forget?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the best quick fix for the tutorial would be to just remove the calls to dirty_roles()
and instead add a call to mark_dirty()
passing all the roles that need to be written in a subsequent call to writeall()
. This guarantees that the repository on disk is always in a consistent state, even if the interpreter was exited.
What do you think, @joshuagl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this idea. I think it demonstrates how things fit together and helps make some of the workflow clearer.
# ----- Tutorial Section: Delegations | ||
generate_and_write_rsa_keypair( | ||
'unclaimed_key', bits=2048, password='password') | ||
public_unclaimed_key = import_rsa_publickey_from_file('unclaimed_key.pub') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A very minor nit, but the tutorial isn't consistent in its use of '
vs "
for strings. I suspect it highlights the new content Lukas added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are absolutely right! I pointed this out in #808 (comment) (3rd item), which I suggested in #775 (comment) should be fixed along with other remaining issues in separate PRs.
Apologies for the many sources of truth here. I can see how it is hard for a reviewer to keep track of all the things that need to be fixed. Thanks for bearing with me! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'unclaimed_key', bits=2048, password='password') | ||
public_unclaimed_key = import_rsa_publickey_from_file('unclaimed_key.pub') | ||
repository.targets.delegate( | ||
'unclaimed', [public_unclaimed_key], ['myproject/*.txt']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This emits:
'myproject/*.txt' is not located in the repository's targets directory: '/home/joshuagl/tuf-tut/repository/targets'
I think this is the first point in the tutorial we haven't warned the user to expect output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I briefly thought about it but left it to be fixed along with other issues in the tutorial.
Moreover, I have the suspicion that this is a bogus warning. It seems to literally check for a file called "myproject/*.txt" ignoring the fact that this is a path pattern.
What to you think about creating an issue, and adding a comment in the tutorial snippet that asks the reader to ignore that output, referring to that issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds reasonable. Let's be sure to have the issue mention that the tutorial needs updating once the issue is resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suspicion was wrong. The warning does not stem from an invalid file check but rather an unexpected file path prefix check. I documented the issue and a proposed fix in #963.
tests/test_tutorial.py
Outdated
os.path.join('repository', 'targets', 'myproject'), recursive_walk=True) | ||
|
||
repository.targets('unclaimed').delegate_hashed_bins( | ||
targets, [public_unclaimed_key], 32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make a remark about the output here? I get:
Creating hashed bin delegations.
1 total targets.
32 hashed bins.
256 total hash prefixes.
Each bin ranges over 8 hash prefixes.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! It is another instance of me thinking this should be addressed. More specifically with the 2nd to last item in #808 (comment).
But adding a comment there that briefly explains the output does not hurt.
docs/TUTORIAL.md
Outdated
|
||
>>> repository.dirty_roles() | ||
Dirty roles: ['00-07', '08-0f', '10-17', '18-1f', '20-27', '28-2f', '30-37', '38-3f', '40-47', '48-4f', '50-57', '58-5f', '60-67', '68-6f', '70-77', '78-7f', '80-87', '88-8f', '90-7', '98-9f', 'a0-a7', 'a8-af', 'b0-b7', 'b8-bf', 'c0-c7', 'c8-cf', 'd0-d7', 'd8-df', 'e0-e7', 'e8-ef', 'f0-f7', 'f8-ff', 'unclaimed'] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the same roles but sorted differently, which I'd expect knowing that various Python data structures have non-deterministic ordering. Does it make sense to make these deterministic (i.e. sorted()
this list)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Patch to sort the list before printing in dirty_roles()
submitted in #961
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#961 was merged 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is odd. I had already added this change in ac01033 specifically for that reason, albeit one level deeper. Are you sure you got a different sorting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me
Minor tutorial related tweaks made after reviewing PR #775
Thank you very much for the review, @joshuagl. Let me address/reply to your excellent observations! |
@@ -21,7 +21,7 @@ init: | |||
|
|||
install: | |||
- set PATH=%PYTHON%;%PYTHON%\\Scripts;%PATH% | |||
- python -m pip install -U pip | |||
- python -m pip install -U pip setuptools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to explicitly install mock for Python 2.7 on appveyor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Let me fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshuagl, do you happen to know how to install the dependency only for Python 2.7 builds on appveyor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not off the top of my head, but I think we should be able to use for.matrix.only/.except
:
https://www.appveyor.com/docs/build-configuration/#specializing-matrix-job-configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Maybe 932c025 works too.
7513a98
to
f02c5c4
Compare
- line-wraps are integrated with 190a736 in #775 - sorting the value returned by `get_dirty_roles()` in `dirty_roles()` is not necessary as `get_dirty_roles()` already returns a sorted list per ac01033 in #775. Signed-off-by: Lukas Puehringer <[email protected]>
…utorial Revert "Minor tutorial related tweaks made after reviewing PR #775"
2f0cf72
to
21db98f
Compare
Signed-off-by: Sebastien Awwad <[email protected]>
(that is, despite currently existing issue to be remedied in #774) Currently, repository_tool.get_filepaths_in_directory yields relative paths, not the absolute paths it promises in its docstring. This test will now function despite this and continue to function after #774 is merged. Signed-off-by: Sebastien Awwad <[email protected]>
Note that target filepaths specified in the repo use '/' even on Windows. (That property is important to make sure that metadata is platform- independent.) Signed-off-by: Sebastien Awwad <[email protected]>
Reasons are: - The prompt says 2.7.3 Sep 26 2013, which makes the tutorial look outdated - There is another section that explains how snippets should be executed in a Python interactive interpreter - The only activity in the snippet is importing tuf functions and creating a repo, both of which is done in another snippet below. And the here created repo is not re-used - The tutorial is long enough Signed-off-by: Lukas Puehringer <[email protected]>
Signed-off-by: Lukas Puehringer <[email protected]>
- Fix expected output - Update comments - Also see #955 Signed-off-by: Lukas Puehringer <[email protected]>
- Correctly show that repo.get_filepaths_in_directory() returns absolute and not relative paths - Pass absolute path to repo.targets.add_target() to fix exception - Also see #957 Signed-off-by: Lukas Puehringer <[email protected]>
The section does not fit in the continuity of the tutorial and misses commands to make it fully copy-pastable. This commit marks the section as "Excursion" and updates the snippets to make the commands fully copy-pastable, using files created in prior tutorial sections. Signed-off-by: Lukas Puehringer <[email protected]>
In "Targets" section: - Remove `repository.targets('<delegated rolename>').add_target(...)` command, because it is not copy-pastable and delegations have not yet been covered at that point. - Update the "remove targets" snippet to remove the previously added "myproject/file4.txt" instead of "file3.txt", because we will add "myproject/file4.txt" to the delegated "unclaimed" role in the Delegation section. In "Delegation" section: - Change "unclaimed" delegation pattern from 'foo*.tgz', for which no file exists in the tutorial, to 'myproject/*.txt'. - Add "myproject/file4.txt" to the delegated unclaimed targets role - Remove the command that updates the version of the "unclaimed" role, because this should not be done manually, and the add_target call shows just as well how to access delegated roles. - Comment out the revoke delgated role section, leaving a TODO note for required updates (should be ticketized). In "Delegate to Hashed Bins": - Add call to remove target "myproject/file4.txt" from "unclaimed", because it is further delgated to hashed bins - Add dirty_roles() call to show all the newly created bins - Add mark_dirty() and writeall() calls to create a consistent state of the repo Signed-off-by: Lukas Puehringer <[email protected]>
- Fix expected output - Update comments - Add a few additional calls, to help the reader understand the repo state - Also see #958 Signed-off-by: Lukas Puehringer <[email protected]>
The text above the snippet explains the basic idea of "consistent snapshots" and how to generate them with `write` and `writeall`. The commands in the snippet just leave the repo in an inconsistent state (see comment). Signed-off-by: Lukas Puehringer <[email protected]>
- Change directory names to what cli tool repo.py expects - Remove unrelated "tufenv" note - Mention that `tuf` must be installed Signed-off-by: Lukas Puehringer <[email protected]>
Instead of keeping track of files created during the tutorial and removing them afterwards, this commit updates the test case to create and change into a temporary directory in setUp and change back and remove the tempdir in tearDown. Signed-off-by: Lukas Puehringer <[email protected]>
roledb.get_dirty_roles(repo_name) returns the list representation of the global _dirty_roles[repo_name] set. To make the return value deterministic this commit sorts the list before returning it. The commit also removes calls to sorted on the return value of get_dirty_roles in test_roledb.py and test_repository_tool.py. Signed-off-by: Lukas Puehringer <[email protected]>
Signed-off-by: Lukas Puehringer <[email protected]>
Add, remove and update function calls to match code snippets in tutorial. This commit also adds tests for outputs of `repo.status()` and `repo.dirty_roles()` functions. Note that the compare-to strings need to be constructed programatically, akin to how they are constructed in the relevant functions, in order to avoid issues with unicode prefixes in Python2/3, e.g. "Dirty roles: ['root']" vs "Dirty roles: [u'root']" Signed-off-by: Lukas Puehringer <[email protected]>
This should fix the following build error: error in tuf setup command: 'tests_require' must be a string or list of strings containing valid project/version requirement specifiers; Expected version spec in mock; python_version < "3.3" at ; python_version < "3.3" Signed-off-by: Lukas Puehringer <[email protected]>
Co-Authored-By: Joshua Lock <[email protected]> Signed-off-by: Lukas Puehringer <[email protected]>
- Ask the reader to ignore a misleading warning about the location of a delegation path pattern. The comment may be removed when fixing the warning in #963. - Comment out text that has become obsolete when commenting out the "Revoke Delegated Role" section (in an earlier commit). Signed-off-by: Lukas Puehringer <[email protected]>
Explain and show output of delegate_hashed_bins() function call in tutorial snippet. Also update the subsequent comment for better continuity. Signed-off-by: Lukas Puehringer <[email protected]>
Signed-off-by: Lukas Puehringer <[email protected]>
TUF does not reliably mark roles as dirty whose metadata needs to be re-generated. Only roles that have changed are marked as dirty, but sometimes roles metadata needs to be updated, although the role wasn't changed directly (see #958). Furthermore, the tutorial assumes at one point that the reader leaves and re-enter the interpreter session, being forced to reload the signing keys, roles that later need to be re-written, are marked as dirty. If the reader does not leave the interpreter, the roles are not marked as dirty (see #964). To not confuse the reader with flawed state-keeping, and to never write an inconsistent repository to disk, the tutorial lets the reader explicitly mark all roles that need to be re-written as "dirty". This can be changed once above issues are fixed. Signed-off-by: Lukas Puehringer <[email protected]>
Signed-off-by: Lukas Puehringer <[email protected]>
Signed-off-by: Lukas Puehringer <[email protected]>
- Add missing remove_target call from "Delegate to Hashed Bins" section - Add comments to dirty_roles output assertion Signed-off-by: Lukas Puehringer <[email protected]>
c6aa55d
to
fc74cf2
Compare
Add a regression test for the tutorial instructions. Running test_tutorial.py attempts the commands replicated from TUTORIAL.md. This should help us avoid breaking the tutorial with future changes without noticing by having automated testing run the tutorial and produce helpful output.
It would be wise to add a similar test for the client use instructions.