Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for potentially persisting ephemera between builds #32

Open
jclulow opened this issue Jul 17, 2023 · 2 comments
Open

support for potentially persisting ephemera between builds #32

jclulow opened this issue Jul 17, 2023 · 2 comments

Comments

@jclulow
Copy link
Collaborator

jclulow commented Jul 17, 2023

At present, buildomat targets provide a pristine environment each time a job is started. This is the best way to provide reliable, understandable results that can be reproduced or debugged after a build completes.

Unfortunately, some software build processes are truly staggeringly expensive. It would appear to be possible to, at the expense of making the build less hermetic, persist some portion of the build environment at the end of a successful build (but not the whole environment) to be unfurled on top of the pristine environment the next time around. This is obviously a facility that will require some care in order to avoid creating a lot of potentially quiet problems; some features we should consider are:

  • allowing this to occur for pull requests but not for builds pushed to the main branch of a repository; GitHub currently forces a new commit hash even when doing a purely fast-forward rebase merge, so our deduplication based on commit hash likely won't be a problem here unless that changes
  • if this is driven by declarative configuration in the job TOML:
    • preserving only a very specific set of files, perhaps using the same rule matching behaviour we get with output_rules directives
    • we should take care to preserve files only after a successful build
    • we will need some way to reliably invalidate any existing persisted ephemera
  • this could also be driven explicitly using the bmat control program that is available inside jobs, which was added in 6cd4797
    • perhaps one could nominate files to persist; e.g., bmat persist cargo-registry ~/.cargo/registry
    • one could unfurl the latest persistent data with, say, bmat restore cargo-registry and it would be unpacked to the location from which it was originally saved
    • these commands would direct the agent to begin managing the persistent files, or to report that it couldn't do it for whatever reason; an advantage might be that jobs could handle failure however they like, and that the operational cost of doing this would itself be visible within the job
  • jobs should report any sets of persistent ephemera that they used, including a link to download the archive itself, as well as any integrity information, or metadata like creation date, which job created the archive, etc
@steveklabnik
Copy link
Contributor

@jclulow if I were interested in trying to implement this, is there a path that you particularly would prefer? that is, which way is most likely to get accepted?

Happy to chat through details as well.

@smklein
Copy link

smklein commented Nov 8, 2023

Any update here? As I mentioned on oxidecomputer/omicron#4471 -- this could easily shave ~10 minutes off most of our Omicron jobs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants