Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import base layer images #1688

Closed
wants to merge 1 commit into from

Conversation

jsturtevant
Copy link
Contributor

If a project creates a base layer image and wants to directly import the base then this ensure the baseline file are in place. It is a follow up to #1637 and enables containerd with ctr image import image.tar of a base layer.

@gabriel-samfira @helsaawy @TBBle

I didn't find any tests that I could run but I am happy to add something if I get a pointer in the right direction.

If a project creates a base layer image and wants to directly import the base then this ensure the baseline file are in place.  It is a follow up to microsoft#1637 and enables containerd with ctr image import image.tar of a base layer.

Signed-off-by: James Sturtevant <[email protected]>
@jsturtevant jsturtevant requested a review from a team as a code owner March 7, 2023 23:30
@jsturtevant jsturtevant mentioned this pull request Mar 7, 2023
30 tasks
@TBBle
Copy link
Contributor

TBBle commented Mar 8, 2023

If I understand correctly, this is to make arbitrary tarballs importable as base layers? My initial question is: does this introduce any risks or issues if multiple imports of the same tarball actually end up with different (meta)data inside Files/ (i.e. user-visible differences in creation datestamps etc).

If support for this feature is desired, we might need a stronger ensureBaseLayer that ensures that any created files have metadata based on some immutable information in the layer, or just constant.

Also, I can't think of any issue this can cause, but this does mean that the baseLayerWriter is now always writing into a pre-created directory tree rather than creating everything as it goes during layer writing. I suspect the existing code for handling directory timestamps will fix up any divergence in the Windows/system32/config's resulting timestamps, but perhaps this should be called in baseLayerWriter's cleanup instead, so that it's effectively a no-op for the existing base layer-formatted layers, where the requisite files are already in the tarball.

Edit: Looking at the context, I realise this is for WASI container images, which can't be expected to have Windows-specific stuff in their base layer. So definitely an important use-case, and given existing HCS layer-mounting requirements, I can't see any better approach than running something like ensureBaseLayer during import. (Unless we don't use the Windows Snapshotter on Windows for WASI container images, but that smells worse to me)

Edit again: Actually, perhaps WASI images can't be cross-platform like I thought in my earlier edit, since AFAIR the WCOW layer format includes the top-level Files/ directory (and potentially UtilityVM/Files), so they're never going to be compatible if HCS is involved. (Which brings back the containerd-level "is this the right snapshotter for WASI?" question)

Gah, I'm too far from this stuff these days, my recollections aren't reliable and I don't have time to check the code right now. So I know at least one of the above two paragraphs is incorrect. ^_^

@gabriel-samfira
Copy link
Contributor

I didn't find any tests that I could run but I am happy to add something if I get a pointer in the right direction.

I skipped adding tests in hcsshim, as this is heavily used in containerd, and we run the containerd integration tests as part of the hcsshim CI pipeline. Sadly, the containerd branch using this is not yet merged.

If a project creates a base layer image and wants to directly import the base then this ensure the baseline file are in place. It is a follow up to #1637 and enables containerd with ctr image import image.tar of a base layer.

To be honest, I can't think of any negative implications of being able to import and mount any arbitrary layer. They won't actually be usable to start a container on Windows, but they will be mountable at least. I see no harm in this tbh, but I am hardly an authority on this 😄

@jsturtevant
Copy link
Contributor Author

So definitely an important use-case, and given existing HCS layer-mounting requirements, I can't see any better approach than running something like ensureBaseLayer during import. (Unless we don't use the Windows Snapshotter on Windows for WASI container images, but that smells worse to me)

I considered a different snapshotter as well but this would be quite a bit of additional overhead. Maybe longer term it could be the right approach (and now that @gabriel-samfira has enabled Bind Filters in Windows we might be able to use the native snapshoter with some minor tweaks? I only look briefly and this is a bit out of my comfort zone)

They won't actually be usable to start a container on Windows, but they will be mountable at least. I see no harm in this tbh, but I am hardly an authority on this 😄

The other use case for this is HostProcess Containers for images like https://github.com/microsoft/windows-host-process-containers-base-image. I user could generate an image for this locally, import it and then run it.

Edit again: Actually, perhaps WASI images can't be cross-platform like I thought in my earlier edit, since AFAIR the WCOW layer format includes the top-level Files/ directory (and potentially UtilityVM/Files), so they're never going to be compatible if HCS is involved. (Which brings back the containerd-level "is this the right snapshotter for WASI?" question)

With this change the image generated at https://github.com/containerd/runwasi/blob/main/crates/wasi-demo-app/build.rs doesn't require any changes to run on windows (since this adds those meta data files). It does add some additional files that won't be used (like the vhd files). so the image is technically cross platform if you squint. Again maybe using a custom snapshotter would make sense longer term but this also opens the possibility to do this with HPC.

Also, I can't think of any issue this can cause, but this does mean that the baseLayerWriter is now always writing into a pre-created directory tree rather than creating everything as it goes during layer writing. I suspect the existing code for handling directory timestamps will fix up any divergence in the Windows/system32/config's resulting timestamps, but perhaps this should be called in baseLayerWriter's cleanup instead, so that it's effectively a no-op for the existing base layer-formatted layers, where the requisite files are already in the tarball.

I am not sure I understand all of this, but could try putting it during the cleanup? It needs to be called before ImportLayer or it errors because those files are not present.

@jsturtevant
Copy link
Contributor Author

CI / integration-tests (windows-2022) (pull_request) failure doesn't look related to this change

@gabriel-samfira
Copy link
Contributor

The other use case for this is HostProcess Containers for images like https://github.com/microsoft/windows-host-process-containers-base-image. I user could generate an image for this locally, import it and then run it.

This is actually great. We can finally have FROM scratch with host process containers and not have to pull in a large image just to distribute a binary.

@gabriel-samfira
Copy link
Contributor

CI / integration-tests (windows-2022) (pull_request) failure doesn't look related to this change

For peace of mind, you could try out the snapshotter tests in containerd/containerd#8043. Just replace hcsshim with your branch in go.mod and run:

sh.exe -c "EXTRA_TESTFLAGS=-test.run=TestSnapshotterClient make integration"

@jsturtevant
Copy link
Contributor Author

This is semi-related: opencontainers/runtime-spec#1185

@TBBle
Copy link
Contributor

TBBle commented Mar 9, 2023

Also, I can't think of any issue this can cause, but this does mean that the baseLayerWriter is now always writing into a pre-created directory tree rather than creating everything as it goes during layer writing. I suspect the existing code for handling directory timestamps will fix up any divergence in the Windows/system32/config's resulting timestamps, but perhaps this should be called in baseLayerWriter's cleanup instead, so that it's effectively a no-op for the existing base layer-formatted layers, where the requisite files are already in the tarball.

I am not sure I understand all of this, but could try putting it during the cleanup? It needs to be called before ImportLayer or it errors because those files are not present.

Yeah, I'm thinking maybe put it before

err = reapplyDirectoryTimes(w.root, w.dirInfo)
, so that the "fixup directory times" step is run after ensureBaseLayer.

@TBBle
Copy link
Contributor

TBBle commented Mar 9, 2023

The other use case for this is HostProcess Containers for images like https://github.com/microsoft/windows-host-process-containers-base-image. I user could generate an image for this locally, import it and then run it.

This is actually great. We can finally have FROM scratch with host process containers and not have to pull in a large image just to distribute a binary.

I'm surprised that image works; it's basically the same as the earlier version of the "base layer creation" PR (i.e. before real hive creation was added, when ensureHive was just touch) and that wasn't importable on recent Windows versions. This change would not fix that, because the created image contains invalid hives, and ensureBaseLayer doesn't protect against that, so would be a no-op.

Anyway, FROM scratch is what the current work already implements: I had hoped that the eventual implementation of FROM scratch would not rely on an external image, in the same way it doesn't on Linux.

I'm somewhat curious, is "tar up a filesystem and ctr image import it" a common workflow? Until this PR, I thought ctr image import only accepted Docker and OCI image/image list tarballs.

@jsturtevant
Copy link
Contributor Author

sh.exe -c "EXTRA_TESTFLAGS=-test.run=TestSnapshotterClient make integration"

I am getting an error even when not using my PR.

time="2023-03-17T16:08:22.821364900-07:00" level=info msg="containerd successfully booted in 0.033075s"
time="2023-03-17T16:08:22.932857100-07:00" level=debug msg="garbage collected" d=3.5324ms
time="2023-03-17T16:08:23.278667900-07:00" level=debug msg="event published" ns=testing topic=/namespaces/update type=containerd.events.NamespaceUpdate
time="2023-03-17T16:08:23.592158800-07:00" level=debug msg="(*service).Write started" expected="sha256:3bb312dcc36fa551766a13cf2fb8e2be90726774589c96bea9198a26307bf2a2" ref="index-sha256:3bb312dcc36fa551766a13cf2fb8e2be90726774589c96bea9198a26307bf2a2" total=350

Haven't dug into it, might be related to my dev env.

@jsturtevant
Copy link
Contributor Author

I'm somewhat curious, is "tar up a filesystem and ctr image import it" a common workflow? Until this PR, I thought ctr image import only accepted Docker and OCI image/image list tarballs.

The image https://github.com/microsoft/windows-host-process-containers-base-image isn't in an oci format, but it probably should be.

As for the common, making your own super small compact image for HPC should be something that could be enabled. As an example, provide an oci formatted tar package that has only your binary you want and I would expect directly importing to work.

@TBBle
Copy link
Contributor

TBBle commented Mar 18, 2023

I'm somewhat curious, is "tar up a filesystem and ctr image import it" a common workflow? Until this PR, I thought ctr image import only accepted Docker and OCI image/image list tarballs.

The image https://github.com/microsoft/windows-host-process-containers-base-image isn't in an oci format, but it probably should be.

Skimming https://github.com/microsoft/windows-host-process-containers-base-image/blob/main/New-HostProcessBaseImage.ps1, that does appear to be creating a Docker-formatted image tarball (i.e. with /manifest.json) with the requisite hive files in its rootfs. So it should not be affected by this PR, nor would anything using it as a base image.

As for the common, making your own super small compact image for HPC should be something that could be enabled. As an example, provide an oci formatted tar package that has only your binary you want and I would expect directly importing to work.

Yeah, I think I had a faulty assumption earlier, I thought this was for handling arbitrary tarballs (just a raw rootfs), not an OCI or Docker image tarball with a handrolled rootfs tarball (contrasting with a rootfs tarball prepared by, e.g. wclayer makebaselayer and wclayer export).

The use case I understand we're talking about now is similar to #750, although since that'd be solve in this repo, it'd be able to use wclayer to get a base layer in the currently-supported format.

I think the rest of my concerns (about being able to repeatedly import a given tarball and produce a 100% identical result including file-creation/modification metadata and ensuring this is a no-op when the root tarball does contain all of the necessary files) are applicable though.


Contrasting this to the existing approach (including those files in the exported base layer), this approach provides simpler base-layer hand-assembly, including not relying on access to empty hive templates or a hive-creation library. (Modulo that zero-byte hives failed to be processed last time I tested it, but perhaps that's changed again, since New-HostProcessBaseImage.ps1 does this and (I assume) works on Server 2022.)

However, base layers created without the required hives will only be importable with hcsshim-using systems that have taken a version of hcsshim after this pull is completed, or they'll get failures referencing hcsshim::ProcessBaseLayer coming from docker pull or ctr image pull.

Hence, I'd be concerned to start seeing such images pushed with the windows platform tag. If there's an different platform tag for WASI images, then that would be fine as there's no non-preview system that can pull WASI-platform images on Windows now, so we can ensure this feature is landed and made available to containerd before containerd starts formally supporting WASI on Windows.

So, I wouldn't use this feature to implement FROM scratch for Windows images; I'd still use hcsshim to process and export the base layer to ensure maximum compatibility, or for non-Windows image creation, use empty hive templates or a cross-platform hive-editing library to prepare a fully-compatible base layer.

@jsturtevant
Copy link
Contributor Author

Are you saying it would be preferable to generate the empty hive files like the HPC example does vs generating them on import? This won't work since there are checks to make sure those paths exist, which the don't even with the dummy files.

I think the HPC is a good example of a use case that could be updated to be a valid OCI tar format with a tool like https://github.com/containerd/runwasi/tree/main/crates/oci-tar-builder and should be importable without having to use hcsshim to create base layers.

Or is there another way to something similiar (point at a folder with an executable and generate a base image that can be shared via tarball)?

I will have to dig deeper on your concerns about the reproducibility of imports.

@TBBle
Copy link
Contributor

TBBle commented Mar 25, 2023

Are you saying it would be preferable to generate the empty hive files like the HPC example does vs generating them on import? This won't work since there are checks to make sure those paths exist, which the don't even with the dummy files.

I don't understand what you mean by not working here. If the dummy files exist, then a check for them by path should pass.

Maybe there's some confusion on my part; I have assumed the HPC base image generated by that repo is importable into containerd today, as importing of layers hasn't been changed by the work I've been involved with.

But I haven't tried it myself, so if it's failing, knowing that (and how it fails) would be informative.

But yeah, my general preference is to produce base layers in the existing format, and avoid actions in the import process that produce a Files/ directory that differs from the distributed layer tarball. (Hence my suggestion that when given a current-format base layer, this PR's effect should result in no changes to file data or metadata.)

The mutatedFiles stuff is an example where some data changes can't be avoided during the import process, so they are reverted back after importing. reapplyDirectoryTimes is the same idea for known-affected metadata. So this codebase already contains active effort to ensure Files/ matches the layer tarball's content.

I think the HPC is a good example of a use case that could be updated to be a valid OCI tar format with a tool like https://github.com/containerd/runwasi/tree/main/crates/oci-tar-builder and should be importable without having to use hcsshim to create base layers.

What's invalid about the HPC base-image's format now? It's in Docker format, rather than OCI format, but it looked valid as a base layer from reading the script that produces it. (I didn't run it up side-by-side with the Docker image manifest spec though.)

Or is there another way to something similiar (point at a folder with an executable and generate a base image that can be shared via tarball)?

I don't think there's any complete pipeline that can do that, apart from the containerd snapshotter integration tests, which (once the PR lands) will be using the existing makeBaseLayer functionality in hcsshim to generate the necessary hive files and other things to be mountable.

Now that hcsshim-side stuff has landed, I was considering putting a script together (like the HPC repo) using the wclayer tool here to fulfill #750, but I'm still a fair way from having time to put into building and testing that right now.

@jsturtevant
Copy link
Contributor Author

What's invalid about the HPC base-image's format now? It's in Docker format, rather than OCI format, but it looked valid as a base layer from reading the script that produces it. (I didn't run it up side-by-side with the Docker image manifest spec though.)

nothing was invalid but it wasn't importing. I figured out why though, it doesn't have an annotations so the image wasn't identified. If I run ctr image import --index-name=hpc .\windows-host-process-containers-base-image.tar it does import.

@jsturtevant
Copy link
Contributor Author

jsturtevant commented Mar 31, 2023

I don't understand what you mean by not working here. If the dummy files exist, then a check for them by path should pass.

I get the following when I put dummy files in the oci image. Now that I've figured out why the HPC image wasn't loading this does seem to work for HPC so I'm looking deeper into why its not working for me.

error="failed to extract layer sha256:cbd83a58d0840f98b9458176b4887b3413f06c61dfea6bd0277bce7d9ccbb3ba:
 Failed to safefile.OpenRelative failed in Win32: open 
\\\\?\\C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\\1590\\Files\\Windows\\System32\\config\\DEFAULT: The system cannot find the path specified. (0x3) Files\\Windows\\System32\\config\\DEFAULT: unknown" key="extract-705276600-qBGq sha256:cbd83a58d0840f98b9458176b4887b3413f06c61dfea6bd0277bce7d9ccbb3ba"

The mutatedFiles stuff is an example where some data changes can't be avoided during the import process, so they are reverted back after importing. reapplyDirectoryTimes is the same idea for known-affected metadata. So this codebase already contains active effort to ensure Files/ matches the layer tarball's content.

I initially though by writing these files we could solve this problem but I see your concerns about the datetimes. If I can sort out the error above by passing blank files I think that can work for my use case, otherwise I guess I will need to mimic the wclayer calls in rust.

@TBBle
Copy link
Contributor

TBBle commented Apr 1, 2023

Are you building the tarball in code? My best guess for your error is that entries for each directory in the chain Files\Windows\System32\config have not yet been processed so the directory for this tar entry has not been created.

@TBBle
Copy link
Contributor

TBBle commented Apr 1, 2023

For reference, the bare-minimum base-layer tarball looks like this:

> tar tvf t.tar
d---------  0 0      0           0 Apr 01 10:31 Files
d---------  0 0      0           0 Apr 01 10:31 Files/Windows
d---------  0 0      0           0 Apr 01 10:31 Files/Windows/System32
d---------  0 0      0           0 Apr 01 10:31 Files/Windows/System32/config
----------  0 0      0        8192 Apr 01 10:31 Files/Windows/System32/config/DEFAULT
----------  0 0      0        8192 Apr 01 10:31 Files/Windows/System32/config/SAM
----------  0 0      0        8192 Apr 01 10:31 Files/Windows/System32/config/SECURITY
----------  0 0      0        8192 Apr 01 10:31 Files/Windows/System32/config/SOFTWARE
----------  0 0      0        8192 Apr 01 10:31 Files/Windows/System32/config/SYSTEM

That's created with

mkdir t\Files
wclayer makebaselayer t
wclayer export t -o t.tar
# Test that it imports again
wclayer import -i t.tar t2
# Mount-test
wclayer create -l t2 s2
wclayer mount -l t2 s2 m2
# Clean up
wclayer unmount s2 m2

I say "bare mimimum", but those are real (emtpy) hives. I tried with empty files, and it worked, but in the past mounting has failed so this might vary by OS build.

This is an empty-file equivalent to wclayer makebaselayer t3:

mkdir t3\Files\Windows\System32\config
New-Item t3\Files\Windows\System32\config\DEFAULT
New-Item t3\Files\Windows\System32\config\SAM
New-Item t3\Files\Windows\System32\config\SECURITY
New-Item t3\Files\Windows\System32\config\SOFTWARE
New-Item t3\Files\Windows\System32\config\SYSTEM

producing

> tar tvf .\t3.tar
d---------  0 0      0           0 Apr 01 10:34 Files
d---------  0 0      0           0 Apr 01 10:34 Files/Windows
d---------  0 0      0           0 Apr 01 10:34 Files/Windows/System32
d---------  0 0      0           0 Apr 01 10:35 Files/Windows/System32/config
----------  0 0      0           0 Apr 01 10:35 Files/Windows/System32/config/DEFAULT
----------  0 0      0           0 Apr 01 10:35 Files/Windows/System32/config/SAM
----------  0 0      0           0 Apr 01 10:35 Files/Windows/System32/config/SECURITY
----------  0 0      0           0 Apr 01 10:35 Files/Windows/System32/config/SOFTWARE
----------  0 0      0           0 Apr 01 10:35 Files/Windows/System32/config/SYSTEM

@jsturtevant
Copy link
Contributor Author

Are you building the tarball in code? My best guess for your error is that entries for each directory in the chain Files\Windows\System32\config have not yet been processed so the directory for this tar entry has not been created.

This was it. I was generating the file via code and the tar file didn't have the appropriate header files for the directories (TIL tar -t). I've added those entries explicitly and it is working. This works for me now and is equivalent to the HPC version now.

Thanks for helping me work through the issue. I will close this for now given the fact that this doesn't reproduce repeatable images as you pointed out.

@jsturtevant jsturtevant closed this Apr 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants