Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go-winio fails to import layers with a file larger than 8gB #43

Closed
TBBle opened this issue Jul 8, 2020 · 8 comments
Closed

go-winio fails to import layers with a file larger than 8gB #43

TBBle opened this issue Jul 8, 2020 · 8 comments
Assignees
Labels
App Compatibility Cross-platform compatibility assurance

Comments

@TBBle
Copy link

TBBle commented Jul 8, 2020

Mirroring moby/moby#40444 as it appears to be a HCS-level issue.


Description

When trying to create a container on Windows with a file larger than 8gB (8,589,934,592 bytes of random data is my test-case), Docker fails to commit the layer, giving an error along the lines of re-exec error: exit status 1: output: write \\?\C:\ProgramData\docker\tmp\hcs816749287\Files\UnrealEngine\LocalBuilds\Engine\Windows\Engine\Plugins\Experimental\BlastPlugin\Binaries\Win64\UE4Editor-BlastAuthoring.pdb: There is not enough space on the disk. or hcsshim::ImportLayer - failed failed in Win32: The system cannot find the path specified. (0x3).


Reproduced on Windows Server LTSC 2019, and Windows 10 1903, 1909, and 2004.

See moby/moby#40444 for more details and reproduction steps, and a log of the hcsshim calls made by Docker in the reproduction case.

If necessary, I could try and produce a reproduction case calling https://github.com/microsoft/hcsshim directly.

@triage-new-issues triage-new-issues bot added the triage New and needs attention label Jul 8, 2020
@immuzz immuzz added App Compatibility Cross-platform compatibility assurance and removed triage New and needs attention labels Jul 15, 2020
@immuzz
Copy link

immuzz commented Jul 15, 2020

@weijuans-msft will look into this

@TBBle
Copy link
Author

TBBle commented Jul 27, 2020

Reproduction case using wclayer utility from microsoft/hcsshim#852.

Given an 8 GB file named 8gfile, e.g.

$writer = [System.IO.File]::OpenWrite('./8gfile')
$random = new-object Random
$blockSize = 1073741824
$bytes = new-object byte[] $blockSize
for ($i=0; $i -lt 8; $i++)
{
 $random.NextBytes($bytes)
 $writer.Write($bytes, 0, $blockSize)
}
$writer.Close()

and mcr.microsoft.com/windows/nanoserver:2004 extracted to C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\763, although any base layer should work.

and a ton of disk space (there's going to be up to three copies of the 8 GB file around by the time we're done),

then, as Administrator:

# Create a new sandbox layer based on mcr.microsoft.com/windows/nanoserver:2004
wclayer create scratch -l C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\763
mkdir scmount
# Activate the sandbox layer, and mount the volume to a directory
wclayer mount scratch scmount -l C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\763
copy 8gfile scmount
# Confirm it's there, and 8gB
dir scmount
# Unmount the volume mount point and deactivate the read-write volume
wclayer unmount scratch scmount
# Confirm the sandbox.vhdx is now just over 8gB.
dir scratch
# Export the layer as an OCI image layer tarball
# During this process you can see in %TEMP%\hcs* that it extracts the modified files to disk, i.e. just Files/8gfile plus some supporting data, and then tarballs that directory up.
wclayer export scratch -l C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\763 -o 8gtest.tar
# Now reimport the layer as a new read-only layer
mkdir import
wclayer import import -l C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\763 -i .\8gtest.tar

This last step fails with

archive/tar: invalid tar header

The 8gtest.tar looks fine to me, in the Windows-shipped tar:

PS> tar --version
bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.5.f-ipp
PS> tar tvf 8gtest.tar
d---------  0 0      0           0 Jul 27 17:25 Files
----------  0 0      0  8589934592 Jul 27 17:28 Files/8gfile
d---------  0 0      0           0 Jul 27 17:25 Hives
----------  0 0      0        8192 Jul 27 17:25 Hives/DefaultUser_Delta
----------  0 0      0        8192 Jul 27 17:25 Hives/Sam_Delta
----------  0 0      0        8192 Jul 27 17:25 Hives/Security_Delta
----------  0 0      0        8192 Jul 27 17:25 Hives/Software_Delta
----------  0 0      0        8192 Jul 27 17:25 Hives/System_Delta

Now, if I replace the vendored Microsoft/go-winio with one from my PR to use Go's current archive/tar, and rebuild wclayer, and rerun just the last step, it works.

Go up to 1.7 failed to read PAX archives containing a file larger than 8GB (the fix in Go 1.8), so this is consistent.

So it seems I already have posted the fix as microsoft/go-winio#175 (replacing the archive/tar forked from Go 1.6), and it just needs to be approved, and then https://github.com/microsoft/hcsshim updated with the new version, and then the upstream users (https://github.com/moby/moby and https://github.com/containerd/containerd that I know of) updated as well.

@TBBle TBBle changed the title Docker fails to create Windows Containers with a file larger than 8gB hcsshim fails to import layers with a file larger than 8gB Jul 27, 2020
@TBBle
Copy link
Author

TBBle commented Jul 27, 2020

On top of my last comment, it also looks like there's issues with the gzip stream produced by wclayer. If I use the non-upgraded-gowinio wclayer and instead export with:

.\wclayer export scratch -l C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\763 -o 8gtest.tgz -z

We get:

PS> dir 8gtest.tgz
Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a---           27/7/2020  6:14 pm     8589988864 8gtest.tar
-a---           27/7/2020  7:01 pm     8592549114 8gtest.tgz

So the .tgz is slightly larger than the .tar, which kind-of makes sense since the only thing in the tar file is 8GB of random data, so it won't compress well.

Windows-shipped tar does not like the compressed version:

PS C:\Users\paulh\Documents\BuildKit\8gtest> tar tzvf .\8gtest.tgz
d---------  0 0      0           0 Jul 27 17:25 Files
----------  0 0      0  8589934592 Jul 27 17:28 Files/8gfile
tar.exe: Truncated input file (needed 8589934592 bytes, only 0 available)
tar.exe: Error exit delayed from previous errors.

I tried some other archivers to verify that the gzip stream was faulty:

And to rule out a bug in tar, I checked with unpigz which I happened to have lying around:

PS C:\Users\paulh\Documents\BuildKit\8gtest> unpigz.exe -t .\8gtest.tgz
unpigz: abort: corrupted input -- invalid deflate data: .\8gtest.tgz

although I can't 100% rule-out faulty tools, since:

> pigz.exe -c .\8gtest.tar > 8gtest.pig.tgz
C:\Users\paulh\bin\pigz.exe: abort: .\8gtest.tar too large -- not compiled with large file support

So I also tried 7z:

PS> & 'C:\Program Files\7-Zip\7z' t .\8gtest.tgz

7-Zip 19.00 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2019-02-21

Scanning the drive for archives:
1 file, 8592549114 bytes (8195 MiB)

Testing archive: .\8gtest.tgz
--
Path = .\8gtest.tgz
Type = gzip
Headers Size = 10

ERROR: Unexpected end of data : 8gtest.tar

Sub items Errors: 1

Archives with Errors: 1

Sub items Errors: 1

So I finally recompressed my working 8gtest.tar from the previous comment, using 7zip (named .gz, not .tgz, to avoid 7-zip creating a tarball containing my tarball)

PS> & 'C:\Program Files\7-Zip\7z' a 8gtest.7z.gz .\8gtest.tar

7-Zip 19.00 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2019-02-21

Scanning the drive:
1 file, 8589988864 bytes (8193 MiB)

Creating archive: 8gtest.7z.gz

Add new data to archive: 1 file, 8589988864 bytes (8193 MiB)


Files read from disk: 1
Archive size: 8593748051 bytes (8196 MiB)
Everything is Ok
PS> dir 8gtest.*
Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a---           27/7/2020  7:30 pm     8593748051 8gtest.7z.gz
-a---           27/7/2020  6:14 pm     8589988864 8gtest.tar
-a---           27/7/2020  7:01 pm     8592549114 8gtest.tgz

(slightly bigger than the one produced by compress/gzip)

and now Windows-shipped tar is happy:

PS> tar tzvf .\8gtest.7z.gz
d---------  0 0      0           0 Jul 27 17:25 Files
----------  0 0      0  8589934592 Jul 27 17:28 Files/8gfile
d---------  0 0      0           0 Jul 27 17:25 Hives
----------  0 0      0        8192 Jul 27 17:25 Hives/DefaultUser_Delta
----------  0 0      0        8192 Jul 27 17:25 Hives/Sam_Delta
----------  0 0      0        8192 Jul 27 17:25 Hives/Security_Delta
----------  0 0      0        8192 Jul 27 17:25 Hives/Software_Delta
----------  0 0      0        8192 Jul 27 17:25 Hives/System_Delta

and so is unpigz, suggesting it's only filesystem operations that my build doesn't have large-file support for.

PS> unpigz.exe -t .\8gtest.7z.gz
PS> 

I see the same thing with the winio-upgraded wclayer, which makes sense, since both of them work by simply wrapping compress/gzip.NewWriter around the writer it otherwise creates, suggesting a flaw in Go's own compress/gzip.

However, this code-path is specific to the wclayer utility itself, so should not affect either Docker or containerd, unless they also have the same bug in their own compression routines, which would be shared between Linux and Windows anyway.

@ghost
Copy link

ghost commented Sep 17, 2020

This issue has been open for 30 days with no updates.
@weijuans-msft, please provide an update or close this issue.

@TBBle
Copy link
Author

TBBle commented Sep 17, 2020

Update from my side: microsoft/go-winio#175 has been merged, and there outstanding PRs for Docker Engine (moby/moby#41430) and the hcsshim Go library (microsoft/hcsshim#876) to upgrade to the fixed version of go-winio.

I'm hopeful that the revendoring can be backported into Docker Engine 19.03 as a bug-fix, but have not had any indication from the maintainers as to whether this can happen.

There is also an outstanding PR for containerd (containerd/containerd#4395) to move from in-tree code to using go-winio, but that doesn't change the status of this issue, as the in-tree code did not have this bug.

Basically, all PRs waiting for project maintainer feedback and/or acceptance to progress.

Edit: The PR for Docker Engine has been merged to master, and is marked for the (increasingly-inaccurately-named) 20.03 milestone.

@TBBle TBBle changed the title hcsshim fails to import layers with a file larger than 8gB go-winio fails to import layers with a file larger than 8gB Oct 3, 2020
@ghost
Copy link

ghost commented Nov 2, 2020

This issue has been open for 30 days with no updates.
@weijuans-msft, please provide an update or close this issue.

@TBBle
Copy link
Author

TBBle commented Nov 3, 2020

hcsshim 0.8.10 and Docker Engine 20.10.0-beta1, e.g. from Docker Desktop Edge 2.4.2.0, have the fix merged.

I'm currently tracking another possible size-related issue in Docker Engine 20.10.0-beta1, but I don't believe it's the same problem as this one, as it seems to be more related to total size copied that individual image size. (I also don't have a local reproduction for it yet, for unrelated reasons)

So we could close this now, or wait for a full release of Docker Engine 20.10.0.

@weijuans-msft
Copy link

Thanks. Close this now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
App Compatibility Cross-platform compatibility assurance
Projects
None yet
Development

No branches or pull requests

3 participants