Synchronous flushing of bind-mount caches on `docker stop` #6512

rfay · 2020-04-30T20:21:13Z

I have tried with the latest version of my channel (Stable or Edge)
I have uploaded Diagnostics
Diagnostics ID: (Not possible to grab this, as happens unpredictably in CI environment)

Expected behavior

When a container is destroyed, one should expect that its bind-mount is also destroyed, and will not bleed into other containers which may be started later.

Actual behavior

#5530 (comment) explains the sequence of events:

This is sequential tests in an golang test environment.

There is a bind-mounted directory that has nginx configuration in it.

Test 1 adds a configuration file into the mounted directory, and tests to make sure that the additional configuration is detected.

Test 1 then stops the container and removes the extra configuration from the bind-mounted directory. It does this in a defer, so I haven't been able to figure out any way that it does not happen.

Test 2 then starts a completely new container, which bind-mounts the same directory.

Test 2 nginx startup tries to read the extra configuration (because it was apparently found in the directory) but fails to open it.

@djs55 replied there that

I think your guess is probably right-- currently when a container exits we flush the caches in the VM but unfortunately this is a background task (triggered by an event from the docker engine) So at step (4) when test 2 then starts a completely new container, the cache might not have been flushed yet. If this happens then the readdir will still show the old config file, but of course the file has actually been deleted so the open will fail with ENOENT. Although your tests are sequential and doing things in the right order, the delayed cache flush makes it look like the tests have overlapped.

Although flushing caches in the background when we receive events handles the case of containers stopping by themselves reasonably well, I think we should investigate adding a synchronous cache flush on the docker stop codepath as well.

Information

This has generally happened with an older, slower test-runner (2011). Docker Desktop 2.2.0.5, old, slow machine, 8GB Ram, default settings for docker.

Steps to reproduce the behavior

So far, this is intermittent and not reproducible on demand, although it's plenty common enough

rfay · 2020-05-15T20:12:40Z

This continues to be a real problem, in ddev/ddev#2261 I'm starting to skip tests that do this pattern because they fail too often:

Start container with bind mount
Stop container with bind mount
Rename a subdirectory of the bind mount (on the host)
Wait/sleep/even try to invalidate the cache using @djs55 's special tool
Start container with bind mount again.
Windows denies, because the host subdirectory is still apparently under control of Docker container (that was already stopped)

mat007 · 2020-05-16T07:18:48Z

Thanks for the report.
It’s quite difficult to dig into this without diagnostics. It’s possible to gather diagnostics from the command-line using:

# "C:\Program Files\Docker\Docker\resources\com.docker.diagnose.exe"
Please specify a command.

USAGE: com.docker.diagnose.exe [options] COMMAND [options]

Gather and upload diagnostics bundles for Docker Desktop.
Commands:
  gather
  upload

Options:
  -path string
        Local path prefix where to store the diagnostics bundle.

Run 'com.docker.diagnose.exe COMMAND help' for more information on the command

Common use cases:
- Generate a diagnostics bundle and upload:
    com.docker.diagnose.exe gather -upload
  This will print a diagnostics ID you can supply for further troubleshooting.

- Generate a local diagnostics bundle to upload later:
    com.docker.diagnose.exe gather
  This will print a diagnostics ID you can use to upload with.
    com.docker.diagnose.exe submit <ID>

- Generate a local diagnostics bundle:
    com.docker.diagnose.exe gather <file>
  This will generate a local bundle in <file>.zip.

If -path is specified, place the diagnostics bundle in that directory. Otherwise the sstem default temporary directory will be used.

Maybe it could be run when a test fails?

rfay · 2020-05-16T12:20:46Z

Since I had already implemented @djs55 's workaround and struggled with intermittent test failures for a couple of weeks, I just disabled the tests on Windows, not really willing to put more time into it. As indicated in the OP, @djs55 seemed to think this was a logical outcome of the current architecture, where

when a container exits we flush the caches in the VM but unfortunately this is a background task (triggered by an event from the docker engine

docker-robott · 2020-08-14T01:00:18Z

Issues go stale after 90 days of inactivity.
Mark the issue as fresh with /remove-lifecycle stale comment.
Stale issues will be closed after an additional 30 days of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle stale

docker-robott · 2020-10-13T03:00:09Z

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

rfay mentioned this issue Apr 30, 2020

File not changing inside docker container after updating from 2.1.0.5 to 2.2.0.0 #5530

Closed

2 tasks

rfay mentioned this issue May 15, 2020

Skip tests that are too difficult for Windows Docker (tests only) ddev/ddev#2261

Merged

mat007 added status/triage version/2.3.0.2 labels May 16, 2020

docker-robott added the lifecycle/stale label Aug 14, 2020

docker-robott closed this as completed Sep 13, 2020

docker locked and limited conversation to collaborators Oct 13, 2020

docker-robott added the lifecycle/locked label Oct 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synchronous flushing of bind-mount caches on `docker stop` #6512

Synchronous flushing of bind-mount caches on `docker stop` #6512

rfay commented Apr 30, 2020

rfay commented May 15, 2020 •

edited

Loading

mat007 commented May 16, 2020

rfay commented May 16, 2020

docker-robott commented Aug 14, 2020

docker-robott commented Oct 13, 2020

Synchronous flushing of bind-mount caches on docker stop #6512

Synchronous flushing of bind-mount caches on docker stop #6512

Comments

rfay commented Apr 30, 2020

Expected behavior

Actual behavior

Information

Steps to reproduce the behavior

rfay commented May 15, 2020 • edited Loading

mat007 commented May 16, 2020

rfay commented May 16, 2020

docker-robott commented Aug 14, 2020

docker-robott commented Oct 13, 2020

Synchronous flushing of bind-mount caches on `docker stop` #6512

Synchronous flushing of bind-mount caches on `docker stop` #6512

rfay commented May 15, 2020 •

edited

Loading