Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequent cache access error warnings on Win32 #169

Closed
schlenk opened this issue Dec 3, 2021 · 8 comments · Fixed by #178
Closed

Frequent cache access error warnings on Win32 #169

schlenk opened this issue Dec 3, 2021 · 8 comments · Fixed by #178
Labels
bug Something isn't working performance Something isn't as fast or responsive as it should be. plat:windows Issues reported on Windows

Comments

@schlenk
Copy link

schlenk commented Dec 3, 2021

Bug description

The cache writing code frequently runs into WinError 32 warnings due to parallel access by different processes.
This might be due to the typical Windows Defender antivirus interfering with a create & rename operation. Files in use may not be renamed on windows.

Reproduction steps

Run pip-audit --local on a Windows machine with the default Defender Antivirus enabled.

Expected behavior

No warnings, at least one retry.

Screenshots and logs

DEBUG:pip_audit._cli:parsed arguments: Namespace(local=True, requirements=None, format=<OutputFormatChoice.Columns: 'columns'>, vulnerability_service=<VulnerabilityServiceChoice.Pypi: 'pypi'>, dry_run=False, strict=False, desc=<VulnerabilityDescriptionChoice.Auto: 'auto'>, cache_dir=WindowsPath('c:/temp/audit'), progress_spinner=<ProgressSpinnerChoice.On: 'on'>, timeout=15)
- Auditing zope.interface (5.4.0)
DEBUG:cachecontrol.controller:Looking up "https://pypi.org/pypi/alabaster/0.7.12/json" in the cache
DEBUG:cachecontrol.controller:No cache entry available
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pypi.org:443
DEBUG:urllib3.connectionpool:https://pypi.org:443 "GET /pypi/alabaster/0.7.12/json HTTP/1.1" 200 6108
DEBUG:cachecontrol.controller:Updating cache with response from "https://pypi.org/pypi/alabaster/0.7.12/json"
DEBUG:cachecontrol.controller:etag object cached for 1209600 seconds
DEBUG:cachecontrol.controller:Caching due to etag
WARNING:pip_audit._service.pypi:Failed to write to cache directory, performance may be degraded: [WinError 32] Der Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen Prozess verwendet wird: 'c:\\temp\\audit\\8\\8\\c\\3\\c\\tmpfibz6hh_' -> 'c:\\temp\\audit\\8\\8\\c\\3\\c\\88c3c4038e9abf700f212a9b6a8b1294aca8991fc28f185b1fd2a782'
DEBUG:cachecontrol.controller:Looking up "https://pypi.org/pypi/argon2-cffi/21.1.0/json" in the cache
DEBUG:cachecontrol.controller:No cache entry available

Platform information

  • OS name and version:
  • pip-audit version (pip-audit -V): pip-audit 1.0.1
  • Python version (python -V or python3 -V): Python 3.9.7
  • pip version (pip -V or pip3 -V): pip 21.3.1

Additional context

Maybe using https://pypi.org/project/atomicwrites/ would help?

@schlenk schlenk added the bug-candidate Might be a bug. label Dec 3, 2021
@di
Copy link
Member

di commented Dec 3, 2021

@schlenk What's the output of pip cache dir here?

@woodruffw woodruffw added the plat:windows Issues reported on Windows label Dec 3, 2021
@schlenk
Copy link
Author

schlenk commented Dec 3, 2021

It should not matter, as it also happens with --cache option.

The pip cache dir is in the typical user profile location:
c:\users\msc\appdata\local\pip\cache

The traceback was captured with --cache c:\temp\audit given.

@woodruffw
Copy link
Member

Interesting that this is a problem -- we shouldn't be doing any parallel access, although I guess we do technically have nested serial access by virtue of running pip within pip-audit whenever we're in requirements mode. But the mode you specified (environment + --local) shouldn't be triggering that case.

@woodruffw woodruffw added bug Something isn't working and removed bug-candidate Might be a bug. labels Dec 3, 2021
@schlenk
Copy link
Author

schlenk commented Dec 3, 2021

The parallel access is probably from the outside (e.g. virus scanner), but as Windows Defender is default on by now, the code should handle that.

I would suspect that this code is vulnerable to a race condition:

      with NamedTemporaryFile(delete=False, dir=os.path.dirname(name)) as io:
            io.write(value)

            # NOTE(ww): Similar to what `pip` does in `adjacent_tmp_file`.
            io.flush()
            os.fsync(io.fileno())

            os.replace(io.name, name)

The os.fsync() might take long enough so the virus scanner notices the new file, then it races against the os.replace(). If any other process holds a file handle for the named temporary file, the os.replace() will fail and should be retried with a tiny delay like 50ms or so.

@woodruffw
Copy link
Member

That makes sense, thanks for pointing that out. I'm a little bit hesitant to go with a hardcoded delay + retry loop, since it turns a failure mode into a potential deadlock (in the unlikely event that Windows Defender or whatever other AV refuses to give up the file handle.)

I'll see if there's a way we can introduce additional creation/access flags here to prevent Defender from trying to obtain access.

@schlenk
Copy link
Author

schlenk commented Dec 3, 2021

The retry loop should probably just retry 1 or 2 times and simply run in the error path afterwards. And yes, the deadlock happens otherwise.
It may be possible to avoid the issue by setting up some share mode like 0x0, but didn't try here. (https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea see dwShareMode ).

@woodruffw
Copy link
Member

Yeah, in an ideal mode we'd use dwShareMode, but it looks like Python's (uniform) file APIs don't expose it yet.

We'll try the retry-with-error-path approach for now. I'll have a PR ready in a moment.

@woodruffw woodruffw added the performance Something isn't as fast or responsive as it should be. label Dec 3, 2021
@schlenk
Copy link
Author

schlenk commented Dec 4, 2021

Yes, sadly https://bugs.python.org/issue15244 is still in limbo. And the SHARE_MODE seems to be the problem here, as the code tries to rename the file held open by the same process, which only works on Windows when the file is opened with SHARE_MODE_DELETE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance Something isn't as fast or responsive as it should be. plat:windows Issues reported on Windows
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants