Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo test on cargo itself fails spuriously on Windows #12042

Closed
jyn514 opened this issue Apr 27, 2023 · 20 comments · Fixed by rust-lang/rust#110918
Closed

cargo test on cargo itself fails spuriously on Windows #12042

jyn514 opened this issue Apr 27, 2023 · 20 comments · Fixed by rust-lang/rust#110918
Labels
C-bug Category: bug S-triage Status: This issue is waiting on initial triage.

Comments

@jyn514
Copy link
Member

jyn514 commented Apr 27, 2023

Problem

cargo test is failing intermittently for me on Windows.

PS C:\Users\jyn\src\cargo> cargo t --test testsuite warn_on_failure::no_warning_on_bin_failure
    Finished test [unoptimized + debuginfo] target(s) in 0.25s
     Running tests\testsuite\main.rs (target\debug\deps\testsuite-96440e92f452dc25.exe)

running 1 test
test warn_on_failure::no_warning_on_bin_failure ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 2811 filtered out; finished in 0.82s

PS C:\Users\jyn\src\cargo> cargo t --test testsuite warn_on_failure::no_warning_on_bin_failure
    Finished test [unoptimized + debuginfo] target(s) in 0.26s
     Running tests\testsuite\main.rs (target\debug\deps\testsuite-96440e92f452dc25.exe)

running 1 test
test warn_on_failure::no_warning_on_bin_failure ... FAILED

failures:

---- warn_on_failure::no_warning_on_bin_failure stdout ----
thread 'warn_on_failure::no_warning_on_bin_failure' panicked at 'failed to remove "C:\\Users\\jyn\\src\\cargo\\target\\tmp\\cit\\t0": Os { code: 
5, kind: PermissionDenied, message: "Access is denied." }', crates\cargo-test-support\src\paths.rs:146:17
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/f5559e338256f17ada6d82b429acc2dbd8facc9c/library\std\src\panicking.rs:584
   1: core::panicking::panic_fmt
             at /rustc/f5559e338256f17ada6d82b429acc2dbd8facc9c/library\core\src\panicking.rs:67
   2: cargo_test_support::paths::impl$1::rm_rf
             at .\crates\cargo-test-support\src\paths.rs:146
   3: cargo_test_support::paths::init_root
             at .\crates\cargo-test-support\src\paths.rs:81
   4: testsuite::warn_on_failure::no_warning_on_bin_failure
             at .\tests\testsuite\warn_on_failure.rs:77
   5: testsuite::warn_on_failure::no_warning_on_bin_failure::closure$0
             at .\tests\testsuite\warn_on_failure.rs:77
   6: core::ops::function::FnOnce::call_once<testsuite::warn_on_failure::no_warning_on_bin_failure::closure_env$0,tuple$<> >
             at /rustc/f5559e338256f17ada6d82b429acc2dbd8facc9c\library\core\src\ops\function.rs:250
   7: core::ops::function::FnOnce::call_once
             at /rustc/f5559e338256f17ada6d82b429acc2dbd8facc9c/library\core\src\ops\function.rs:250
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


failures:
    warn_on_failure::no_warning_on_bin_failure

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 2811 filtered out; finished in 0.15s

Steps

Run cargo t --test testsuite warn_on_failure::no_warning_on_bin_failure on windows MSVC until you get a test failure. This usually only takes 2 or 3 run to fail for me, and once it fails once it will continue to fail until I run rm -r -force target\\tmp\\cit\\t0.

I am not sure what -force is doing that CargoPathExt::rm_rf isn't. Without force, rm gives the same error as cargo. https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/remove-item?view=powershell-7.3#example-3-delete-hidden-read-only-files says it might be related to the index being read-only?

Some more info that may or may not be helpful:

PS C:\Users\jyn\src\cargo> rm -r C:\\Users\\jyn\\src\\cargo\\target\\tmp\\cit\\t0
rm : Cannot remove item C:\Users\jyn\src\cargo\target\tmp\cit\t0\home\AppData\Local\Microsoft\Windows\INetCache\IE\NJD6DELP: You do not have 
sufficient access rights to perform this operation.
At line:1 char:1
+ rm -r C:\\Users\\jyn\\src\\cargo\\target\\tmp\\cit\\t0
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : PermissionDenied: (NJD6DELP:DirectoryInfo) [Remove-Item], IOException
    + FullyQualifiedErrorId : RemoveFileSystemItemUnAuthorizedAccess,Microsoft.PowerShell.Commands.RemoveItemCommand
rm : Cannot remove item C:\Users\jyn\src\cargo\target\tmp\cit\t0\home\AppData\Local\Microsoft\Windows\INetCache\IE\container.dat: You do not     
have sufficient access rights to perform this operation.
At line:1 char:1
+ rm -r C:\\Users\\jyn\\src\\cargo\\target\\tmp\\cit\\t0
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : PermissionDenied: (68ce435e0a43416537b0fee6dba5d9f338e038:FileInfo) [Remove-Item], IOException
    + FullyQualifiedErrorId : RemoveFileSystemItemUnAuthorizedAccess,Microsoft.PowerShell.Commands.RemoveItemCommand
rm : Directory C:\Users\jyn\src\cargo\target\tmp\cit\t0\registry\.git\objects\c9 cannot be removed because it is not empty.
At line:1 char:1

that .git/objects path apparently doesn't have write permissions set:

PS C:\Users\jyn\src\cargo> ls C:\Users\jyn\src\cargo\target\tmp\cit\t0\registry\.git\objects\  


    Directory: C:\Users\jyn\src\cargo\target\tmp\cit\t0\registry\.git\objects


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d-----         4/27/2023  12:44 AM                25
d-----         4/27/2023  12:44 AM                26
d-----         4/27/2023  12:44 AM                51
d-----         4/27/2023  12:44 AM                67
d-----         4/27/2023  12:44 AM                9e
d-----         4/27/2023  12:44 AM                a5
d-----         4/27/2023  12:44 AM                bd
d-----         4/27/2023  12:44 AM                c9

I tried switching to https://github.com/XAMPPRocky/remove_dir_all, but it didn't help.

Possible Solution(s)

No response

Notes

No response

Version

8bcf8fe

@jyn514 jyn514 added C-bug Category: bug S-triage Status: This issue is waiting on initial triage. labels Apr 27, 2023
@jyn514
Copy link
Member Author

jyn514 commented Apr 27, 2023

also, this isn't a bug necessarily, but many other tests will fail unless core.symlinks is set to true.

@ChrisDenton
Copy link
Member

ChrisDenton commented Apr 27, 2023

This seems related to #11442 and #11872 in that cargo (currently) uses its own logic for removing directory trees. Which has nicer error output but doesn't handle all the edge cases on different platforms.

In this particular case, git is setting the readonly attribute, which prevents deletion under win32 semantics. I guess they were expecting it to work like the unix permission. Things like rm -r -force and remove_dir_all essentially ignore the attribute.

EDIT: Oh wait: switching to the remove_dir_all crate didn't help? Does std::remove_dir_all work? I would have expected it to.

@ehuss
Copy link
Contributor

ehuss commented Apr 27, 2023

This is the key part: cit\t0\home\AppData\Local\Microsoft\Windows\INetCache\IE\NJD6DELP

I've been having this same problem on my system, but I was thinking it was something unique to my environment. I'm glad to see it's not just me. 😄

Something is creating this AppData directory in the mock home directory:

./AppData
./AppData/Local
./AppData/Local/Microsoft
./AppData/Local/Microsoft/EdgeBho
./AppData/Local/Microsoft/EdgeBho/IEToEdge
./AppData/Local/Microsoft/Feeds Cache
./AppData/Local/Microsoft/Internet Explorer
./AppData/Local/Microsoft/Internet Explorer/DOMStore
./AppData/Local/Microsoft/Internet Explorer/EmieSiteList
./AppData/Local/Microsoft/Internet Explorer/EmieUserList
./AppData/Local/Microsoft/Windows
./AppData/Local/Microsoft/Windows/History
./AppData/Local/Microsoft/Windows/History/History.IE5
./AppData/Local/Microsoft/Windows/History/History.IE5/MSHist012023032020230327
./AppData/Local/Microsoft/Windows/History/History.IE5/MSHist012023032720230403
./AppData/Local/Microsoft/Windows/History/History.IE5/MSHist012023040320230410
./AppData/Local/Microsoft/Windows/History/History.IE5/MSHist012023041520230416
./AppData/Local/Microsoft/Windows/IECompatCache
./AppData/Local/Microsoft/Windows/IECompatUaCache
./AppData/Local/Microsoft/Windows/IEDownloadHistory
./AppData/Local/Microsoft/Windows/INetCache
./AppData/Local/Microsoft/Windows/INetCache/Content.IE5
./AppData/Local/Microsoft/Windows/INetCache/IE
./AppData/Local/Microsoft/Windows/INetCache/IE/container.dat
./AppData/Local/Microsoft/Windows/INetCache/IE/Z6I2XOZV
./AppData/Local/Microsoft/Windows/INetCache/IE/Z6I2XOZV/dyntelconfig.json[1].cache

I've spent a fairly large amount of time searching on the internet for what could possibly be causing this, but I have not had much luck.

Some of the searches I've had refer to InetCache as being a directory junction. But AFAICT, mine is not a junction.

There is something funky about these directories. Most tools (like pwsh or cmd or explorer) can't even view the contents of the InetCache folder. Somehow bash is able to read it, and indicates there is a link:

lrwxrwxrwx 1 eric 87 Apr 23 12:16 Content.IE5 -> /d/Proj/rust/cargo/target/tmp/cit/t29/home/AppData/Local/Microsoft/Windows/INetCache/IE/
drwxr-xr-x 1 eric  0 Apr 23 12:16 IE/

I've checked permissions, and I don't see why. I saw some comments about "Virtual Folders" in some searches, but I'm not sure if that is the case here.

I would love if someone could help figure this out. I've been hitting this for a very long time, and it has been driving me batty.

The creation of these directories seems semi-random, and I can't find a correlation to what might be causing them.


@jyn514 Can you include the complete version information of cargo -Vv in your report. In particular, the OS info might be relevant.


@ChrisDenton This is using std::fs::remove_dir_all here.

@ehuss
Copy link
Contributor

ehuss commented Apr 27, 2023

One theory is that it is the VCTIP (VC++ Technology Improvement Program). I tried turning off telemetry, but it didn't seem to fix it. 😦 Maybe I didn't turn it off in the correct place?

@ChrisDenton
Copy link
Member

ChrisDenton commented Apr 27, 2023

Ah I see! My workflow here would be to use Process Monitor to record events while cargo's tests are running and to apply filers for the file/directory names.

My first assumption here would be that something in cargo's tests is loading a dll related to web stuff.

@jyn514
Copy link
Member Author

jyn514 commented Apr 27, 2023

@jyn514 Can you include the complete version information of cargo -Vv in your report. In particular, the OS info might be relevant.

PS C:\Users\jyn\src\cargo> cargo -vV
cargo 1.71.0-nightly (de80432f0 2023-04-21)
release: 1.71.0-nightly
commit-hash: de80432f04da61d98dcbbc1572598071718ccfd2
commit-date: 2023-04-21
host: x86_64-pc-windows-msvc
libgit2: 1.6.3 (sys:0.17.0 vendored)
libcurl: 8.0.1-DEV (sys:0.4.61+curl-8.0.1 vendored ssl:Schannel)
os: Windows 10.0.19045 (Windows 10 Pro) [64-bit]

I can try running process monitor in a bit. Is there a Windows version of strace I can use to see what Remove-Item -Force is doing under the hood? Ideally cargo could just do the same and we wouldn't have to care if these directories get created.

@jyn514
Copy link
Member Author

jyn514 commented Apr 27, 2023

Oh I see, ProcMon is the Windows version of strace. I'll see what I can find.

@jyn514
Copy link
Member Author

jyn514 commented Apr 27, 2023

@ehuss I don't think the version of cargo in my default toolchain is relevant. I can reproduce this just by running target\debug\deps\testsuite-96440e92f452dc25.exe warn_on_failure::no_warning_on_bin_failure twice in a row.

@jyn514
Copy link
Member Author

jyn514 commented Apr 27, 2023

Here's the files the test binary touches; as you can see it never creates AppData: image

However, if I include other processes, some of them do touch AppData:
image

I'm not sure why powershell is running at all, or why it's creating these folders.

@ehuss
Copy link
Contributor

ehuss commented Apr 27, 2023

It might be because powershell is part of the telemetry system. Do you have VSCode running? That has a separate telemetry system from Visual Studio.

@jyn514
Copy link
Member Author

jyn514 commented Apr 27, 2023

I do. Let me close everything I have open and see what still pops up.

@jyn514
Copy link
Member Author

jyn514 commented Apr 27, 2023

Now something called VCTIP.exe is creating the directories.

image

@jyn514
Copy link
Member Author

jyn514 commented Apr 27, 2023

Hmm, I set HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\VSCommon\17.0\SQM\OptIn to 0 as suggested in https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2017/ide/visual-studio-experience-improvement-program?view=vs-2017, but I'm still seeing the folder created by VCTIP, even after a restart.

@jyn514
Copy link
Member Author

jyn514 commented Apr 27, 2023

Is there a Windows version of strace I can use to see what Remove-Item -Force is doing under the hood? Ideally cargo could just do the same and we wouldn't have to care if these directories get created.

Here's what it's doing on the inner-most file (the last one that doesn't get removed if I don't pass -Force):
image

Compare that to what it does to a normal file (without -Force):
image

Those ... look the same to me? Not sure what I'm missing.

edit: more details

without Force: image

with -Force:
image

@jyn514
Copy link
Member Author

jyn514 commented Apr 27, 2023

ls shows the directory as a "hidden system" file:

PS C:\Users\jyn\src\cargo> ls "C:\Users\jyn\src\cargo\target\tmp\cit\t0\home\AppData\Local\Microsoft\" -h


    Directory: C:\Users\jyn\src\cargo\target\tmp\cit\t0\home\AppData\Local\Microsoft


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d--hs-         4/27/2023   6:20 PM                Feeds Cache

https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/get-childitem?view=powershell-7.3#example-1-get-child-items-from-a-file-system-directory

@ChrisDenton
Copy link
Member

Ok, I'm at a computer now. I'm not sure if these work to disable telemetry completely but on my own machine I have these two environment variables set globally:

Key Value
VSCMD_SKIP_SENDTELEMETRY 1
VCPKG_KEEP_ENV_VARS VSCMD_SKIP_SENDTELEMETRY

The directory in question seem to have an explicit "Deny" permission of listing the contents of a directory. It seems our remove_dir_all code is returning immediately on error instead of attempting to just delete the directory. That is definitely something that can and should be fixed.

@ehuss
Copy link
Contributor

ehuss commented Apr 28, 2023

I confirmed on my system that it is VCTIP as well.

The item that seems to be the most trouble is InetCache\Content.IE5. In my testing, the important parts are that it is a directory junction, and that its ACL is "Everyone Deny". I can reproduce that and confirm that Rust's remove_dir_all fails, but powershell rm -Force succeeds.

Unless I'm mistaken, I think PowerShell just uses the facilities defined here. However, I can't quite follow from that what is unique about being a Reparse Point + having a restricted ACL.

(The Content.IE5 link also is System+Hidden, but that doesn't seem to matter in my testing.)

@ChrisDenton
Copy link
Member

Linked above is a fix. If anyone were able to test test it then that would be grand. I did so using the following little app compiled first without the patch (which failed) and then with (which worked):

fn main() {
    let path = std::env::args().nth(1).unwrap();
    println!("deleting {path}");
    std::fs::remove_dir_all(&path).unwrap();
}

@ehuss
Copy link
Contributor

ehuss commented Apr 28, 2023

I confirmed that rust-lang/rust#110918 works with deleting the troublesome directories.

I'm going to close this issue as this seems to be a combination of external factors (VCTIP creating weird files and fs::remove_dir_all not being able to remove them). Solutions will hopefully be the linked PR, and disabling VCTIP here (I'm not sure if just the registry entry is enough, or if VSCMD_SKIP_SENDTELEMETRY is also required).

@ehuss ehuss closed this as completed Apr 28, 2023
@ehuss
Copy link
Contributor

ehuss commented Apr 28, 2023

BTW, thanks @jyn514 and @ChrisDenton for helping with this, it is greatly appreciated!

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Apr 28, 2023
…r=cuviper

`remove_dir_all`: try deleting the directory even if `FILE_LIST_DIRECTORY` access is denied

If opening a directory with `FILE_LIST_DIRECTORY` access fails then we should try opening without requesting that access. We may still be able to delete it if it's empty or a link.

Fixes rust-lang/cargo#12042
RalfJung pushed a commit to RalfJung/miri that referenced this issue Apr 28, 2023
`remove_dir_all`: try deleting the directory even if `FILE_LIST_DIRECTORY` access is denied

If opening a directory with `FILE_LIST_DIRECTORY` access fails then we should try opening without requesting that access. We may still be able to delete it if it's empty or a link.

Fixes rust-lang/cargo#12042
thomcc pushed a commit to tcdi/postgrestd that referenced this issue Jul 18, 2023
`remove_dir_all`: try deleting the directory even if `FILE_LIST_DIRECTORY` access is denied

If opening a directory with `FILE_LIST_DIRECTORY` access fails then we should try opening without requesting that access. We may still be able to delete it if it's empty or a link.

Fixes rust-lang/cargo#12042
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 20, 2024
`remove_dir_all`: try deleting the directory even if `FILE_LIST_DIRECTORY` access is denied

If opening a directory with `FILE_LIST_DIRECTORY` access fails then we should try opening without requesting that access. We may still be able to delete it if it's empty or a link.

Fixes rust-lang/cargo#12042
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
`remove_dir_all`: try deleting the directory even if `FILE_LIST_DIRECTORY` access is denied

If opening a directory with `FILE_LIST_DIRECTORY` access fails then we should try opening without requesting that access. We may still be able to delete it if it's empty or a link.

Fixes rust-lang/cargo#12042
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug S-triage Status: This issue is waiting on initial triage.
Projects
None yet
3 participants