Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows builds broken #3929

Closed
richardlau opened this issue Oct 9, 2024 · 12 comments
Closed

Windows builds broken #3929

richardlau opened this issue Oct 9, 2024 · 12 comments

Comments

@richardlau
Copy link
Member

It looks like since this morning the Windows test jobs started failing

It looks like they're timing out trying to connect to the binary_tmp git repo:
e.g. https://ci.nodejs.org/job/node-test-binary-windows-js-suites/30748/RUN_SUBSET=0,nodes=win10-COMPILED_BY-vs2022/console

10:50:37 > git fetch --prune --no-tags [email protected]:binary_tmp.git +refs/heads/jenkins-node-test-commit-windows-fanned-037ac1a672300d7e107e547c43dc646c1e227573-bin-win-vs2022:refs/remotes/jenkins_tmp 
10:50:37 Warning: Permanently added '67.158.54.159' (ED25519) to the list of known hosts.
10:50:37 Build timed out (after 30 minutes). Marking the build as failed.
10:50:37 Terminate batch job (Y/N)? 
10:50:37 

cc @StefanStojanovic

@aduh95
Copy link
Contributor

aduh95 commented Oct 10, 2024

Happy to help if I can, this is blocking the release.

@StefanStojanovic
Copy link
Contributor

I did an investigation. Everything was broken between Oct 9, 2024, 3:23 AM and Oct 9, 2024, 5:20 AM Jenkins local time. One noticeable thing that happened between those 2 moments is windows-update-reboot trigger, which updated git from v2.46.2 to 2.47.0 (from log You have git.install v2.46.2 installed. Version 2.47.0 is available based on your source(s).).

I'll try taking one machine offline downgrading git and seeing if it fixes the issue. If it does I'll go ahead and downgrade all of the machines and pin Git to 2.46.2 until we find a way for it to work with v2.47.0.

@aduh95
Copy link
Contributor

aduh95 commented Oct 10, 2024

@StefanStojanovic did you get some results from the investigation?

@StefanStojanovic
Copy link
Contributor

I made changes in Windows 10 machines (downgrading git to 2.46.2) since there are only 4 of them, so was the least work to get them all done. However now I get different error log with Permission denied, please try again. when trying to do git fetch .... Currently, I'm not sure what to make of it.

The machines doing the compilation are on the latest git and they seem not to have issues pushing code.

@StefanStojanovic
Copy link
Contributor

An update - after reinstalling Git, cygpath, which is required, was missing from PATH for some reason (C:\Program Files\Git\usr\bin wasn't added). After adding it manually, canceling all test jobs (as they are waiting a lot because of the timeouts) and retrying a random one, I have this run and I can see that Win10 tests are running (the others will timeout again).

So until now, based on my investigation, I think this needs to be done on all Windows CI machines:

  • choco uninstall git
  • choco install git --version=2.46.2
  • Check if reinstall was successful as 1/4 machines still kept v2.47.1,
  • choco pin add -n git
  • choco pin add -n git.install
  • Add C:\Program Files\Git\usr\bin to PATH
  • Reboot machines

After that is done I think the test jobs will be back to normal.

P.S. While I was writing an update Win10 test jobs finished, some of them failed, but because of test failures.

@aduh95
Copy link
Contributor

aduh95 commented Oct 10, 2024

Looking at https://ci.nodejs.org/job/node-test-binary-windows-js-suites/30776/, the problem seems resolved for win10-COMPILED_BY-vs2022 machines, but still ongoing for all the other ones

@StefanStojanovic
Copy link
Contributor

Looking at https://ci.nodejs.org/job/node-test-binary-windows-js-suites/30776/, the problem seems resolved for win10-COMPILED_BY-vs2022 machines, but still ongoing for all the other ones

Exactly, I'll apply same fix to the other machines too

@aduh95
Copy link
Contributor

aduh95 commented Oct 10, 2024

Any chance you'd be able to share a timeline for when the Windows CI would be ready?

@StefanStojanovic
Copy link
Contributor

In an hour or two max, I'm already working on it (halfway there I'd say). Will let you know once it's ready.

@StefanStojanovic
Copy link
Contributor

Fixed all of the machines except 2 Windows 2022 in Rackspace with VS 2022 (the ones we use for both compilation and tests) which I marked as temporary offline (the other 4 are fixed). I'll deal with them first thing in the morning tomorrow CET.

Anyway, here are the daily master test jobs I retried (https://ci.nodejs.org/job/node-test-binary-windows-js-suites/30782/ and https://ci.nodejs.org/job/node-test-binary-windows-native-suites/25168/), as you can see, all passed, so should be all fine except for the 2 missing machines. That is just temporary, I'll reenable them tomorrow. Regards.

@StefanStojanovic
Copy link
Contributor

The last 2 machines were fixed and reenabled a few hours ago. From what I see, everything is back to normal. I'll keep this issue open until Monday just in case.

@StefanStojanovic
Copy link
Contributor

Closing this issue now since it didn't repeat over the weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants