-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reenable Windows CI build with Artifactory support #4596
Conversation
* Use github concurrency management * Update to get more in sync with nix.yml * Don't bother caching Conan * Add support for Debug job (disabled for now)
* upstream/develop: APIv2: add error messages for account_tx (4571)
* upstream/develop: ci: cancel overridden workflows (4597) fix: Update Handler::Condition enum values 3417 (4239)
* upstream/develop: APIv2(account_info): handle invalid "signer_lists" value (4585) fix: deb package build (4591)
* upstream/develop: fix: add allowClawback flag for `account_info` (4590) build: add binary hardening compile and link flags (4603) add clang-format pre-commit hook (4599) chore: update checkout action version to v3: (4598)
* upstream/develop: refactor: change the return type of mulDiv to std::optional (4243)
* upstream/develop: APIv2(ledger_entry): return "invalidParams" when fields missing (4552)
.github/workflows/windows.yml
Outdated
conan remote add ripple ${{ env.CONAN_URL }} --insert 0 || \ | ||
conan remote list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hosted runner seems to be keeping some state between runs, which cause errors when the commands are run again. This was a workaround. Here's an example from my own repo: https://github.com/ximinez/rippled/actions/runs/5490473061/jobs/10005957758#step:8:14
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it because the runner does not create a fresh environment for each job? I think we should consider this a blocker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what it looks like.
* upstream/develop: Introduce AMM support (XLS-30d): (4294) Adapt to change in Conan recipe for NuDB (4615) docs(CONTRIBUTING): push beta releases to `release` (4589)
* upstream/develop: Rename `allowClawback` flag to `allowTrustLineClawback` (4617) Update dependencies (4595)
Judging by the tests, the runner has only 4 hardware threads available. We might be able to cut the build time if we can get that number up to 12 or 16. |
* upstream/develop: docs: add API Changelog (4612) Set version to 1.12.0-b2 BUILD: list steps after dependencies update (4623)
* upstream/develop: Set version to 1.12.0-rc1 Set version to 1.12-b3 Several changes to improve Consensus stability: (4505) Apply transaction batches in periodic intervals (4504) Asynchronously write batches to NuDB (4503) refactor: fix typo in FeeUnits.h (4644) Update Ubuntu build image (4650) test: add forAllApiVersions helper function (4611) add view updates for account SLEs (4629) Fix the package recipe for consumers of libxrpl (4631) refactor: improve checking of path lengths (4519) refactor: use C++20 function std::popcount (4389) fix(AMM): prevent orphaned objects, inconsistent ledger state: (4626) feat: support Concise Transaction Identifier (CTID) (XLS-37) (4418)
* upstream/develop: Set version to 1.12.0-rc3 Revert "Asynchronously write batches to NuDB (4503)" Revert "Apply transaction batches in periodic intervals (4504)" Revert "Several changes to improve Consensus stability: (4505)"
* upstream/develop: docs: update SECURITY.md (4338) refactor: simplify `TxFormats` common fields logic (4637) APIv2(ledger_entry): return invalidParams for bad parameters (4630) docs(rippled-example.cfg): clarify ssl_cert vs ssl_chain (4667) Introduce replacement for getting and setting thread name: (4312) Set version to 1.12.0 Set version to 1.12.0-rc4 amm_info: fetch by amm account id; add AMM object entry (4682) AMMBid: use tecINTERNAL for 'impossible' errors (4674)
* upstream/develop: Several changes to improve Consensus stability: (4505) Apply transaction batches in periodic intervals (4504) Asynchronously write batches to NuDB. (4503) Remove CurrentThreadName.h from RippledCore.cmake (4697)
This reverts commit ab26e5c.
* Stolen from (4716 discussion) * Also add note to test step that it can fail
* upstream/develop: docs(BUILD.md): require GCC 11 or higher (4700)
@thejohnfreeman @ckeshava I think this PR is as good as it's going to get given current infrastructure. It's an improvement (given that we have no Windows CI at all right now), and doesn't appear to do any harm. It still takes a long time to build, but cancels old jobs, so it won't tie things up indefinitely. The only place where it might cause a problem is if we need to rush something into develop, and need to wait for this build. There are ways around that, so I don't see it as a blocker. Mind reviewing changes and signing off so we can get this merged? Thanks! |
conan export external/snappy snappy/1.1.9@ | ||
conan export external/soci soci/4.0.3@ | ||
- name: install dependencies | ||
echo outcome=$(conan user --remote ripple ${{ secrets.CONAN_USERNAME }} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. this change obviates the need for this PR right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wait, I think this change affects only the windows workflow, never mind
note: awaiting review from @thejohnfreeman to confirm whether prior requests were addressed |
* upstream/develop: refactor: reunify transaction common fields: (4715) fix typo in SECURITY.md (4662) docs(API-CHANGELOG): clarify account_info response (4724) fix: asan stack-use-after-scope in soci::use with rvalues (4676)
* upstream/develop: `fixDisallowIncomingV1`: allow issuers to authorize trust lines (4721) refactor: reduce boilerplate in applySteps: (4710)
Thank you for all the work here @ximinez. I know this was a heavy lift even though the changeset is small. |
My pleasure. I think it will save time and headaches in the long run. |
* upstream/develop: docs(API-CHANGELOG): add `XRPFees` change (4741) docs(rippled-example.cfg): add P2P link compression (4753)
note: intend to merge this shortly. giving CI a chance to run for a bit |
suggested commit message:
|
In Windows, we need to call `python` in order for the `pip` upgrade command to work. This changes the GitHub Actions Windows CI job to use the correct command to upgrade PIP, fixing this error: ``` ERROR: To modify pip, please run the following command: C:\hostedtoolcache\windows\Python\3.9.13\x64\python.exe -m pip install --upgrade pip ``` A future task is to make job run on heavy Windows runners so that it doesn't take so long. Context: #4596
The unity build speeds up compilation by bundling multiple source files into one larger file. This reduces Windows CI build time by up to 50%. As described in #4596, the automatic Windows builds take a very long time. Unity builds are significantly faster - currently about 45 min, much closer to the typical MacOS (35-40 minutes) and nix (~30 minutes) run times. This is intended as a stopgap solution until a more resourced and reliable runner is available. No C++ code was changed. This only affects CI.
* Disable the Windows CI unit tests "allowed to fail" workaround which was previously introduced in #4596. * The runner hardware was upgraded, and the unit tests have been passing since then.
Artifactory support was added to the `nix` builds with XRPLF#4556. This extends that support to the Windows build. Now the Windows build works; CI will build and test a Windows release build. This only affects CI and does not change any C++ code. * Copy the remote setup step outcome fix from XRPLF#4716 discussion * Allow the Windows job to succeed if tests fail: * Currently the tests do not always pass, even on a single threaded run on the GitHub runners. So we are using parallel runs and mark the test step as allowed to fail (continue-on-error). * At this point, it's more important that the build succeeds than that the tests succeed, because: * We've got plenty of test coverage on the other jobs. * Test failures are much rarer than build failures because of cross-platform issues. * Having a test failure locally doesn't interrupt a workflow nearly as much as a build failure. Note that Conan Center cannot hold the binaries we need. They do not build the configurations we need, and they will not add them. ## Future Tasks This introduces a new bottleneck since the build and test takes over an hour. Speed up the job by: * Making this job run on heavy Windows runners. * Increasing the number of hardware threads.
In Windows, we need to call `python` in order for the `pip` upgrade command to work. This changes the GitHub Actions Windows CI job to use the correct command to upgrade PIP, fixing this error: ``` ERROR: To modify pip, please run the following command: C:\hostedtoolcache\windows\Python\3.9.13\x64\python.exe -m pip install --upgrade pip ``` A future task is to make job run on heavy Windows runners so that it doesn't take so long. Context: XRPLF#4596
The unity build speeds up compilation by bundling multiple source files into one larger file. This reduces Windows CI build time by up to 50%. As described in XRPLF#4596, the automatic Windows builds take a very long time. Unity builds are significantly faster - currently about 45 min, much closer to the typical MacOS (35-40 minutes) and nix (~30 minutes) run times. This is intended as a stopgap solution until a more resourced and reliable runner is available. No C++ code was changed. This only affects CI.
* Disable the Windows CI unit tests "allowed to fail" workaround which was previously introduced in XRPLF#4596. * The runner hardware was upgraded, and the unit tests have been passing since then.
High Level Overview of Change
#4556 added Artifactory support to the
nix
builds. This extends that support to the Windows buildThe good news is that the build now works. The bad news is that it introduces a new bottleneck, since the build and test takes over an hour.
Context of Change
Our Windows CI build has been disabled for a while because it was unable to successfully build Conan dependencies. Now that it can download pre-built dependencies from Artificatory, we can put it back in play.
Right now, the job is running on Github's limited hardware. I hope that we can get self-hosted Windows runners available soon. If they become available before this PR is merged, I'll push a trivial commit changing the
runs-on:
line.Before / After
Before: No Windows CI jobs enabled.
After: CI will build and test a Windows release build.
Test Plan
No C++ code was changed. This only affects CI, so there is nothing to test.
Future Tasks
Making this job run on heavy Windows runners, so it doesn't take so dang long.