-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spurious appveyor 32-bit test timeouts #46903
Comments
Picking two random logs good bad the major difference seems to be that the good log finishes compiling the compiler at 01:01:30, whereas the bad log finishes at 01:21:05, a 20 minute delay from the original one. AFAIK no real extra work was done in the bad log. I believe that AppVeyor doesn't guarantee a constant level of performance (shared hosting and whatnot) so I think that we just get less CPU time during peak hours (or at least that's what I think). In that sense I think the only real solution here is to do less work per job. That may mean cutting tests from 32-bit MinGW tests or sharding the builder. |
#47154 may be a cause to the recent explosion in timeouts. The timing also match since #46278 is merged at 2018-01-01T19:04:27Z. There is a fix in #47161. #46910 has caused about 40–50% increase in time spent on fulldeps tests. But it is not sufficient to explain the previous timeouts since that just means an additional 4 minutes at most. |
#47161 has landed but the error rate is still not decreasing 😢 |
I've done some analysis of our historical trends to see what's going on here. This is specifically for the i686-pc-windows-msvc builder that's running tests on AppVeyor First up we have the trend of the total build time over time: Clearly we're on the up and up! Next I broke it down by stage. Here I was taking a look at various stages in the build: Here we can see for sure that various stages are getting slower, and if we look at each of them in isolation (not stacked up) we get: which from this seems to indicate:
The raw data (not smoothed, but stacked and not stacked) is unfortunately pretty hard to decipher. I also unfortunately don't quite know where to go from here.. |
Surely the size of the code base and test suite is growing over time, I think this is the expected result unless compiler speed is improving at a greater rate than the code base is growing (which seems unlikely). |
@withoutboats I agre yeah but there's been a severe uptick over the past ~200 builds which means our build time is increasing way faster than it was before, which seems worrisome.. |
This seems to be another example: https://ci.appveyor.com/project/rust-lang/rust/build/1.0.6426/job/do1stdu2mywwkyf7 |
Split MinGW tests into two builders on AppVeyor Run-pass and compile-fail tests appear to take the most significant chunk of time, so split them into their own builder. Should help with #46903. r? @kennytm cc @alexcrichton
Closing as fixed. We've had multiple successful builds on AppVeyor, the 32-bit MinGW builders are both now around 2 hours. |
The appveyor 32-bit MinGW test builders on appveyor are sometimes slower than expected and time out, which causes some of its builders to exceed the 3 hour limit (this had also happened I think in the start of December, if someone can bother digging up these PRs).
It appears that a "good" build (e.g. https://ci.appveyor.com/project/rust-lang/rust/build/1.0.5766) takes 150 minutes, while a "bad" build on the same code can exceed the 3 hour (180 minutes) limit.
It appears that in some cases (e.g. https://ci.appveyor.com/project/rust-lang/rust/build/1.0.5551) other builders also get close to the limit, but I haven't seen any of the hitting it yet. The reason appears to be that the 32-bit test builders (both MSVC and GNU) are the slowest, taking the "full" 150 minutes even on a good day.
I'm not that sure what the best solution is - eventually we could play with checkpoint/restart, but I would not want to do that on Windows first.
Maybe it's possible to investigate the cause of the slowness, or to bump the time limit, or to split the pc-windows-gnu builders (the latter would also speed up the cycle time).
However, the Windows 32-bit test builders being the slowest of our entire group seems to be a good cause to split them (this also makes some sense, because they spawn a lot of processes, which is slow on Windows).
Cases:
The text was updated successfully, but these errors were encountered: