-
Notifications
You must be signed in to change notification settings - Fork 107
Alternatives for multi-platform CI support in servo/servo #215
Comments
I am in favour of using other services here. I am not particularly concerned about service outages right now, because:
|
That being said, we wouldn't have the ability to log in to the builders like we do with our own, would we? That might impact our ability to reproduce certain failures. |
I really like the solution of adding appveyor/etc support since a lot of our CI problems would just disappear. I'd like to see how well appveyor pans out before committing to it (if we plan to pay them for something, that is). I'm really okay with the "editing buildbot is hard" thing, though (we don't need to edit these much). My main concern is the time sink that might get created if we have to maintain our own Windows infra (maintaining Linux/Mac already is work); especially since I suspect @larsbergstrom is the only one familiar enough with working on Windows to do it 😄 @jdm note that Travis allows you to ssh into their builders if you need it. Not sure if appveyor has that. |
@jdm It appears that AppVeyor allows you to RDP (the Windows version of VNC) into the servers - http://www.appveyor.com/docs/how-to/rdp-to-build-worker @Manishearth I'm probably more worried about the "editing buildbot is hard" thing because I feel like each time we add another builder it doubles in likelihood-to-go-boom. Adding more platforms is definitely going to make things even worse :-/ I'll agree on the Windows buildbot issues. Everyone I've talked with on the Mozilla side of things with experience with it has basically said, "have fun with that." |
I agree with @jdm regarding outages -- for now, having Windows test results sometimes will be better than never having them, and if Appveyor turns out to be down a lot we can look at moving the tests to our own infrastructure or teaching Homu to ignore failures from tests attempted on unreachable platforms. I can see this potentially causing some confusion about where to put a given piece of testing logic (should Homu track it, or should Buildbot and Travis and Appveyor each track it independently? Will one platform ever need to run a test conditionally on some other platform's result?). As long as we document our intentions for the new system clearly, it shouldn't be much of a problem, though. |
For the Windows builds, I agree that unless we have someone with significant Windows + buildbot experience, I think it makes more sense to let AppVeyor handle Windows builds. For the ARM builds, it looks like the servo-nightly repo @larsbergstrom linked is cross-compiling for ARM. If so, I'd prefer integrating an ARM cross-compile flow into the Buildbot flow instead of adding Travis as another external dependency. At the very least, we can add it to Buildbot now, and possibly move things to Travis if needed once the appropriate homu work is done. (Also, if we want real ARM hardware, I think I have some spare Raspberry Pis 😜.) On a side note, are we still using Linode? I thought we had moved away from them to EC2 reserved instances. |
Yeah, we now have EC2 reserved but can spin up more (EC2 on-demand IIRC?) instances if we need. |
We're moving to EC2 as much as we can (EC2 auto-bills to Mozilla - I have to use my personal CC for linode, which Finance frowns upon). The smaller builders will be easy to move soon, but the actual salt master is going to be a bit of a nightmare. It'll get a new IP, we'll need a new mapping for build.servo.org, and I expect our whole CI to basically be offline for a day when we do it as DNS entries and "oops we left a raw IP in that GH repo's config" issues transition over. |
Heh, paying off technical debt is never fun. If you give me a heads up I can try to be around (on IRC I guess) for the transition. Also, I'd recommend not doing a clean cutover but phasing it using Salt multimaster to add the new master first, confirm functionality with a deprecation period (leave any webhooks running and check logs to see if anything is still pinging the old machine), then shut down the old one. |
Now that AppVeyor support has landed in master (servo/servo#9863) and we've got the build times down to ~30 minutes, I'm very tempted to just fix the homu bugs to get things running. |
This has been rolled out! |
Today, we have a default strategy for each new platform or build configuration in the servo/servo repository - add new buildbot rules and rebalance those rules across a new set of builders we spin up.
That has a couple of issues:
One alternative that I'm considering is to add both AppVeyor support (barosl/homu#87) and the ability to gate on multiple CI systems (barosl/homu#100) to homu.
This would mean that for some new platforms (Windows - servo/servo#9406, ARM - https://github.com/mmatyas/servo-nightly/blob/master/.travis.yml, etc.) and also some tests (
test-tidy
), we could run them on Travis or AppVeyor infrastructure and use the merged buildbot+travis+appveyor results.The upsides are:
Downsides:
c4.4xlarge
that we use on EC2) on some of these other services, at least in the first 3--6 months, which could put an upper limit on our build time.Thoughts? CC @Manishearth @metajack @edunham
The text was updated successfully, but these errors were encountered: