-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: investigate flaky test-child-process-stdio-big-write-end #13603
Comments
Also, just to make it clear this isn't a one-time fluke: https://ci.nodejs.org/job/node-test-commit-arm/10251/nodes=ubuntu1604-arm64/console not ok 1516 parallel/test-child-process-stdio-big-write-end
---
duration_ms: 120.600
severity: fail
stack: |-
timeout |
Looks to me like this corresponds precisely with switching from mininodes to packetnet (in the hostname for the test server), but I don't know the first thing about when/why/how that happened or what the implications are or if there are significant differences in memory/CPU/etc. /cc @nodejs/build |
Given the Which makes me wonder if it's all about network or something like that? Also makes me wonder if it belongs in |
I think this is officially a broken test/platform and not merely flaky/unreliable... |
Tried to bisect, but even 8.0.0 fails: |
Got a log in to test-packetnet-ubuntu1604-arm64-2. The test hangs when run from the command line. No hung processes prior, no output... Probably appropriate at this point to loop in @nodejs/testing even though it seems host-configuration specific. Maybe someone will have an idea of what might be up.... |
I did a little more investigating on the problematic host and the issue is that this bit of the test is an infinite loop (or infinite-seeming in any event) on the host for whatever reason: // Write until the buffer fills up.
let buf;
do {
buf = Buffer.alloc(BUFSIZE, '.');
sent += BUFSIZE;
} while (child.stdin.write(buf)); |
Proposed fix coming in another minute or four.... |
test-child-process-stdio-big-write-end was failing on ubuntu1604-arm64 because the while loop that was supposed to fill up the buffer ended up being an infinite loop. This increases the size of the writes in the loop by 1K until the buffer fills up. Fixes: nodejs#13603
Proposed fix: #13626 |
@nodejs/testing for context: the mininodes hosts are still active, they have just been relabelled to ubuntu1604-arm64_odroid_c2. The new packet.net hosts are proper server-class ARM machines, not these repurposed mobile chips that we've had access to until now. The major difference you'll find on these new packet.net machines: they have 96 cores and 48G of RAM. We have not virtualized or containerized anything so they are running on bare metal. At the moment we are running at ~ They have access to very fast SSD so the bottlenecks appear pretty late in the parallelization. My assumption with this error when I first saw it was that they are too heavily parallelized so I was reducing The hosts are all accessible to everyone that has nodejs_build_test ssh access, they configs are in the new ansible setup (ansible directory of the build repo, look in the inventory.yml if you want IPs but there is a way to dump everything to your .ssh/config if you want, I just can't tell you off the top of my head). There are 2 x Ubuntu 16.04 and 2 x CentOS 7. |
extra context over @ nodejs/build#755 for those interested in the introduction of packet.net resources |
For sshing in, clone the build repo update your inventory.yml with nodejs/build#754, make sure your
, and do: cd ansible
ansible-playbook playbooks/write-ssh-config.yml |
The test might still be flaky, see node-test-commit-arm/10353, where it fails on centos7-arm64 and ubuntu1604-arm64:
|
test-child-process-stdio-big-write-end was failing on ubuntu1604-arm64 because the while loop that was supposed to fill up the buffer ended up being an infinite loop. This increases the size of the writes in the loop by 1K until the buffer fills up. PR-URL: #13626 Fixes: #13603 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Alexey Orlenko <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
test-child-process-stdio-big-write-end was failing on ubuntu1604-arm64 because the while loop that was supposed to fill up the buffer ended up being an infinite loop. This increases the size of the writes in the loop by 1K until the buffer fills up. PR-URL: #13626 Fixes: #13603 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Alexey Orlenko <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
test-child-process-stdio-big-write-end was failing on ubuntu1604-arm64 because the while loop that was supposed to fill up the buffer ended up being an infinite loop. This increases the size of the writes in the loop by 1K until the buffer fills up. PR-URL: #13626 Fixes: #13603 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Alexey Orlenko <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
master
https://ci.nodejs.org/job/node-test-commit-arm/10256/nodes=ubuntu1604-arm64/
The text was updated successfully, but these errors were encountered: