Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inspector tests leaving around stale processes #5

Closed
Trott opened this issue Jun 7, 2018 · 14 comments
Closed

Inspector tests leaving around stale processes #5

Trott opened this issue Jun 7, 2018 · 14 comments
Assignees

Comments

@Trott
Copy link
Member

Trott commented Jun 7, 2018

This seems to have started today and seems to be making it all but impossible to get a green CI on Raspberry Pi (arm-fanned).

Inspector tests are leaving around stale jobs. Here's a partial copy/paste of an exchange between @refack and me in the node-build IRC channel:

me:

Hanging process is:
home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=0 -p process.debugPort
I don't have access to test-requireio_kahwee-debian9-arm64_pi3-1 but it looks like it could use the same treatment. refack rvagg
Took that one offline. Guess it needs to be added to the ansible playbook for writing the ssh config file.

refack:

I saw home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=0 -p process.debugPort on ceejbot-debian9-armv6l_pi1p-1 as well . Opening an issue

me:

Gotta be test/sequential/test-inspector-port-zero-cluster.js since that one fails so often. Probably leaves a process around when it fails.
Different one on test-requireio_mininodes-debian9-armv6l_pi1p-1 but still an inspector test.
/home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect -e const assert = require('assert'); const inspector = process.binding('inspector'); assert( !!inspector.isEnabled(), 'inspector.isEnabled() should be true when run with --inspect'); process._debugEnd(); assert( !inspector.isEnabled(), 'inspector.isEnabled() should be false after _debugEnd()');
Not seeing the issue. Did you open it, refack? Maybe I'm looking in the wrong repo.
Meanwhile test-requireio_continuationlabs-debian9-armv6l_pi1p-1 has 18 of these running:

/home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12391 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js

Same forr test-requireio_mininodes-debian9-armv7l_pi2-1
(Although fewer there. Only 6.

@nodejs/v8-inspector @nodejs/build @nodejs/testing

@Trott
Copy link
Member Author

Trott commented Jun 7, 2018

Another one:

$ ssh test-requireio_bengl-debian9-armv6l_pi1p-1
The authenticity of host '192.168.2.43 (<no hostip for proxy command>)' can't be established.
ECDSA key fingerprint is SHA256:87BtCkbkmfMLPwfh69OjNKsCHyGkDsuGcAuazlA4QJM.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.2.43' (ECDSA) to the list of known hosts.
Linux test-requireio--bengl-debian9-armv6l--pi1p-1 4.14.34+ #1110 Mon Apr 16 14:51:42 BST 2018 armv6l

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed May 30 19:58:35 2018 from 192.168.1.10
pi@test-requireio--bengl-debian9-armv6l--pi1p-1:~ $ ps -ef| grep Release
iojs     22070   883  0 19:20 ?        00:00:01 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=0 -e process._debugEnd();process.exit(42);
iojs     22084   883  0 19:26 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-open.js
pi       23194 23150  0 23:10 pts/0    00:00:00 grep --color=auto Release
pi@test-requireio--bengl-debian9-armv6l--pi1p-1:~ $ sudo kill -9 22070 22084
pi@test-requireio--bengl-debian9-armv6l--pi1p-1:~ $ ps -ef | grep Release
pi       23208 23150  0 23:10 pts/0    00:00:00 grep --color=auto Release
pi@test-requireio--bengl-debian9-armv6l--pi1p-1:~ $ exit
logout
Connection to 192.168.2.43 closed.
Killed by signal 1.

Not really sure how many more of these I need to log.

@Trott
Copy link
Member Author

Trott commented Jun 7, 2018

Another one:

$ ssh test-requireio_continuationlabs-debian9-armv6l_pi1p-1
Linux test-requireio--continuationlabs-debian9-armv6l--pi1p-1 4.14.34+ #1110 Mon Apr 16 14:51:42 BST 2018 armv6l

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Thu Jun  7 22:55:54 2018 from 192.168.1.10
pi@test-requireio--continuationlabs-debian9-armv6l--pi1p-1:~ $ ps -ef | grep Release
iojs     21384   875  0 19:08 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect-port=9230 /home/iojs/build/workspace/node-test-binary-arm/test/parallel/test-inspect-support-for-node_options.js
iojs     21389   875  0 19:08 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect-port=9231 /home/iojs/build/workspace/node-test-binary-arm/test/parallel/test-inspect-support-for-node_options.js
iojs     23315   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23320   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=65534 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23321   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12346 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23322   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect --inspect-port=12351 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23327   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect --debug-port=12356 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23328   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=0.0.0.0:12361 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23333   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=127.0.0.1:12366 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23338   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12371 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23343   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12376 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23346   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12381 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23355   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12386 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23360   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12391 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23365   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12396 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23370   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12401 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23380   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12406 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
iojs     23391   875  0 19:27 ?        00:00:02 /home/iojs/build/workspace/node-test-binary-arm/out/Release/node --inspect=12416 /home/iojs/build/workspace/node-test-binary-arm/test/sequential/test-inspector-port-cluster.js
pi       24638 24618  0 23:12 pts/0    00:00:00 grep --color=auto Release
pi@test-requireio--continuationlabs-debian9-armv6l--pi1p-1:~ $

Did a kill -9 there too so it's back to working again.

@eugeneo
Copy link

eugeneo commented Jun 7, 2018

I'm confident this is because of nodejs/node@327ce2d, assigning to me.

@eugeneo
Copy link

eugeneo commented Jun 7, 2018

(Can't assign, actually - will be looking into it)

Is there a way to get SSH access to one of those systems?

@Trott
Copy link
Member Author

Trott commented Jun 7, 2018

(Can't assign, actually - will be looking into it)

OK, I just assigned you to it.

@joyeecheung It appears that Collaborators only have Read access to this repo. Any reason not to give them Write access? (Or maybe this issue is in the wrong repo and I should open it in the main repo?)

@Trott
Copy link
Member Author

Trott commented Jun 7, 2018

Is there a way to get SSH access to one of those systems?

To get SSH access to CI machines, you'd open an issue in the Build repo. To expedite things, I'll do that for you right now. Once one or two other Build WG people approve the request (which hopefully won't take more than 10 or 15 minutes, but we'll see), I or someone else can give you access.

@eugeneo
Copy link

eugeneo commented Jun 7, 2018

Are those processes from my CI runs (ones I started for nodejs/node#21182) or a result of other collaborators running the tests?

Upd: PR URL

@Trott
Copy link
Member Author

Trott commented Jun 7, 2018

Are those processes from my CI runs (ones I started for https://github.com/pulls) or a result of other collaborators running the tests?

@eugeneo I'm not sure. I assumed other collaborators, but I suppose it could have been from your own PRs. I guess we can wait and see if the come back now that I've terminated them all (or at least all the ones I know about). If they come back, they're happening in other PRs. If they don't come back, maybe it was all from your stuff and we can ignore them?

@eugeneo
Copy link

eugeneo commented Jun 7, 2018

@Trott Sounds good. If they are caused by the pending PR, I will be extra careful when running next CI.

@Trott
Copy link
Member Author

Trott commented Jun 7, 2018

@eugeneo I checked the build history for one of the hosts that had stalled processes. It last succeeded in https://ci.nodejs.org/job/node-test-binary-arm/RUN_SUBSET=3,label=pi1-docker/1700/ which was started by addaleax and then first failed in https://ci.nodejs.org/job/node-test-binary-arm/RUN_SUBSET=0,label=pi1-docker/1702/ which was started by you, so that strongly suggesting the CIs done in the course of nodejs/node#21182.

@Trott
Copy link
Member Author

Trott commented Jun 7, 2018

By the way, this kind of suggests a build issue too, because Makefile tries to terminate the processes when it finds them. Unfortunately, I was only able to terminate these processes with root privileges and I suspect we don't run it with root privs on the Raspberry Pi devices (although we do elsewhere.)

@Trott
Copy link
Member Author

Trott commented Jun 7, 2018

CI is running fine now that I've terminated all the stalled jobs. I'm going to close this, since @eugeneo is aware. We can re-open if this recurs.

@Trott Trott closed this as completed Jun 7, 2018
@joyeecheung
Copy link
Member

joyeecheung commented Jun 8, 2018

@Trott I've given Collaborators write access now. Should I just give the Core write access?

@Trott
Copy link
Member Author

Trott commented Jun 8, 2018

@Trott I've given Collaborators write access now. Should I just give the Core write access?

I think Collaborators is probably fine. We can always expand it further later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants