Skip to content

Commit

Permalink
test: fix test-net-connect-reset-until-connected
Browse files Browse the repository at this point in the history
Fixes a race condition in test that causes the test to randomly timeout on Solaris 11.4, SmartOS and potentially also FreeBSD. The client resets the connection using conn.resetAndDestroy(). This call is asynchronous and if it's effect occurs before server's listening socket accepts the connection, the test hangs. The fix is to put a synchronization barrier that resets the connection only after it is established on both server and client side.

Below is a little bit more about the root cause. I show positive (test works) and negative (when test hangs) scenarios. The output contains only relevant system / library calls that were collected using truss(1). Without the fix the test randomly hangs. With the fix the test completes thousands of runs without single issue.

Race condition scenario:
```
connect(23, 0x7FFFBFFF7F10, 32, SOV_XPG4_2)Err#150 EINPROGRESS
^ client socket connects to server
close(23)= 0
^ client closes the socket too early
accept(21, 0x00000000, 0x00000000, SOV_DEFAULT)Err#130 ECONNABORTED
accept(21, 0x00000000, 0x00000000, SOV_DEFAULT)Err#11 EAGAIN
^ accept on listening socket fails
... test hangs and times out...
```

Working (good) scenario:
```
connect(23, 0x7FFFBFFF7F00, 32, SOV_XPG4_2)Err#150 EINPROGRESS
^ client socket connects to server
accept(21, 0x00000000, 0x00000000, SOV_DEFAULT)= 24
^ server accepts client connection on listening socket
close(23)= 0
^ client socket closes
read(24, 0x046B6010, 65536)Err#131 ECONNRESET
^ test gets so much wanted error while reading on accepted FD
close(24)= 0
^ accepted FD closes
... test completes, passes ...
```

Fixes: #43446
  • Loading branch information
Vita Batrla committed Feb 22, 2023
1 parent ee22706 commit 9af6703
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 5 deletions.
4 changes: 0 additions & 4 deletions test/parallel/parallel.status
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,6 @@ test-crypto-dh-stateless: SKIP
test-crypto-keygen: SKIP

[$system==solaris] # Also applies to SmartOS
# https://github.com/nodejs/node/issues/43446
test-net-connect-reset-until-connected: PASS, FLAKY
# https://github.com/nodejs/node/issues/43457
test-domain-no-error-handler-abort-on-uncaught-0: PASS, FLAKY
test-domain-no-error-handler-abort-on-uncaught-1: PASS,FLAKY
Expand All @@ -52,8 +50,6 @@ test-domain-with-abort-on-uncaught-exception: PASS, FLAKY
test-fs-stat-bigint: PASS,FLAKY
# https://github.com/nodejs/node/issues/31280
test-worker-message-port-message-before-close: PASS,FLAKY
# https://github.com/nodejs/node/issues/43446
test-net-connect-reset-until-connected: PASS, FLAKY

[$system==ibmi]
# https://github.com/nodejs/node/pull/30819
Expand Down
11 changes: 10 additions & 1 deletion test/parallel/test-net-connect-reset-until-connected.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,27 @@
const common = require('../common');
const net = require('net');

function barrier(count, cb) {
return function () {
if (--count == 0)
cb();
}
}

const server = net.createServer();
server.listen(0, common.mustCall(function() {
const port = server.address().port;
const conn = net.createConnection(port);
const connok = barrier(2, () => conn.resetAndDestroy());
conn.on('close', common.mustCall());
server.on('connection', (socket) => {
connok();
socket.on('error', common.expectsError({
code: 'ECONNRESET',
message: 'read ECONNRESET',
name: 'Error'
}));
server.close();
});
conn.resetAndDestroy();
conn.on('connect', connok);
}));

0 comments on commit 9af6703

Please sign in to comment.