Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix flaky test-dgram-exclusive-implicit-bind #10212

Closed
wants to merge 3 commits into from

Conversation

Trott
Copy link
Member

@Trott Trott commented Dec 10, 2016

Checklist
  • make -j4 test (UNIX), or vcbuild test nosign (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

test dgram

Description of change

test-dgram-exclusive-implicit-bind is written assuming that dgram
messages are received with 100% reliability. While missing a dgram
message sent to localhost is rare, we do see it as evidenced by CI
failures from time to time.

The test has been rewritten to send dgram messages over and over until
the test requirements have been met.

Additional incidental refactoring includes:

  • var -> const
  • use of common.mustCall() instead of exit listener + boolean

@Trott Trott added dgram Issues and PRs related to the dgram subsystem / UDP. test Issues and PRs related to the tests. labels Dec 10, 2016
@Trott Trott requested a review from santigimeno December 10, 2016 06:49
@nodejs-github-bot nodejs-github-bot added dont-land-on-v7.x test Issues and PRs related to the tests. labels Dec 10, 2016
@Trott
Copy link
Member Author

Trott commented Dec 10, 2016

Sample CI failure indicating this test was flaky:

https://ci.nodejs.org/job/node-test-commit-linux/6627/nodes=fedora24/console

not ok 1295 parallel/test-dgram-exclusive-implicit-bind
  ---
  duration_ms: 60.62
  severity: fail
  stack: |-
    timeout

@Trott
Copy link
Member Author

Trott commented Dec 10, 2016

@Trott Trott requested a review from cjihrig December 10, 2016 06:53
@Trott
Copy link
Member Author

Trott commented Dec 10, 2016

Could try a stress test, but I don't think this happens with enough frequency for that to be useful without a very large number of runs. Here's master on fedora24 run 9999 times. We'll see if it fails at all. If so, we can do another run with this PR and see if it comes up clean.

https://ci.nodejs.org/job/node-stress-single-test/1070/nodes=fedora24/console

@@ -96,4 +99,12 @@ if (process.env.BOUND === 'y') {
source.unref();
}

source.send(Buffer.from('abc'), 0, 3, common.PORT, '127.0.0.1');
function send() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could all of this just be simplified to:

setInterval(() => {
  source.send(Buffer.from('abc'), 0, 3, common.PORT, '127.0.0.1');
}, 1).unref();

I think you shouldn't have to worry about clearing the interval if you unref it. That would reduce some complexity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was concerned that competing 1ms interval timers on some operating systems could result in increased flakiness, but I didn't actually test it, so yeah, let's try that. :-D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't keep the timer-clearing logic, this happens:

Error: Not running
    at Socket._healthCheck (dgram.js:527:11)
    at Socket.send (dgram.js:347:8)
    at Timeout.setInterval [as _onTimeout] (/Users/trott/io.js/test/parallel/test-dgram-exclusive-implicit-bind.js:104:10)
    at ontimeout (timers.js:365:14)
    at Timer.unrefdHandle (timers.js:471:5)
dgram.js:527
    throw new Error('Not running'); // error message from dgram_legacy.js
    ^

test-dgram-exclusive-implicit-bind is written assuming that dgram
messages are received with 100% reliability. While missing a dgram
message sent to localhost is rare, we do see it as evidenced by CI
failures from time to time.

The test has been rewritten to send dgram messages over and over until
the test requirements have been met.

Additional incidental refactoring includes:

* var -> const
* use of common.mustCall() instead of exit listener + boolean
@Trott
Copy link
Member Author

Trott commented Dec 10, 2016

Updated to use setInterval() per suggestion from @cjihrig.

CI again: https://ci.nodejs.org/job/node-test-pull-request/5352/


source.on('close', function() {
clearTimeout(interval);
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a previous comment I'm not sure this was a problem but can't you get rid of this as the interval is already unrefed? At least locally it works for me

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove that line, the test still passes, but it produces a messy output that looks like it's failing:

$ ./node test/parallel/test-dgram-exclusive-implicit-bind.js
dgram.js:527
    throw new Error('Not running'); // error message from dgram_legacy.js
    ^

Error: Not running
    at Socket._healthCheck (dgram.js:527:11)
    at Socket.send (dgram.js:347:8)
    at Timeout.setInterval [as _onTimeout] (/Users/trott/io.js/test/parallel/test-dgram-exclusive-implicit-bind.js:104:10)
    at ontimeout (timers.js:365:14)
    at Timer.unrefdHandle (timers.js:471:5)
dgram.js:527
    throw new Error('Not running'); // error message from dgram_legacy.js
    ^

Error: Not running
    at Socket._healthCheck (dgram.js:527:11)
    at Socket.send (dgram.js:347:8)
    at Timeout.setInterval [as _onTimeout] (/Users/trott/io.js/test/parallel/test-dgram-exclusive-implicit-bind.js:104:10)
    at ontimeout (timers.js:365:14)
    at Timer.unrefdHandle (timers.js:471:5)
$

I'd prefer the processes clean up after themselves and not generate irrelevant errors like that, even if the errors don't cause the test to fail, so I'd prefer to keep the clearInterval() call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood

Copy link
Member

@santigimeno santigimeno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a suggestion / question

@Trott
Copy link
Member Author

Trott commented Dec 11, 2016

@italoacasas
Copy link
Contributor

Landed 3ddbea6

italoacasas pushed a commit that referenced this pull request Dec 13, 2016
test-dgram-exclusive-implicit-bind is written assuming that dgram
messages are received with 100% reliability. While missing a dgram
message sent to localhost is rare, we do see it as evidenced by CI
failures from time to time.

The test has been rewritten to send dgram messages over and over until
the test requirements have been met.

Additional incidental refactoring includes:

* var -> const
* use of common.mustCall() instead of exit listener + boolean

PR-URL: #10212
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Italo A. Casas <[email protected]>
AnnaMag pushed a commit to AnnaMag/node that referenced this pull request Dec 13, 2016
test-dgram-exclusive-implicit-bind is written assuming that dgram
messages are received with 100% reliability. While missing a dgram
message sent to localhost is rare, we do see it as evidenced by CI
failures from time to time.

The test has been rewritten to send dgram messages over and over until
the test requirements have been met.

Additional incidental refactoring includes:

* var -> const
* use of common.mustCall() instead of exit listener + boolean

PR-URL: nodejs#10212
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Italo A. Casas <[email protected]>
evanlucas pushed a commit that referenced this pull request Dec 14, 2016
test-dgram-exclusive-implicit-bind is written assuming that dgram
messages are received with 100% reliability. While missing a dgram
message sent to localhost is rare, we do see it as evidenced by CI
failures from time to time.

The test has been rewritten to send dgram messages over and over until
the test requirements have been met.

Additional incidental refactoring includes:

* var -> const
* use of common.mustCall() instead of exit listener + boolean

PR-URL: #10212
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Italo A. Casas <[email protected]>
@italoacasas italoacasas mentioned this pull request Dec 15, 2016
MylesBorins pushed a commit that referenced this pull request Jan 22, 2017
test-dgram-exclusive-implicit-bind is written assuming that dgram
messages are received with 100% reliability. While missing a dgram
message sent to localhost is rare, we do see it as evidenced by CI
failures from time to time.

The test has been rewritten to send dgram messages over and over until
the test requirements have been met.

Additional incidental refactoring includes:

* var -> const
* use of common.mustCall() instead of exit listener + boolean

PR-URL: #10212
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Italo A. Casas <[email protected]>
MylesBorins pushed a commit that referenced this pull request Jan 24, 2017
test-dgram-exclusive-implicit-bind is written assuming that dgram
messages are received with 100% reliability. While missing a dgram
message sent to localhost is rare, we do see it as evidenced by CI
failures from time to time.

The test has been rewritten to send dgram messages over and over until
the test requirements have been met.

Additional incidental refactoring includes:

* var -> const
* use of common.mustCall() instead of exit listener + boolean

PR-URL: #10212
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Italo A. Casas <[email protected]>
@MylesBorins MylesBorins mentioned this pull request Jan 24, 2017
MylesBorins pushed a commit that referenced this pull request Jan 31, 2017
test-dgram-exclusive-implicit-bind is written assuming that dgram
messages are received with 100% reliability. While missing a dgram
message sent to localhost is rare, we do see it as evidenced by CI
failures from time to time.

The test has been rewritten to send dgram messages over and over until
the test requirements have been met.

Additional incidental refactoring includes:

* var -> const
* use of common.mustCall() instead of exit listener + boolean

PR-URL: #10212
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Italo A. Casas <[email protected]>
@Trott Trott deleted the fix-implicit branch January 13, 2022 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dgram Issues and PRs related to the dgram subsystem / UDP. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants