Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(iroh-p2p): ensure providers stream closes #625

Merged
merged 4 commits into from
Dec 20, 2022
Merged

fix(iroh-p2p): ensure providers stream closes #625

merged 4 commits into from
Dec 20, 2022

Conversation

ramfox
Copy link
Contributor

@ramfox ramfox commented Dec 20, 2022

Also implements final p2p unit test to close n0-computer/beetle#99
Bug fix fixes the flakey dht tests n0-computer/beetle#74

Some notes:

  • Before we could get to fixing the flakiness, the test was hanging due to a bug in the Providers query manager. To summarize the issue: we weren't reacting to GetProvidersOk::FinishedWithNoAdditionalRecord events, which indicate no more records were found. Because this event was never handled, we never removed the Query from the providers Set, and so the stream of providers never closed, causing the test_dht test to hang.

  • To test reliability, I looped the test_dht 10 times for my initial commits, but removed the loop it once things stabilized.

  • I did a whole pass where the node emitted an event whenever the kbuckets were updated, and used that event to ensure everything was "in place" before attempting to StartProviding and fetch_providers_dht. However... this made things less reliable! So I removed it. It also would have potentially added a lot of "noise" to the network_event stream.

  • The only thing that actually stabilized the test (when attempting multiple loops)... was increasing the timeout by 1 second. I'm not sure how to feel about this.

  • In order to test the CancelListenForIdentify, I had to add a NetworkEvent that would emit after this cancellation happens. There was no other way I could think of that would allow us to inspect the node itself without breaking the borrow checker. This event would only ever be emitted due to a user action (manually attempting to lookup a peer), so I'm not concerned about it polluting the event space.

@ramfox ramfox added this to the v0.2.0 milestone Dec 20, 2022
@ramfox ramfox self-assigned this Dec 20, 2022
@ramfox ramfox added test c-p2p fix Fixes a bug labels Dec 20, 2022
@ramfox ramfox force-pushed the ramfox/p2p_test branch 3 times, most recently from 4c421b5 to 3d8a6d0 Compare December 20, 2022 07:23
@ramfox ramfox marked this pull request as ready for review December 20, 2022 07:27
dignifiedquire
dignifiedquire previously approved these changes Dec 20, 2022
Copy link
Contributor

@dignifiedquire dignifiedquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, nice catch

@ramfox ramfox merged commit bcf7bd3 into main Dec 20, 2022
@ramfox ramfox deleted the ramfox/p2p_test branch December 20, 2022 18:56
ppodolsky pushed a commit to izihawa/iroh that referenced this pull request Dec 20, 2022
* fix flakey `test_dht` test by extending timeout

* add test for  `CancelListenForIdentify`
ppodolsky pushed a commit to izihawa/iroh that referenced this pull request Dec 20, 2022
* fix flakey `test_dht` test by extending timeout

* add test for  `CancelListenForIdentify`
ppodolsky pushed a commit to izihawa/iroh that referenced this pull request Dec 28, 2022
* fix flakey `test_dht` test by extending timeout

* add test for  `CancelListenForIdentify`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Fixes a bug test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

p2p unit tests Fix flakey test_dht function
2 participants