-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race condition in Global Distribution packet retransmission #1689
Conversation
0a6dcb2
to
43c6d87
Compare
pamparam, don't like any concurrency in this quantum world^W^W^W any tests. |
@@ -76,17 +76,20 @@ maybe_reroute({From, To, Acc0, Packet} = FPacket) -> | |||
Acc = maybe_initialize_metadata(Acc0), | |||
LocalHost = opt(local_host), | |||
GlobalHost = opt(global_host), | |||
case lookup_recipients_host(To, LocalHost, GlobalHost) of | |||
case lookup_recipients_host(get_metadata(Acc, target_host_override, undefined), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have it on a separate line?
TargetHost = get_metadata(Acc, target_host_override, undefined),
case lookup_recipients_host(TargetHost, To, LocalHost, GlobalHost) of
It still will be two lines long :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
target_host_override should be documented, I have no idea what is for just by looking at the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and why it can be undefined should be described too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments!
43c6d87
to
59739cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good
This PR fixes #1683 and obsoletes #1684
The race condition was provoked by
test_in_order_messages_on_multiple_connections_with_bounce
test case. It manifested under following conditions:known_recipient
hook, which in turn triggered retransmission of stored itemsI've decided not to create new test case nor make the existing one more predictable, as with 100 messages sent it reproduces the issue fairly frequently and there is very little chance it won't test what it is supposed to. I mean an edge case when e.g. all 100 messages hit a window where recipient mapping is available due to test-c2s concurrency.
Note: This version was built on Travis about 6 times with full set of jobs and
test_in_order_messages_on_multiple_connections_with_bounce
was never a reason for crash.