You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I discovered a possible race condition in Channel#wait_for_confirms which can lead to wait_for_confirms returning before RabbitMQ has acknowledged (or nack'ed) the published message. This actually hit us in production.
Suppose the following happens:
We have a channel in "confirm select" mode.
We do a basic_publish to that channel. Some time passes, say >0.5s.
During that time, an ack is received by Bunny and handled in handle_ack_or_nack. @unconfirmed_set is emptied, true is pushed to @confirms_continuations.
We do a second baslic_publish to that same channel.
Immediately afterwards we issue wait_for_confirms to wait for all confirmations. It sees that @unconfirmed_set has something in it and issues @confirms_continuations.poll but that returns immediately and unexpectedly because of the true value pushed to it at step 3. We haven't received an ack or nack for the second publish yet.
The way I understand the documentation is that doing a bunch of publishes and then calling wait_for_confirms is a completely valid use case, but unless I'm missing something obvious, it's pretty evident from the code that each ack that comes will result in another true value pushed to @confirms_continuations and that many wait_for_confirms will immediately succeed afterwards, no matter how many messages are in @unconfirmed_set.
In our code we actually do a publish and wait_for_confirms immediately after that, but the race condition can still occur (and it did).
I discovered a possible race condition in
Channel#wait_for_confirms
which can lead towait_for_confirms
returning before RabbitMQ has acknowledged (or nack'ed) the published message. This actually hit us in production.Suppose the following happens:
basic_publish
to that channel. Some time passes, say >0.5s.handle_ack_or_nack
.@unconfirmed_set
is emptied,true
is pushed to@confirms_continuations
.baslic_publish
to that same channel.wait_for_confirms
to wait for all confirmations. It sees that@unconfirmed_set
has something in it and issues@confirms_continuations.poll
but that returns immediately and unexpectedly because of thetrue
value pushed to it at step 3. We haven't received an ack or nack for the second publish yet.The way I understand the documentation is that doing a bunch of publishes and then calling
wait_for_confirms
is a completely valid use case, but unless I'm missing something obvious, it's pretty evident from the code that eachack
that comes will result in anothertrue
value pushed to@confirms_continuations
and that manywait_for_confirms
will immediately succeed afterwards, no matter how many messages are in@unconfirmed_set
.In our code we actually do a publish and
wait_for_confirms
immediately after that, but the race condition can still occur (and it did).I've also written a small script that demonstrates the issue.
Tested on: Bunny 2.4.0 and Ruby 2.1.2 on macOS
The text was updated successfully, but these errors were encountered: