Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No failure when exceeding recovery_attempts #549

Closed
robinroestenburg opened this issue Apr 11, 2018 · 7 comments
Closed

No failure when exceeding recovery_attempts #549

robinroestenburg opened this issue Apr 11, 2018 · 7 comments

Comments

@robinroestenburg
Copy link

Reproduction scenario:

  • Setup a Bunny session using the following configuration parameters:
    • automatically_recover: true
    • recovery_attempts: 5
    • network_recovery_internal: 5
  • Send a couple of messages to RabbitMQ using this session.
  • Stop RabbitMQ server.
  • Bunny starts the network recovery and retries the connection for 5 times. After that it gives up with the message "Ran out of recovery attempts".
    No exception is raised by Bunny.
  • Send more messages to RabbitMQ.
    No exception is generated by Bunny.

Expectation:
Error/exception is generated when either:

  • network recovery fails after the maximum number of attempts, or
  • messages are sent using a Bunny session that ran out of retry attempts.

Cause
See the following lines: https://github.com/ruby-amqp/bunny/blob/master/lib/bunny/session.rb#L745-L756

If a network failure happens the exception is caught and the retry mechanism kicks in. When the maximum number of recovery attempts has been reached, the should_retry_recovery? method will return false and Bunny will swallow the exception.

This is a problem, because:

  • the client code is not informed (through an exception or maybe a special status on the session object) that a problem occurred, and
  • any future messages will also be silently dropped.

The original commit message for the recovery_attempts feature can be found here: fff7f44
It states:

:recovery_attempts options sets the maximum number of attempts Bunny
would make to recover from network failures before giving up and
throwing the original exception
. :recovery_attempts is effective only
if :automatically_recover option is true.

It is not re-raising the original exception right now. I tried to add it locally, but that didn't seem to work. Maybe you have some ideas?

If you need any more information please let me know.

@michaelklishin
Copy link
Member

Feel free to submit a PR.

Connections that are out of retry attempts should transition to the closed state, which will automatically lead to any operation on it throw an exception.

@robinroestenburg
Copy link
Author

@michaelklishin ok, sounds good. I will work on a PR.

@michaelklishin
Copy link
Member

Note that connection recovery happens in the I/O loop thread. But setting an ivar that would stop recovery attempts from happening should be enough and only the I/O loop thread should mutate it (at least in theory).

@arlandism
Copy link
Contributor

Hi @robinroestenburg! Did you have any luck in a PR for this? If not, I can take a look. Thanks

@robinroestenburg
Copy link
Author

@arlandism unfortunately I have not had the time for it yet, using a hack in our code for now (reading @recovery_attempts variable from the Bunny instance). Feel free to give it a go.

arlandism added a commit to arlandism/bunny that referenced this issue Jul 27, 2018
…ery attempts

are exhausted

When the RabbitMQ server goes down, the session attempts to continually
reconnect, only stopping once the recovery_attempts threshold is met (or
forever is this parameter is not provided). In the former case, there
are no exceptions raised to indicate to clients that new messages are
not being processed. This patch provides that by initiating close
from the session, which will close related channels and the reader
loop. Then, an exception will be thrown when subsequent operations are
performed.
michaelklishin added a commit that referenced this issue Jul 28, 2018
Issue #549: Move connection to closed state when recovery attempts fail
@michaelklishin
Copy link
Member

Resolved in #556 contributed by @arlandism.

@robinroestenburg
Copy link
Author

Thanks @arlandism 👍

michaelklishin added a commit that referenced this issue Sep 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants