-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cannot recover from disconnections #588
Comments
It seems like the heartbeats still work, backtraces from rbtrace:
|
suggests that the node was stopped and earlier lines suggest that recovery attempts do happen. Note that the consumers section was accidentally hidden in RabbitMQ 3.8.x management UI. This was addressed for 3.8.3. So you may be misinterpreting what you are seeing. |
Also,
suggests that a TCP connection to the target host:port was refused. RabbitMQ logs all connection lifecycle events so when in doubt, consult RabbitMQ logs and/or take a traffic capture that can be analyzed with Wireshark. |
@michaelklishin Please take a look at the screenshot below, my server opens 7 connections, but the number of channels is 0 in one of the connections. I have confirmed that the other 6 connections work great. It seems like one of the connections recovered but the channel wasn't created correctly. |
Enable debug logging for Bunny and watch RabbitMQ logs for channel exceptions. |
@michaelklishin I have recently stumbled onto the same issue. Was there any further development made about it? |
@ccdredzik I cannot comment on a single sentence problem definition. My recommendation on inspecting server logs still stands, as does Troubleshooting Networking recommendations, taking a traffic capture and using Toxiproxy or something of that kind to try to reproduce what you're seeing. Bunny connection recovery does not do anything particularly sophisticated and the algorithm hasn't changed in years. |
I fight the same issue for a while (rabbitmq cluster in kubernetes + bunny/sneakers consumers). |
@tomlobato did you find solutions? |
There were at least five changes related to recovery since 2020: |
@michaelklishin we are also facing same error with same infra (kubernetes + bunny + sneakers) Error: Got an exception when receiving data: IO timeout when reading 7 bytes (Timeout::Error) Note: running 3 instances of rabbit Also noted when it occurs,
|
Not yet, @blackjid. |
I'm using
sneakers
which usesbunny
to process messages.When I restarted my rabbitmq-ha(running in Kubernetes) and waited for a while until it was up, then I found that my rabbitmq consumers didn't work, no consumers shown in rabbitmq management.
BTW, I created a new bunny connection in the pod, the connection worked without any problems.
My server logged:
It's worth to note that no more logs outputted since
2020-01-05T05:04:39Z
.Then I set log level to
DEBUG
, it only logged:Bunny 2.14.2, RabbitMQ High Availability(3 nodes) 3.7.15
No errors found in RabbimtMQ log files.
I can't reproduce it with restarting RabbitMQ, it happens occasionally. When my server continually outputs re-connection messages, it reconnects correctly.
The issue seems to be similar to #536, but it has been resolved in my version.
Any idea?
The text was updated successfully, but these errors were encountered: