Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot recover from disconnections #588

Closed
FX-HAO opened this issue Jan 13, 2020 · 12 comments
Closed

cannot recover from disconnections #588

FX-HAO opened this issue Jan 13, 2020 · 12 comments

Comments

@FX-HAO
Copy link

FX-HAO commented Jan 13, 2020

I'm using sneakers which uses bunny to process messages.
When I restarted my rabbitmq-ha(running in Kubernetes) and waited for a while until it was up, then I found that my rabbitmq consumers didn't work, no consumers shown in rabbitmq management.
image

BTW, I created a new bunny connection in the pod, the connection worked without any problems.

My server logged:

2020-01-05T04:44:56Z p-17 t-gskqnmpb4 WARN: Could not establish TCP connection to qa-rabbitmq:5672: Connection refused - connect(2) for 172.19.3.18:5672
2020-01-05T04:44:56Z p-17 t-gskqnmpb4 WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T04:44:56Z p-17 t-gskqnmpb4 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:44:57Z p-17 t-gskq0uc94 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:44:58Z p-17 t-gskqnmpnw WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:44:59Z p-17 t-gskqoy2c8 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:01Z p-17 t-gskqnmpb4 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:02Z p-17 t-gskqnlgdw WARN: Could not establish TCP connection to qa-rabbitmq:5672: Bunny::ClientTimeout
2020-01-05T04:45:02Z p-17 t-gskqnlgdw WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T04:45:02Z p-17 t-gskqnlgdw WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:03Z p-17 t-gskqoy37c WARN: Could not establish TCP connection to qa-rabbitmq:5672: Bunny::ClientTimeout
2020-01-05T04:45:03Z p-17 t-gskqoy37c WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T04:45:03Z p-17 t-gskqoy37c WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:07Z p-17 t-gskqnlgdw WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:08Z p-17 t-gskqoy37c WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:09Z p-17 t-gskqmlofs WARN: Could not establish TCP connection to qa-rabbitmq:5672: Bunny::ClientTimeout
2020-01-05T04:45:09Z p-17 t-gskqmlofs WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T04:45:09Z p-17 t-gskqmlofs WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:13Z p-17 t-gskq0uc94 WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T04:45:13Z p-17 t-gskq0uc94 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:13Z p-17 t-gskqnmpnw WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T04:45:13Z p-17 t-gskqnmpnw WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:14Z p-17 t-gskqoy2c8 WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T04:45:14Z p-17 t-gskqoy2c8 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:14Z p-17 t-gskqmlofs WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:16Z p-17 t-gskqnmpb4 WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T04:45:16Z p-17 t-gskqnmpb4 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:23Z p-17 t-gskq0uc94 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:23Z p-17 t-gskqoy37c WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T04:45:23Z p-17 t-gskqoy37c WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:23Z p-17 t-gskqnmpnw WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:24Z p-17 t-gskqoy2c8 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:26Z p-17 t-gskqnmpb4 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:28Z p-17 t-gskqkwnqo ERROR: Got an exception when receiving data: IO timeout when reading 7 bytes (Timeout::Error)
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: Exception in the reader loop: Timeout::Error: IO timeout when reading 7 bytes
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: Backtrace: 
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:65:in `rescue in read_fully'
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:53:in `read_fully'
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:239:in `read_fully'
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:261:in `read_next_frame'
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:73:in `run_once'
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:39:in `block in run_loop'
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `loop'
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `run_loop'
2020-01-05T04:45:28Z p-17 t-gskqkwnqo WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 ERROR: Got an exception when receiving data: IO timeout when reading 7 bytes (Timeout::Error)
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: Exception in the reader loop: Timeout::Error: IO timeout when reading 7 bytes
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: Backtrace: 
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:65:in `rescue in read_fully'
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:53:in `read_fully'
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:239:in `read_fully'
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:261:in `read_next_frame'
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:73:in `run_once'
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:39:in `block in run_loop'
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `loop'
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `run_loop'
2020-01-05T04:45:29Z p-17 t-gskqkwwi8 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:29Z p-17 t-gskqku3zw ERROR: Got an exception when receiving data: IO timeout when reading 7 bytes (Timeout::Error)
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: Exception in the reader loop: Timeout::Error: IO timeout when reading 7 bytes
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: Backtrace: 
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:65:in `rescue in read_fully'
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:53:in `read_fully'
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:239:in `read_fully'
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:261:in `read_next_frame'
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:73:in `run_once'
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:39:in `block in run_loop'
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `loop'
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `run_loop'
2020-01-05T04:45:29Z p-17 t-gskqku3zw WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:31Z p-17 t-gskqktfr4 ERROR: Got an exception when receiving data: IO timeout when reading 7 bytes (Timeout::Error)
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: Exception in the reader loop: Timeout::Error: IO timeout when reading 7 bytes
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: Backtrace: 
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:65:in `rescue in read_fully'
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:53:in `read_fully'
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:239:in `read_fully'
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:261:in `read_next_frame'
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:73:in `run_once'
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:39:in `block in run_loop'
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `loop'
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `run_loop'
2020-01-05T04:45:31Z p-17 t-gskqktfr4 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:33Z p-17 t-gskqoy37c WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:38Z p-17 t-gskqkwnqo WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:38Z p-17 t-gskqg1ssw ERROR: Got an exception when receiving data: IO timeout when reading 7 bytes (Timeout::Error)
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: Exception in the reader loop: Timeout::Error: IO timeout when reading 7 bytes
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: Backtrace: 
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:65:in `rescue in read_fully'
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:53:in `read_fully'
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:239:in `read_fully'
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:261:in `read_next_frame'
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:73:in `run_once'
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:39:in `block in run_loop'
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `loop'
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `run_loop'
2020-01-05T04:45:38Z p-17 t-gskqg1ssw WARN: Will recover from a network failure (no retry limit)...
2020-01-05T04:45:39Z p-17 t-gskqkwwi8 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:39Z p-17 t-gskqku3zw WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:41Z p-17 t-gskqktfr4 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T04:45:48Z p-17 t-gskqg1ssw WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T05:03:51Z p-17 t-gskqdtzy4 WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')
2020-01-05T05:03:51Z p-17 t-gskqdtzy4 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:03:51Z p-17 t-gskqdrf1o WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')
2020-01-05T05:03:51Z p-17 t-gskqdrf1o WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:03:51Z p-17 t-gskq9sk2g WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')
2020-01-05T05:03:51Z p-17 t-gskqdxglc WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')
2020-01-05T05:03:51Z p-17 t-gskqdmla4 WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')
2020-01-05T05:03:51Z p-17 t-gskq9sk2g WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:03:51Z p-17 t-gskqdxglc WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:03:51Z p-17 t-gskqdmla4 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:03:51Z p-17 t-gskqfntas WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')
2020-01-05T05:03:51Z p-17 t-gskqg9ht4 WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')
2020-01-05T05:03:51Z p-17 t-gskqfntas WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:03:51Z p-17 t-gskqg9ht4 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:04:01Z p-17 t-gskqdtzy4 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T05:04:01Z p-17 t-gskqdrf1o WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T05:04:01Z p-17 t-gskq9sk2g WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T05:04:01Z p-17 t-gskqdmla4 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T05:04:01Z p-17 t-gskqdxglc WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T05:04:01Z p-17 t-gskqg9ht4 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T05:04:01Z p-17 t-gskqfntas WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T05:04:16Z p-17 t-gskqg9ht4 WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T05:04:16Z p-17 t-gskqg9ht4 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:04:16Z p-17 t-gskqfntas WARN: TCP connection failed, reconnecting in 5.0 seconds
2020-01-05T05:04:16Z p-17 t-gskqfntas WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:04:26Z p-17 t-gskqg9ht4 WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T05:04:26Z p-17 t-gskqfntas WARN: Retrying connection on next host in line: qa-rabbitmq:5672
2020-01-05T05:04:32Z p-17 t-osc6a94ts ERROR: Got an exception when receiving data: IO timeout when reading 7 bytes (Timeout::Error)
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: Exception in the reader loop: Timeout::Error: IO timeout when reading 7 bytes
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: Backtrace: 
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:65:in `rescue in read_fully'
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:53:in `read_fully'
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:239:in `read_fully'
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:261:in `read_next_frame'
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:73:in `run_once'
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:39:in `block in run_loop'
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `loop'
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `run_loop'
2020-01-05T05:04:32Z p-17 t-osc6a94ts WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:04:32Z p-17 t-osc924mt0 ERROR: Got an exception when receiving data: IO timeout when reading 7 bytes (Timeout::Error)
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: Exception in the reader loop: Timeout::Error: IO timeout when reading 7 bytes
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: Backtrace: 
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:65:in `rescue in read_fully'
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/cruby/socket.rb:53:in `read_fully'
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:239:in `read_fully'
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/transport.rb:261:in `read_next_frame'
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:73:in `run_once'
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:39:in `block in run_loop'
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `loop'
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: 	/usr/local/bundle/gems/bunny-2.14.2/lib/bunny/reader_loop.rb:36:in `run_loop'
2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:04:39Z p-17 t-osccdzpn8 WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')
2020-01-05T05:04:39Z p-17 t-osccdzobg WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')
2020-01-05T05:04:39Z p-17 t-osc56ajgk WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')

It's worth to note that no more logs outputted since 2020-01-05T05:04:39Z.
Then I set log level to DEBUG, it only logged:

2020-01-13T07:08:34Z p-19237 t-gmtooe34o DEBUG: Sending a heartbeat, last activity time: 2020-01-10 17:42:50 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtora2cs DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:08:29 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtoohfus DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:08:29 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtoqkhug DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:10:39 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtojg27k DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:09:58 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtojfz14 DEBUG: Sending a heartbeat, last activity time: 2020-01-13 10:57:50 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtoobep0 DEBUG: Sending a heartbeat, last activity time: 2020-01-13 10:52:55 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtooe34o DEBUG: Sending a heartbeat, last activity time: 2020-01-10 17:42:50 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtora2cs DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:08:29 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtoohfus DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:08:29 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtoqkhug DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:10:39 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtojg27k DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:09:58 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtoobep0 DEBUG: Sending a heartbeat, last activity time: 2020-01-13 10:52:55 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtojfz14 DEBUG: Sending a heartbeat, last activity time: 2020-01-13 10:57:50 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtooe34o DEBUG: Sending a heartbeat, last activity time: 2020-01-10 17:42:50 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtora2cs DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:08:29 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtoohfus DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:08:29 +0800, interval (s): 0.4
2020-01-13T07:08:34Z p-19237 t-gmtoqkhug DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:10:39 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtoobep0 DEBUG: Sending a heartbeat, last activity time: 2020-01-13 10:52:55 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtojg27k DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:09:58 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtojfz14 DEBUG: Sending a heartbeat, last activity time: 2020-01-13 10:57:50 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtooe34o DEBUG: Sending a heartbeat, last activity time: 2020-01-10 17:42:50 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtora2cs DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:08:29 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtoohfus DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:08:29 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtoqkhug DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:10:39 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtoobep0 DEBUG: Sending a heartbeat, last activity time: 2020-01-13 10:52:55 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtojg27k DEBUG: Sending a heartbeat, last activity time: 2020-01-09 00:09:58 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtojfz14 DEBUG: Sending a heartbeat, last activity time: 2020-01-13 10:57:50 +0800, interval (s): 0.4
2020-01-13T07:08:35Z p-19237 t-gmtooe34o DEBUG: Sending a heartbeat, last activity time: 2020-01-10 17:42:50 +0800, interval (s): 0.4
...
  • Which Bunny and RabbitMQ versions are used
    Bunny 2.14.2, RabbitMQ High Availability(3 nodes) 3.7.15
  • Recent RabbitMQ log file contents
    No errors found in RabbimtMQ log files.
  • Steps to reproduce or a failing test case
    I can't reproduce it with restarting RabbitMQ, it happens occasionally. When my server continually outputs re-connection messages, it reconnects correctly.

The issue seems to be similar to #536, but it has been resolved in my version.

Any idea?

@FX-HAO
Copy link
Author

FX-HAO commented Jan 13, 2020

It seems like the heartbeats still work, backtraces from rbtrace:

Bunny::HeartbeatSender#beat
  Time.now
    Time#initialize
      Integer#+ <0.000030>
    Time#initialize <0.000143>
  Time.now <0.000671>
  Time#plus_with_duration
    ActiveSupport::Duration.===
      Kernel#is_a? <0.000007>
    ActiveSupport::Duration.=== <0.000028>
    Time#+
      Float#to_r
        Integer#div <0.000028>
        Integer#div <0.000030>
      Float#to_r <0.000231>
      Rational#* <0.000060>
      Integer#+
        Rational#coerce <0.000011>
        Rational#+ <0.000011>
      Integer#+ <0.000658>
    Time#+ <0.001295>
  Time#plus_with_duration <0.001350>
  Comparable#>
    Time#compare_with_coercion
      Kernel#class <0.000009>
      Time#<=>
        Integer#<=>
          Rational#coerce <0.000004>
          Rational#<=> <0.000006>
        Integer#<=> <0.000021>
      Time#<=> <0.000030>
    Time#compare_with_coercion <0.000539>
  Comparable#> <0.001090>
  Logger#debug
    ServerEngine::DaemonLogger#add <0.000004>
  Logger#debug <0.000025>
  AMQ::Protocol::HeartbeatFrame.encode
    AMQ::Protocol::HeartbeatFrame.encode
      AMQ::Protocol::HeartbeatFrame.encode
        AMQ::Protocol::HeartbeatFrame.encode_to_array
          ActiveSupport::IncludeWithRange#include?
            Kernel#is_a? <0.000006>
            Range#include?
              Integer#<=> <0.000049>
              Integer#<=> <0.000056>
            Range#include? <0.000184>
          ActiveSupport::IncludeWithRange#include? <0.000252>
          Kernel#nil? <0.000005>
          AMQ::Protocol::HeartbeatFrame.find_type
            Module#=== <0.000006>
            Integer#==
              BasicObject#== <0.000534>
            Integer#== <0.000560>
            Hash#has_key? <0.000006>
          AMQ::Protocol::HeartbeatFrame.find_type <0.000622>
          String#bytesize <0.000005>
          Array#pack <0.000083>
          AMQ::Protocol::HeartbeatFrame.encoded_payload
            Kernel#respond_to? <0.000011>
            String#encoding <0.000005>
            Encoding#name <0.000011>
            Kernel#dup
              Kernel#initialize_dup
                String#initialize_copy <0.000005>
              Kernel#initialize_dup <0.000499>
            Kernel#dup <0.000516>
            String#force_encoding <0.000007>
          AMQ::Protocol::HeartbeatFrame.encoded_payload <0.001062>
        AMQ::Protocol::HeartbeatFrame.encode_to_array <0.002683>
        Array#join <0.000052>
      AMQ::Protocol::HeartbeatFrame.encode <0.002797>
    AMQ::Protocol::HeartbeatFrame.encode <0.002853>
  AMQ::Protocol::HeartbeatFrame.encode <0.002912>
  Bunny::Transport#write_without_timeout
    MonitorMixin#mon_synchronize
      MonitorMixin#mon_enter
        Thread.current <0.000054>
        Thread::Mutex#lock <0.000039>
        Thread.current <0.000016>
      MonitorMixin#mon_enter <0.000448>
      IO#write <0.000054>
      Bunny::HeartbeatSender#beat
        Time.now
          Time#initialize
            Integer#+ <0.000044>
          Time#initialize <0.000125>
        Time.now <0.000186>
        Time#plus_with_duration
          ActiveSupport::Duration.===
            Kernel#is_a? <0.000027>
          ActiveSupport::Duration.=== <0.000074>
          Time#+
            Float#to_r
              Integer#div <0.000083>
              Integer#div <0.000035>
            Float#to_r <0.000238>
            Rational#* <0.000046>
            Integer#+
              Rational#coerce <0.000040>
              Rational#+ <0.000033>
            Integer#+ <0.000204>
          Time#+ <0.001280>
        Time#plus_with_duration <0.001451>
        Comparable#>
          Time#compare_with_coercion
            Kernel#class <0.000005>
            Time#<=>
              Integer#<=>
                Rational#coerce <0.000005>
                Rational#<=> <0.000047>
              Integer#<=> <0.000126>
            Time#<=> <0.000213>
          Time#compare_with_coercion <0.000273>
        Comparable#> <0.000323>
        Logger#debug
          ServerEngine::DaemonLogger#add <0.000051>
        Logger#debug <0.004315>
        AMQ::Protocol::HeartbeatFrame.encode
          AMQ::Protocol::HeartbeatFrame.encode
            AMQ::Protocol::HeartbeatFrame.encode
              AMQ::Protocol::HeartbeatFrame.encode_to_array
                ActiveSupport::IncludeWithRange#include?
                  Kernel#is_a? <0.000006>
                  Range#include?
                    Integer#<=> <0.000005>
                    Integer#<=> <0.000519>
                  Range#include? <0.000542>
                ActiveSupport::IncludeWithRange#include? <0.000564>
                Kernel#nil? <0.000005>
                AMQ::Protocol::HeartbeatFrame.find_type
                  Module#=== <0.000006>
                  Integer#==
                    BasicObject#== <0.000005>
                  Integer#== <0.000017>
                  Hash#has_key? <0.000053>
                AMQ::Protocol::HeartbeatFrame.find_type <0.000158>
                String#bytesize <0.000027>
                Array#pack <0.000047>
                AMQ::Protocol::HeartbeatFrame.encoded_payload
                  Kernel#respond_to? <0.000612>
                  String#encoding <0.000006>
                  Encoding#name <0.000009>
                  Kernel#dup
                    Kernel#initialize_dup
                      String#initialize_copy <0.000005>
                    Kernel#initialize_dup <0.000014>
                  Kernel#dup <0.000025>
                  String#force_encoding <0.000506>
                AMQ::Protocol::HeartbeatFrame.encoded_payload <0.001231>
              AMQ::Protocol::HeartbeatFrame.encode_to_array <0.002208>
              Array#join <0.000009>
            AMQ::Protocol::HeartbeatFrame.encode <0.002232>
          AMQ::Protocol::HeartbeatFrame.encode <0.002242>
        AMQ::Protocol::HeartbeatFrame.encode <0.002254>
        Bunny::Transport#write_without_timeout
          MonitorMixin#mon_synchronize
            MonitorMixin#mon_enter
              Thread.current <0.000009>
              Thread::Mutex#lock <0.000005>
              Thread.current <0.000005>
            MonitorMixin#mon_enter <0.000611>
            IO#write <0.000048>
            MonitorMixin#mon_exit
              MonitorMixin#mon_check_owner
                Thread.current <0.000103>
              MonitorMixin#mon_check_owner <0.000128>
              Thread::Mutex#unlock <0.000006>
            MonitorMixin#mon_exit <0.000352>
          MonitorMixin#mon_synchronize <0.001054>
          IO#flush <0.000042>
        Bunny::Transport#write_without_timeout <0.001278>
      Bunny::HeartbeatSender#beat <0.010129>
      Kernel#sleep <0.000079>
      MonitorMixin#mon_exit
        MonitorMixin#mon_check_owner
          Thread.current <0.000054>
        MonitorMixin#mon_check_owner <0.000137>
        Thread::Mutex#unlock <0.000037>
      MonitorMixin#mon_exit <0.000279>
    MonitorMixin#mon_synchronize <0.011336>
    IO#flush <0.000024>
  Bunny::Transport#write_without_timeout <0.011635>
Bunny::HeartbeatSender#beat <0.018483>

Kernel#sleep <0.010833>

Array#shift <0.000003>

Thread::ConditionVariable#wait
  Thread::Mutex#sleep <0.030343>
  ServerEngine::ProcessManager::Target#heartbeat!
    IO#write <0.000027>
  ServerEngine::ProcessManager::Target#heartbeat! <0.000038>
  Kernel#sleep <0.209858>
  Bunny::HeartbeatSender#beat
    Time.now
      Time#initialize
        Integer#+ <0.000005>
      Time#initialize <0.000014>
    Time.now <0.000029>
    Time#plus_with_duration
      ActiveSupport::Duration.===
        Kernel#is_a? <0.000007>
      ActiveSupport::Duration.=== <0.000014>
      Time#+
        Float#to_r
          Integer#div <0.000031>
          Integer#div <0.000034>
        Float#to_r <0.000197>
        Rational#* <0.000042>
        Integer#+
          Rational#coerce <0.000035>
          Rational#+ <0.000063>
        Integer#+ <0.000315>
      Time#+ <0.000729>
    Time#plus_with_duration <0.000917>
    Comparable#>
      Time#compare_with_coercion
        Kernel#class <0.000045>
        Time#<=>
          Integer#<=>
            Rational#coerce <0.000049>
            Rational#<=> <0.000051>
          Integer#<=> <0.000261>
        Time#<=> <0.000376>
      Time#compare_with_coercion <0.000603>
    Comparable#> <0.001355>
    Logger#debug
      ServerEngine::DaemonLogger#add <0.000005>
    Logger#debug <0.000016>
    AMQ::Protocol::HeartbeatFrame.encode
      AMQ::Protocol::HeartbeatFrame.encode
        AMQ::Protocol::HeartbeatFrame.encode
          AMQ::Protocol::HeartbeatFrame.encode_to_array
            ActiveSupport::IncludeWithRange#include?
              Kernel#is_a? <0.000005>
              Range#include?
                Integer#<=> <0.000038>
                Integer#<=> <0.000063>
              Range#include? <0.000245>
            ActiveSupport::IncludeWithRange#include? <0.000376>
            Kernel#nil? <0.000134>
            AMQ::Protocol::HeartbeatFrame.find_type
              Module#=== <0.000048>
              Integer#==
                BasicObject#== <0.000040>
              Integer#== <0.000131>
              Hash#has_key? <0.000042>
            AMQ::Protocol::HeartbeatFrame.find_type <0.000406>
            String#bytesize <0.000040>
            Array#pack <0.000049>
            AMQ::Protocol::HeartbeatFrame.encoded_payload
              Kernel#respond_to? <0.000040>
              String#encoding <0.000045>
              Encoding#name <0.000107>
              Kernel#dup
                Kernel#initialize_dup
                  String#initialize_copy <0.000052>
                Kernel#initialize_dup <0.000126>
              Kernel#dup <0.000223>
              String#force_encoding <0.000046>
            AMQ::Protocol::HeartbeatFrame.encoded_payload <0.000714>
          AMQ::Protocol::HeartbeatFrame.encode_to_array <0.002057>
          Array#join <0.000042>
        AMQ::Protocol::HeartbeatFrame.encode <0.002187>
      AMQ::Protocol::HeartbeatFrame.encode <0.002229>
    AMQ::Protocol::HeartbeatFrame.encode <0.002279>
    Bunny::Transport#write_without_timeout
      MonitorMixin#mon_synchronize
        MonitorMixin#mon_enter
          Thread.current <0.000039>
          Thread::Mutex#lock <0.000053>
          Thread.current <0.000049>
        MonitorMixin#mon_enter <0.000349>
        IO#write <0.000099>
        MonitorMixin#mon_exit
          MonitorMixin#mon_check_owner
            Thread.current <0.000047>
          MonitorMixin#mon_check_owner <0.000131>
          Thread::Mutex#unlock <0.000039>
        MonitorMixin#mon_exit <0.000319>
      MonitorMixin#mon_synchronize <0.000973>
      IO#flush <0.000055>
    Bunny::Transport#write_without_timeout <0.001182>
  Bunny::HeartbeatSender#beat <0.006459>
  Kernel#sleep <0.057976>
  Kernel#loop
    String#bytesize <0.000005>
    IO#__read_nonblock <0.000018>
    String#bytesize <0.000004>
  Kernel#loop <0.000057>
Thread::ConditionVariable#wait <0.304869>

AMQ::Protocol::Frame.decode_header
  String#== <0.000046>
  String#unpack <0.000139>
AMQ::Protocol::Frame.decode_header <0.000339>

Bunny::Transport#read_fully
  Bunny::Socket#read_fully
    Kernel#loop
      String#bytesize <0.000043>
      IO#__read_nonblock <0.000057>
      String#bytesize <0.000497>
    Kernel#loop <0.000688>
  Bunny::Socket#read_fully <0.000725>
Bunny::Transport#read_fully <0.000827>

String#bytesize <0.000004>

AMQ::Protocol::Frame.new
  AMQ::Protocol::Frame.find_type
    Module#=== <0.000003>
    Hash#has_key? <0.000064>
  AMQ::Protocol::Frame.find_type <0.000124>
  AMQ::Protocol::HeartbeatFrame.new
    AMQ::Protocol::FrameSubclass#initialize <0.000034>
  AMQ::Protocol::HeartbeatFrame.new <0.000157>
AMQ::Protocol::Frame.new <0.000370>

Kernel#is_a? <0.000032>

Thread::Mutex#synchronize <0.000030>

Bunny::ReaderLoop#run_once
  Bunny::Transport#read_next_frame
    Bunny::Transport#read_fully
      Bunny::Socket#read_fully
        Kernel#loop
          String#bytesize <0.000005>
          IO#__read_nonblock
            SystemCallError#initialize
              Exception#initialize <0.000026>
            SystemCallError#initialize <0.000051>
            Exception#exception <0.000030>
            Exception#backtrace <0.000025>
          IO#__read_nonblock <0.000309>
        Kernel#loop <0.000369>
      Bunny::Socket#read_fully <0.000450>
      Module#=== <0.000029>
      Errno::EAGAIN.=== <0.000004>
      IO.select <0.003731>
      Kernel#loop
        String#bytesize <0.000003>
        IO#__read_nonblock <0.000012>
        String#bytesize <0.000004>
      Kernel#loop <0.000041>
    Bunny::Transport#read_fully <0.004507>
  Bunny::Transport#read_next_frame <0.004546>
  AMQ::Protocol::Frame.decode_header
    String#== <0.000004>
    String#unpack <0.000034>
  AMQ::Protocol::Frame.decode_header <0.000118>
  Bunny::Transport#read_fully
    Bunny::Socket#read_fully
      Kernel#loop
        String#bytesize <0.000029>
        IO#__read_nonblock <0.000061>
        String#bytesize <0.000037>
      Kernel#loop <0.000199>
    Bunny::Socket#read_fully <0.000288>
  Bunny::Transport#read_fully <0.000335>
  String#bytesize <0.000026>
  AMQ::Protocol::Frame.new
    AMQ::Protocol::Frame.find_type
      Module#=== <0.000031>
      Hash#has_key? <0.000067>
    AMQ::Protocol::Frame.find_type <0.000176>
    AMQ::Protocol::HeartbeatFrame.new
      AMQ::Protocol::FrameSubclass#initialize <0.000010>
    AMQ::Protocol::HeartbeatFrame.new <0.000455>
  AMQ::Protocol::Frame.new <0.000682>
Bunny::ReaderLoop#run_once <0.005968>

Kernel#is_a? <0.000005>

Thread::Mutex#synchronize <0.000006>

@michaelklishin
Copy link
Member

2020-01-05T05:04:32Z p-17 t-osc924mt0 WARN: Will recover from a network failure (no retry limit)...
2020-01-05T05:04:39Z p-17 t-osccdzpn8 WARN: Recovering from connection.close (CONNECTION_FORCED - broker forced connection closure with reason 'shutdown')

suggests that the node was stopped and earlier lines suggest that recovery attempts do happen.

Note that the consumers section was accidentally hidden in RabbitMQ 3.8.x management UI. This was addressed for 3.8.3. So you may be misinterpreting what you are seeing.
Use rabbitmqctl list_connections and connection data from your monitoring system.

@michaelklishin
Copy link
Member

Also,

2020-01-05T04:44:56Z p-17 t-gskqnmpb4 WARN: Could not establish TCP connection to qa-rabbitmq:5672: Connection refused - connect(2) for 172.19.3.18:5672

suggests that a TCP connection to the target host:port was refused. RabbitMQ logs all connection lifecycle events so when in doubt, consult RabbitMQ logs and/or take a traffic capture that can be analyzed with Wireshark.

@FX-HAO
Copy link
Author

FX-HAO commented Jan 14, 2020

@michaelklishin Please take a look at the screenshot below, my server opens 7 connections, but the number of channels is 0 in one of the connections. I have confirmed that the other 6 connections work great. It seems like one of the connections recovered but the channel wasn't created correctly.
image
89a3c246-4e1c-49c3-a0b0-2a534141db4d

@michaelklishin
Copy link
Member

Enable debug logging for Bunny and watch RabbitMQ logs for channel exceptions.

@ccdredzik
Copy link

ccdredzik commented Aug 10, 2020

@michaelklishin I have recently stumbled onto the same issue. Was there any further development made about it?

@michaelklishin
Copy link
Member

@ccdredzik I cannot comment on a single sentence problem definition. My recommendation on inspecting server logs still stands, as does Troubleshooting Networking recommendations, taking a traffic capture and using Toxiproxy or something of that kind to try to reproduce what you're seeing. Bunny connection recovery does not do anything particularly sophisticated and the algorithm hasn't changed in years.

@tomlobato
Copy link

I fight the same issue for a while (rabbitmq cluster in kubernetes + bunny/sneakers consumers).
Is there any development reached on this?

@blackjid
Copy link

blackjid commented Apr 4, 2023

@tomlobato did you find solutions?

@sajangupta74
Copy link

@michaelklishin we are also facing same error with same infra (kubernetes + bunny + sneakers)

Error: Got an exception when receiving data: IO timeout when reading 7 bytes (Timeout::Error)

Note: running 3 instances of rabbit

Also noted when it occurs,

  1. one of the rabbit instance is not in running state
  2. bunny start reconnecting
  3. while bunny waiting for rabbit response, all rabbit instances got UP

@tomlobato
Copy link

@tomlobato did you find solutions?

Not yet, @blackjid.
I will try the list of issues from michaelklishin above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants