-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redis Connection does not reconnect when failed, causing Queue Workers to loop #41302
Comments
Right, so, what I'm interested in is what the value for I suspect the failed jobs still stay in redis, waiting to be processed and that they get processed once you restart the worker? |
https://github.com/laravel/framework/blob/9.x/src/Illuminate/Queue/Jobs/Job.php#L315
The error is saying $nextJob is an array that only consists of ONE ITEM, not two (hence the The problem to me appears to be that there's no way that I could find to catch the \Redis exception in Illuminate/Redis/Connections/Connection.php:116 and mark the connection as failed. It's not possible to reconnect there, as it needs to be destroyed and recreated in RedisManager::resolve() https://github.com/laravel/framework/blob/9.x/src/Illuminate/Redis/RedisManager.php#L103 |
I feel like you could catch this error in: https://github.com/laravel/framework/blob/9.x/src/Illuminate/Queue/Worker.php#L392 The reason being is that when the worker loses a database connection shouldQuit is set to true so the worker quits and then reconnects when it starts again. This same concept could apply to Redis |
Good spotting @tm1000 (FYI, he is a co-worker of mine) @driesvints I didn't want to submit a PR until someone decides on the 'correct' way to handle this, and as it's getting into the fiddly in-depth bits of queue workers, I thought it would be better to report it as a bug first and then get some guidance on how you guys would like it handled. |
What version is your Redis?
This seems highly unlikely as the current code totally not expects that format. I have no idea how to solve this unfortunately. Ideally, someone starts a PR which we can start off from. |
@driesvints well, you're the guy who makes the call! What would you prefer?
I'm happy to do the PR, but I'd want to do it in a way you're happy with. The OTHER problem is that to write a unit test for this requires at least three moving parts - a working redis, stopping that working redis, and then restarting it. Would you need a unit test for this, or would you be happy without it? |
Sorry if I didn't make this clear - that response comes from the \Redis object, when it is in its broken state. Not from redis-the-actual-process. If it's actually important I can spend some time to build a dockerfile that demonstrates this, but it's super simple for to duplicate it manually, so I haven't worried about trying to figure out a way to prove this automatically (hence my 'Is this OK without a unit test?' question above 8)) |
Not really. I don't merge PR's 😅 Let's try the first option and see how it goes. |
Feel free to send in that PR when you can, thanks. |
Why is this ticket closed? The bug still exists, and I (or no-one else) has created a PR, so it still needs to be open. |
@xrobau just send in that PR if you can please. |
Thank you, I'll try to get onto it today. |
…nections When a \Redis connection has a socket/connection error, there's no reliable way to re-establish the connection. This adds a simple public 'shouldRestart' bool (based on the existing Queue\Worker 'stopWorkerIfLostConnection' code) to the Redis Connection, which is checked in RedisManager and recreated if needed there. Signed-Off-By: Rob Thomas <[email protected]>
My default PHP Linters decided that The code ended up being pretty trivial, as soon as I looked into it. |
This is an alternative option to PR laravel#41502, and adds a match for 'socket' ("socket error when (reading|writing)") to trigger a reconnection. I am not sure if this is valid in other languages, too. My personal preference would be to remove the entire if and ALWAYS reconnect on any Exception Signed-Off-By: Rob Thomas [email protected]
* Bugfix #41302 - Alternative This is an alternative option to PR #41502, and adds a match for 'socket' ("socket error when (reading|writing)") to trigger a reconnection. I am not sure if this is valid in other languages, too. My personal preference would be to remove the entire if and ALWAYS reconnect on any Exception Signed-Off-By: Rob Thomas [email protected] * Fix typo (used vim instead of my usual IDE)
I saw this being fixed. I am currently running on Laravel 7.30.6...which version of Laravel should I be upgrading to? Looks like I am doomed. |
Description:
Originally reported in #30081 and #29969
If the Redis connection fails, workers do not realise this, and error with:
and
The code flow basically ends up calling https://github.com/laravel/framework/blob/9.x/src/Illuminate/Redis/RedisManager.php#L82 which doesn't check if the connection is still valid. As the connection NAME is still valid, but the connection itself is closed, there is no reconnect attempt (as there is with other drivers).
This is triggered when there is a socket error:
which probably THERE should unset itself, or at least mark itself as failed.
Steps To Reproduce:
socket error
to occurAt this point, all jobs will immediately fail with the null/undefined offset, and the only way to fix them is to kill and restart the worker.
The text was updated successfully, but these errors were encountered: