You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
In scenarios like a Redis version upgrade that alter the desired status of a statefulset, the statefulset's updateStrategy causes Pods to undergo a RollingUpdate.
Assuming we have a 3-member replication setup, there is a risk of data loss if a pod goes down momentarily without securing a replica, due to a lack of reconcile by the operator during the RollingUpdate.
Therefore, during the rollingUpdate process facilitated by the statefulset, it is crucial to ensure that at least one replica, synchronized with the leader, is secured.
While it is possible to think setting the statefulset's terminationGracePeriodSeconds to a sufficiently long duration to delay the rollingUpdate might be adequate,
I believe using Container Lifecycle Hooks to functionally guarantee this would significantly enhance the project’s reliability.
Describe the solution you'd like Describe alternatives you've considered
I propose writing event code for the PreStop hook to check whether a failover-capable replica is secured before terminating the container:
If the pod designated for deletion has a redis-role of slave, then it is safe to delete the pod.
If it’s a master, wait until a currently synced replica is secured.
If already secured, proceed.
If syncing is ongoing, remain in the loop until complete. masterSyncInProgress == 0
Is your feature request related to a problem? Please describe.
In scenarios like a Redis version upgrade that alter the desired status of a statefulset, the statefulset's updateStrategy causes Pods to undergo a
RollingUpdate
.Assuming we have a 3-member replication setup, there is a risk of data loss if a pod goes down momentarily without securing a replica, due to a lack of reconcile by the operator during the
RollingUpdate
.Therefore, during the rollingUpdate process facilitated by the statefulset, it is crucial to ensure that at least one replica, synchronized with the leader, is secured.
While it is possible to think setting the statefulset's
terminationGracePeriodSeconds
to a sufficiently long duration to delay therollingUpdate
might be adequate,I believe using Container Lifecycle Hooks to functionally guarantee this would significantly enhance the project’s reliability.
Describe the solution you'd like
Describe alternatives you've considered
I propose writing event code for the
PreStop
hook to check whether a failover-capable replica is secured before terminating the container:If the pod designated for deletion has a
redis-role
ofslave
, then it is safe to delete the pod.If it’s a
master
, wait until a currently synced replica is secured.If already secured, proceed.
If syncing is ongoing, remain in the loop until complete.
masterSyncInProgress
== 0I would like to hear what the maintainers think about this issue and the development of this feature.
If it's difficult for you to allocate time, I would like to add this feature myself and submit a Pull Request.
What version of redis-operator are you using?
redis-operator version:
Additional context
Here's the pseudo-code of the
PreStop
event code.The text was updated successfully, but these errors were encountered: