Investigate Bookkeeper disruption cases #114
Labels
DR
Disaster Recovery
kind/enhancement
Enhancement of an existing feature
priority/P2
Slight inconvenience or annoyance to applications, system continues to function
status/needs-investigation
Further investigation is required
We need to investigate and implement an action plan to cover all BookKeeper disruption cases. Some disruption cases that come to my mind are:
kubectl drain
to remove a node from the K8 cluster.kubectl delete pod
to delete a particular Bookkeeper pod, probably accidentally.For graceful terminations, we may want to run a pre-delete hook,
preStop handler
in K8 terminology, to make sure that ledgers stored in that Bookie are rereplicated before it is shut down. Probably by running a Bookkeeper manual recovery process.For unexpected terminations, we may want to rely on Bookkeeper's autorecovery feature and the pod disruption budget to prevent a second pod graceful termination until the terminated pod is rescheduled and recovered.
The text was updated successfully, but these errors were encountered: