Issue 579: Leader election fails after node reboot #580
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change log description
In VMware cluster, some pods are stuck in ProviderFailed state, and leader election function, provided by operator SDK, is unable to process that, so new pods are stuck in wait cycle.
Purpose of the change
Fixes #579
What the code does
Customise the leader.Become() function of operator-sdk and if the pod is in
ProviderFailed
state, delete the pod and configmap so that new pod can come up.How to verify it
Verify in Vmware setup pods are coming up successfully after node reboot.