Implementation Owner:
Status: implemented
Operators need leader election to ensure that if the same operator is running in two separate pods within the same namespace, only one of them will be active. The primary goal of leader election is to avoid contention between multiple active operators of the same type.
High availability is not a goal of leader election for operators.
Controller-runtime is adding leader election based on functionality present in client-go. However that implementation allows for the possibility of brief periods during which multiple leaders are active.
Requirements have been discussed on GitHub.
This proposal is to add leader election to the SDK that follows a "leader for life" model, which does not allow for multiple concurrent leaders.
- Provide leader election that is easy to use
- Provide leader election that prohibits multiple leaders
- Make operators highly available
The "leader for life" approach uses Kubernetes features to detect when a leader has disappeared and then automatically remove its lock.
The approach and a PoC is detailed in a separate
repository. This proposal is to move
that implementation into operator-sdk
and finish/modify it as appropriate.
func main() {
// create a lock named "myapp-lock", retrying every 5 seconds until it succeeds
err := leader.Become("myapp-lock", 5)
if err != nil {
log.Error(err, "")
os.Exit(1)
}
...
// do whatever else your app does
}
Once accepted into operator-sdk, this would be valuable to contribute back to either controller-runtime directly or client-go.