-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Webhooks are not guaranteed to start before cache sync is started #1685
Comments
A concrete example how we end up with a deadlock:
Now the following happens:
In the new controller:
>> deadlock (xref: log of a CAPI controller stuck in that deadlock: manager.log (notably absent: a log with Enabling leader election is a valid workaround as it delays |
That is bad. Can we try to find a way to reliable start the conversion webhooks before anything else? |
Yeah would be good to find a way. Two naive ideas (I don't really now enough about CR to assess if those will lead to new issues):
|
We could probably add a new lock just for the cache starting and use a RW lock for the two runnable starter funcs and |
I have played a little bit around @alvaroaleman idea in #1689, I hope this can help |
However, some webhooks need to get/list resources from cache during handling requests. For example, a webhook that injects fields or validates for a CR may depend on another resource (e.g. configmap). So if we start webhooks before cache synced, it will cause some requests failed because of The ideal sequence might be |
Would probably need a bigger refactor to separate conversion webhooks from the other ones, and add the validator & mutators after the webhook server is started. The |
I would also like to mention that webhooks (both conversion and validation/mutating) have different availability requirements than controllers and I would strongly recommend to run multiple replicas of the webhook servers at all times whereas this isn't needed for controllers. While I am ok improving the situation for the use-case of resource-constrained env where everything runs in a single replica, this is par definition never going to be particularly resilient (kube waits six full minutes to reschedule workloads after a node becomes unavailable, this is six full minutes your api is entirely unavailable) and I would very much recommend to not do that. |
@FillZpp @randomvariable If you have a test environment, could you run some tests on top of #1695 ? |
Ah, cool, I missed that PR. I'll give it a go later today. |
Sure. I have fetch #1695 into a branch in my fork repo and packaged a test tag. Then I upgrade dependences in openkruise/kruise, including k8s to 1.22, controller-runtime to v0.10 and replace it with the test tag. Turns out all unit test2 and e2e tests work fine. https://github.com/FillZpp/kruise/actions?query=branch%3Atest-controller-runtime-pr-1695 |
@sbueringer: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In the process of debugging kubernetes-sigs/cluster-api-provider-aws#2834 we spot the following behaviour:
If resources are present in API Server/etcd at an older storage version AND
--leader-election
is set to false when starting controllers with a newer CRD version, then there is a high probability that conversion webhooks will not start before cache syncs start, which due to conversion webhooks not being present and a mutex being held, indefinitely delay the start of the webhook server. On some infrastructures, this is closer to 90% (e.g. Amazon EC2 t3.large (2 CPU, 8GB RAM)).When leader election is enabled, then the probability drops to 20%, and when using healthchecks for the webhooks, sufficient restarts by kubelet lead to the conversion webhooks running sufficiently early to allow cache syncs to occur.
4e548fa was intended to ensure that webhooks start first, but this doesn't seem to have had the desired effect.
We believe that the occurrence is lower when leader election happens since the leader election process delays the call of
cm.startLeaderElectionRunnables()
until after election is completed.Example logs are here: capi-logs.txt
Controller-runtime-version: v0.10.2
Affected consumers: Cluster API v1.0.0
cc @fabriziopandini @sbueringer
The text was updated successfully, but these errors were encountered: