-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd fails kubelet's health checks #720
Comments
Looking at the kubeadm code, it appears we statically set the probe scheme to http, but etcd is https. Even if we didn't set client auth to true, I don't know how this could ever have worked. Did etcd's behavior change under us? |
Hi @Q-Lee -- these were the changes that introduced etcd TLS:
Here's is the bugfix: |
k8s-github-robot
pushed a commit
to kubernetes/kubernetes
that referenced
this issue
Mar 5, 2018
…tcd_tls Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Add mTLS to kubeadm etcd liveness probe. **What this PR does / why we need it**: We switched etcd over to using mTLS, but the liveness probe is still using http. Disabling the liveness probe allows etcd to continue operating. The real fix isn't simple, because we need to generate a client certificate for healthchecking and update the probe to exec `etcdctl` like so: https://sourcegraph.com/github.com/coreos/etcd-operator/-/blob/pkg/util/k8sutil/pod_util.go#L71-89 ~Working on patching this now.~ This PR now generates the healthcheck identity and updates the liveness probe to use it. **Which issue(s) this PR fixes** Fixes #59766 Fixes kubernetes/kubeadm#720 **Special notes for your reviewer**: We should generate a client cert specifically for etcd health checks so that the apiserver certs can be revoked independently. This will be stored in `/etc/kubernetes/pki/etcd/` so that we don't have to change the pod's hostMount. **Release note**: ```release-note NONE ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What keywords did you search in kubeadm issues before filing this one?
etcd ssl health
Is this a BUG REPORT or FEATURE REQUEST?
Choose one: BUG REPORT
Versions
kubeadm version (use
kubeadm version
): HEADEnvironment: custom/docker-in-docker, docker-in-docker-in-docker
kubectl version
): HEADuname -a
): 4.9.0-5What happened?
tl;dr - kubelet doesn't know the key/crt for etcd, so that etcd returns "non-sense" for its health check.
$ curl -k https://127.0.0.1:2379/health --key apiserver-etcd-client.key --cert apiserver-etcd-client.crt
{"health": "true"}
$ curl -k https://127.0.0.1:2379/health
curl: (35) error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate
$ journalctl | grep "Liveness probe" | grep -v succeeded | tail -n1
Mar 03 09:49:08 7e928d1f665e kubelet[151]: I0303 09:49:08.142334 151 server.go:422] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"etcd-7e928d1f665e", UID:"17c2da92666b4a9242f31873234f3101", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{etcd}"}): type: 'Warning' reason: 'Unhealthy' Liveness probe failed: Get http://127.0.0.1:2379/health: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02"
What you expected to happen?
I expect etcd to pass health checks.
How to reproduce it (as minimally and precisely as possible)?
I suspect any kubeadm cluster built from HEAD will have this quirk.
Anything else we need to know?
The text was updated successfully, but these errors were encountered: