-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As a tekton pipeline user, I want to use liveness/readiness probes to check controller pod health #3111
Comments
Now, I check pod event and find liveness/readiness probes failed message is gone:
So I think previous failure is expected, Because pod is restarting, do liveness and readiness detect failed. Once pod is ready, liveness and readiness detections are normal. But it's better for your side to add an explicit probes for controller. I think it's necessary. |
@mattmoor is there a health check enpoint provided by knative/pkg by any chance? Perhaps hooked into any signal that reconciling is happening, and into the new HA stuff? |
Yeah, but I think it's probably only exposed when the webhook is active, since the main use case would be lame ducking webhooks so the K8s Service drops the endpoint before it quits. |
@imjasonh any comments around using the |
It's honestly a bit surprising that it reports unhealthy in the example above. I'd want to look into that and figure out why that is. Maybe for simplicity we should add a new That seems a bit less useful though, since ideally we'd like to only report "ready" when listers/informers are set up, or when the webhook is registered, and should probably have some way to programmatically be able to report unhealthy/non-live. That's why I roped in @mattmoor, in case these considerations have already come up and been solved in Knative-land. |
@imjasonh Thanks for your attention!
|
And this liveness and readiness is also required for tekton pipeline |
This sounds the same to the request in #1586. I tried adding a port in the controller (commit) but still didn't work. Still trying adding this to controller. The two probes are just added to webhook via this commit. |
@ywluogg Why do you use |
@xiujuan95 Ah that's because I wanted to separate the probes' ports from metrics port. I'm able to add the probes and the pending changes are in: 1d0f3d3 But as @imjasonh mentioned it seems more useful that if we can connect probes' ports to a signal that can clearly tells the controller is processing reconciliations, which needs much more changes. The current controller setup is a single controller replica and its time to restart itself when it crashes and the time it restarts itself using probes are roughly the same. Are you trying to use the probes for some other goals? I'm going to wait for the discussion about this and then probably send a PR. |
Thanks @ywluogg ! No, I just want to use probes to detect the health of controller, don't have other goals. |
@imjasonh do you think we should add the probes for simple health check purposes for now? |
any updates on this issue? It seems that if we use the |
The probes are available now on the webhook: Lines 93 to 103 in cfe2fe0
|
@mattmoor @pmorie I found knative/pkg#1048 but it's not clear to me then whether it is available yet or not. If not we could perhaps add a "shallow" |
Agreed with @afrittoli. It seems still more suitable if we add the health checks after knative/pkg#1048 becomes available. |
Hi, @ywluogg any updates about adding livenss/readiness probes for controller? |
Feature request
I want to use liveness and readiness probes to detect if my tekton controller pod is healthy or not. However, seem like liveness and readiness field don't be included in controller deployment:https://github.com/tektoncd/pipeline/blob/master/config/controller.yaml
About this request, actually, I have done some experiments. I configure liveness and readiness like below:
And I check the pod event, it tells me the probes failed:
However, my pod is still running normally:
This is not expected.
BTW, I can do
curl http://localhost:9090/metrics
command successfully within tekton controller container:Use case
The text was updated successfully, but these errors were encountered: