Skip to content
This repository has been archived by the owner on Nov 5, 2021. It is now read-only.

[Feature Request] Allow k8s to discover ingress resources #419

Closed
networkop opened this issue Jun 23, 2020 · 19 comments · Fixed by #469 or #470
Closed

[Feature Request] Allow k8s to discover ingress resources #419

networkop opened this issue Jun 23, 2020 · 19 comments · Fixed by #469 or #470
Assignees
Milestone

Comments

@networkop
Copy link
Contributor

Similar to how dynamic targets can be discovered from pods, endpoints and services, it'd be nice to have RDS discover ingress resources as well. The result would be an FQDN with optional set of IP (if resolved by the RDS server).

@manugarg
Copy link
Contributor

manugarg commented Jul 4, 2020

Just for the record, ingress's status object is same as service's status object[1]. We can probably just reuse most of the code after some refactoring.

[1] https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#ingress-v1beta1-networking-k8s-io

@networkop
Copy link
Contributor Author

Yes, in fact, this status is copied from an underlying LB service by the ingress controller (e.g. [1]).
However, I think we still need to know (and pass to the client) the spec.rules[*].host to build the correct HTTP host header.

[1] https://github.com/kubernetes/ingress-nginx/blob/3eafaa35a171e2d91d75690461ef45e721411aec/internal/ingress/status/status.go#L237

@manugarg
Copy link
Contributor

manugarg commented Jul 5, 2020

However, I think we still need to know (and pass to the client) the spec.rules[*].host to build the correct HTTP host header.

That's an interesting detail. Thanks @networkop.

There are two interesting problems:

  1. We need to have a way to transfer the host information from ingress definition to the HTTP probe. We can probably use an internal/private label to achieve that, say something like _cloudprober_http_host.

  2. How to handle multiple hosts in ingress definition. Is it common to have multiple hosts in an ingress definition?

@networkop
Copy link
Contributor Author

We can probably use an internal/private label to achieve that, say something like _cloudprober_http_host.

Yeah, I think something like that should work and it's fairly easy to add.

How to handle multiple hosts in ingress definition. Is it common to have multiple hosts in an ingress definition?

This may not be common but it's definitely possible and even mentioned in the offical k8s ingress docs.

Another questions is how to do DNS resolution for these hosts. The most straigh-forward way is to do it locally on the client, based on information provided in Resource.IP as collected from Ingress.status.LoadBalancer. However, this may result in false negatives in certain situations. If a user relies on external DNS integration and this integration fails for some reason (e.g. IAM permissions changed), then Ingress will not be accessible but cloudprober will continue reporting it as healthy.

@infa-kparida
Copy link

This will be really good if cloudprober can discover ingress and then provide the golden signals from those ingress.This will be one of the killer feature of cloudprober if can be implemented

@manugarg
Copy link
Contributor

@infa-kparida thanks for the input. Yes, we are planning to add support for the "ingress" resource type.

I've not yet figured out how to name targets for multiple host rules within an ingress, for example, we can do:
<ingress_name>_<host_fqdn> or just <host_fqdn>

I guess second option is cleaner, though I think we should provide a way to identify which ingress that particular host belongs to.

Let me know if you have any opinions on that.

Also, we recently added special handling for the target's 'fqdn' label; if present, it becomes the HTTP host header for the probe:

if target.Labels["fqdn"] != "" {

We can use this feature and automatically add 'fqdn' label to ingress targets if not present already.

We can probably use an internal/private label to achieve that, say something like _cloudprober_http_host.

Yeah, I think something like that should work and it's fairly easy to add.

How to handle multiple hosts in ingress definition. Is it common to have multiple hosts in an ingress definition?

This may not be common but it's definitely possible and even mentioned in the offical k8s ingress docs.

Another questions is how to do DNS resolution for these hosts. The most straigh-forward way is to do it locally on the client, based on information provided in Resource.IP as collected from Ingress.status.LoadBalancer. However, this may result in false negatives in certain situations. If a user relies on external DNS integration and this integration fails for some reason (e.g. IAM permissions changed), then Ingress will not be accessible but cloudprober will continue reporting it as healthy.

I think for DNS resolution we don't have to depend on the IP provided in the resource; we can let system resolve the IP address. There is a config option for that (though probably not that clear): resolve_first. If resolve_first is not configured, we use system's DNS resolver.

@manugarg manugarg added this to the v0.10.10 milestone Sep 1, 2020
@networkop
Copy link
Contributor Author

I think <ingress_name>_<host_fqdn> or <ingress_name>_<index> would do, as you can have multiple unique targets defined in a single ingress resource.

Also @manugarg , I was thinking about implementing this locally but then I noticed you've added it to the next milestone. I'm happy to help with the implementation or testing to help speed things up, so please let me know if you need any help.

@manugarg
Copy link
Contributor

manugarg commented Sep 3, 2020

Thanks @networkop. I had started a change some time back. I'll try to finish it in a couple of days (it will mostly get submitted around Sep 08 though). I'll let you know how it goes.

@manugarg manugarg self-assigned this Sep 3, 2020
@manugarg
Copy link
Contributor

Thanks @networkop. I had started a change some time back. I'll try to finish it in a couple of days (it will mostly get submitted around Sep 08 though). I'll let you know how it goes.

I was over-optimistic to think that I'll get it done over the long weekend :) I've the change out for review (internally) now. It will take a few more days I think.

manugarg added a commit that referenced this issue Sep 16, 2020
As ingresses can have rules with multiple hosts and paths, ingress resources are named as follows:

<ingress_name> (if no rules at all)
<ingress_name>_<host> (if path is "/")
<ingress_name>_<host>__<path with / replaced by _>

We attach "fqdn" (=host) and "relative_url" (=path) labels to the resources if these labels are not set already. These labels can then be used inside HTTP probe for automatic configuration.

Implements: #419
PiperOrigin-RevId: 331914954
manugarg added a commit that referenced this issue Sep 16, 2020
Add ingress resource type to kubernetes targets provider.

As ingresses can have rules with multiple hosts and paths, ingress resources are named as follows:

<ingress_name> (if no rules at all)
<ingress_name>_<host> (if path is "/")
<ingress_name>_<host>__<path with / replaced by _>

We attach "fqdn" (=host) and "relative_url" (=path) labels to the resources if these labels are not set already. These labels can then be used inside HTTP probe for automatic configuration.

Implements: #419
PiperOrigin-RevId: 331914954
@manugarg
Copy link
Contributor

Alright, #469 adds support for ingress resources.

Following example in the test shows how resources end up getting named and can be filtered using labels:

wantNames: []string{"rds-ingress_foo.bar.com__health", "rds-ingress_foo.bar.com__rds", "rds-ingress_prometheus.bar.com", "cloudprober-ingress"},

Ingress resources get the following two labels:

"fqdn"         : is set to the "host" in the ingress rule.
"relative_url" : is set to the "path" in the ingress rule.

@networkop and @infa-kparida, please let me know if you have any thoughts on it.

@networkop
Copy link
Contributor Author

Great, thanks @manugarg , I'll give it a spin in the next few days and let you know.

@manugarg manugarg linked a pull request Sep 16, 2020 that will close this issue
@networkop
Copy link
Contributor Author

Hi @manugarg , here are a few issues I've picked up so far:

  1. RDS now needs and updated set of RBAC rules:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
  name: rds-reader
rules:
- apiGroups: [""]
  resources: ["*"]
  verbs: ["get", "list"]
- apiGroups:
  - extensions
  - "networking.k8s.io" # k8s 1.14+
  resources:
  - ingresses
  - ingresses/status
  verbs: ["get", "list"]
  1. URL is malformed when the client uses custom surfacer.prometheus_surfacer.metrics_prefix. For example if the prefix is set to cloudprober_, the URL becomes: http://cloudprober_google.com

  2. HTTP path gets appended to the URL incorrectly. E.g. if I have an ingress with path /download, the URL becomes: http://google.com__download

@networkop
Copy link
Contributor Author

another issue is that the URL that gets built and the client tries to resolve is in the format ingressName_ingressHostname, e.g. my-ingress_google.com

@networkop
Copy link
Contributor Author

seems like the above DNS resolution issues are fixed with http_proxy.resolve_first: true

@manugarg
Copy link
Contributor

Thanks a lot for checking @networkop.

RDS now needs and updated set of RBAC rules:

Yeah, this is expected. I'll update the document.

URL is malformed when the client uses custom surfacer.prometheus_surfacer.metrics_prefix. For example if the prefix is set to cloudprober_, the URL becomes: http://cloudprober_google.com

I didn't get this one. This prefix is added to the metric names. Which URL is this?

HTTP path gets appended to the URL incorrectly. E.g. if I have an ingress with path /download, the URL becomes: http://google.com__download

another issue is that the URL that gets built and the client tries to resolve is in the format ingressName_ingressHostname, e.g. my-ingress_google.com

So both of these issues were supposed to be resolved by target labels. HTTP probe should use "fqdn" and "relative_url" labels, which are added by RDS automatically ("fqdn" label is used for URL host as well as host header).

func urlHostForTarget(target endpoint.Endpoint) string {

Just to confirm, you tried with cloudprober head, right? Not just the ingress change.

I'll investigate more. Thanks a lot for giving it a try. That's very helpful.

@manugarg
Copy link
Contributor

@networkop I tried running an HTTP probe like this:

probe {
  name: "pod-to-ingress"
  type: HTTP

  targets {
    rds_targets {
      resource_path: "k8s://ingresses"
      filter {
        key: "namespace"
        value: "default"
      }
    }
  }

  additional_label {
    key: "fqdn"
    value: "@target.label.fqdn@"
  }

  additional_label {
    key: "probeurl"
    value: "@target.label.relative_url@"
  }

  http_probe {}
}

Where ingress is:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: rds-ingress
spec:
  rules:
  - host: foo.bar.com
    http:
      paths:
      - path: /health
        backend:
          serviceName: cloudprober-test
          servicePort: 9313
  - host: prometheus.bar.com
    http:
      paths:
      - path: /
        backend:
          serviceName: cloudprober-test
          servicePort: 9313

With this I get logs like this:

2020-09-16 12:17:31.005 PDT
Target:foo.bar.com, URL:http://foo.bar.com/health, http.doHTTPRequest: Get "http://foo.bar.com/health": dial tcp: lookup foo.bar.com on 10.31.240.10:53: no such host
Expand all | Collapse all{
 insertId: "1sxfsuig1rbeq6l"  
 logName: "projects/google.com:bbmc-stackdriver/logs/cloudprober.pod-to-ingress"  
 receiveTimestamp: "2020-09-16T19:17:41.024229028Z"  
 resource: {…}  
 severity: "WARNING"  
 textPayload: "Target:foo.bar.com, URL:http://foo.bar.com/health, http.doHTTPRequest: Get "http://foo.bar.com/health": dial tcp: lookup foo.bar.com on 10.31.240.10:53: no such host"  
 timestamp: "2020-09-16T19:17:31.005359502Z"  
}

Notice that the URL is correct here.

And metrics look like this (notice additional labels here):

total{ptype="http",probe="pod-to-ingress",dst="rds-ingress_foo.bar.com__health",fqdn="foo.bar.com",probeurl="/health"} 25 1600284207212
total{ptype="http",probe="pod-to-ingress",dst="rds-ingress_prometheus.bar.com",fqdn="prometheus.bar.com",probeurl="/"} 25 1600284207212
success{ptype="http",probe="pod-to-ingress",dst="rds-ingress_foo.bar.com__health",fqdn="foo.bar.com",probeurl="/health"} 0 1600284207212
success{ptype="http",probe="pod-to-ingress",dst="rds-ingress_prometheus.bar.com",fqdn="prometheus.bar.com",probeurl="/"} 0 1600284207212
latency{ptype="http",probe="pod-to-ingress",dst="rds-ingress_foo.bar.com__health",fqdn="foo.bar.com",probeurl="/health"} 0.000 1600284207212
latency{ptype="http",probe="pod-to-ingress",dst="rds-ingress_prometheus.bar.com",fqdn="prometheus.bar.com",probeurl="/"} 0.000 1600284207212
timeouts{ptype="http",probe="pod-to-ingress",dst="rds-ingress_foo.bar.com__health",fqdn="foo.bar.com",probeurl="/health"} 0 1600284207212
timeouts{ptype="http",probe="pod-to-ingress",dst="rds-ingress_prometheus.bar.com",fqdn="prometheus.bar.com",probeurl="/"} 0 1600284207212

@manugarg manugarg linked a pull request Sep 16, 2020 that will close this issue
@networkop
Copy link
Contributor Author

hm.. maybe they were false positives. Let me retest tomorrow and report back.

@networkop
Copy link
Contributor Author

it looks like it was false positives. I may have had DNS not fully functioning at one point and I looked at dst and assumed that's what was getting resolved. So far I've reverted back all of the workarounds I put in place, including the metrics_prefix and everything seems to be working fine.

@manugarg
Copy link
Contributor

Thanks so much for confirming @networkop.

I'll go ahead mark this issue closed. We can re-open this or file another one if we find a problem with the ingress resources discovery.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants