Skip to content
This repository has been archived by the owner on Apr 25, 2024. It is now read-only.

201 - Prometheus not loading any metrics. #491

Open
andrewhertog opened this issue Jun 11, 2018 · 4 comments
Open

201 - Prometheus not loading any metrics. #491

andrewhertog opened this issue Jun 11, 2018 · 4 comments

Comments

@andrewhertog
Copy link

andrewhertog commented Jun 11, 2018

I'm currently following
https://github.com/aws-samples/aws-workshop-for-kubernetes/tree/master/02-path-working-with-clusters/201-cluster-monitoring

I've successfully loaded Prometheus in a browser after using the proxy command kubectl port-forward $(kubectl get po -l prometheus=prometheus -n monitoring -o jsonpath={.items[0].metadata.name}) 9090 -n monitoring but i am not seeing any of the metrics on localhost:9090

This is all I see:
screen shot 2018-06-11 at 12 04 07 pm

I have gone through 201 from the beginning twice, with the same results, following the cleanup shown at the end of the tutorial

Update

I just did some digging and noticed a lot of the following in the logs for the prometheus-prometheus-0 pod:

level=error ts=2018-06-11T18:11:56.227803158Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:177: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list endpoints in the namespace \"kube-system\""
level=error ts=2018-06-11T18:11:56.231432703Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:177: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list endpoints in the namespace \"monitoring\""
level=error ts=2018-06-11T18:11:56.231436883Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:178: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list services in the namespace \"kube-system\""
level=error ts=2018-06-11T18:11:56.231516245Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:177: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list endpoints in the namespace \"monitoring\""
level=error ts=2018-06-11T18:11:56.243360069Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:177: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list endpoints in the namespace \"monitoring\""
level=error ts=2018-06-11T18:11:56.243441332Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:178: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list services in the namespace \"kube-system\""
level=error ts=2018-06-11T18:11:56.243462191Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:178: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list services in the namespace \"monitoring\""
level=error ts=2018-06-11T18:11:56.24351401Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:178: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list services in the namespace \"monitoring\""
level=error ts=2018-06-11T18:11:56.24351825Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:177: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list endpoints in the namespace \"kube-system\""
level=error ts=2018-06-11T18:11:56.243564991Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:177: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list endpoints in the namespace \"default\""
level=error ts=2018-06-11T18:11:56.243573509Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:178: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list services in the namespace \"kube-system\""
level=error ts=2018-06-11T18:11:56.243612852Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:178: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list services in the namespace \"monitoring\""
level=error ts=2018-06-11T18:11:56.243625904Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:177: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list endpoints in the namespace \"kube-system\""
level=error ts=2018-06-11T18:11:56.254559683Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:178: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list services in the namespace \"monitoring\""
level=error ts=2018-06-11T18:11:56.254651698Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:178: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list services in the namespace \"default\""
level=error ts=2018-06-11T18:11:56.254773778Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:177: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list endpoints in the namespace \"kube-system\""
level=error ts=2018-06-11T18:11:56.254851617Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:177: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list endpoints in the namespace \"monitoring\""
level=error ts=2018-06-11T18:11:56.254910369Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:178: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-operator\" cannot list services in the namespace \"kube-system\""
level=error ts=2018-06-11T18:11:56.449285051Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to watch *v1.Pod: unknown (get pods)"
level=error ts=2018-06-11T18:11:56.452019179Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to watch *v1.Pod: unknown (get pods)"
level=error ts=2018-06-11T18:11:56.452942474Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to watch *v1.Pod: unknown (get pods)"
level=error ts=2018-06-11T18:11:56.453427575Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to watch *v1.Pod: unknown (get pods)"
level=error ts=2018-06-11T18:11:56.49484826Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to watch *v1.Pod: unknown (get pods)"
level=error ts=2018-06-11T18:11:56.496102528Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to watch *v1.Pod: unknown (get pods)"
level=error ts=2018-06-11T18:11:56.498910773Z caller=main.go:212 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to watch *v1.Pod: unknown (get pods)"
@CharlyF
Copy link
Contributor

CharlyF commented Jul 6, 2018

[UPDATE] I think the issue is just that prometheus-operator and prometheus are sharing the same RBAC here. While they should be different.
The other Cluster Role that should be used is: https://github.com/coreos/prometheus-operator/blob/v0.14.1/Documentation/rbac.md#prometheus-rbac
and should be used as a different SA here https://github.com/aws-samples/aws-workshop-for-kubernetes/blob/master/02-path-working-with-clusters/201-cluster-monitoring/templates/prometheus/prometheus.yaml#L228

I realized that as the API Servers appeared to be down (which happened because the get of /metrics is not listed in the prometheus-operator cluster-role).
So the underlying problem is that there is a missing cluster role.


I also hit this issue - You can just kubectl edit clusterrole prometheus-operator -n monitoring
and add the missing verbs, in your case list for endpoints/svcs and watch for pods I think.
This is the RBAC that got the UI up in my case.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
[...]
rules:
- apiGroups:
  - extensions
  resources:
  - thirdpartyresources
  verbs:
  - '*'
- apiGroups:
  - apiextensions.k8s.io
  resources:
  - customresourcedefinitions
  verbs:
  - '*'
- apiGroups:
  - monitoring.coreos.com
  resources:
  - alertmanagers
  - prometheuses
  - servicemonitors
  verbs:
  - '*'
- apiGroups:
  - apps
  resources:
  - statefulsets
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - configmaps
  - secrets
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - watch
  - list
  - delete
- apiGroups:
  - ""
  resources:
  - services
  - endpoints
  verbs:
  - get
  - list
  - create
  - watch
  - update
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - namespaces
  verbs:
  - list

Interestingly, these are not required according to the doc of 0.14.1

@jicowan
Copy link
Contributor

jicowan commented Jul 17, 2018

Can someone make the changes that @CharlyF mentioned? As of 7/14/18 this was still not working.

@dannyvargas23
Copy link

dannyvargas23 commented Sep 6, 2018

I had the same issue and had to modify the cluster role.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-operator
  namespace: monitoring
rules:
- apiGroups:
  - extensions
  resources:
  - thirdpartyresources
  verbs:
  - "*"
- apiGroups:
  - apiextensions.k8s.io
  resources:
  - customresourcedefinitions
  verbs:
  - "*"
- apiGroups:
  - monitoring.coreos.com
  resources:
  - alertmanagers
  - prometheuses
  - servicemonitors
  - prometheusrules
  verbs:
  - "*"
- apiGroups:
  - apps
  resources:
  - statefulsets
  verbs: ["*"]
- apiGroups: [""]
  resources:
  - configmaps
  - secrets
  verbs: ["*"]
- apiGroups: [""]
  resources:
  - pods
  verbs: ["list", "delete", "watch"]
- apiGroups: [""]
  resources:
  - services
  - endpoints
  verbs: ["get", "create", "update", "watch", "list"]
- apiGroups: [""]
  resources:
  - nodes
  verbs: ["list", "watch"]
- nonResourceURLs:
  - /metrics
  verbs: ["get"]
- apiGroups: [""]
  resources:
  - namespaces
  verbs: ["list"]

@geerlingguy
Copy link

And to be a little more precise—if you followed the directions in the guide, and you have a blank Targets page (and the prometheus container in the prometheus-prometheus-1 pod is showing errors in the log like the ones shown earlier in this thread), then you need to:

  1. Copy and paste the entire code block in @dannyvargas23's comment above; paste it into the prometheus-bundle.yaml file directly.
  2. Run the command kubectl apply -f templates/prometheus/prometheus-bundle.yaml again, to apply the changes.

After a couple minutes, you should start seeing Targets 'UP' in the Prometheus UI.

I'll file a PR with this change, hopefully it can get merged soon!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants