Skip to content
This repository has been archived by the owner on Apr 25, 2023. It is now read-only.

bugfix: register workqueue metrics in controller-runtime firstly #1382

Merged

Conversation

mars1024
Copy link
Contributor

What this PR does / why we need it:
Ref to #1372 , because of importing race as #1372 (comment), workqueue metrics do not work now.
And this PR try to use an import with highest order to fix this.

Which issue(s) this PR fixes:
Fixes #1372

Signed-off-by: Bruce Ma [email protected]

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 18, 2021
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Mar 18, 2021
@hectorj2f
Copy link
Contributor

@mars1024 Could you share an example in this PR of what you are getting solved by importing this package ? What were we missing before ?

@mars1024
Copy link
Contributor Author

Yeah, let's start with the precondition that kubefed is exposing the metrics registry defined in controller-runtime/pkg/metrics.
https://github.com/kubernetes-sigs/kubefed/blob/16b2f5cc4f05c9e9b696970295f353c56d33aa3a/cmd/controller-manager/app/controller-manager.go#L402-L404

So we must make sure that the metrics collectors defined in controller-runtime/pkg/metrics will be used by client-go/util/workqueue through calling SetProvider function to get itself registered, and the SetProvider function has a sync.Once to admit only the first call.

SetProvider is being called in init functions of two packages, k8s.io/component-base/metrics/prometheus/workqueue and sigs.k8s.io/controller-runtime/pkg/metrics, and both of them are imported by kubefed, here is the dependency paths,

kubefed-controller-manager -> k8s.io/apiserver/pkg/server -> k8s.io/apiserver/pkg/storageversion -> k8s.io/component-base/metrics/prometheus/workqueue
kubefed-controller-manager -> sigs.k8s.io/kubefed/cmd/controller-manager/app -> sigs.k8s.io/controller-runtime/pkg/metrics

From the specification of golang, we will know that k8s.io/component-base/metrics/prometheus/workqueue wins, and client-go/util/workqueue will never use the metrics collectors(defined in sigs.k8s.io/controller-runtime/pkg/metrics) which are being exposed.

So this PR is trying to import sigs.k8s.io/controller-runtime/pkg/metrics before k8s.io/component-base/metrics/prometheus/workqueue to make sure calling SetProvider firstly.

@mars1024
Copy link
Contributor Author

In our product environments, we are using kubfed on dependencies of kubernetes 1.17 which has no k8s.io/apiserver/pkg/storageversion.
So the race condition of SetProvider was missed before.

@mars1024
Copy link
Contributor Author

/assign @hectorj2f

Copy link
Contributor

@hectorj2f hectorj2f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 18, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hectorj2f, mars1024

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 18, 2021
@k8s-ci-robot k8s-ci-robot merged commit 22635e7 into kubernetes-retired:master Mar 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

workqueue metrics do not work.
3 participants