Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics/GenevaActions for Clustersync #3785

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rhamitarora
Copy link
Collaborator

@rhamitarora rhamitarora commented Aug 21, 2024

Which issue this PR addresses:

ARO-9545 and ARO-8659 both JIRA's have common code

What this PR does / why we need it:

  1. Create new clustersync metrics under monitor package. Both syncSets and selectorSyncSets should be merged into the same Geneva metric.
  2. Create a Geneva Action to show the clustersync resource of a cluster.

Test plan for issue:

Unit test cases added.
Need to create respective metrics dashboard in Geneva.

Is there any documentation that needs to be updated for this PR?

Will create TSGs for respective metrics.

How do you know this will function as expected in production?

Monitor from Geneva Dashboard.

@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch 5 times, most recently from 3cdbf8c to 8d1a6e9 Compare August 27, 2024 11:30
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch 6 times, most recently from b5ac73b to 99fa8df Compare September 11, 2024 07:40
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch 18 times, most recently from 08835f8 to 41a6e7c Compare September 17, 2024 12:43
@rhamitarora rhamitarora marked this pull request as ready for review September 17, 2024 14:16
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch 4 times, most recently from 8d33911 to afbaf3b Compare October 22, 2024 05:12
Copy link
Collaborator

@LiniSusan LiniSusan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks good to me

@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch from afbaf3b to 548e097 Compare October 22, 2024 09:24
bitoku
bitoku previously approved these changes Oct 22, 2024
pkg/frontend/admin_hive_syncset_resources.go Outdated Show resolved Hide resolved
pkg/frontend/admin_hive_syncset_resources.go Show resolved Hide resolved
@bitoku
Copy link
Collaborator

bitoku commented Oct 22, 2024

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch 2 times, most recently from 88d5029 to 99c51f8 Compare October 23, 2024 09:29
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch from 99c51f8 to a04c6bb Compare October 24, 2024 09:13
bitoku
bitoku previously approved these changes Oct 24, 2024
@github-actions github-actions bot added the needs-rebase branch needs a rebase label Oct 24, 2024
Copy link

Please rebase pull request.

@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch from a04c6bb to 99c3f75 Compare October 24, 2024 17:22
@github-actions github-actions bot removed the needs-rebase branch needs a rebase label Oct 24, 2024
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch 2 times, most recently from 60aee73 to e884d16 Compare October 25, 2024 03:45
merging 8659 and 9545

Metrics for SyncSet and SelectorSyncSets
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch from e884d16 to 4522c6e Compare October 28, 2024 06:28
Copy link
Collaborator

@tsatam tsatam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, just a few questions about the metric we emit to make sure the metric works for us downstream (dashboarding/alerting).

if clusterSync.Status.SyncSets != nil {
for _, s := range clusterSync.Status.SyncSets {
mon.emitGauge("hive.clustersync", 1, map[string]string{
"metric": "SyncSets",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: having a dimension on the metric named "metric" might be a little confusing - should we rename this to something else? for example syncType?

if clusterSync != nil {
if clusterSync.Status.SyncSets != nil {
for _, s := range clusterSync.Status.SyncSets {
mon.emitGauge("hive.clustersync", 1, map[string]string{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Do we want to change what "value" we emit here based on the success/failure state of the syncset? For example, return 1 for Successful syncsets and 0 for failed syncsets?

This might make it easier for us to, for example, build downstream dashboards or alerts based off of this metric.

@@ -23,7 +23,9 @@ var _ = Describe("Monitor", func() {
wg.Add(1)
mon, err := cluster.NewMonitor(log, clients.RestConfig, &api.OpenShiftCluster{
ID: resourceIDFromEnv(),
}, &noop.Noop{}, nil, true, &wg)
}, &api.OpenShiftClusterDocument{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be worth adding E2E tests for both the monitor and Geneva Actions functionality. We have various contexts in which E2E runs against an RP with Hive enabled (production/release E2E, PR E2E after we move to the containerized implementation).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is too difficult to add the E2E tests in this PR, this can become follow-up work, however.

}

func NewMonitor(log *logrus.Entry, restConfig *rest.Config, oc *api.OpenShiftCluster, m metrics.Emitter, hiveRestConfig *rest.Config, hourlyRun bool, wg *sync.WaitGroup) (*Monitor, error) {
func NewMonitor(log *logrus.Entry, restConfig *rest.Config, oc *api.OpenShiftCluster, doc *api.OpenShiftClusterDocument, m metrics.Emitter, hiveRestConfig *rest.Config, hourlyRun bool, wg *sync.WaitGroup, hiveClusterManager hive.ClusterManager) (*Monitor, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little strange to me that we have both oc and doc here (as oc should be a subproperty of doc), and hiveRestConfig and hiveClusterManager.

I think there's an opportunity for us to deduplicate some of these dependencies, but that can be a follow-up refactor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
firefly Issues or Pull requests owned by Team Firefly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants