Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate backend metrics and add hedged request metric #790

Merged

Conversation

josephwoodward
Copy link
Contributor

@josephwoodward josephwoodward commented Jun 23, 2021

What this PR does:

This PR does two things:

  • Consolidates the backend metrics into a more general purpose metrics
  • Adds a new metrics to track the number of hedged requests

Which issue(s) this PR fixes:
Fixes #760

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@joe-elliott
Copy link
Member

Thanks for digging into this. I was thinking more like consolidating the three instrumentation.go files we have in gcs/s3/azure and then rolling this new metric into that.

It's a shame that these metrics names are different but we should keep the metrics backwards compatible:
tempodb_s3_request_duration_seconds
tempodb_azure_request_duration_seconds
tempodb_gcs_request_duration_seconds

All new metrics should not contain the name of the backend and should be the same no matter what backend you choose.

@josephwoodward
Copy link
Contributor Author

josephwoodward commented Jun 23, 2021

Thanks for digging into this. I was thinking more like consolidating the three instrumentation.go files we have in gcs/s3/azure and then rolling this new metric into that.

It's a shame that these metrics names are different but we should keep the metrics backwards compatible:
tempodb_s3_request_duration_seconds
tempodb_azure_request_duration_seconds
tempodb_gcs_request_duration_seconds

All new metrics should not contain the name of the backend and should be the same no matter what backend you choose.

I see, thanks for the clarification.

So to reiterate: capture duration metrics in a consolidated instrumentation.go (using the current instrumented transport approach) and grab the hedged specific metrics via stats.Snapshot() method every 10 seconds in a Go routine. Is that right? Sorry for all of the questions!

@joe-elliott
Copy link
Member

Yup, that's what I was thinking.

@josephwoodward josephwoodward force-pushed the ConsolodateHedgeRequestMetrics branch from e89b478 to 1d6d7ed Compare June 28, 2021 09:40
@josephwoodward
Copy link
Contributor Author

Hi @joe-elliott, I've been working on this over the past few evenings and feel it's getting to a point where it'd be valuable to get some feedback on the approach. When you have moment would you mind having a quick look?

Thanks

Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A well thought out addition. Thank you.

Please add a changelog entry mentioning the new metrics and the deprecation of the old. Other than that this look ready to go.

@josephwoodward josephwoodward force-pushed the ConsolodateHedgeRequestMetrics branch from a834cc5 to dc06bd0 Compare June 30, 2021 02:43
@josephwoodward josephwoodward marked this pull request as ready for review June 30, 2021 02:44
@josephwoodward josephwoodward changed the title Consolidating hedged request metrics Consolidate backend metrics and add hedged request metric Jun 30, 2021
@josephwoodward
Copy link
Contributor Author

@joe-elliott This PR is good to go if you're happy with it.

CHANGELOG.md Outdated Show resolved Hide resolved
@josephwoodward josephwoodward force-pushed the ConsolodateHedgeRequestMetrics branch from 873f441 to 4a1572f Compare June 30, 2021 13:09
Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! Thank you for the contribution.

@joe-elliott joe-elliott merged commit f542976 into grafana:main Jun 30, 2021
@josephwoodward josephwoodward deleted the ConsolodateHedgeRequestMetrics branch June 30, 2021 14:30
@josephwoodward
Copy link
Contributor Author

@joe-elliott No problem! I'm pretty new to Go so this has been a great learning opportunity on a reasonably large, real-world Go product. I'll look at picking up another issue now that this has been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add metrics for better visibility around hedged requests
2 participants