feat(spans): Use gauges to report self and total time to lower costs #3448

phacops · 2024-04-17T16:19:03Z

We're currently storing self and total times for spans as distribution metrics. Those store a lot of information in order to be flexible enough to calculate percentiles.

Since querying percentiles has been too slow to display on any screen, we decided not to do it this way and use a different method. It's not necessary to keep storing them as distributions and we could use gauges instead to lower our cost.

The plan is to add those new gauge metrics, record them for a while, add support to query them in the product behind a feature flag and switch to them when we're happy with the result.

jjbayer

I like the idea! This will pretty much double the amount of buckets we send between Relays and to Kafka. Can we put the new metrics behind a feature flag?

jjbayer · 2024-04-18T07:15:16Z

relay-dynamic-config/src/defaults.rs

+        },
+        MetricSpec {
+            category: DataCategory::Span,
+            mri: "g:spans/self_time@millisecond".into(),


Can we stick with exclusive_time and duration instead of self_time and total_time? Or would that cause naming clashes in some downstream component that ignores the type prefix?

I think we can stick with them but we do rename them to total_time and self_time in the UI. I thought that was a good occasion to stop having to do that.

Any reason why not to change the names?

jjbayer

Is the purpose of this PR to experiment with what the product would look like without distributions, or is it already decided and we now need the double write for the transition period?

If it's just for experimenting, I don't think we actually need to record a gauge metric to see what the product would look like without distributions. Distributions are (almost) a superset of gauges, so we should be able to query min, max, sum, and count from the distributions table just as easily as from the gauges table.

phacops · 2024-04-19T13:50:28Z

This is not to experiment, this is to "downgrade" the metric to lower its cost since we only use averages and a solution with percentiles won't come by improving distributions. Gauges are cheaper than distributions to store.

jjbayer · 2024-04-22T11:44:21Z

relay-dynamic-config/src/defaults.rs

+    ];
+
+    if double_write_distributions_as_gauges {
+        metrics.append(&mut vec![


Suggested change

metrics.append(&mut vec![

metrics.extend([

feat(spans): Use gauges to report self and total time to lower costs

be81d36

phacops requested a review from a team as a code owner April 17, 2024 16:19

Add a CHANGELOG entry

326a2f1

jjbayer reviewed Apr 18, 2024

View reviewed changes

jjbayer reviewed Apr 19, 2024

View reviewed changes

phacops added 3 commits April 19, 2024 07:53

Merge branch 'master' into pierre/spans-use-gauges-to-lower-cost

3f78011

Fix CHANGELOG

8c3df3e

Fix CHANGELOG

ae1f2a3

phacops mentioned this pull request Apr 19, 2024

feat(spans): Add a feature to control double writing distributions as gauges getsentry/sentry#69344

Merged

Gate behind a feature flag

0f455c4

phacops requested a review from jjbayer April 19, 2024 21:26

phacops self-assigned this Apr 19, 2024

phacops enabled auto-merge (squash) April 22, 2024 11:39

jjbayer approved these changes Apr 22, 2024

View reviewed changes

phacops merged commit 03643c3 into master Apr 22, 2024
20 checks passed

phacops deleted the pierre/spans-use-gauges-to-lower-cost branch April 22, 2024 11:44

jjbayer mentioned this pull request Apr 23, 2024

feat(mobile-ui): Add slow, frozen, and total frames metrics #3473

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(spans): Use gauges to report self and total time to lower costs #3448

feat(spans): Use gauges to report self and total time to lower costs #3448

phacops commented Apr 17, 2024

jjbayer left a comment

jjbayer Apr 18, 2024

phacops Apr 18, 2024

jjbayer left a comment

phacops commented Apr 19, 2024 •

edited

Loading

jjbayer Apr 22, 2024

feat(spans): Use gauges to report self and total time to lower costs #3448

feat(spans): Use gauges to report self and total time to lower costs #3448

Conversation

phacops commented Apr 17, 2024

jjbayer left a comment

Choose a reason for hiding this comment

jjbayer Apr 18, 2024

Choose a reason for hiding this comment

phacops Apr 18, 2024

Choose a reason for hiding this comment

jjbayer left a comment

Choose a reason for hiding this comment

phacops commented Apr 19, 2024 • edited Loading

jjbayer Apr 22, 2024

Choose a reason for hiding this comment

phacops commented Apr 19, 2024 •

edited

Loading