Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(spans): Reduce number of tags on span metrics #2907

Merged
merged 14 commits into from
Jan 8, 2024

Conversation

jjbayer
Copy link
Member

@jjbayer jjbayer commented Jan 4, 2024

During the development of Starfish we've repeatedly added new tags to span metrics without restricting the conditions under which a tag is set (that is, what queries actually require a tag).

This PR introduces a new config for the exclusive_time and exclusive_time_light metrics, which limits the number of cases in which the metrics and / or their tags are extracted. Some tags are removed entirely.

For now, the new config will only be enabled for projects that have the all-modules feature flag enabled, i.e. sentry-internal. After we've verified that the new config covers all required cases, we can enable the new config for all orgs.

ref: https://github.com/getsentry/team-ingest/issues/252

@jjbayer jjbayer changed the title Ref/span metrics by module ref(spans): Reduce number of tags on span metrics Jan 4, 2024
@jjbayer jjbayer requested a review from phacops January 4, 2024 16:03
@jjbayer jjbayer marked this pull request as ready for review January 4, 2024 16:32
@jjbayer jjbayer requested a review from a team as a code owner January 4, 2024 16:32
Comment on lines 1657 to 1696
{
"org_id": 1,
"project_id": 42,
"name": "d:spans/exclusive_time@millisecond",
"type": "d",
"value": [500.0, 1500],
"timestamp": expected_timestamp,
"tags": {"span.status": "ok", "span.op": "default"},
"retention_days": 90,
},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No exclusive time for op:default.

Comment on lines -53 to -95
Bucket {
timestamp: UnixTimestamp(1597976302),
width: 0,
name: "d:spans/exclusive_time@millisecond",
value: Distribution(
[
2000.0,
],
),
tags: {
"environment": "fake_environment",
"span.action": "GET",
"span.category": "http",
"span.description": "GET http://domain.tld",
"span.domain": "domain.tld",
"span.group": "d9dc18637d441612",
"span.op": "http.client",
"transaction": "gEt /api/:version/users/",
"transaction.method": "POST",
"transaction.op": "myop",
},
},
Bucket {
timestamp: UnixTimestamp(1597976302),
width: 0,
name: "d:spans/exclusive_time_light@millisecond",
value: Distribution(
[
2000.0,
],
),
tags: {
"environment": "fake_environment",
"span.action": "GET",
"span.category": "http",
"span.description": "GET http://domain.tld",
"span.domain": "domain.tld",
"span.group": "d9dc18637d441612",
"span.op": "http.client",
"transaction.method": "POST",
"transaction.op": "myop",
},
},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shruthilayaj @gggritso no more exclusive time metrics for http.client, except on mobile screen load spans (see test_extract_span_metrics_mobile_screen).

Is there any other use case where we still need http.client spans?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of, but it'll come up immediately when we work on an API module (unsure when, if ever)

@@ -637,11 +152,8 @@ expression: metrics
"span.domain": "table",
"span.group": "f4a7fef06db3d88e",
"span.op": "db.sql.query",
"span.status": "ok",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gggritso span.status will be removed for db spans.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@@ -637,11 +152,8 @@ expression: metrics
"span.domain": "table",
"span.group": "f4a7fef06db3d88e",
"span.op": "db.sql.query",
"span.status": "ok",
"span.system": "postgresql",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gggritso we still use span.system in Relay to pick an SQL parser, but it won't be a metric tag anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 makes sense! Though, caveat: there's a chance we'll need this back in the future, when we come back to do a database server selector. Right now we're planning to use the database host (doesn't exist as a tag yet) but we might need the system, too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, brining it back should not be a problem!

"transaction": "gEt /api/:version/users/",
"transaction.method": "POST",
"transaction.op": "myop",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gggritso transaction.op will be removed for db spans.

Copy link
Member

@gggritso gggritso Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for now, but it's a little risky! When we get back to working on a Web Service View (which hopefully is soon, but I don't know!) we'll probably want to filter time spent to http.server spans vs. task spans. Again, this is just me speculating, but still!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can live with not having backdated data at the point where we start developing that view, I would drop it for now and bring it back when it's needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@@ -661,10 +173,6 @@ expression: metrics
"span.domain": "table",
"span.group": "f4a7fef06db3d88e",
"span.op": "db.sql.query",
"span.status": "ok",
"span.system": "postgresql",
"transaction.method": "POST",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gggritso transaction.method is still a tag on exclusive_time, but not on exclusive_time_light.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I don't know of any product use-case for querying by transaction method but no transaction

Comment on lines -1199 to -1239
Bucket {
timestamp: UnixTimestamp(1597976302),
width: 0,
name: "d:spans/exclusive_time@millisecond",
value: Distribution(
[
2000.0,
],
),
tags: {
"environment": "fake_environment",
"span.category": "cache",
"span.description": "GET *",
"span.group": "37e3d9fab1ae9162",
"span.op": "cache.get_item",
"span.status": "ok",
"transaction": "gEt /api/:version/users/",
"transaction.method": "POST",
"transaction.op": "myop",
},
},
Bucket {
timestamp: UnixTimestamp(1597976302),
width: 0,
name: "d:spans/exclusive_time_light@millisecond",
value: Distribution(
[
2000.0,
],
),
tags: {
"environment": "fake_environment",
"span.category": "cache",
"span.description": "GET *",
"span.group": "37e3d9fab1ae9162",
"span.op": "cache.get_item",
"span.status": "ok",
"transaction.method": "POST",
"transaction.op": "myop",
},
},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gggritso no more exclusive time metrics for cache spans.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense! Same caveat about Web Service view, we'll probably want cache spans then

"span.category": "db",
"span.description": "GET *",
"span.group": "37e3d9fab1ae9162",
"span.op": "db.redis",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gggritso no more exclusive time metrics for redis spans (for now).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense! Same caveat about Web Service view, we'll probably want redis spans then

"transaction": "gEt /api/:version/users/",
"transaction.method": "POST",
"transaction.op": "myop",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DominikB2014 span.domain, span.status, transaction.method and transaction.op will disappear for resource spans.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjbayer We use span.domain for the domain dropdown, will there be an alternative?
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. how did I miss that. Will add it back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏 Thanks @jjbayer!

@@ -218,7 +221,6 @@ expression: "(&event.value().unwrap().spans, metrics)"
"os.name": "iOS",
"release": "1.2.3",
"span.op": "ui.load.initial_display",
"transaction.method": "GET",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shruthilayaj transaction.method will disappear for mobile span metrics.

// Add conditions to filter spans if a specific module is enabled.
// By default, this will extract all spans.
let (span_op_conditions, resource_condition) = if project_config
// "all-modules" is our de-facto feature flag for experimental features.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// "all-modules" is our de-facto feature flag for experimental features.
// The "all-modules" is our de-facto feature flag for experimental features.

@@ -637,11 +152,8 @@ expression: metrics
"span.domain": "table",
"span.group": "f4a7fef06db3d88e",
"span.op": "db.sql.query",
"span.status": "ok",
"span.system": "postgresql",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, brining it back should not be a problem!

"transaction": "gEt /api/:version/users/",
"transaction.method": "POST",
"transaction.op": "myop",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can live with not having backdated data at the point where we start developing that view, I would drop it for now and bring it back when it's needed.

relay-dynamic-config/src/defaults.rs Outdated Show resolved Hide resolved
@jjbayer jjbayer enabled auto-merge (squash) January 8, 2024 11:19
@jjbayer jjbayer merged commit f7aea71 into master Jan 8, 2024
20 checks passed
@jjbayer jjbayer deleted the ref/span-metrics-by-module branch January 8, 2024 11:39
jjbayer added a commit that referenced this pull request Jan 10, 2024
Follow-up to #2907: Apply reduced
set of span metrics and tags regardless of feature flag.

ref: getsentry/team-ingest#252
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants