-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(spans): JSON Kafka message with metadata #2556
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add profile_id
as well? Only filled if the profile
context contains a profile_id
key. Should be filled for every spans generated by the transaction (and the segment as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed some values missing from before or which could be added to help:
description
start_timestamp_ms
Is group
in the sentry_tags
guaranteed to be a good value (convertible to an hex value)?
And regarding tags
and sentry_tags
, can we make sure we don't add a None
value?
relay-server/src/actors/processor.rs
Outdated
exclusive_time | ||
.value() | ||
.ok_or(anyhow::anyhow!("missing exclusive_time"))?; | ||
|
||
if let Some(sentry_tags) = sentry_tags.value_mut() { | ||
sentry_tags.retain(|key, value| match value.value() { | ||
Some(s) => { | ||
if key == "group" { | ||
// Only allow 16-char hex strings in group. | ||
s.len() == 16 && s.chars().all(|c| c.is_ascii_hexdigit()) | ||
} else { | ||
true | ||
} | ||
} | ||
// Drop empty string values. | ||
None => false, | ||
}); | ||
} | ||
if let Some(tags) = tags.value_mut() { | ||
tags.retain(|_, value| !value.is_empty()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phacops this should give some additional guarantees about the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If tags is not None
but it's empty, is it forwarded as an empty object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. The consumer should be able to handle that though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, fix the changelog. Otherwise lgtm
CHANGELOG.md
Outdated
- Exclude more spans fron metrics extraction. ([#2522](https://github.com/getsentry/relay/pull/2522), [#2525](https://github.com/getsentry/relay/pull/2525), [#2545](https://github.com/getsentry/relay/pull/2545)) | ||
pull/2522), [#2525](https://github.com/getsentry/relay/pull/2525), [#2545](https://github.com/getsentry/relay/pull/2545)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something strange here, looks like it got broken, maybe on master
merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 fixed now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a new release of the sentry kafka schemas library required before deploying this?
relay-server/src/actors/processor.rs
Outdated
// Only allow 16-char hex strings in group. | ||
s.len() == 16 && s.chars().all(|c| c.is_ascii_hexdigit()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: why not longer/shorter strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this is a mistake. We are converting from an hex string into a u64
but the length of the hex string can be 8 I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The group hash is a 64-bit hash, i.e. 8 bytes, but it takes two hex characters to encode a byte. So the group hash must always be 16 characters long. I'm fine with relaxing the condition to <= 16
though.
relay/relay-event-normalization/src/normalize/span/tag_extraction.rs
Lines 339 to 341 in e27eff0
let mut span_group = format!("{:?}", md5::compute(scrubbed_desc)); | |
span_group.truncate(16); | |
span_tags.insert(SpanTagKey::Group, span_group); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String::len
returns bytes, so 16 bytes * 2 chars / byte = 32 chars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
16 is the length of the hexidecimal representation, e.g. 0123456789abcdef
. But it still only encodes 8 bytes of data: 01 23 45 67 89 ab cd ef
.
def get_span(self): | ||
message = self.poll() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use the timeout here? Calling the spans_consumer
with a timeout and it not being used may generate some confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See
relay/tests/integration/fixtures/processing.py
Lines 160 to 162 in 415e5cb
def poll(self, timeout=None): | |
if timeout is None: | |
timeout = self.timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there still a possibility for status
and op
to be in sentry_tags
and at the span top-level and have a different value? Would be good to clean this up and have only 1 value, wherever it's easier to read from.
relay-server/src/actors/processor.rs
Outdated
// Only allow 16-char hex strings in group. | ||
s.len() == 16 && s.chars().all(|c| c.is_ascii_hexdigit()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this is a mistake. We are converting from an hex string into a u64
but the length of the hex string can be 8 I believe.
No, we haven't written one yet for |
Can you also validate |
Done.
Removed relay/relay-event-normalization/src/normalize/span/tag_extraction.rs Lines 235 to 238 in b45dce8
|
This PR makes the following changes to the relay -> sentry kafka schema for spans:
organization_id
andretention_days
so the consumer does not have to look them up.span.sentry_tags
can be used instead ofspan.data
. It now contains the same keys as expected by snuba. The values are guaranteed to be strings.Sentry counterpart: getsentry/sentry#57284