Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve servicegraph handling of newer OTEL semantic conventions #3711

Merged
merged 7 commits into from
May 28, 2024

Conversation

zalegrala
Copy link
Contributor

@zalegrala zalegrala commented May 23, 2024

What this PR does:

Currently, the db.name attribute is used as the node identity on the service graph. This leads to situations where the name of the database may be reused between services, leading to a single node on the graph where more than one actual service is in use.

This PR changes the node identity when newer OTEL semantic conventions are available as attributes on the span. The following attributes are considered in order:

  • peer.service
  • server.address
  • network.peer.address -- additionally network.peer.port will be used if available
  • fall back to db.name

Which issue(s) this PR fixes:
Fixes #3010

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

)

// Check for db.name first. The dbName is set initially to maintain backwards compatbility.
if name, ok := processor_util.FindAttributeValue(string(semconv.DBNameKey), resourceAttr, span.Attributes); ok {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern here loops over the span and resource attributes to find specific keys. This means that if you use the preferred attributes in the spans, then the generator has to do less work. I'd considered finding all available keys in one pass, but following the existing pattern elsewhere feels simpler.

Copy link
Contributor

@hedss hedss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a GTM perspective, this is great, thank you Zach!

Copy link
Contributor

@knylander-grafana knylander-grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating the docs!

// Check for network.peer.address and network.peer.port. Use port if it is present.
if host, ok := processor_util.FindAttributeValue(string(semconv.NetworkPeerAddressKey), resourceAttr, span.Attributes); ok {
if port, ok := processor_util.FindAttributeValue(string(semconv.NetworkPeerPortKey), resourceAttr, span.Attributes); ok {
e.ServerService = fmt.Sprintf("%s:%s", host, port)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little concerned about cardinality here. it should be fine b/c it should accurately reflect real databases in an org.

we had similar issues before and @mapno submitted this.

#2505

my guess is that it's fine, but we should keep an eye out for cardinality jumps in prod as we roll this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about the carnality here too, but I think because this is server side information, we should be okay since I expect server name/port combos to be significantly less than the ephemeral ports in use by the client. Good thing to watch out for when we roll for sure.

Copy link
Member

@yvrhdn yvrhdn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 🙂

CHANGELOG.md Outdated
@@ -49,6 +49,7 @@
* [ENHANCEMENT] TraceQL - Add support for scoped intrinsics using `:` [#3629](https://github.com/grafana/tempo/pull/3629) (@ie-pham)
available scoped intrinsics: trace:duration, trace:rootName, trace:rootService, span:duration, span:kind, span:name, span:status, span:statusMessage
* [ENHANCEMENT] Performance improvements on TraceQL and tag value search. [#3650](https://github.com/grafana/tempo/pull/3650),[#3667](https://github.com/grafana/tempo/pull/3667) (@joe-elliott)
* [ENHANCEMENT] Improve use of OTEL semantic conventions on the service graph [#3711](https://github.com/grafana/tempo/pull/3711) (@zalegrala)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might have to move this back up the changelog, unless we backport this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you.

// if we have a server.address, use it as the database ServerService
// if we have a network.peer.address, use it as the database ServerService. Include :port if network.peer.port is present
// if we have a db.name, use it as the database ServerService, which is the backwards-compatible behavior
func (p *Processor) upsertDatabaseRequest(e *store.Edge, resourceAttr []*v1_common.KeyValue, span *v1_trace.Span) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to break this out in a separate function 👍

@zalegrala zalegrala force-pushed the otelDatabaseSemconv branch from 6211030 to ca5189a Compare May 28, 2024 16:27
@zalegrala zalegrala merged commit c6bf298 into grafana:main May 28, 2024
15 checks passed
@zalegrala zalegrala deleted the otelDatabaseSemconv branch May 28, 2024 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Metrics generator] Prioritise OTel DB semantic attributes over db.name and db.system
5 participants