-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pubsub: native OpenTelemetry support #4665
Comments
Hiya, Thanks for filing this issue. OT tracing support is definitely something that's on our roadmap and we're actively planning to support this. At the moment, it's mostly blocked on standardization across Pub/Sub client library languages and waiting on the Go implementation for 1.0. I'll post here for updates. Also appreciate the detail info. If you have specific asks or features you want, I'd be happy to relay them to the rest of the client library maintainers while we work on this. |
Hi again. otel-go released 1.0 yesterday! 🎉 Could you please shed some insight on what
means? I talked to @aabmass and @weyert on the CNCF Slack about the |
Hey! Yeah we (myself and the Node library maintainer) are in contact with @aabmass. Yeah, that's one of the few things that we're trying to standardize though we're not 100% sure how we're going to fix that for Node since it's already released. We're discussing the merits of making a breaking change/major version bump or just supporting both attributes. Among the other things is what parts of the publishing/subscribing we want to trace (and what the end result should look like). We kind of have an idea about this from the Node implementation, but it seems like it might have issues. I'll start hacking away at draft PR in Go and ask for feedback there if you're willing to help with that! |
Did you had any luck with this? |
Sorry for the lack of an update. Yeah, I've had success with a basic span that encompasses send/receive/consume. I'm trying to make sure the spans look like what you get out of other messaging systems, following semconv, and should have a draft ready by end of this week. |
@hongalex That's great! If that's the case I wil try to match semconv usage it in the |
Could you clarify what the issue with message handling is? I did notice that there is some issue with context propagation in Node (contexts are marshalled with json instead of a propagator that follows W3C tracestate/traceparent conventions). One thing that I'm still missing right now in Go is linking spans when doing batch receiving](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.1.0/specification/trace/semantic_conventions/messaging.md#batch-receiving). After this, the PR should be ready for feedback. |
Yeah, I meant that there is a ticket in the Alongside that I want to move towards the same approach you are using with the W3C tracestate/traceparent conventions. For the |
I finally opened #5034, please take a look/try it out for yourself. It's very much a work in progress still (need to figure out how to pass |
Thanks, I will check it out :) |
Hi, what is the current status of that feature? |
Ho folks , is there any news? Very excited to google support otel |
We really need this feature. Right now we don't have a very elegant way of passing trace info from publisher to subscriber. |
What is the situ here, given #5034 (comment)? If we want to keep a consistent trace across all messages, is GCP going to prevent that? |
is there any updated? |
I wrote a blog about how to enable tracing in Google PubSub. Basically you need to wrap tracing information inside message attribute. This is the link to my blog: https://www.nylas.com/blog/distributed-tracing-with-opentelemetry/. Search for "Tracing PubSub" and you will find my helper functions. |
Is there any update as of when this will be picked up again/available? 😄 |
Any update here? Seems this requirement is strong. |
Same question here, any plans to implement it? 🙇🏽 Thanks |
Thanks for the patience on this. We're still working out a few kinks, but if you would be interested in trying out this, it is available to try out as a beta release. This can be done by running Official documentation is still pending, but you should be able to see traces once you do the following
Each message is given its own trace, from publish to subscribe. Batch RPC spans (publish, ack, modack) will live in their own traces, but durations can be viewed on the message trace via span events. While we are close to launching this feature, we welcome any feedback. |
@hongalex Could you clarify how the new support connects with this? #5034 (comment) Has your thinking changed? Are there best practices or suggested options you have in mind? |
The main change is that we are now relying more heavily on links. The design at the time of the previous comment duplicated batch RPC spans to every single message trace. So if you make a publish RPC call with 10 messages in it, then each message trace will contain a copy of the publish RPC span. Now, that span exists in its own trace, and is linked to each message trace it is affiliated with. In addition, in each message trace, we added span events to say how long the RPC took. There were some span name changes and attribute changes that were made with updated messaging semantic conventions in mind. This is still an evolving standard so we reserve the right to make breaking changes to things like spans, span names, attribute names, and links. |
One issue I noticed is that the parent's context is not being used in the |
I left a few comments on the PR, we ran this in production for a while but quickly noticed memory leaks, I believe from the |
This is unfortunately intentional. The short story is that batched spans (like ack) have multiple parents, so it cannot solely rely on parent/child relationships and why I gave it a new context. Instead, batched spans live in their own trace and are linked to the main message trace. I plan to add a new option to link back from message trace -> batched ops traces.
Thanks for the comments. Fixed the memory leak issues (and others) and cut another release under |
@hongalex thanks for the quick turnaround. I can confirm we aren't seeing the memory leak in the latest release. |
Two changes: - use a pubsub client library version that enables trace support - this allows a header `googclient-traceparent` to be progapated to push subscribers. We can also populate this ourselves (by setting a message attribute), but the library also adds a couple nice spans inside to the pubsub client library - on handlers, overwrite `traceparent` by `googclient-traceparent` if available This PubSub client library is at 1.37, slightly older than our current version (1.38). According to googleapis/google-cloud-go#4665 (comment) this will be launching ~soonish so we won't have to use an older version much longer. If it takes very long, we can DIY. <img width="1482" alt="image" src="https://github.com/chainguard-dev/terraform-infra-common/assets/138266/179f91bb-20f1-4a95-b64a-ae8157300ebe"> --------- Signed-off-by: Nghia Tran <[email protected]>
@hongalex do you have an estimated timeline for the launch of this feature? Also, do you plan to cut regular releases following the main versions (1.38.0 now)? Thanks a lot! |
Thanks all for your patience. This feature is now released as part of Please use the issue tracker for any problems you may notice while enabling tracing. As a reminder, tracing is GA, but span names, attributes, links, and other parts of the trace are subject to change as we align with OpenTelemetry semantic conventions messaging standards. |
function in main line now googleapis/google-cloud-go#4665 (comment) bump to go 1.23 --------- Signed-off-by: Kenny Leung <[email protected]>
cc @aabmass
Is your feature request related to a problem? Please describe.
As developer using PubSub via the Golang client lib, I would like to propagate OpenTelemetry span contexts via messages so that I can better understand the entire lifecycle of a message.
Describe the solution you'd like
I would expect a working solution to look similar to the NodeJS implementation.
Maybe something like this?
Then I would expect the result to look something like this when configured with a Cloud Trace exporter:
Image is from this article by the former intern who implemented the NodeJS and Python versions of this request.
Describe alternatives you've considered
OpenTelemetry-OpenCensus Bridge would be great if PubSub were already instrumented to propagate OpenCensus spans through messages like this, but unfortunately the behavior I describe is orthogonal to the existing issue to port the existing tracing from OpenCensus to OpenTelemetry.
The text was updated successfully, but these errors were encountered: