Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Micrometer Tracing Too Granular #34133

Closed
jimbogithub opened this issue Feb 9, 2023 · 8 comments
Closed

Micrometer Tracing Too Granular #34133

jimbogithub opened this issue Feb 9, 2023 · 8 comments
Labels
status: declined A suggestion or change that we don't feel we should currently apply

Comments

@jimbogithub
Copy link

jimbogithub commented Feb 9, 2023

For a simple HTTP Server call trace we are seeing multiple Spans with two of them having a dozen or so attached Events. This feels like way too much detail for "application level" tracing. It is almost exclusively "framework" level tracing which as the application developer I have little interest in or control over. I'm relying on you guys to keep up the good work on that front.

We see Spans for:

  • the expected http get/post url...
  • security filterchain before with a dozen attached ...Filter before Events
  • secured request
  • security filterchain after with a dozen attached ...Filter after Events

I'm guessing there are probably ways to filter them out but I think the default behaviour should be less granular.

[Spring-Boot 3.0.2]

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Feb 9, 2023
@wilkinsona
Copy link
Member

wilkinsona commented Feb 9, 2023

This feels quite subjective to me and an area where it won't be possible to please all of the people all of the time. I've asked the observability team to take a look and share their opinions.

@wilkinsona wilkinsona added the status: waiting-for-internal-feedback An issue that needs input from a member or another Spring Team label Feb 9, 2023
@jonatan-ivanov
Copy link
Member

@jimbogithub Could you please give us a screenshot about your tracing UI showing us the parts you found too granular?
Also, what is the underlying issue (is there any)? Is it messing up your UI, are more important details lost because of them? Or you are not interested in them?

@jimbogithub
Copy link
Author

A few screenshots...

Jaeger for a simple HTTP request that does one JDBC call. There are 6 Spans when 2 would be sufficient and there are 14 Events each attached to 2 of the Spans:
Graph3

DataDog for a simple HTTP request (prometheus metric scrape). 5 Spans when I think 1 is sufficient. (DataDog hasn't caught up yet with displaying the Events properly):
Graph1

Graph2

Issues:

  1. Yes it does over complicate the FlameGraph displays. The above examples are the most trivial flows but have at least 5 Spans. In real world cases a Trace for a business level activity will have a series of REST, REST client, JDBC, JMS, Kafka etc Spans distributed across multiple services and these additional "framework" Spans are noise.
  2. Where's my Tag? If I add a Tag to the current Span during request processing it ends up on the "secured request" Span when all the other interesting Tags (URI, Status etc) are on the initial Span. I have to hunt in multiple places to find the info I want. Search tools will also not be able to find Spans of interest if the Tags we are searching on are on different Spans.
  3. Costs. There will be CPU, network and storage costs both in terms of performance and dollars (e.g. DataDog Span retention) for these Spans and Events. Some I can mitigate by discarding Spans but should they really be there in the first place?

Tracing is awesome. The ability to see the distributed application flow and identify latency, bottlenecks, call frequencies, changes in behaviour over time etc is great. But I don't need to know that the Spring HeaderWriterFilter took 9.6 microseconds. I get that the Spring developers might want to know that so maybe you have some config to turn it on and off but I doubt it's what most app developers are looking for and should be off by default.

@jonatan-ivanov
Copy link
Member

jonatan-ivanov commented Feb 10, 2023

TL;DR: All of these can be valid for other users but you can turn everything off, see the last paragraph.

~There are X Spans when Y would be sufficient

I think this really depends on your use-case. Different users have different needs but I agree that the http span is likely more important than the security spans for most of the users. Maybe we should consider adding a flag that is opt-out by default and you need to opt-in if you want this data.

there are 14 Events each attached to 2 of the Spans

This is exactly why I asked for a screenshot: I thought something is messed up and they are polluting other spans. Please notice that you don't need to look at these events on your UI, they should be hidden from you under the security spans. This can also be solved by the flag I mentioned before.

DataDog for a simple HTTP request (prometheus metric scrape). 5 Spans when I think 1 is sufficient.

Again, you might think 1 is sufficient but one can argue that 0 should be sufficient since they don't want to observe that Prometheus scrapes their app. Also some can argue that 1 is not sufficient because this way it is visible if the Prometheus endpoint is secured (and how). Also, since these security components can call out to other components over the network (LDAP, DB, Auth server) or just slow by design, this information actually can be useful.

Yes it does over complicate the FlameGraph displays...

By FlameGraph you mean the trace vies with the spans and timeline or do you have an actual FlameGraph?

Where's my Tag? If I add a Tag to the current Span...

You should not add tags to the current Span you should add them to the current Observation but in both cases you can get the parent and add the tags to it. Also, there are components that you can inject and provide the data you want, you don't necessarily need to manually get the Observation/Span and tag it.

But I don't need to know that the Spring HeaderWriterFilter took 9.6 microseconds.

I get that but please understand that there are users who might. Although I agree that being this on by default is arguable but to me the value of it in certain scenarios is not a question.

Costs. There will be CPU, network and storage costs both in terms of performance and dollars

Please look at the ObservationPredicate component. With the help of it, you can disable the observations you want, like security (but also actuator, etc). If you do this, the observation will be no-op makink the cost ~zero.
Fyi: we are also thinking about introducing some sort of granularity concept to the Observation API (think about log levels) which might be similar to the optin/out flag but with finer control.

@jimbogithub
Copy link
Author

jimbogithub commented Feb 12, 2023

@jonatan-ivanov Thanks for the detailed info. I've solved the "granularity" issues with:

	@Bean
	ObservationRegistryCustomizer<ObservationRegistry> observationRegistryCustomizer() {
		return (registry) -> registry.observationConfig()
				.observationPredicate(ObservationPredicates.noSpringSecurity())
				.observationPredicate(ObservationPredicates.noActuator());
	}
import org.springframework.http.server.observation.ServerRequestObservationContext;
import io.micrometer.observation.ObservationPredicate;

public final class ObservationPredicates {

	private ObservationPredicates() {
	}

	public static ObservationPredicate noSpringSecurity() {
		return (name, context) -> !name.startsWith("spring.security.");
	}

	public static ObservationPredicate noActuator() {
		return (name, context) -> {
			if (context instanceof ServerRequestObservationContext srCtx) {
				return !srCtx.getCarrier().getRequestURI().startsWith("/actuator");
			}
			return true;
		};
	}

}

I've also extended the org.springframework.http.server.observation.DefaultServerRequestObservationConvention to inject additional Tags via KeyValues (also related to #34132).

@wilkinsona
Copy link
Member

@jimbogithub Good to hear you've got things working in a way that suits you. Thanks for letting us know. I'll close this one now as I don't think there's anything that we should change in Spring Boot. If we get more feedback that the defaults are to granular we can reconsider the defaults in the future, but they won't change until a new minor release (at the earliest) along with a way to restore the current behavior.

@wilkinsona wilkinsona closed this as not planned Won't fix, can't repro, duplicate, stale Feb 12, 2023
@wilkinsona wilkinsona added status: declined A suggestion or change that we don't feel we should currently apply and removed status: waiting-for-triage An issue we've not yet triaged status: waiting-for-internal-feedback An issue that needs input from a member or another Spring Team labels Feb 12, 2023
@alexismanin
Copy link

Adding my opinion to the debate : the Spring security "secured request" span causes me trouble when I want to get or amend the root HTTP request span, as any component (web filter, rest controller, etc.) that want to access the root http request span must run before the security filter (which is nearly impossible except for a filter with higher precedence order).

I think it is good to add a 'security filter chain before', to be able to identify/isolate any security related action. But once security checks are done, we should resume on the base HTTP span, not force the rest of the filter chain to run inside a spring security based span ('secured request').

For clarity, here is an example: in the below trace, 1 Is the root span containing HTTP request information, 2 is useful, it encloses exactly the security chain, but 3 is annoying because it "hides" the HTTP span and make it difficult to access it. Also, it does not add much value, I think. If the security filter is successful, we resume HTTP request processing, I am really not sure we require a "secured request" span.

spring-security-span

I wonder if other people have the same opinion, or if the "secured request" span is useful for some use-cases ?

@jonatan-ivanov
Copy link
Member

Adding my opinion to the debate : the Spring security "secured request" span causes me trouble when I want to get or amend the root HTTP request span, as any component (web filter, rest controller, etc.) that want to access the root http request span must run before the security filter (which is nearly impossible except for a filter with higher precedence order).

I might not understand this or "nearly impossible" might be an overstatement, this should be possible in multiple ways:

  • Use an ObservationFilter or an ObservationConventon to modify the HTTP Observation
  • Use a SpanFilter to modify the HTTP Span
  • Call observation.getParent() recursively until it returns null to get the root (HTTP) Observation

not force the rest of the filter chain to run inside a spring security based span ('secured request').

As far as I know, the secured request Observation is there so that you can see if there any "rogue" operations that are not falling under the scope of authentication/authorization (not secured). You can easily disable it via an ObservationPredicate.

What is useful/unnecessary varies a lot, what feels annoying to you might be useful to other users and there are users who disable the whole security observability feature entirely that's why there are many options for you to customize the behavior (see above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: declined A suggestion or change that we don't feel we should currently apply
Projects
None yet
Development

No branches or pull requests

5 participants