Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[diagnostics] Add InstrumentAdvice details to instrumentation doc #43640

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

CodeBlanch
Copy link

@CodeBlanch CodeBlanch commented Nov 19, 2024

Summary

Add content + example showing the System.Diagnostics.DiagnosticSource v9.0.0 InstrumentAdvice API.

The goal is to link to this from the OTel .NET docs: https://github.com/open-telemetry/opentelemetry-dotnet/blob/e3665c95c21082f5675cb4d5863051201f640397/docs/metrics/customizing-the-sdk/README.md?plain=1#L244-L252


Internal previews

📄 File 🔗 Preview link
docs/core/diagnostics/metrics-instrumentation.md docs/core/diagnostics/metrics-instrumentation

@CodeBlanch CodeBlanch requested review from tommcdon and a team as code owners November 19, 2024 19:21
@dotnetrepoman dotnetrepoman bot added this to the November 2024 milestone Nov 19, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates PR is created by someone from the .NET community. label Nov 19, 2024
class Program
{
static Meter s_meter = new Meter("HatCo.Store");
static readonly IReadOnlyList<double> s_shortSecondsBucketBoundaries = [0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static readonly IReadOnlyList s_shortSecondsBucketBoundaries = [0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10];

nit: does this need to be static at all. can you add the list inline when creating InstrumentAdvice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline list does seem like it would be a little easier to read. Perhaps something like:

        static Histogram<double> s_orderProcessingTime = s_meter.CreateHistogram(
            name: "hatco.store.order_processing_time",
            unit: "s",
            description: "Duration order processing.",
            advice: new InstrumentAdvice<double> { HistogramBucketBoundaries = [0.01, 0.05, 0.1, 0.5, 1, 5] });

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@@ -224,6 +224,11 @@ Types of instruments currently available:
<xref:System.Diagnostics.Metrics.Histogram%601.Record%2A> to record these measurements during the collection tool's update interval: 1,5,2,3,10,9,7,4,6,8. A collection tool
might report that the 50th, 90th, and 95th percentiles of these measurements are 5, 9, and 9 respectively.

> [!NOTE]
> For details about how to set the recommended bucket boundaries when creating
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to tell somewhere InstrumentAdvice<T> can be used in the future with more instruments as the scenario permit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you. I feel like we don't need to mention it now. When we have more advice features we can make doc updates to show them?

Aggregation](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#explicit-bucket-histogram-aggregation).

When using explicit bucket aggregation the measurements recorded through a
Histogram will be grouped into buckets which track the sum and count of values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sum is overall, only count is tracked per bucket..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Specification: `[ 0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000,
7500, 10000 ]`.

The default values may not lead to the best granularity for every Histogram. For
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://opentelemetry.io/blog/2023/why-histograms/
https://opentelemetry.io/blog/2023/histograms-vs-summaries/

are 2 useful links that maybe shared here for users to read more on this area...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added links as a sub-section after the main content


The `InstrumentAdvice` API may be used by instrumentation authors to specify the
set of recommended default bucket boundaries for a given Histogram. Consumers
can then choose to use those values when configuring aggregation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a note about -> more buckets gives more precise data, but at the cost of more memory/cpu, so users wont go and add millions of buckets.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an "IMPORTANT" section to cover that

@@ -542,6 +547,66 @@ Name Current Value
> [!NOTE]
> OpenTelemetry refers to tags as 'attributes'. These are two different names for the same functionality.

## Using Advice to customize Histogram instruments

Consumers of Histogram instruments can choose different aggregation strategies.
Copy link
Member

@noahfalk noahfalk Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth spending a few sentences just to describe what histogram bucketing is. Some readers may not be familiar with them. For example:

When using histograms, it is the responsibility of the tool or library collecting the data to decide how best to represent the distribution of values that were recorded. A common strategy is to divide up the range of possible values into sub-ranges called buckets and report how many recorded values were in each bucket. For example a tool might divide numbers into three buckets, those less than 1, those between 1-10, and those greater than 10. If your app recorded the values 0.5, 6, 0.1, 12 then there would be two data points the first bucket, one in the second, and one in the 3rd. Using more buckets with narrower ranges allows representing a distribution with more accuracy, but the tradeoff is that it takes more space to store the results.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added at the top. Removed the last sentence because I added an "IMPORTANT" section to cover that based on @cijothomas's feedback.

Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looked good to me overall. A few suggestions inline.


> [!IMPORTANT]
> In general more buckets will lead to more precise data for a given Histogram
> but each bucket requires memory to store the aggregated details. It is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think the cost is not just memory, but more CPU to iterate through to find the right slot.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I tweaked it to mention CPU as well

The default values may not lead to the best granularity for every Histogram. For
example, sub-second request durations would all fall into the `0` bucket.

To solve this problem the `9.0.0` version of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested rephrase (to account for the fact that this problem is already solved elsewhere; this is just making it easier or making it work for non-otel scenario too)

The solution used by OpenTelemetry was to customize the buckets, by using OpenTelemetry specific mechanism (link to view doc), but this only worked when OpenTelemetry was used.
Starting 9.0.0, histogram creation API is enhanced to allowed accepting buckets..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some content to mention View API

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution Indicates PR is created by someone from the .NET community. dotnet-fundamentals/svc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants