Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Better ILogger sampling and filtering #82465

Open
Tracked by #97522
noahfalk opened this issue Feb 22, 2023 · 26 comments
Open
Tracked by #97522

[Feature Request]: Better ILogger sampling and filtering #82465

noahfalk opened this issue Feb 22, 2023 · 26 comments
Assignees
Milestone

Comments

@noahfalk
Copy link
Member

noahfalk commented Feb 22, 2023

Problem:

Large scale .NET customers are constantly complaining that logging is too expensive to turn on in production, perf, or stress scenarios. Their external log storage supports sampling of logs to limit storage costs, but that doesn't address the perf impact of collecting and exporting the data from the app.

Each log level adds an order of magnitude to the number of records produced, making collection prohibitive, but not collecting them reduces the chances of detecting problems earlier, or further optimizing performance.

Existing partial solutions:

  • When log statements are created in code controlled by the application developer they could add logic to make the log conditional. For example they might write if(Activity.Current?.IsRecorded) { _logger.LogInfo(...); }. This solution doesn't work for log statements in other libraries and it feels awkward to tie logging code from all over to a specific log sampling strategy.
  • When logging with the console sink developers can use QueueFullMode to drop log messages that can't be flushed fast enough. This does avoid some performance issues with blocking on console output, but with little control on which messages get preserved the log might be filled with low value messages while important content got dropped.

Better potential solutions:

UPDATE 1/12: I have added a more specific proposed solution in the discussion below

There are several things we might want to look into and they aren't mutually exclusive:

  • filtering by event ID - this would allow users to turn on/off specific messages from a library without necessarily seeing all other messages at the same logging level.
  • random sampling - The app developer can select some % of the total messages they want to keep and each message uses an RNG to decide if it will be filtered out.
  • request based sampling - Rather than deciding on a message by message basis what should be preserved, this approach tries to decide for an entire request will it be logged or not. This can be an improvement over random sampling because it ensures that when you do see message in the log such as an error or some interesting state you can also find all the other log messages for the same request which lead up to it.
  • rate limiting - rather than trying to define a specific fraction or type of message to filter in advance, we could conditionally filter messages only when load spikes. This can be a useful safety backstop to ensure that sudden failure or load increase on logging doesn't spillover into other problems because the logging is doing too much IO or using too much CPU.
    • combining rate limiting and event ID to only send N per time window for each ID

Ideally any/all of the above options would be configurable in code or via reloadable configuration and they are generic enough to work with any log source and sink. SRE may want to use config settings to mitigate an active issue where a production is logging too much or they may want to proactively configure these options to curate their log output. Being able to change configuration on the fly enables rapid exploration of problem hypotheses without requiring taking the service offline to update.

@noahfalk noahfalk added this to the Future milestone Feb 22, 2023
@ghost
Copy link

ghost commented Feb 22, 2023

Tagging subscribers to this area: @dotnet/area-extensions-logging
See info in area-owners.md if you want to be subscribed.

Issue Details

Problem:

Large scale .NET customers are constantly complaining that logging is too expensive to turn on in production, perf, or stress scenarios. Their external log storage supports sampling of logs to limit storage costs, but that doesn't address the perf impact of collecting and exporting the data from the app.

Existing partial solutions:

  • When log statements are created in code controlled by the application developer they could add logic to make the log conditional. For example they might write if(Activity.Current?.IsRecorded) { _logger.LogInfo(...); }. This solution doesn't work for log statements in other libraries and it feels awkward to tie logging code from all over to a specific log sampling strategy.
  • When logging with the console sink developers can use QueueFullMode to drop log messages that can't be flushed fast enough. This does avoid some performance issues with blocking on console output, but with little control on which messages get preserved the log might be filled with low value messages while important content got dropped.

Better potential solutions:

There are several things we might want to look into and they aren't mutually exclusive:

  • filtering by event ID - this would allow users to turn on/off specific messages from a library without necessarily seeing all other messages at the same logging level.
  • random sampling - The app developer can select some % of the total messages they want to keep and each message uses an RNG to decide if it will be filtered out.
  • request based sampling - Rather than deciding on a message by message basis what should be preserved, this approach tries to decide for an entire request will it be logged or not. This can be an improvement over random sampling because it ensures that when you do see message in the log such as an error or some interesting state you can also find all the other log messages for the same request which lead up to it.
  • rate limiting - rather than trying to define a specific fraction or type of message to filter in advance, we could conditionally filter messages only when load spikes. This can be a useful safety backstop to ensure that sudden failure or load increase on logging doesn't spillover into other problems because the logging is doing too much IO or using too much CPU.

Ideally any/all of the above options would be configurable in code or via reloadable configuration and they are generic enough to work with any log source and sink. Customers may want to use config settings to mitigate an active issue where a production is logging too much or they may want to proactively configure these options to curate their log output.

Author: noahfalk
Assignees: -
Labels:

area-Extensions-Logging

Milestone: Future

@samsp-msft
Copy link
Member

Added some additional comments

@Tratcher
Copy link
Member

Ideally all of these would be configurable by category like log level today. E.g. for a random request I want to see the routing, auth, and MVC debug level and higher logs.

@reyang
Copy link

reyang commented Feb 23, 2023

SRE may want to use config settings to mitigate an active issue where a production is logging too much or they may want to proactively configure these options to curate their log output. Being able to change configuration on the fly enables rapid exploration of problem hypotheses without requiring taking the service offline to update.

Does this imply that the SRE should be able to get a full list of logs that are available? (e.g. discover all the logger categories/instances)

@Tratcher
Copy link
Member

SRE may want to use config settings to mitigate an active issue where a production is logging too much or they may want to proactively configure these options to curate their log output. Being able to change configuration on the fly enables rapid exploration of problem hypotheses without requiring taking the service offline to update.

Does this imply that the SRE should be able to get a full list of logs that are available? (e.g. discover all the logger categories/instances)

More likely they're told which logs to enable for a given feature area or situation.

@samsp-msft
Copy link
Member

Discovery of telemetry is a general problem, without good solutions. The same problem exists for Activities, Counters, EventSource, & Logging.

Its further complicated by conditional code in the app that means that log messages will only be sent if that log provider and level are supported.

There is no good way to statically analyze the application and produce a list of log entries, events etc. The main way to discover them seems to be through reading the code, documentation or a tool that will capture all of them and enumerate the results. I wrote this for eventsource that will listen to all the events while running, and then provide details of them. But using it requires modifying the app code.

I theory an SDK like open-telemetry could include a similar feature in it - as a form of collection mode that can be run for a period, that will then produce a report with all the telemetry that could be collected. Its semi-expensive to run in that all the code paths for generating all the telemetry will be run, but its not egressing much data out of the application.

For that reason it would probably a good feature for dotnet-counters, dotnet-trace and dotnet-monitor.

Does the app-insights SDK have any command/control capability from the backend to the app? As this is something I'd expect a n SRE to run occasionally when examining a problem, rather than something the app is written to send?

Maybe this is a case where you'd use one of the dotnet-* tools from a console to interrogate the running app to determine what its emitting.

@reyang
Copy link

reyang commented Feb 24, 2023

Discovery of telemetry is a general problem, without good solutions. The same problem exists for Activities, Counters, EventSource, & Logging.

I have different understanding. I think ActivitySource, Meter and EventSource allow folks to inject a "listener", and that listener will get notified for all the existing sources and the new sources that will be created in the future. Which means there is a central way of discovering all the sources.

ILogger has a different philosophy that there is no such central place.

@samsp-msft
Copy link
Member

With the above, it will tell you the sources, but not the events that occur - which is half the value so that you can know what settings to use to see the events fired so you can pick the appropriate settings.

With ILogger, you could create a new provider, and configure it to enumerate the events that are fired, rather than log each of them individually.

@reyang
Copy link

reyang commented Feb 24, 2023

With the above, it will tell you the sources, but not the events that occur - which is half the value so that you can know what settings to use to see the events fired so you can pick the appropriate settings.

With ILogger, you could create a new provider, and configure it to enumerate the events that are fired, rather than log each of them individually.

Would you elaborate a bit?

With the listener/source model, it is possible to build a listener that catches all the events, although it might not be the most efficient way, it has provided a mechanism for folks to get what they want if they are willing to pay the cost.

I don't know a way how to get a holistic view of "given an application which uses several 3rd party libs, find all the loggers and hook all the logs/records" unless I use a debugger.

@samsp-msft
Copy link
Member

samsp-msft commented Feb 24, 2023

Following the code in https://learn.microsoft.com/en-us/dotnet/core/extensions/custom-logging-provider
I would expect to have a new Logger, say LogEntryEnumeratorLogger, that:

  • Implements ILogger,
  • Has a corresponding ILoggerProvider
  • Has an AddLogEntryEnumeratorLogger() extension method for the ILoggingBuilder

It would collect information on each log entry passed through it. It would build up a dictionary with each log message, including:

  • Category
  • Level
  • EventID
  • Log format string
  • Count of how many times it was fired

Then when Dispose is called, it would write out the entries to either the console or a file.

It would need to be enabled via program.cs, in addition to whatever other providers the app wants to use:

var builder = WebApplication.CreateBuilder(args);

builder.Logging.AddLogEntryEnumeratorLogger();

And the configured in AppSettings.json, as to whether to collect the info, and how to output it.

{
  "Logging" : {
    "LogEntryEnumeratorLogger" : {
      "Enabled" : true,
      "OutputTo" : "File",
      "FilePath" : "./log_entries.json"
    }
  }
}

There would be a cost to operate the logger, as it would be building up a structure of all the unique log messages. It wouldn't want to format the messages, just collect the format string itself. If the log entry has an ID, then for efficiency it can assume that all messages with that ID would be the same, so it can shortcut having to compare them all. For messages without an ID, it would compare the format string.

When 3rd party libraries are logging, they should be retrieving the ILogger from DI, and so as long as this was initialized before they are called, then it will be able to capture their entries too.

@samsp-msft
Copy link
Member

So it is Friday, and it was a good opportunity to write some code to delve more into Logging implementation, I hacked up a quick demo.

Using https://gist.github.com/samsp-msft/37a5aba3ef7abb5bbadf01ed48477886 and

builder.Logging.AddLogEnumeratorLogger();
...
app.MapGet("/Logs", () => LogEnumeratorLoggerProvider.dumpLogInfo());

and a configuration of

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning",
      "System.Net.Http": "Warning"
    },
    "Console": {
      "IncludeScopes": true,
      "LogLevel": {
        "Microsoft": "Warning",
        "Default": "Information"
      }
    },
    "LogEnumerator": {
      "LogLevel": {
        "Default" : "Trace"
      }
    }
  }
}
Category	Level	ID	Message Format	Count
System.Net.Http.HttpClient.Default.ClientHandler	Information	100	Sending HTTP request {HttpMethod} {Uri}	120
System.Net.Http.HttpClient.Default.ClientHandler	Information	101	Received HTTP response headers after {ElapsedMilliseconds}ms - {StatusCode}	119
System.Net.Http.HttpClient.Default.ClientHandler	Trace	102		120
System.Net.Http.HttpClient.Default.ClientHandler	Trace	103		119
Microsoft.AspNetCore.HostFiltering.HostFilteringMiddleware	Trace	 2	All hosts are allowed.	4
Microsoft.AspNetCore.HostFiltering.HostFilteringMiddleware	Debug	--	Wildcard detected, all requests with hosts will be allowed.	1
Microsoft.AspNetCore.Routing.EndpointMiddleware	Information	 1	Executed endpoint '{EndpointName}'	3
Microsoft.AspNetCore.Routing.EndpointMiddleware	Information	--	Executing endpoint '{EndpointName}'	4
Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware	Error	 1	An unhandled exception has occurred while executing the request.	1
Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware	Warning	 2	The response has already started, the error page middleware will not be executed.	1
Microsoft.Hosting.Lifetime	Information	14	Now listening on: {address}	2
Microsoft.Hosting.Lifetime	Information	--	Hosting environment: {EnvName}	1
Microsoft.Hosting.Lifetime	Information	--	Content root path: {ContentRoot}	1
Microsoft.Hosting.Lifetime	Information	--	Application started. Press Ctrl+C to shut down.	1
Microsoft.Extensions.Hosting.Internal.Host	Debug	 1	Hosting starting	1
Microsoft.Extensions.Hosting.Internal.Host	Debug	 2	Hosting started	1
Microsoft.AspNetCore.Routing.Matching.DfaMatcher	Debug	1001	{CandidateCount} candidate(s) found for the request path '{Path}'	4
Microsoft.AspNetCore.Routing.Matching.DfaMatcher	Debug	1005	Endpoint '{Endpoint}' with route pattern '{RoutePattern}' is valid for the request path '{Path}'	2
Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets	Debug	 6	Connection id "{ConnectionId}" received FIN.	2
Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets	Debug	 7	Connection id "{ConnectionId}" sending FIN because: "{Reason}"	2
Microsoft.AspNetCore.Hosting.Diagnostics	Information	 1		4
Microsoft.AspNetCore.Hosting.Diagnostics	Information	 2		3
Microsoft.AspNetCore.Hosting.Diagnostics	Debug	13	Loaded hosting startup assembly {assemblyName}	3
System.Net.Http.HttpClient.Default.LogicalHandler	Information	100	Start processing HTTP request {HttpMethod} {Uri}	120
System.Net.Http.HttpClient.Default.LogicalHandler	Information	101	End processing HTTP request after {ElapsedMilliseconds}ms - {StatusCode}	119
System.Net.Http.HttpClient.Default.LogicalHandler	Trace	102		120
System.Net.Http.HttpClient.Default.LogicalHandler	Trace	103		119
Microsoft.AspNetCore.Server.Kestrel.Http2	Debug	35	Trace id "{TraceIdentifier}": HTTP/2 stream error "{error}". A Reset is being sent to the stream.	1
Microsoft.AspNetCore.Server.Kestrel.Http2	Trace	37	Connection id "{ConnectionId}" received {type} frame for stream ID {id} with length {length} and flags {flags}.	9
Microsoft.AspNetCore.Server.Kestrel.Http2	Trace	49	Connection id "{ConnectionId}" sending {type} frame for stream ID {id} with length {length} and flags {flags}.	91
Microsoft.AspNetCore.Server.Kestrel.Https.Internal.HttpsConnectionMiddleware	Debug	 1	Failed to authenticate HTTPS connection.	2
Microsoft.AspNetCore.Server.Kestrel.Https.Internal.HttpsConnectionMiddleware	Debug	 3	Connection {connectionId} established using the following protocol: {protocol}	1
Microsoft.AspNetCore.Routing.EndpointRoutingMiddleware	Debug	 1	Request matched endpoint '{EndpointName}'	4
Microsoft.AspNetCore.Server.Kestrel.Connections	Debug	 1	Connection id "{ConnectionId}" started.	3
Microsoft.AspNetCore.Server.Kestrel.Connections	Debug	 2	Connection id "{ConnectionId}" stopped.	2
Microsoft.AspNetCore.Server.Kestrel.Connections	Debug	39	Connection id "{ConnectionId}" accepted.	3
LoadTest	Information	--	using service url {url}	2
LoadTest	Trace	--	Thread {threadID} making request {count}	120
LoadTest	Information	--	Loading service #{index} for {duration} with {threads}	2
Microsoft.AspNetCore.Server.Kestrel	Error	13	Connection id "{ConnectionId}", Request id "{TraceIdentifier}": An unhandled exception was thrown by the application.	1

@samsp-msft
Copy link
Member

This also makes me question whether we need to have filtering built in - or whether it could be another Logging provider that you would nest other loggers under.

builder.Logging.AddLogFilter( logs =>
{
  logs.AddConsoleLogger();
  Logs.AddXXXLogger();
}

Which would be all managable through config as to what filtering is desired.

@cijothomas
Copy link
Contributor

It would collect information on each log entry passed through it. It would build up a dictionary with each log message, including:

@samsp-msft You'll be to collect information those logs only which were made using the ILogger instances obtained from the same LoggerFactory to which the LoggerProvider was added. I could setup one/many other LoggerFactory instances with different providers, and there is no way to get that information inside the provider you create. (Which is different from Activity/Meter/EventSource - they all allow setting up listeners, so there is way to "hide" anything.)

@Tratcher
Copy link
Member

More examples of wanting to emit a specific log from a category: microsoft/reverse-proxy#2202

@halter73
Copy link
Member

#49924 appears to be a special case of this issue.

@tarekgh tarekgh modified the milestones: Future, 9.0.0 Nov 20, 2023
@noahfalk
Copy link
Member Author

I'd like to propose a narrow and more concrete solution to see if we can move this forward. We would add a new API similar to this (I'm not too concerned with exact naming or signature, the proposal is primarily about the general concept and scope):

public static class LoggingFilter
{
    public static void AddDynamicFilter(
        this ILoggingBuilder builder,
        string category, 
        Func<LogLevel, EventId, TState, Exception?, Func<TState,Exception?,string>, bool> filter);
}

Every time ILogger.Log() gets called on an ILogger produced from this ILoggerFactory it would do:

  1. [pre-existing behavior] if IsEnabled() is false return without logging
  2. [new behavior] call each dynamic filter delegate with matching category prefix, if any of the filters returns false return without logging.
  3. [pre-existing behavior] forward the call to registered ILoggerProviders

Some possible usage examples:

// sample a randomly selected 5% of logs
builder.Logging.AddDynamicFilter(null,
    (level, eventId, state, exception, formatter) => Random.Shared.NextDouble() < 0.05);

// only log messages if there is an active Activity which is marked as Recorded
// (distributed tracing sampling rules determine this)
builder.Logging.AddDynamicFilter(null,
    (level, eventId, state, exception, formatter) => Activity.Current?.IsRecorded);

// eliminate a known noisy log message
builder.Logging.AddDynamicFilter("FooCorp.SomeCategoryName",
    (level, eventId, state, exception, formatter) => eventId != UndesiredEventId);

// rate limiting
// assuming some helper rateLimiter object has already been written that keeps track
// of an incrementing count and resets it on a timer
builder.Logging.AddDynamicFilter(null,
    (level, eventId, state, exception, formatter) => rateLimiter.IsNewMessageWithinLimit());

My immediate focus is less on the details and more on getting some agreement about the scope of solution. In particular I am suggesting these things are out of scope for this feature:

  • Programmatic discovery of different events - there is a lot of discussion above about discovery of categories and events which I think can be useful in its own right, but I don't see it as a hard dependency to do filtering. All of those examples above seem useful, I don't know how someone would implement them today, and they don't rely on any form of dynamic event discovery.
  • Higher level abstractions that define filtering policies or compose them in unique ways - There is certainly room for external libraries (perhaps OpenTelemetry?) to implement a variety of filtering strategies out-of-the-box and provide strongly typed APIs for them, but I don't think there is an immediate need to have such APIs baked into M.E.Logging.Abstractions.
  • Tail-sampling/filtering - Recording logged events into a buffer and then filtering them later can be a powerful technique to only log data about interesting requests (such as slow requests or failed requests). However ILoggerProviders are generally not designed to expect data that has been buffered so implementing this kind of filtering at the ILogger API level is not the ideal place for it. This type of filtering is probably better suited within the implementation of specific ILoggerProviders that are managing buffers of serialized log data, or out-of-process in collection agents and backend processing systems.

@JamesNK
Copy link
Member

JamesNK commented Jan 16, 2024

public static class LoggingFilter
{
    public static void AddDynamicFilter(
        this ILoggingBuilder builder,
        string category, 
        Func<LogLevel, EventId, TState, Exception?, Func<TState,Exception?,string>, bool> filter);
}

I'm not sure TState would work here. Where does the type information come from?

Being able to access the state is useful. It would require casting it to IReadOnlyList<KeyValuePair<string, object>> but finding a pay-for-play way of doing that is tough.

We don't want to preemptively cast state to a list because that could cause boxing, and state doesn't have to implement the list interface.

Update: Would changing the signature to AddDynamicFilter<TState> be enough?

@reyang
Copy link

reyang commented Jan 16, 2024

  1. Do we want category to be the exact category name or it'll support wildcard/prefix? (I think the latter would be more useful, and better aligned with existing pattern; the downside is potential overlap and perf cost).
  2. With AddDynamicFilter, I wonder if folks would need RemoveDynamicFilter or ClearDynamicFilters / RemoveAllDynamicFilters. SetDynamicFilter could be another option. Considering the business need "this would allow users to turn on/off specific messages from a library without necessarily seeing all other messages at the same logging level", removing existing filters could be useful although the user could choose to put the switch/on/off logic in their callbacks.

@samsp-msft
Copy link
Member

samsp-msft commented Jan 16, 2024

  1. Do we want category to be the exact category name or it'll support wildcard/prefix? (I think the latter would be more useful, and better aligned with existing pattern; the downside is potential overlap and perf cost).

Through system.configuration, the settings for categories is a prefix match, that would probably be best here. Its not general wildcards, or regex, so should be about as fast as a normal string match.

How will this work with a hierarchy. If one filter is filtering out "foo.bar" and another is trying to filter in "foo.bar.baz" won't the former deny the latter, and so break what the developer is trying to achieve?

Maybe they need to be called from least to most specific, and have the current on/off state included in the parameters. That way more specific filters will still be called, and can make the call based on previous filters in the chain.

@noahfalk
Copy link
Member Author

So far all the feedback sounds like it is refining the design so I'm taking it as people are on-board with the general premise/scope so far. I'm not trying to dive too far into the details yet, but off-the-cuff thoughts for various points raised:

I'm not sure TState would work here. Where does the type information come from?

You are right a delegate is insufficient. We probably need an interface definition where one of the methods has a generic type parameter on it or we could exclude TState from the filtering parameter list.

ILoggerDynamicFilter
{
    bool Filter<TState>(...);
}

Do we want category to be the exact category name or it'll support wildcard/prefix? (I think the latter would be more useful, and better aligned with existing pattern; the downside is potential overlap and perf cost).

I was imagining it would be the latter but omitted this for brevity. I expect all the perf cost is paid at startup when determining which filters apply to which ILogger instances. Once that is determined the category string no longer plays a role and the Logger is just enumerating and invoking a list of filters that were determined to match.

With AddDynamicFilter, I wonder if folks would need RemoveDynamicFilter or ClearDynamicFilters / RemoveAllDynamicFilters.

Yeah we might want those. I'd consider any form of Add/Remove/Clear or any other API that manages the list to be in the scope of the proposal. We'd just have to narrow in on exactly which ones we want.

How will this work with a hierarchy. If one filter is filtering out "foo.bar" and another is trying to filter in "foo.bar.baz" won't the former deny the latter, and so break what the developer is trying to achieve?

My initial thought (which I am not at all tied to) was that we'd enumerate every filter that matched and as soon as one returned false then we short circuit. I think this is an area of the design that would need more investigation and/or feedback.

@CodeBlanch
Copy link
Contributor

Being able to access the state is useful. It would require casting it to IReadOnlyList<KeyValuePair<string, object>> but finding a pay-for-play way of doing that is tough.

Getting into the weeds here a bit but I agree with @JamesNK. Working with TState in the ILogger world is tough.

Imagine a bunch of filters doing...

if (state is IReadOnlyList<KeyValuePair<string, object>> list)
{
    ...
}

That is going to cause a lot of boxing.

One thing I would like to see which might help here is for FormattedLogValues to be made public (could be renamed). That would allow code to be written like:

if (typeof(TState) == typeof(FormattedLogValues))
{
    var typedState = (FormattedLogValues)(object)state;
    ...
}
else if (state is IReadOnlyList<KeyValuePair<string, object>> list)
{
   ...
}

The idea being we could avoid boxing (for these filtering cases but also normal log handling) for the common FormattedLogValues-based log messages. And the JIT should be able to specialize the code.

@noahfalk
Copy link
Member Author

The idea being we could avoid boxing

If you are careful about the exact code pattern you use, the JIT can already optimize away the boxing operation without needing to special case the type in code (Example). In the short term you can change the code to this and avoid the boxing:

if (state is IReadOnlyList<KeyValuePair<string, object>>)
{
   ((IReadOnlyList<KeyValuePair<string, object>>)state).WhateverFunction();
}

In the longer term the JIT is still improving their analysis to recognize the box removal optimization in more code patterns. I wouldn't be surprised if your else if code above compiles without boxing in a near future .NET release. I don't think we need to expose more types to allow people to micro-optimize, we can just let the JIT get better at doing the optimization automatically.

@CodeBlanch
Copy link
Contributor

Interesting thanks for sharing @noahfalk! That sharplab example is 🤯 Improving the JIT to handle this would be 💯 🚀

@cijothomas
Copy link
Contributor

I'd like to propose a narrow and more concrete solution to see if we can move this forward. We would add a new API similar to this (I'm not too concerned with exact naming or signature, the proposal is primarily about the general concept and scope):

public static class LoggingFilter
{
    public static void AddDynamicFilter(
        this ILoggingBuilder builder,
        string category, 
        Func<LogLevel, EventId, TState, Exception?, Func<TState,Exception?,string>, bool> filter);
}

Every time ILogger.Log() gets called on an ILogger produced from this ILoggerFactory it would do:

  1. [pre-existing behavior] if IsEnabled() is false return without logging
  2. [new behavior] call each dynamic filter delegate with matching category prefix, if any of the filters returns false return without logging.
  3. [pre-existing behavior] forward the call to registered ILoggerProviders

Some possible usage examples:

// sample a randomly selected 5% of logs
builder.Logging.AddDynamicFilter(null,
    (level, eventId, state, exception, formatter) => Random.Shared.NextDouble() < 0.05);

// only log messages if there is an active Activity which is marked as Recorded
// (distributed tracing sampling rules determine this)
builder.Logging.AddDynamicFilter(null,
    (level, eventId, state, exception, formatter) => Activity.Current?.IsRecorded);

// eliminate a known noisy log message
builder.Logging.AddDynamicFilter("FooCorp.SomeCategoryName",
    (level, eventId, state, exception, formatter) => eventId != UndesiredEventId);

// rate limiting
// assuming some helper rateLimiter object has already been written that keeps track
// of an incrementing count and resets it on a timer
builder.Logging.AddDynamicFilter(null,
    (level, eventId, state, exception, formatter) => rateLimiter.IsNewMessageWithinLimit());

My immediate focus is less on the details and more on getting some agreement about the scope of solution. In particular I am suggesting these things are out of scope for this feature:

  • Programmatic discovery of different events - there is a lot of discussion above about discovery of categories and events which I think can be useful in its own right, but I don't see it as a hard dependency to do filtering. All of those examples above seem useful, I don't know how someone would implement them today, and they don't rely on any form of dynamic event discovery.
  • Higher level abstractions that define filtering policies or compose them in unique ways - There is certainly room for external libraries (perhaps OpenTelemetry?) to implement a variety of filtering strategies out-of-the-box and provide strongly typed APIs for them, but I don't think there is an immediate need to have such APIs baked into M.E.Logging.Abstractions.
  • Tail-sampling/filtering - Recording logged events into a buffer and then filtering them later can be a powerful technique to only log data about interesting requests (such as slow requests or failed requests). However ILoggerProviders are generally not designed to expect data that has been buffered so implementing this kind of filtering at the ILogger API level is not the ideal place for it. This type of filtering is probably better suited within the implementation of specific ILoggerProviders that are managing buffers of serialized log data, or out-of-process in collection agents and backend processing systems.

This proposal looks good to me. Thank you.

@samsp-msft
Copy link
Member

I tried to create a sampler for logs via OpenTelemetry using the OTel processor extension, but it turns out that there is no way to suppress entries with that API. I tried setting the log level of the discarded messages to None, but the evaluation of the level happens before the processor and is not re-evaluated.

The semantics of what I would want the sampler to do is captured in https://gist.github.com/samsp-msft/e37120be7322d17500dcc797a840faef.

The API proposed above would work perfectly for this scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants