-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(metrics): Add a metrics aggregator #958
Conversation
Since the aggregator now requires a recipient for flushed buckets. To test this, we will need to merge #957 first, which moves the Actix test utilities into their own crate. |
relay-metrics/src/bucketing.rs
Outdated
} | ||
// Bucket from distant future. Best we can do is schedule it after initial delay | ||
// TODO: or refuse metric altogether? | ||
UnboundedInstant::Later => now + self.initial_delay(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jan-auer Let me know if this makes sense. I'm unsure what to do about metrics with future timestamps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the timestamp is not representable, then we're so far out of the valid ingestion range (something we've not yet implemented) that we can just drop the metric. WDYT if we just make this Option<Instant>
everywhere and then bail out with an error to keep it simple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need the ternary enum to support backdated metrics - at least if we want the tests to pass on macOS, where even the following panics:
use std::time::{Duration, Instant};
fn main() {
let now = Instant::now();
let then = now - Duration::from_secs(24 * 60 * 60);
println!("{:?}", then);
}
* master: fix(server): Remove dependent items from envelope when dropping transaction item (#960) fix(clippy): Fix clippy 1.51.0 warnings (#965) feat(server): Add support for breakdowns ingestion (#934) build: Update schemars and remove workarounds (#961) feat(server): Add rule id to outcomes coming from transaction sampling (#953)
Uses the metrics aggregator from #958 to send batches of pre-aggregated buckets to the upstream instead of forwarding individual metric values. If sending fails for any reason, the metrics are merged back into the aggregator, which will retry flushing after the next interval. Metric envelopes are not queued like regular envelopes. Instead, they go straight to the EventProcessor worker pool, where they are parsed, normalized and sent to the project's aggregator. This ensures that metric requests do not create long running futures that would slow down the system. For mixed envelopes, metric items are split off and handled separately. Metrics aggregators are spawned on the projects thread, which runs the project cache and manages all project state access. In the future, metrics aggregation will have to be moved to a separate resource to ensure that project state requests remain instant.
Adds an
Aggregator
actor which places metrics into buckets by time and tag value and flushes them out in regular time intervals to a recipient. Aggregators have their own internal lifecycle and will be instantiated per organization.This PR also updates the protocol implementation. Instead of allowing any value with any metric type, the parser now requires
f64
for counters, distributions, and gauges. This will have sufficient accuracy for real-world use cases and allows for a fixed storage model. For sets, it allows arbitrary values and hashes them into au32
.Follow-up to #948
Requires #957