Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(metrics): Add a metrics aggregator #958

Merged
merged 32 commits into from
Mar 26, 2021

Conversation

jan-auer
Copy link
Member

@jan-auer jan-auer commented Mar 22, 2021

Adds an Aggregator actor which places metrics into buckets by time and tag value and flushes them out in regular time intervals to a recipient. Aggregators have their own internal lifecycle and will be instantiated per organization.

This PR also updates the protocol implementation. Instead of allowing any value with any metric type, the parser now requires f64 for counters, distributions, and gauges. This will have sufficient accuracy for real-world use cases and allows for a fixed storage model. For sets, it allows arbitrary values and hashes them into a u32.

Follow-up to #948
Requires #957

@jan-auer jan-auer self-assigned this Mar 22, 2021
@jan-auer
Copy link
Member Author

Since the aggregator now requires a recipient for flushed buckets. To test this, we will need to merge #957 first, which moves the Actix test utilities into their own crate.

}
// Bucket from distant future. Best we can do is schedule it after initial delay
// TODO: or refuse metric altogether?
UnboundedInstant::Later => now + self.initial_delay(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jan-auer Let me know if this makes sense. I'm unsure what to do about metrics with future timestamps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the timestamp is not representable, then we're so far out of the valid ingestion range (something we've not yet implemented) that we can just drop the metric. WDYT if we just make this Option<Instant> everywhere and then bail out with an error to keep it simple?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need the ternary enum to support backdated metrics - at least if we want the tests to pass on macOS, where even the following panics:

use std::time::{Duration, Instant};

fn main() {
    let now = Instant::now();
    let then = now - Duration::from_secs(24 * 60 * 60);
    println!("{:?}", then);
}

jjbayer and others added 5 commits March 25, 2021 15:43
* master:
  fix(server): Remove dependent items from envelope when dropping transaction item (#960)
  fix(clippy): Fix clippy 1.51.0 warnings (#965)
  feat(server): Add support for breakdowns ingestion (#934)
  build: Update schemars and remove workarounds (#961)
  feat(server): Add rule id to outcomes coming from transaction sampling (#953)
@jan-auer jan-auer marked this pull request as ready for review March 26, 2021 10:46
@jan-auer jan-auer requested a review from a team March 26, 2021 10:46
@jan-auer jan-auer enabled auto-merge (squash) March 26, 2021 11:05
@jan-auer jan-auer merged commit 80481c7 into master Mar 26, 2021
@jan-auer jan-auer deleted the feat/metrics-aggregate-envelopes branch March 26, 2021 11:06
jjbayer added a commit that referenced this pull request Mar 30, 2021
Uses the metrics aggregator from #958 to send batches of pre-aggregated
buckets to the upstream instead of forwarding individual metric values.
If sending fails for any reason, the metrics are merged back into the
aggregator, which will retry flushing after the next interval.

Metric envelopes are not queued like regular envelopes. Instead, they go
straight to the EventProcessor worker pool, where they are parsed,
normalized and sent to the project's aggregator. This ensures that
metric requests do not create long running futures that would slow down
the system. For mixed envelopes, metric items are split off and handled
separately.

Metrics aggregators are spawned on the projects thread, which runs the
project cache and manages all project state access. In the future,
metrics aggregation will have to be moved to a separate resource to
ensure that project state requests remain instant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants