Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(relay): Implement factor based multi-matching #1790

Merged
merged 50 commits into from
Feb 14, 2023

Conversation

iambriccardo
Copy link
Member

@iambriccardo iambriccardo commented Jan 30, 2023

This PR implements a new sampling algorithm that has two new major changes, support for factors and multi-matching.

The rationale behind this new algorithm is to enable the interpolation of different matching rules due to the increasing number of biases created on the sentry side (e.g., we want to mix the latest release and team key transaction biases together). With the current system, this was not possible, because matching was performed in a FIFO fashion.

This algorithm can result in inconsistent trace sampling, which is a drawback we decided to accept. The algorithm does perform inconsistent sampling because it will take into consideration both transaction and trace rules while computing the final sample rate.

The algorithm revolves around two configurations:

  • root config -> contains the rules of the head of the trace to which the incoming event belongs.
  • non-root config -> contains the rules of the project to which the incoming event belongs.

It also introduces the concept of samplingValue. A samplingValue is nothing more than a type of heuristic we apply to sampling, which has a value. The two types we currently have implemented are:

  • factor -> the factor which the algorithm will apply on a matched sample rate.
  • sampleRate -> the classic sample rate.

The new algorithm works in the following way:

  1. It merges the sampling rules of the root and non-root projects, concatenating first the transaction/error rules of the non-root project and then the trace rules of the root project.
  2. It performs matching from top to bottom with the following approach:
  3. If a factor rule is matched, it will take the factor and multiply it with past accumulated factors (starting with default 1).
  4. If a sampleRate rule is matched, it will take the accumulated factors and multiply that by the sampleRate of the matched rule.

The matching process will return the final sampleRate that will be used to decide whether to keep or not keep the event. In case no matches or unsupported rules are found, the algorithm will return a no match, which will mean that we will keep the event (we prefer to oversample than drop events).

@iambriccardo iambriccardo marked this pull request as ready for review February 3, 2023 12:40
@iambriccardo iambriccardo requested a review from a team February 3, 2023 12:40
Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, mostly nitpicks at this stage. I did not look at the tests yet.

@@ -242,7 +242,10 @@ def test_validate_sampling_configuration():
"rules": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be rulesV2 now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, i didn't catch it as CI was blocked and integration tests weren't run again. Will fix!

Comment on lines 512 to 515
match self.sampling_value {
SamplingValue::SampleRate { value: _ } => true,
SamplingValue::Factor { value: _ } => false,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
match self.sampling_value {
SamplingValue::SampleRate { value: _ } => true,
SamplingValue::Factor { value: _ } => false,
}
matches!(self.sampling_value, SamplingValue::SampleRate { _ })

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely!

Comment on lines 518 to 523
fn get_sampling_base_value(&self) -> f64 {
match self.sampling_value {
SamplingValue::SampleRate { value: sample_rate } => sample_rate,
SamplingValue::Factor { value: factor } => factor,
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we put this helper in impl SamplingValue?

Comment on lines 889 to 890
/// The ordered sampling rules for the project.
#[serde(default, skip_deserializing)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The ordered sampling rules for the project.
#[serde(default, skip_deserializing)]
/// Legacy sampling rules.
///
/// We need to serialize an empty list of rules such that outdated downstream Relays
/// accept the config, but at the same time, we do not want to actually use these rules
/// in the current Relay.
#[serde(default, skip_deserializing)]

Comment on lines 895 to 898
#[serde(default, skip_deserializing)]
pub rules: Vec<SamplingRule>,
/// The ordered sampling rules v2 for the project.
pub rules_v2: Vec<SamplingRule>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might make the protocol a bit confusing, but I think it makes the rest of the code / diff easier to read. Feel free to disagree.

Suggested change
#[serde(default, skip_deserializing)]
pub rules: Vec<SamplingRule>,
/// The ordered sampling rules v2 for the project.
pub rules_v2: Vec<SamplingRule>,
#[serde(default, skip_deserializing, rename="rules")]
pub legacy_rules: Vec<SamplingRule>,
/// The ordered sampling rules v2 for the project.
#[serde(rename = "rulesV2")]
pub rules: Vec<SamplingRule>,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep the rules_v2 schema, to be more explicit about the newer version, like it is typically done in APIs. The reasoning for avoiding legacy is that if we upgrade again the schema, we will have to define legacy_legacy? I envision a simpler naming scheme with rules_v*. Ofc the best solution would be to implement proper versioning.


/// Represents the specification for sampling an incoming event.
#[derive(Clone, Debug, PartialEq)]
pub struct SamplingConfigMatchResult {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would rename this to SamplingMatch.

relay-sampling/src/lib.rs Show resolved Hide resolved
Comment on lines 937 to 939
if let Some(event_id) = event_id {
return Some(event_id.0);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if let Some(event_id) = event_id {
return Some(event_id.0);
}
event_id.map(|id| id.0)

None
}

pub fn has_unsupported_rules(&self) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this function back to the top? Just to make the diff easier to read.

};
) -> Option<SamplingConfigMatchResult> {
let mut matched_rule_ids = vec![];
let mut has_matched_trace_rule = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of storing a boolean and using get_seed, I would just

let mut seed = event.id.value();

and then overwrite it with the DSC seed further down if necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made my implementation under the assumption of having trace and transaction rules mixed, as shown in the tests. For this reason I use a boolean to keep track of any seen trace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that on line 1000, instead of flipping a boolean, you could overwrite the seed directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah sorry, I get now what you mean. Will do that, sorry for the confusion.

@iambriccardo iambriccardo changed the title feat(relay): Implement factor based multi-matching feat(relay): Implement factor based multi-matching [TET-665] Feb 8, 2023
@iambriccardo iambriccardo requested a review from jjbayer February 8, 2023 14:53
@@ -374,6 +374,15 @@ pub enum SamplingValue {
Factor { value: f64 },
}

impl SamplingValue {
fn get_sampling_base_value(&self) -> f64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Our getters usually do not have a get_ prefix (see also rust docs). Should we just call this value()?

};
) -> Option<SamplingConfigMatchResult> {
let mut matched_rule_ids = vec![];
let mut has_matched_trace_rule = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that on line 1000, instead of flipping a boolean, you could overwrite the seed directly.

Comment on lines 138 to 140
let merged_config = SamplingConfigs::new(sampling_config)
.add_root_config(root_project_state)
.get_merged_config();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not clone configs / create new vectors for every event here. Ideally, we operate on a get_merged_rules() function that returns an iterator (not a vector) of rules.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to discuss this indeed, I can try to implement the approach with iterators.

Comment on lines 39 to 43
Match {
sample_rate: f64,
matched_rule_ids: MatchedRuleIds,
seed: Uuid,
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very similar to the SamplingMatch defined in relay-sampling. Can we just use an Option<SamplingMatch> instead of this enum?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They look similar but we actually change the sample rate, I would prefer to keep this Match and NoMatch for expressiveness and I can use the SamplingMatch inside the Match variant of the enum.

Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are close to merging this. Please take a look at the remaining comments. Could you also write a PR description that includes the motivation for this change and a summary of the change itself?

relay-sampling/src/lib.rs Outdated Show resolved Hide resolved
relay-sampling/src/lib.rs Outdated Show resolved Hide resolved
relay-sampling/src/lib.rs Outdated Show resolved Hide resolved
relay-sampling/src/lib.rs Outdated Show resolved Hide resolved
relay-server/src/utils/dynamic_sampling.rs Outdated Show resolved Hide resolved
relay-sampling/src/lib.rs Outdated Show resolved Hide resolved
relay-sampling/src/lib.rs Outdated Show resolved Hide resolved
relay-sampling/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing! Final notes:

  1. Do not merge until feat(dynamic-sampling): Add support for rules v2 in Sentry [TET-666] sentry#44229 has landed (as discussed). Otherwise, Relay will fail to parse project configs because of the missing rulesV2 field.
  2. Please update the PR description (see docs).
  3. There's some new deprecation warnings, but I assume they are easy to fix.

@iambriccardo
Copy link
Member Author

Thanks for contributing! Final notes:

  1. Do not merge until getsentry/sentry#44229 has landed (as discussed). Otherwise, Relay will fail to parse project configs because of the missing rulesV2 field.
  2. Please update the PR description (see docs).
  3. There's some new deprecation warnings, but I assume they are easy to fix.

I will have also to write the changelog.

@iambriccardo iambriccardo merged commit f92c968 into master Feb 14, 2023
@iambriccardo iambriccardo deleted the riccardo/feat/multi-matching-and-factors branch February 14, 2023 08:18
@iambriccardo iambriccardo changed the title feat(relay): Implement factor based multi-matching [TET-665] feat(relay): Implement factor based multi-matching Feb 14, 2023
jan-auer added a commit that referenced this pull request Feb 14, 2023
* master:
  feat(relay): Implement factor based multi-matching [TET-665] (#1790)
  fix(rate-limits): Rate limit profiles attached to transactions (#1843)
  fix(txnames): Avoid replacing url encoded transaction names (#1839)
  meta: Remove cargo vet (#1841)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants