Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(metrics): Light normalize before extracting metrics [INGEST-1517 INGEST-1424] #1366

Merged
merged 42 commits into from
Aug 4, 2022

Conversation

iker-barriocanal
Copy link
Contributor

@iker-barriocanal iker-barriocanal commented Jul 29, 2022

Context

Currently, all relay modes extract metrics from transactions. However, only processing relays are performing normalization operations that sample out events and modify the values that relay extracts metrics from, in the store normalization step. Not sampling out events early enough also introduces the consequence of extracting metrics from transactions that are discarded a bit later.

What this PR does

Starting from envelopes.rs, there's a new function light_normalize_event to perform minimal event normalization to filter them later, and that minimal normalization has been extracted from the processors.

The changes in the processors are mostly refactors extracting functions to make them callable from the outside (from light_normalize_event, in this case). The normalization processor doesn't re-do the normalization work that was previously done in the light processing step.

An alternative to all this extraction into functions is to have another processor. This was discarded because of its complexity. No additional design choices have been considered.

Other considerations

Performing light normalization in non-processing relays impacts any test depending on the amount of time relay takes to process an envelope. The cause is the long amount of time relay takes to initialize lazily the regex to parse user agents. Since that work is only done once, it doesn't impact relay instances on production.

@iker-barriocanal iker-barriocanal self-assigned this Jul 29, 2022
Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall! I did not yet verify whether everything that we need for inbound filters and / or metrics extraction is now actually part of light_normalize_event.

relay-general/src/store/normalize.rs Outdated Show resolved Hide resolved
}
}
}

pub fn light_normalize_event(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We should probably come up with a better name for this function, but I don't have one yet. Maybe pre_normalize_event?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hard agree "light normalization" isn't descriptive enough. I don't like pre_normalize_* because that indicates the operation happens before normalization, and this function actually performs some normalization. I think that some is what causes troubles for the naming. I'd also like to avoid referring to any filtering in the name, if possible -- the function may be used in another context in the future and shouldn't be tied up to current behavior.

What about something like minimal_normalize_event? That indicates there's some normalization happening, it doesn't perform the whole normalization, and the explanation of what it does and the motivation behind it could be added to the docstrings. What do you think?

relay-server/src/actors/envelopes.rs Outdated Show resolved Hide resolved
@@ -1882,6 +1912,8 @@ impl EnvelopeProcessor {

self.finalize_event(state)?;

self.light_normalize_event(state)?;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this, we can remove a lot of ad-hoc normalization from extract_transaction_metrics. But that can be a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that double-normalizing (such as computing the breakdowns) doesn't break anything now?

relay-general/src/store/normalize.rs Show resolved Hide resolved
relay-general/src/store/normalize.rs Outdated Show resolved Hide resolved
relay-general/src/store/transactions.rs Show resolved Hide resolved
Copy link
Member

@untitaker untitaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check out sentry, do pip install -e ~/projects/relay/py/ and run its tests. I think you may break some tests by moving normalization out of the store normalizer, which we use in Python to fully normalize events.

It may be easier to introduce a flag on StoreNormalizer to opt into "light normalization", but it all depends on how many tests you're going to break.

@jjbayer
Copy link
Member

jjbayer commented Aug 2, 2022

I think you may break some tests by moving normalization out of the store normalizer, which we use in Python to fully normalize events.

@untitaker This is a good point, should we just re-do everything in store normalization, and for the time being accept the additional cost for the sake of correctness?

Copy link
Contributor Author

@iker-barriocanal iker-barriocanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a decent amount of stuff added to the output of integration tests of security reports. By looking at the code, all changes seem they make sense to me, so I assume they weren't running before.

Comment on lines +137 to +141
if "received" in event:
event.pop("received")
if "timestamp" in event:
event.pop("timestamp")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in the expect_ct case the event has the received and timestamp fields. I haven't thought if we want them, but I don't think this is that important right now and I took the short and hacky path deleting these fields here. We may want to revisit this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, just a result of timestamp normalization.

@iker-barriocanal iker-barriocanal marked this pull request as ready for review August 4, 2022 00:03
@iker-barriocanal iker-barriocanal requested review from a team, untitaker and jjbayer August 4, 2022 00:03
@@ -46,7 +46,7 @@ def test_uses_origins(mini_sentry, relay, json_fixture_provider, allowed_origins
)

if should_be_allowed:
mini_sentry.captured_events.get(timeout=1).get_event()
mini_sentry.captured_events.get(timeout=10).get_event()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is necessary because the non-processing relay now does normalization, which includes UA parsing, which apparently has slow initialization.

CHANGELOG.md Outdated Show resolved Hide resolved
relay-general/src/store/normalize.rs Outdated Show resolved Hide resolved
relay-general/src/store/mod.rs Outdated Show resolved Hide resolved
Copy link
Member

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, we need to run the SchemaProcessor at the very least since that enforces things like: max_chars, required, trim_whitespace, and nonempty. In store normalization, the schema processor runs before the normalizer, which means that it can change values in such a way that the normalizer behaves differently.

relay-general/src/store/mod.rs Outdated Show resolved Hide resolved
relay-general/src/store/mod.rs Outdated Show resolved Hide resolved
relay-general/src/store/mod.rs Outdated Show resolved Hide resolved
@@ -1882,6 +1912,8 @@ impl EnvelopeProcessor {

self.finalize_event(state)?;

self.light_normalize_event(state)?;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that double-normalizing (such as computing the breakdowns) doesn't break anything now?

})?;

// Default required attributes, even if they have errors
normalize_release_dist(event); // dist is a tag extracted along with other metrics from transactions
Copy link
Member

@jan-auer jan-auer Aug 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Do we need all these comments here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked for these comments because I figured it might be helpful to know in the future why we decided to pull those into "light" normalization.

Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to ignore nitpick.

relay-general/src/store/normalize.rs Show resolved Hide resolved
@iker-barriocanal iker-barriocanal merged commit f7a577e into master Aug 4, 2022
@iker-barriocanal iker-barriocanal deleted the iker/fix/filter-before-mep-extract branch August 4, 2022 13:13
jan-auer added a commit that referenced this pull request Aug 4, 2022
* master:
  fix(metrics): Light normalize before extracting metrics (#1366)
tobias-wilfert pushed a commit that referenced this pull request Aug 5, 2022
Follow-up to #1366. By applying `InvalidTransaction`, we do not report these
dropped envelopes to Sentry and apply a dedicated reason code.

The store normalizer returns an error for invalid transactions, which has to be
special cased. Other than that, the store normalizer is infallible.
@github-actions
Copy link

Fails
🚫 Please consider adding a changelog entry for the next release.
Instructions and example for changelog

For changes exposed to the Python package, please add an entry to py/CHANGELOG.md. This includes, but is not limited to event normalization, PII scrubbing, and the protocol.

For changes to the Relay server, please add an entry to CHANGELOG.md under the following heading:

  1. Features: For new user-visible functionality.
  2. Bug Fixes: For user-visible bug fixes.
  3. Internal: For features and bug fixes in internal operation, especially processing mode.

To the changelog entry, please add a link to this PR (consider a more descriptive message):

- Light normalize before extracting metrics. ([#1366](https://github.com/getsentry/relay/pull/1366))

If none of the above apply, you can opt out by adding #skip-changelog to the PR description.

Generated by 🚫 dangerJS against c646b5a

jjbayer added a commit that referenced this pull request Oct 25, 2022
The `normalize` in the python package and C-ABI receives a structure
with config arguments for the various normalizers. The `renormalize`
flag is more central, as it is supposed to disable most of the
normalizers. This is used in Sentry after the event has passed
normalization to ensure that it is still structurally valid, but it may
contain fields now that are prohibited during ingestion.

In #1366 large parts of normalization were split into a dedicated
`light_normalize` function which did not honor the renormalize flag.
This PR restores correct behavior and disables the entire renormalize
call.

Co-authored-by: Jan Michael Auer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants