-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(replays): Use Annotated struct definition for replay-event parsing #1582
Conversation
Instructions and example for changelogFor changes exposed to the Python package, please add an entry to For changes to the Relay server, please add an entry to
To the changelog entry, please add a link to this PR (consider a more descriptive message): - Use Annotated struct definition for replay-event parsing. ([#1582](https://github.com/getsentry/relay/pull/1582)) If none of the above apply, you can opt out by adding #skip-changelog to the PR description. |
@jan-auer or @jjbayer Having a tough time finishing this one. Trying to figure out the best place to put things and also how all these things fit together. Goal: normalize the replay event payload and provide data scrubbing functionality that is configurable from sentry.io. I have a Questions: How should |
@cmanallen haven't reviewed in detail, but I think it might make more sense to write a separate To pass the client IP, you could give your processor a config similar to relay/relay-general/src/store/mod.rs Lines 33 to 35 in 4fc8084
|
@jjbayer Updated with PII scrubbing. Question: the Edit: also I did not follow this advice:
I did not know how process_replay would be called and wasn't really sure what to populate it with. Thankfully process_value with the unmodified PiiProcessor seems to partially work. I wonder what else I need to tweak to get to 100%? |
@jjbayer Also the linter is freaking out because it says my imports are unnecessary but if I remove them the tests break. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmanallen is this PR ready for final review? Or are you still making changes?
@jjbayer Yes ready for final review! Question: the PII scrubber removes values and replaces them with |
As far as I know, Sentry & Snuba do not treat scrubbed values differently from regular values, i.e. they are still strings, just with a different value. If you'd rather have these values set to relay/relay-general/src/pii/builtin.rs Lines 103 to 106 in 4cb923f
|
@jjbayer Currently I'm getting the |
@cmanallen the short answer is yes. Do I understand correctly that in this version, replay events are scrubbed with |
@jjbayer We're happy with the configured behavior ( |
relay-general/src/protocol/replay.rs
Outdated
fn test_scrub_pii_from_annotated_replay() { | ||
let mut pii_config = PiiConfig::default(); | ||
pii_config.applications = | ||
BTreeMap::from([(SelectorSpec::And(vec![]), vec!["@common".to_string()])]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we fixed this test, I did not realize you were planning to read PII config from project configs instead of using a hard-coded one. So it might make sense to change this test back to a pii config which looks like the most common datascrubbing settings coming from sentry:
relay/relay-general/src/pii/convert.rs
Lines 196 to 203 in e87b92e
fn simple_enabled_config() -> DataScrubbingConfig { | |
DataScrubbingConfig { | |
scrub_data: true, | |
scrub_ip_addresses: true, | |
scrub_defaults: true, | |
..Default::default() | |
} | |
} |
But with that, even the credit card scrubbing seems to fail and I'm not sure why. I can investigate later this week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Could you please add a PR description including a motivation for this change?
@jjbayer Updated description. Thanks for helping me through this marathon of a PR! :D |
* master: (35 commits) ref(actix): Migrate ProjectUpstream to `relay_system::Service` (#1727) feat(general): Add unknown SessionStatus variant (#1736) ref: Convert integration tests about dropping transactions to unit tests (#1720) release: 0.8.16 ci: Skip redundant self-hosted E2E on library release (#1755) doc(changelog): Add relevant changes to python changelog (#1753) feat(profiling): Add profile context (#1748) release: 23.1.0 profiling(fix): use an unpadded base64 encoding (#1749) Revert "feat(replays): Enable PII scrubbing for all organizations" (#1747) feat: Switch from base64 to data-encoding (#1743) instr(replays): Add timer metric to recording processing (#1742) feat(replays): Use Annotated struct definition for replay-event parsing (#1582) feat(sessions): Retire session duration metric (#1739) feat(general): Scrub all fields with IP address (#1725) feat(replays): Enable PII scrubbing for all organizations (#1678) chore(project): Add backoff mechanism for fetching projects (#1726) feat(profiling): Add new measurement units for profiling (#1732) chore(toolchain): update rust to 1.66.1 (#1735) ref(actix): Migrate server actor to the "service" arch (#1723) ...
Removes direct dependency on Serde and replaces it with Annotated. The motivation to move to Annotated was to maintain strong compatibility with wider Sentry ecosystem and to streamline the implementation of PII scrubbing.
closes: https://github.com/getsentry/replay-backend/issues/183