Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(replays): Add default user object and fill its ip-address #1805

Merged
merged 9 commits into from
Feb 10, 2023
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
**Bug Fixes**:

- Improve performance for replays, especially memory usage during data scrubbing.([#1800](https://github.com/getsentry/relay/pull/1800))
- Fix a bug where the replays ip-address normalization was not being applied when the user object was omitted. ([#1805](https://github.com/getsentry/relay/pull/1805))

**Internal**:

Expand Down
38 changes: 26 additions & 12 deletions relay-general/src/protocol/replay.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ use std::net::IpAddr as RealIPAddr;
use crate::protocol::{
ClientSdkInfo, Contexts, EventId, IpAddr, LenientString, Request, Tags, Timestamp, User,
};
use crate::store::{self, user_agent};
use crate::store::{self, normalize_ip_addresses_generic, user_agent};
use crate::types::{Annotated, Array};
use crate::user_agent::RawUserAgentInfo;

Expand Down Expand Up @@ -255,13 +255,12 @@ impl Replay {
}

fn normalize_ip_address(&mut self, ip_address: Option<RealIPAddr>) {
if let Some(addr) = ip_address {
if let Some(user) = self.user.value_mut() {
if user.ip_address.value().is_none() {
user.ip_address.set_value(Some(IpAddr(addr.to_string())));
}
}
}
normalize_ip_addresses_generic(
&mut self.request,
&mut self.user,
&self.platform,
ip_address.map(|ip| IpAddr(ip.to_string())).as_ref(),
)
}

fn normalize_user_agent(&mut self, default_user_agent: Option<&str>) {
Expand Down Expand Up @@ -446,12 +445,27 @@ mod tests {
fn test_missing_user() {
let payload = include_str!("../../tests/fixtures/replays/replay_missing_user.json");

let mut replay: Annotated<Replay> = Annotated::from_json(payload).unwrap();
let replay_value = replay.value_mut().as_mut().unwrap();
replay_value.normalize(None, None);
let mut annotated_replay: Annotated<Replay> = Annotated::from_json(payload).unwrap();
let replay = annotated_replay.value_mut().as_mut().unwrap();

let user = replay_value.user.value();
// No user object and no ip-address was provided.
replay.normalize(None, None);
let user = replay.user.value();
assert!(user.is_none());

// No user object but an ip-address was provided.
let ip_address = IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1));
replay.normalize(Some(ip_address), None);

let ip_addr = replay
.user
.value()
.unwrap()
.ip_address
.value()
.unwrap()
.as_str();
assert!(ip_addr == "127.0.0.1");
}

#[test]
Expand Down
27 changes: 20 additions & 7 deletions relay-general/src/store/normalize.rs
Original file line number Diff line number Diff line change
Expand Up @@ -591,6 +591,20 @@ fn normalize_security_report(

/// Backfills IP addresses in various places.
fn normalize_ip_addresses(event: &mut Event, client_ip: Option<&IpAddr>) {
normalize_ip_addresses_generic(
&mut event.request,
&mut event.user,
&event.platform,
client_ip,
)
}

pub fn normalize_ip_addresses_generic(
request: &mut Annotated<Request>,
user: &mut Annotated<User>,
platform: &Annotated<String>,
client_ip: Option<&IpAddr>,
) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking out loud, instead of a function with a large signature like this, we could decompose it into smaller functions. For instance, I'm noticing that we don't have to pass the user in. Instead, this function could return an Option<IpAddr> and then the caller runs get_or_insert_with(|| infer_ip_address(/* ... */)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jan-auer I did break out some of the functionality into different units but I kept this function with the large signature. I didn't want to duplicate implementation details across two modules. If there's a better way to do this let me know. I'd also like to expedite this particular pull if possible. Its preventing ip-address ingest in prod currently.

// NOTE: This is highly order dependent, in the sense that both the statements within this
// function need to be executed in a certain order, and that other normalization code
// (geoip lookup) needs to run after this.
Expand All @@ -601,7 +615,7 @@ fn normalize_ip_addresses(event: &mut Event, client_ip: Option<&IpAddr>) {

// Resolve {{auto}}
if let Some(client_ip) = client_ip {
if let Some(ref mut request) = event.request.value_mut() {
if let Some(ref mut request) = request.value_mut() {
if let Some(ref mut env) = request.env.value_mut() {
if let Some(&mut Value::String(ref mut http_ip)) = env
.get_mut("REMOTE_ADDR")
Expand All @@ -614,7 +628,7 @@ fn normalize_ip_addresses(event: &mut Event, client_ip: Option<&IpAddr>) {
}
}

if let Some(ref mut user) = event.user.value_mut() {
if let Some(ref mut user) = user.value_mut() {
if let Some(ref mut user_ip) = user.ip_address.value_mut() {
if user_ip.is_auto() {
*user_ip = client_ip.to_owned();
Expand All @@ -624,22 +638,21 @@ fn normalize_ip_addresses(event: &mut Event, client_ip: Option<&IpAddr>) {
}

// Copy IPs from request interface to user, and resolve platform-specific backfilling
let http_ip = event
.request
let http_ip = request
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, here we could have two separate functions. One that resolves {{auto}}, and another one that copies IP addresses over.

.value()
.and_then(|request| request.env.value())
.and_then(|env| env.get("REMOTE_ADDR"))
.and_then(Annotated::<Value>::as_str)
.and_then(|ip| IpAddr::parse(ip).ok());

if let Some(http_ip) = http_ip {
let user = event.user.value_mut().get_or_insert_with(User::default);
let user = user.value_mut().get_or_insert_with(User::default);
user.ip_address.value_mut().get_or_insert(http_ip);
} else if let Some(client_ip) = client_ip {
let user = event.user.value_mut().get_or_insert_with(User::default);
let user = user.value_mut().get_or_insert_with(User::default);
// auto is already handled above
if user.ip_address.value().is_none() {
let platform = event.platform.as_str();
let platform = platform.as_str();

// In an ideal world all SDKs would set {{auto}} explicitly.
if let Some("javascript") | Some("cocoa") | Some("objc") = platform {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"4"
],
"dist": "1.12",
"platform": "Python",
"platform": "javascript",
"environment": "production",
"release": "[email protected]",
"tags": {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"4"
],
"dist": "1.12",
"platform": "Python",
"platform": "javascript",
"environment": "production",
"release": "[email protected]",
"tags": {
Expand Down