-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(spans): initial MongoDB description scrubbing support #3912
Conversation
We would like to support MongoDB queries in the Queries insight module (which is currently SQL-only). Metrics will be shown on the Queries insights page when they fulfill the query `has:span.description span.module:db`. Without the changes in this PR, MongoDB spans would either lack a `span.description` or `span.module` so they were hidden from the module. In addition, to fully support the Queries page, the `span.action` and `span.domain` tags are also required. Without this PR, MongoDB spans' `span.action` was being set if it was present as `db.operation` in span `data`, but `span.domain` was not set. This PR: - Allows MongoDB spans to be tagged with `span.module`: `db` - Adds MongoDB query scrubbing and writes the scrubbed query to `span.description`, conditional on the `organizations:performance-queries-mongodb-extraction` feature flag - Writes the collection name to `span.domain` for MongoDB spans Because the new query scrubbing is behind a feature flag, orgs without it will continue to have no `span.description` and therefore continue to not show up in the Queries module. The short-term plan is to turn this on for a single test org to iterate on the frontend experience, and also begin to test the cardinality of the query scrubbing. The scrubbing keeps the operation/collection key/value pair but otherwise replaces leaf values in the query JSON with `"?"`. This might need to be tuned further if there are cases where people are likely to use high-cardinality keys. Valid MongoDB spans must: - contain a JSON-encoded query in their `description` - have a `db.system` of `"mongodb"` - have either `db.collection.name` or `db.mongodb.collection` set - have `db.operation` set Our JavaScript v8 SDK fulfills these criteria for MongoDB spans. Co-authored-by: Ash <[email protected]>
Disabled, | ||
/// Enable scrubbing of MongoDB span descriptions. | ||
Enabled, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this value has to get passed through so many different functions, it seemed clearer to use an explicit enum argument instead of a dangling boolean for the arguments.
relay-dynamic-config/src/feature.rs
Outdated
/// | ||
/// Serialized as `organizations:performance-queries-mongodb-extraction`. | ||
#[serde(rename = "organizations:performance-queries-mongodb-extraction")] | ||
ScrubMongoDBDescriptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this feature make sense the way it is right now?
It seems like mongodb
is unconditionally enabled with this PR only the scrubbing is conditional. For me this seems like we should have a feature flag to enable/disable mongodb alltogether not just the scrubbing which is supposed to help us with cardinality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my understanding.
There are two places that Relay currently special-cases MongoDB. The first is disabled databases.
relay/relay-dynamic-config/src/defaults.rs
Lines 11 to 17 in 70c15da
const DISABLED_DATABASES: &[&str] = &[ | |
"*clickhouse*", | |
"*compile*", | |
"*mongodb*", | |
"*redis*", | |
"db.orm", | |
]; |
relay/relay-dynamic-config/src/defaults.rs
Lines 123 to 126 in 70c15da
let is_db = RuleCondition::eq("span.sentry_tags.category", "db") | |
& !(RuleCondition::eq("span.system", "mongodb") | |
| RuleCondition::glob("span.op", DISABLED_DATABASES) | |
| RuleCondition::glob("span.description", MONGODB_QUERIES)); |
is_db
is used to control the presence of certain metrics and tags in hardcoded_span_metrics
.
This PR removes that check, which means we will produce more metrics. However, note that:
- a number of SDKs (including v8 of the JS SDK) don't actually have
mongodb
as part of the spanop
, and were never subject to this check - high cardinality of the resulting metrics comes from the
span.description
tag, which is protected against separately (see following)
The second existing special case is in the span description scrubbing.
relay/relay-event-normalization/src/normalize/span/description/mod.rs
Lines 63 to 70 in 70c15da
("db", sub) => { | |
if sub.contains("clickhouse") | |
|| sub.contains("mongodb") | |
|| sub.contains("redis") | |
|| is_legacy_activerecord(sub, db_system) | |
|| is_sql_mongodb(description, db_system) | |
{ | |
None |
We check again for mongodb
in the op (which as I mentioned, is often already bypassed), but the is_sql_mongodb
function catches them all at this point by checking for a db.system
of mongodb
or JSON-looking description. This is the handling we fall through to when the feature is off.
So:
- This PR will change the situation from "some MongoDB spans are counted as
is_db
" to "all MongoDB spans are counted asis_db
", but - There will be no increase in cardinality, as all MongoDB span descriptions will continue to be
None
in cases where the flag is not set.
In an ideal world, I would have kept the behaviour exactly the same and only conditionally changed DISABLED_DATABASES
. But, looking at how hardcoded_span_metrics
, I can't see any way to get feature flags down to it. So I reasoned that the outcomes above were probably acceptable, but I could definitely be wrong! Please let me know what you think. Thanks so much 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a number of SDKs (including v8 of the JS SDK) don't actually have mongodb as part of the span op, and were never subject to this check
I'd be worried about the Otel integrations here, not the SDKs necessarily.
high cardinality of the resulting metrics comes from the span.description tag, which is protected against separately (see following)
If we have some confidence in the scrubbing (which seems pretty sound to me, we can even start with a lower 'recursion limit'), I'd not use the feature flag at all. You will only find the outliers after it's too late anyways. And the feature flag just adds more conditionals and boilerplate while only toggling scrubbing not the full extraction.
That's the same approach that was taken for Redis.
relay-event-normalization/src/normalize/span/description/mod.rs
Outdated
Show resolved
Hide resolved
relay-event-normalization/src/normalize/span/description/mod.rs
Outdated
Show resolved
Hide resolved
relay-event-normalization/src/normalize/span/description/mod.rs
Outdated
Show resolved
Hide resolved
relay-event-normalization/src/normalize/span/description/mod.rs
Outdated
Show resolved
Hide resolved
relay-event-normalization/src/normalize/span/description/mod.rs
Outdated
Show resolved
Hide resolved
- replaced copies with in-place modification - added a recursion limit to the query scrubbing - used for loops instead of `for_each` - `to_owned` instead of `to_string`
Thanks for the review @Dav1dde! I appreciate the corrections to my first Rust. I think I addressed all your lower level comments, and left a longer explanation for the high level concern re: the feature flag if you get a chance to check that out. Thanks again! 🙏 |
@mjq I took the liberty to update the scrubbing code to be clone free and mutate only in place. Few things:
Do you think a depth/recursion limit of 3 is enough? Having no experience with mongodb it does feel quite small. |
a7cb870
to
9759b15
Compare
9759b15
to
3bb717b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted I'd consider dropping the feature flag all together, especially since you expect almost no mongodb data in the first place. Just removes a few more conditionals and overhead. Up to you.
Conditionally changing is_db
would be preferrable and possible, but that requires some deeper changes in Relay we can always do later and I don't want to block you on this.
relay-dynamic-config/src/feature.rs
Outdated
/// | ||
/// Serialized as `organizations:performance-queries-mongodb-extraction`. | ||
#[serde(rename = "organizations:performance-queries-mongodb-extraction")] | ||
ScrubMongoDBDescriptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a number of SDKs (including v8 of the JS SDK) don't actually have mongodb as part of the span op, and were never subject to this check
I'd be worried about the Otel integrations here, not the SDKs necessarily.
high cardinality of the resulting metrics comes from the span.description tag, which is protected against separately (see following)
If we have some confidence in the scrubbing (which seems pretty sound to me, we can even start with a lower 'recursion limit'), I'd not use the feature flag at all. You will only find the outliers after it's too late anyways. And the feature flag just adds more conditionals and boilerplate while only toggling scrubbing not the full extraction.
That's the same approach that was taken for Redis.
@Dav1dde Thanks for the review (and the help with the in-place data transformations!). With hack week over, I'd like to get this merged. I've re-merged master to get a new test run. Even though I'm confident in the scrubbing, personally I would feel better about having the feature flag for the initial release. I can loop back and remove it within a day or two afterwards once we're sure things are looking okay. How does that sound?
My thinking was to start conservative and expand it if we find we need it in real use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment about scrubbing collection names. The rest of the PR looks good to me!
relay-dynamic-config/src/feature.rs
Outdated
/// | ||
/// Serialized as `organizations:performance-queries-mongodb-extraction`. | ||
#[serde(rename = "organizations:performance-queries-mongodb-extraction")] | ||
ScrubMongoDBDescriptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
ScrubMongoDBDescriptions, | |
ScrubMongoDbDescriptions, |
& (RuleCondition::eq("span.system", "mongodb") | ||
| !RuleCondition::glob("span.description", MONGODB_QUERIES)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me a while to parse this condition, might benefit from a comment along the lines of "we disallow mongodb, unless span.system
is set to mongodb
".
@@ -332,6 +337,7 @@ fn normalize(event: &mut Event, meta: &mut Meta, config: &NormalizationConfig) { | |||
event, | |||
config.max_tag_value_length, | |||
config.span_allowed_hosts, | |||
config.scrub_mongo_description.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can derive the Copy
trait on ScrubMongoDescription
to make the .clone()
unnecessary.
@@ -33,12 +34,22 @@ const MAX_EXTENSION_LENGTH: usize = 10; | |||
/// Domain names that are preserved during scrubbing | |||
const DOMAIN_ALLOW_LIST: &[&str] = &["localhost"]; | |||
|
|||
#[derive(PartialEq, Clone, Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[derive(PartialEq, Clone, Debug)] | |
#[derive(PartialEq, Clone, Copy, Debug)] |
.and_then(|data| data.db_collection_name.value()) | ||
.and_then(|collection| collection.as_str()); | ||
|
||
command.zip(collection).and_then(|(command, collection)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
command.zip(collection).and_then(|(command, collection)| { | |
if let (Some(command), Some(collection)) = (command, collection) { |
for value in root.values_mut() { | ||
scrub_mongodb_visit_node(value, 3); | ||
} | ||
root.insert(command, Value::String(collection)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: query
might not parse into a valid object, so I would pass &str
into this function and only convert command
and collection
to String
in this line.
|
||
mongodb_scrubbing_test!( | ||
mongodb_basic_query, | ||
"{\"find\": \"documents\", \"showRecordId\": true}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"{\"find\": \"documents\", \"showRecordId\": true}", | |
r#"{"find": "documents", "showRecordId": true}"#, |
etc,
.and_then(|command| command.as_str()); | ||
|
||
let collection = data | ||
.and_then(|data| data.db_collection_name.value()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With SQL, we learned the hard way that collections / tables can contain all kinds of identifiers (see test cases here). IMO we should pass the mongodb collection name through the same scrubbing as SQL table names to catch those identifers, i.e. apply the TABLE_NAME_REGEX
.
- Rename `ScrubMongoDBDescriptions` to `ScrubMongoDbDescriptions` - comment complicated is_db conditional - add `Copy` trait to `ScrubMongoDescription` - clarify option checking (zip -> conditional) - move `String` conversion after failure cases - use r# strings to make embedded JSON easier to read - scrub collection names from descriptions and tags
@jjbayer Thanks so much for the thorough review 🙏 I've made all of your suggested changes. Please take a look at the collection name scrubbing changes in particular. I've applied it both places the collection is used in a tag (the description and the domain) since either could affect metric cardinality. To reuse the Thanks again! |
let scrubbed_collection_name = | ||
if let Cow::Owned(s) = TABLE_NAME_REGEX.replace_all(collection, "{%s}") { | ||
s | ||
} else { | ||
collection.to_owned() | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this is the same
let scrubbed_collection_name = | |
if let Cow::Owned(s) = TABLE_NAME_REGEX.replace_all(collection, "{%s}") { | |
s | |
} else { | |
collection.to_owned() | |
}; | |
let scrubbed_collection_name = TABLE_NAME_REGEX.replace_all(collection, "{%s}").to_owned(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above returns the type error:
--> relay-event-normalization/src/normalize/span/description/mod.rs:575:51
|
575 | root.insert(command.to_owned(), Value::String(scrubbed_collection_name));
| ------------- ^^^^^^^^^^^^^^^^^^^^^^^^- help: try using a conversion method: `.to_string()`
| | |
| | expected `String`, found `Cow<'_, str>`
| arguments to this enum variant are incorrect
|
= note: expected struct `std::string::String`
found enum `std::borrow::Cow<'_, str>`
But
let scrubbed_collection_name = TABLE_NAME_REGEX.replace_all(collection, "{%s}").to_string();
compiles, if that seems reasonable?
Either way I'm going to merge without this change just in case, and loop back afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that works!
if let Cow::Owned(s) = TABLE_NAME_REGEX.replace_all(db_collection, "{%s}") { | ||
s | ||
} else { | ||
db_collection.to_owned() | ||
} | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about to_owned
.
It's a bit unfortunate that we scrub the collection name again here, but that's a basic flaw of extract_tags
(it works on each tag separately instead of collecting all tags for a given span type at once).
* master: (27 commits) build: Update dialoguer and hostname (#4009) build: Update opentelemetry-proto to 0.7.0 (#4000) build: Update lru to 0.12.4 (#4008) build: Update cookie to 0.18.1 (#4007) feat(spans): Extract standalone CLS span metrics and performance score (#3988) build: Update cadence to 1.4.0 and statsdproxy to 0.2.0 (#4005) build: Update maxminddb to 0.24.0 (#4003) build: Update multer to 3.1.0 (#4002) build: Update regex and aho-corasick (#4001) build: Update sentry-kafka-schemas to 1.0.107 (#3999) build: Update dev-dependencies (#3998) build: Update itertools to 0.13.0 (#3993) build: Update brotli, zstd, flate2 (#3996) build: Update rdkafka to 0.36.2 (#3995) build: Update tikv-jemallocator to 0.6.0 (#3994) build: Update minidump to 0.22.0 (#3992) build: Update bindgen to 0.70.1 (#3991) build: Update chrono to 0.4.38 (#3990) feat(spans): initial MongoDB description scrubbing support (#3912) fix(spooler): Reduce number of disk reads (#3983) ...
We would like to support MongoDB queries in the Queries insight module (which is currently SQL-only).
Metrics will be shown on the Queries insights page when they fulfill the query
has:span.description span.module:db
. Without the changes in this PR, MongoDB spans would either lack aspan.description
orspan.module
so they were hidden from the module.In addition, to fully support the Queries page, the
span.action
andspan.domain
tags are also required. Without this PR, MongoDB spans'span.action
was being set if it was present asdb.operation
in spandata
, butspan.domain
was not set.This PR:
span.module
:db
span.description
, conditional on theorganizations:performance-queries-mongodb-extraction
feature flag (added here)span.domain
for MongoDB spansBecause the new query scrubbing is behind a feature flag, orgs without it will continue to have no
span.description
and therefore continue to not show up in the Queries module.The short-term plan is to turn this on for a single test org to iterate on the frontend experience, and also begin to test the cardinality of the query scrubbing.
The scrubbing keeps the operation/collection key/value pair but otherwise replaces leaf values in the query JSON with
"?"
. This might need to be tuned further if there are cases where people are likely to use high-cardinality keys.Valid MongoDB spans must:
description
db.system
of"mongodb"
db.collection.name
ordb.mongodb.collection
setdb.operation
setOur JavaScript v8 SDK fulfills these criteria for MongoDB spans, and I believe the latest Python does too (but haven't personally checked).