feat(spans): initial MongoDB description scrubbing support #3912

mjq · 2024-08-08T16:44:45Z

We would like to support MongoDB queries in the Queries insight module (which is currently SQL-only).

Metrics will be shown on the Queries insights page when they fulfill the query has:span.description span.module:db. Without the changes in this PR, MongoDB spans would either lack a span.description or span.module so they were hidden from the module.

In addition, to fully support the Queries page, the span.action and span.domain tags are also required. Without this PR, MongoDB spans' span.action was being set if it was present as db.operation in span data, but span.domain was not set.

This PR:

Allows MongoDB spans to be tagged with span.module: db
Adds MongoDB query scrubbing and writes the scrubbed query to span.description, conditional on the organizations:performance-queries-mongodb-extraction feature flag (added here)
Writes the collection name to span.domain for MongoDB spans

Because the new query scrubbing is behind a feature flag, orgs without it will continue to have no span.description and therefore continue to not show up in the Queries module.

The short-term plan is to turn this on for a single test org to iterate on the frontend experience, and also begin to test the cardinality of the query scrubbing.

The scrubbing keeps the operation/collection key/value pair but otherwise replaces leaf values in the query JSON with "?". This might need to be tuned further if there are cases where people are likely to use high-cardinality keys.

Valid MongoDB spans must:

contain a JSON-encoded query in their description
have a db.system of "mongodb"
have either db.collection.name or db.mongodb.collection set
have db.operation set

Our JavaScript v8 SDK fulfills these criteria for MongoDB spans, and I believe the latest Python does too (but haven't personally checked).

We would like to support MongoDB queries in the Queries insight module (which is currently SQL-only). Metrics will be shown on the Queries insights page when they fulfill the query `has:span.description span.module:db`. Without the changes in this PR, MongoDB spans would either lack a `span.description` or `span.module` so they were hidden from the module. In addition, to fully support the Queries page, the `span.action` and `span.domain` tags are also required. Without this PR, MongoDB spans' `span.action` was being set if it was present as `db.operation` in span `data`, but `span.domain` was not set. This PR: - Allows MongoDB spans to be tagged with `span.module`: `db` - Adds MongoDB query scrubbing and writes the scrubbed query to `span.description`, conditional on the `organizations:performance-queries-mongodb-extraction` feature flag - Writes the collection name to `span.domain` for MongoDB spans Because the new query scrubbing is behind a feature flag, orgs without it will continue to have no `span.description` and therefore continue to not show up in the Queries module. The short-term plan is to turn this on for a single test org to iterate on the frontend experience, and also begin to test the cardinality of the query scrubbing. The scrubbing keeps the operation/collection key/value pair but otherwise replaces leaf values in the query JSON with `"?"`. This might need to be tuned further if there are cases where people are likely to use high-cardinality keys. Valid MongoDB spans must: - contain a JSON-encoded query in their `description` - have a `db.system` of `"mongodb"` - have either `db.collection.name` or `db.mongodb.collection` set - have `db.operation` set Our JavaScript v8 SDK fulfills these criteria for MongoDB spans. Co-authored-by: Ash <[email protected]>

mjq · 2024-08-08T18:09:29Z

relay-event-normalization/src/normalize/span/description/mod.rs

+    Disabled,
+    /// Enable scrubbing of MongoDB span descriptions.
+    Enabled,
+}


Because this value has to get passed through so many different functions, it seemed clearer to use an explicit enum argument instead of a dangling boolean for the arguments.

Dav1dde · 2024-08-09T08:31:02Z

relay-dynamic-config/src/feature.rs

+    ///
+    /// Serialized as `organizations:performance-queries-mongodb-extraction`.
+    #[serde(rename = "organizations:performance-queries-mongodb-extraction")]
+    ScrubMongoDBDescriptions,


Does this feature make sense the way it is right now?

It seems like mongodb is unconditionally enabled with this PR only the scrubbing is conditional. For me this seems like we should have a feature flag to enable/disable mongodb alltogether not just the scrubbing which is supposed to help us with cardinality.

Here's my understanding.

There are two places that Relay currently special-cases MongoDB. The first is disabled databases.

relay/relay-dynamic-config/src/defaults.rs

Lines 11 to 17 in 70c15da

const DISABLED_DATABASES: &[&str] = &[

"*clickhouse*",

"*compile*",

"*mongodb*",

"*redis*",

"db.orm",

];

relay/relay-dynamic-config/src/defaults.rs

Lines 123 to 126 in 70c15da

let is_db = RuleCondition::eq("span.sentry_tags.category", "db")

& !(RuleCondition::eq("span.system", "mongodb")

| RuleCondition::glob("span.op", DISABLED_DATABASES)

| RuleCondition::glob("span.description", MONGODB_QUERIES));

is_db is used to control the presence of certain metrics and tags in hardcoded_span_metrics.

This PR removes that check, which means we will produce more metrics. However, note that:

a number of SDKs (including v8 of the JS SDK) don't actually have mongodb as part of the span op, and were never subject to this check

high cardinality of the resulting metrics comes from the span.description tag, which is protected against separately (see following)

The second existing special case is in the span description scrubbing.

relay/relay-event-normalization/src/normalize/span/description/mod.rs

Lines 63 to 70 in 70c15da

("db", sub) => {

if sub.contains("clickhouse")

|| sub.contains("mongodb")

|| sub.contains("redis")

|| is_legacy_activerecord(sub, db_system)

|| is_sql_mongodb(description, db_system)

{

None

We check again for mongodb in the op (which as I mentioned, is often already bypassed), but the is_sql_mongodb function catches them all at this point by checking for a db.system of mongodb or JSON-looking description. This is the handling we fall through to when the feature is off.

So:

This PR will change the situation from "some MongoDB spans are counted as is_db" to "all MongoDB spans are counted as is_db", but

There will be no increase in cardinality, as all MongoDB span descriptions will continue to be None in cases where the flag is not set.

In an ideal world, I would have kept the behaviour exactly the same and only conditionally changed DISABLED_DATABASES. But, looking at how hardcoded_span_metrics, I can't see any way to get feature flags down to it. So I reasoned that the outcomes above were probably acceptable, but I could definitely be wrong! Please let me know what you think. Thanks so much 🙏

a number of SDKs (including v8 of the JS SDK) don't actually have mongodb as part of the span op, and were never subject to this check

I'd be worried about the Otel integrations here, not the SDKs necessarily.

high cardinality of the resulting metrics comes from the span.description tag, which is protected against separately (see following)

If we have some confidence in the scrubbing (which seems pretty sound to me, we can even start with a lower 'recursion limit'), I'd not use the feature flag at all. You will only find the outliers after it's too late anyways. And the feature flag just adds more conditionals and boilerplate while only toggling scrubbing not the full extraction.
That's the same approach that was taken for Redis.

relay-event-normalization/src/normalize/span/description/mod.rs

- replaced copies with in-place modification - added a recursion limit to the query scrubbing - used for loops instead of `for_each` - `to_owned` instead of `to_string`

mjq · 2024-08-09T15:48:49Z

Thanks for the review @Dav1dde! I appreciate the corrections to my first Rust. I think I addressed all your lower level comments, and left a longer explanation for the high level concern re: the feature flag if you get a chance to check that out. Thanks again! 🙏

Dav1dde · 2024-08-12T10:35:30Z

@mjq I took the liberty to update the scrubbing code to be clone free and mutate only in place.

Few things:

for_each: is not almost a code smell and can just be expressed with a for
collections have a bunch of utility methods, like values_mut which gives you mutable access to only the values/what you need, this also allows you to modify in place
you can match on the mutable Value's and modify them in place as well, no need to create new values, just pass &mut references into the visitor
there is a small optimization when you already have a Value::String we don't need to allocate a new string, we can clear the string and push a ? into it, this saves an allocation
You can early return with the questionmark ? operator if the function returns Result or Option (used on serde_json::from_str)
There is an additional optimization we could take, that is not actually parsing the json into a proper data structure, but just visiting and creating a new json immediately (SAX like), the other optimization would be to not parse String's we could just borrow from the original description (since we later re-serialize anyways), but that requires our own Value enum, so decided not to do it for simplicity reasons.

Do you think a depth/recursion limit of 3 is enough? Having no experience with mongodb it does feel quite small.

Dav1dde

As noted I'd consider dropping the feature flag all together, especially since you expect almost no mongodb data in the first place. Just removes a few more conditionals and overhead. Up to you.

Conditionally changing is_db would be preferrable and possible, but that requires some deeper changes in Relay we can always do later and I don't want to block you on this.

Dav1dde · 2024-08-12T18:06:26Z

relay-dynamic-config/src/feature.rs

+    ///
+    /// Serialized as `organizations:performance-queries-mongodb-extraction`.
+    #[serde(rename = "organizations:performance-queries-mongodb-extraction")]
+    ScrubMongoDBDescriptions,


a number of SDKs (including v8 of the JS SDK) don't actually have mongodb as part of the span op, and were never subject to this check

I'd be worried about the Otel integrations here, not the SDKs necessarily.

high cardinality of the resulting metrics comes from the span.description tag, which is protected against separately (see following)

If we have some confidence in the scrubbing (which seems pretty sound to me, we can even start with a lower 'recursion limit'), I'd not use the feature flag at all. You will only find the outliers after it's too late anyways. And the feature flag just adds more conditionals and boilerplate while only toggling scrubbing not the full extraction.
That's the same approach that was taken for Redis.

Stale

mjq · 2024-08-26T16:57:32Z

@Dav1dde Thanks for the review (and the help with the in-place data transformations!). With hack week over, I'd like to get this merged. I've re-merged master to get a new test run.

Even though I'm confident in the scrubbing, personally I would feel better about having the feature flag for the initial release. I can loop back and remove it within a day or two afterwards once we're sure things are looking okay. How does that sound?

Do you think a depth/recursion limit of 3 is enough? Having no experience with mongodb it does feel quite small.

My thinking was to start conservative and expand it if we find we need it in real use.

mjq · 2024-08-29T15:37:47Z

@jjbayer added you as a reviewer (along with a re-request to @Dav1dde) if you wouldn't mind helping, since we're blocked on this now. Thanks!

jjbayer

See comment about scrubbing collection names. The rest of the PR looks good to me!

jjbayer · 2024-08-29T15:01:38Z

relay-dynamic-config/src/feature.rs

+    ///
+    /// Serialized as `organizations:performance-queries-mongodb-extraction`.
+    #[serde(rename = "organizations:performance-queries-mongodb-extraction")]
+    ScrubMongoDBDescriptions,


nit:

Suggested change

ScrubMongoDBDescriptions,

ScrubMongoDbDescriptions,

https://rust-lang.github.io/api-guidelines/naming.html#:~:text=In%20UpperCamelCase%2C%20acronyms%20and%20contractions%20of%20compound%20words%20count%20as%20one%20word%3A%20use%20Uuid%20rather%20than%20UUID

jjbayer · 2024-08-30T07:43:57Z

relay-dynamic-config/src/defaults.rs

+        & (RuleCondition::eq("span.system", "mongodb")
+            | !RuleCondition::glob("span.description", MONGODB_QUERIES));


Took me a while to parse this condition, might benefit from a comment along the lines of "we disallow mongodb, unless span.system is set to mongodb".

jjbayer · 2024-08-30T07:46:04Z

relay-event-normalization/src/event.rs

@@ -332,6 +337,7 @@ fn normalize(event: &mut Event, meta: &mut Meta, config: &NormalizationConfig) {
            event,
            config.max_tag_value_length,
            config.span_allowed_hosts,
+            config.scrub_mongo_description.clone(),


You can derive the Copy trait on ScrubMongoDescription to make the .clone() unnecessary.

jjbayer · 2024-08-30T07:46:20Z

relay-event-normalization/src/normalize/span/description/mod.rs

@@ -33,12 +34,22 @@ const MAX_EXTENSION_LENGTH: usize = 10;
 /// Domain names that are preserved during scrubbing
 const DOMAIN_ALLOW_LIST: &[&str] = &["localhost"];

+#[derive(PartialEq, Clone, Debug)]


Suggested change

#[derive(PartialEq, Clone, Debug)]

#[derive(PartialEq, Clone, Copy, Debug)]

jjbayer · 2024-08-30T07:49:31Z

relay-event-normalization/src/normalize/span/description/mod.rs

+                    .and_then(|data| data.db_collection_name.value())
+                    .and_then(|collection| collection.as_str());
+
+                command.zip(collection).and_then(|(command, collection)| {


Suggested change

command.zip(collection).and_then(|(command, collection)| {

if let (Some(command), Some(collection)) = (command, collection) {

jjbayer · 2024-08-30T07:55:35Z

relay-event-normalization/src/normalize/span/description/mod.rs

+    for value in root.values_mut() {
+        scrub_mongodb_visit_node(value, 3);
+    }
+    root.insert(command, Value::String(collection));


nit: query might not parse into a valid object, so I would pass &str into this function and only convert command and collection to String in this line.

jjbayer · 2024-08-30T07:59:04Z

relay-event-normalization/src/normalize/span/description/mod.rs

+
+    mongodb_scrubbing_test!(
+        mongodb_basic_query,
+        "{\"find\": \"documents\", \"showRecordId\": true}",


Suggested change

"{\"find\": \"documents\", \"showRecordId\": true}",

r#"{"find": "documents", "showRecordId": true}"#,

etc,

jjbayer · 2024-08-30T08:28:09Z

relay-event-normalization/src/normalize/span/description/mod.rs

+                    .and_then(|command| command.as_str());
+
+                let collection = data
+                    .and_then(|data| data.db_collection_name.value())


With SQL, we learned the hard way that collections / tables can contain all kinds of identifiers (see test cases here). IMO we should pass the mongodb collection name through the same scrubbing as SQL table names to catch those identifers, i.e. apply the TABLE_NAME_REGEX.

- Rename `ScrubMongoDBDescriptions` to `ScrubMongoDbDescriptions` - comment complicated is_db conditional - add `Copy` trait to `ScrubMongoDescription` - clarify option checking (zip -> conditional) - move `String` conversion after failure cases - use r# strings to make embedded JSON easier to read - scrub collection names from descriptions and tags

mjq · 2024-09-03T18:04:58Z

@jjbayer Thanks so much for the thorough review 🙏 I've made all of your suggested changes. Please take a look at the collection name scrubbing changes in particular. I've applied it both places the collection is used in a tag (the description and the domain) since either could affect metric cardinality. To reuse the TABLE_NAME_REGEX I moved it up to the closest ancestor crate of all its users - if you have any suggestions for a better spot for it (or a better design) please let me know.

Thanks again!

jjbayer · 2024-09-04T12:20:44Z

relay-event-normalization/src/normalize/span/description/mod.rs

+    let scrubbed_collection_name =
+        if let Cow::Owned(s) = TABLE_NAME_REGEX.replace_all(collection, "{%s}") {
+            s
+        } else {
+            collection.to_owned()
+        };


nit: I think this is the same

Suggested change

let scrubbed_collection_name =

if let Cow::Owned(s) = TABLE_NAME_REGEX.replace_all(collection, "{%s}") {

s

} else {

collection.to_owned()

};

let scrubbed_collection_name = TABLE_NAME_REGEX.replace_all(collection, "{%s}").to_owned();

The above returns the type error:

--> relay-event-normalization/src/normalize/span/description/mod.rs:575:51 | 575 | root.insert(command.to_owned(), Value::String(scrubbed_collection_name)); | ------------- ^^^^^^^^^^^^^^^^^^^^^^^^- help: try using a conversion method: `.to_string()` | | | | | expected `String`, found `Cow<'_, str>` | arguments to this enum variant are incorrect | = note: expected struct `std::string::String` found enum `std::borrow::Cow<'_, str>`

But

let scrubbed_collection_name = TABLE_NAME_REGEX.replace_all(collection, "{%s}").to_string();

compiles, if that seems reasonable?

Either way I'm going to merge without this change just in case, and loop back afterwards.

Yep, that works!

jjbayer · 2024-09-04T12:30:56Z

relay-event-normalization/src/normalize/span/tag_extraction.rs

+                        if let Cow::Owned(s) = TABLE_NAME_REGEX.replace_all(db_collection, "{%s}") {
+                            s
+                        } else {
+                            db_collection.to_owned()
+                        }
+                    })


Same comment about to_owned.

It's a bit unfortunate that we scrub the collection name again here, but that's a basic flaw of extract_tags (it works on each tag separately instead of collecting all tags for a given span type at once).

* master: (27 commits) build: Update dialoguer and hostname (#4009) build: Update opentelemetry-proto to 0.7.0 (#4000) build: Update lru to 0.12.4 (#4008) build: Update cookie to 0.18.1 (#4007) feat(spans): Extract standalone CLS span metrics and performance score (#3988) build: Update cadence to 1.4.0 and statsdproxy to 0.2.0 (#4005) build: Update maxminddb to 0.24.0 (#4003) build: Update multer to 3.1.0 (#4002) build: Update regex and aho-corasick (#4001) build: Update sentry-kafka-schemas to 1.0.107 (#3999) build: Update dev-dependencies (#3998) build: Update itertools to 0.13.0 (#3993) build: Update brotli, zstd, flate2 (#3996) build: Update rdkafka to 0.36.2 (#3995) build: Update tikv-jemallocator to 0.6.0 (#3994) build: Update minidump to 0.22.0 (#3992) build: Update bindgen to 0.70.1 (#3991) build: Update chrono to 0.4.38 (#3990) feat(spans): initial MongoDB description scrubbing support (#3912) fix(spooler): Reduce number of disk reads (#3983) ...

mjq requested a review from a team as a code owner August 8, 2024 16:44

add changelog

e9ccc4b

mjq commented Aug 8, 2024

View reviewed changes

Dav1dde assigned mjq Aug 9, 2024

Dav1dde reviewed Aug 9, 2024

View reviewed changes

address review feedback

b6c92b6

- replaced copies with in-place modification - added a recursion limit to the query scrubbing - used for loops instead of `for_each` - `to_owned` instead of `to_string`

mjq requested a review from Dav1dde August 9, 2024 15:50

Dav1dde force-pushed the mjq-ash-mongodb-queries branch from a7cb870 to 9759b15 Compare August 12, 2024 10:58

eliminate clones from scrubbing

3bb717b

Dav1dde force-pushed the mjq-ash-mongodb-queries branch from 9759b15 to 3bb717b Compare August 12, 2024 10:59

Dav1dde previously approved these changes Aug 12, 2024

View reviewed changes

mjq added 2 commits August 15, 2024 10:25

Merge branch 'master' into mjq-ash-mongodb-queries

5b86ea4

ignore arg count lint

bec7457

Merge branch 'master' into mjq-ash-mongodb-queries

b0ddc97

mjq requested review from Dav1dde and jjbayer August 26, 2024 16:57

jjbayer self-assigned this Aug 30, 2024

jjbayer reviewed Aug 30, 2024

View reviewed changes

mjq added 2 commits September 3, 2024 11:27

Merge branch 'master' into mjq-ash-mongodb-queries

4174456

mjq requested a review from jjbayer September 3, 2024 18:05

jjbayer approved these changes Sep 4, 2024

View reviewed changes

Merge branch 'master' into mjq-ash-mongodb-queries

582d6f8

mjq merged commit 9c5f180 into master Sep 4, 2024
23 checks passed

mjq deleted the mjq-ash-mongodb-queries branch September 4, 2024 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(spans): initial MongoDB description scrubbing support #3912

feat(spans): initial MongoDB description scrubbing support #3912

mjq commented Aug 8, 2024 •

edited

Loading

mjq Aug 8, 2024

Dav1dde Aug 9, 2024

mjq Aug 9, 2024

Dav1dde Aug 12, 2024

mjq commented Aug 9, 2024

Dav1dde commented Aug 12, 2024 •

edited

Loading

Dav1dde left a comment

Dav1dde Aug 12, 2024

mjq commented Aug 26, 2024 •

edited

Loading

mjq commented Aug 29, 2024

jjbayer left a comment

jjbayer Aug 29, 2024

jjbayer Aug 30, 2024

jjbayer Aug 30, 2024

jjbayer Aug 30, 2024

jjbayer Aug 30, 2024

jjbayer Aug 30, 2024

jjbayer Aug 30, 2024

jjbayer Aug 30, 2024

mjq commented Sep 3, 2024

jjbayer Sep 4, 2024

mjq Sep 4, 2024

jjbayer Sep 4, 2024

jjbayer Sep 4, 2024

	const DISABLED_DATABASES: &[&str] = &[
	"clickhouse",
	"compile",
	"mongodb",
	"redis",
	"db.orm",
	];

	let is_db = RuleCondition::eq("span.sentry_tags.category", "db")
	& !(RuleCondition::eq("span.system", "mongodb")
	\| RuleCondition::glob("span.op", DISABLED_DATABASES)
	\| RuleCondition::glob("span.description", MONGODB_QUERIES));

	("db", sub) => {
	if sub.contains("clickhouse")
	\|\| sub.contains("mongodb")
	\|\| sub.contains("redis")
	\|\| is_legacy_activerecord(sub, db_system)
	\|\| is_sql_mongodb(description, db_system)
	{
	None

		& (RuleCondition::eq("span.system", "mongodb")
		\| !RuleCondition::glob("span.description", MONGODB_QUERIES));

	#[derive(PartialEq, Clone, Debug)]
	#[derive(PartialEq, Clone, Copy, Debug)]

	command.zip(collection).and_then(\|(command, collection)\| {
	if let (Some(command), Some(collection)) = (command, collection) {

	"{\"find\": \"documents\", \"showRecordId\": true}",
	r#"{"find": "documents", "showRecordId": true}"#,

feat(spans): initial MongoDB description scrubbing support #3912

feat(spans): initial MongoDB description scrubbing support #3912

Conversation

mjq commented Aug 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjq commented Aug 9, 2024

Dav1dde commented Aug 12, 2024 • edited Loading

Dav1dde left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjq commented Aug 26, 2024 • edited Loading

mjq commented Aug 29, 2024

jjbayer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjq commented Sep 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjq commented Aug 8, 2024 •

edited

Loading

Dav1dde commented Aug 12, 2024 •

edited

Loading

mjq commented Aug 26, 2024 •

edited

Loading