Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(dynamic-config): use matches_any_origin to scrub HTTP hosts in spans #3939

Merged

Conversation

aldy505
Copy link
Contributor

@aldy505 aldy505 commented Aug 16, 2024

Closes #3936

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

@aldy505 aldy505 requested a review from a team as a code owner August 16, 2024 14:21
Copy link
Member

@Dav1dde Dav1dde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also the option we move away from Host and use matches_any_origin

pub fn matches_any_origin(url: Option<&str>, origins: &[SchemeDomainPort]) -> bool {
for matching and deserialize into SchemeDomainPort.

@jjbayer wdyt?

Comment on lines 395 to 400
while let Some(value) = seq.next_element()? {
let this = Host::parse(value);
if let Ok(t) = this {
values.push(t)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This visitor still fails if either one of the elements in the list has the wrong type or even the entire object is of the wrong type (more likely).

We can keep this fallible and not ignore invalid hosts (it's not user config, they should be always valid) and instead handle the entire error of the visitor at the deserialize.deserialize_* call.

}
}

deserializer.deserialize_any(HostVisitor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
deserializer.deserialize_any(HostVisitor)
deserializer.deserialize_seq(HostVisitor)

Can handle the error here.

@jjbayer
Copy link
Member

jjbayer commented Aug 26, 2024

There is also the option we move away from Host and use matches_any_origin

pub fn matches_any_origin(url: Option<&str>, origins: &[SchemeDomainPort]) -> bool {

for matching and deserialize into SchemeDomainPort.
@jjbayer wdyt?

Good idea, this would even come with wildcard matching built in. Parsing into SchemaDomainPort for every span is not ideal, but at least we know it works, i.e. @aldy505 you could basically copy-paste what is_valid_origin does.

@aldy505 aldy505 requested a review from Dav1dde August 31, 2024 11:31
@aldy505
Copy link
Contributor Author

aldy505 commented Sep 2, 2024

@Dav1dde @jjbayer good morning, can you please re-run the failing CI? it's kind of flaky.

Another note: the SchemeHostPort can't do well with IP globs, so 172.16.0.* is invalid. But I think that should be tackled on another PR.

Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the title to match the new behavior.

CHANGELOG.md Outdated Show resolved Hide resolved
@@ -29,6 +29,7 @@ relay-log = { workspace = true }
relay-protocol = { workspace = true }
relay-statsd = { workspace = true }
relay-ua = { workspace = true }
relay-filter = { workspace = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We should move the utility function to relay-common instead of pulling in relay-filter here.

@aldy505 aldy505 changed the title fix(dynamic-config): custom Vec<Host> deserializer method fix(dynamic-config): use matches_any_origin to scrub HTTP hosts in spans Sep 2, 2024
@aldy505
Copy link
Contributor Author

aldy505 commented Sep 2, 2024

@jjbayer all greens now

@jjbayer jjbayer merged commit 516f45a into getsentry:master Sep 2, 2024
23 checks passed
@aldy505 aldy505 deleted the fix/dynamic-config/vec-host-deserializer branch September 3, 2024 03:59
jjbayer added a commit that referenced this pull request Sep 3, 2024
@jjbayer
Copy link
Member

jjbayer commented Sep 3, 2024

@aldy505 we had to revert this PR because it caused panics for URLs that contain [] brackets in their path. I'm working on a fix now to replace the manual parser with url::parse.

jjbayer added a commit that referenced this pull request Sep 3, 2024
…spans (#3939)

Closes #3936

### Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated
in the State of Delaware in 2015 as Functional Software, Inc. and is
gonna need some rights from me in order to utilize my contributions in
this here PR. So here's the deal: I retain all rights, title and
interest in and to my contributions, and by keeping this boilerplate
intact I confirm that Sentry can use, modify, copy, and redistribute my
contributions, under Sentry's choice of terms.

---------

Co-authored-by: Joris Bayer <[email protected]>
jjbayer added a commit that referenced this pull request Sep 4, 2024
Re-apply #3939, which caused
panics in urls with brackets.

This PR contains a minimal fix to prevent the panic:
22233f3

---------

Co-authored-by: Reinaldy Rafli <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

relay.span-normalization.allowed_hosts global options failed to deserialized
4 participants