ref(store): Random Kafka partitioning for sessions #1194

jan-auer · 2022-02-23T10:03:43Z

Instead of using the session ID as partition key for producing sessions, assign
a random partitioning key. This ensures more even distribution of sessions on
the Kafka topic, even if clients are sending many updates for the same session.

We have noticed intermittent imbalance on the partitions. The theory is that
clients are sending repeat updates for the same session (ID) when a high number
of errors are occurring within that session. Every session update would go to
the same partition unless we use random partitioning.

Snuba does not have logical constraints on the partitioning.

jan-auer · 2022-02-23T10:06:17Z

relay-server/src/actors/store.rs

@@ -673,8 +673,8 @@ impl KafkaMessage {
            Self::Attachment(message) => message.event_id.0,
            Self::AttachmentChunk(message) => message.event_id.0,
            Self::UserReport(message) => message.event_id.0,
-            Self::Session(message) => message.session_id,
-            Self::Metric(_message) => Uuid::nil(), // TODO(ja): Determine a partitioning key
+            Self::Session(_message) => Uuid::nil(), // Explicit random partitioning for sessions


Note that this NIL is overwritten immediately after the match block.

nit, can we note down the observed scenario that motivated this change?

I was hoping in code it's enough to say that we explicitly want uniform distribution. The full reasoning is in the PR description, and given that it's a hypothesis I wasn't gonna put this in the code.

jjbayer

As long as we are sure that the partitioning by ID wasn't there for a reason, this looks good to me!

jan-auer · 2022-02-23T10:55:41Z

Asking for @fpacifici's blessing before merging this :)

untitaker · 2022-02-23T13:57:22Z

relay-server/src/actors/store.rs

@@ -673,8 +673,8 @@ impl KafkaMessage {
            Self::Attachment(message) => message.event_id.0,
            Self::AttachmentChunk(message) => message.event_id.0,
            Self::UserReport(message) => message.event_id.0,
-            Self::Session(message) => message.session_id,
-            Self::Metric(_message) => Uuid::nil(), // TODO(ja): Determine a partitioning key
+            Self::Session(_message) => Uuid::nil(), // Explicit random partitioning for sessions


nit, can we note down the observed scenario that motivated this change?

* master: ref(make): Simplify M1 exports in Makefile (#1206) fix(metrics): Wait for project states during aggregator shutdown (#1205) fix(test): Find librdkafka on Apple M1 (#1204) build: Bump sentry-relay in dev dependencies to 0.8.9 (#1202) ref(metrics): Tag backdated bucket creations in statsd (#1200) feat(metrics): Extract user satisfaction as tag (#1197) fix(statsd): Add new metric_type tag to existing metrics (#1199) fix: Apply clippy 1.59 suggestions (#1198)

* master: ci(gcb): Increase timeout for self-hosted integration test (#1208) build: Update regex to 1.5.5 (#1207)

ref(store): Random Kafka partitioning for sessions

fa07de8

jan-auer requested a review from a team February 23, 2022 10:03

jan-auer self-assigned this Feb 23, 2022

meta: Changelog

3745794

jan-auer commented Feb 23, 2022

View reviewed changes

jjbayer approved these changes Feb 23, 2022

View reviewed changes

jan-auer requested a review from fpacifici February 23, 2022 10:55

untitaker approved these changes Feb 23, 2022

View reviewed changes

jan-auer added 3 commits March 9, 2022 08:39

Merge branch 'master' into ref/session-partitioning

dc9316e

* master: ci(gcb): Increase timeout for self-hosted integration test (#1208) build: Update regex to 1.5.5 (#1207)

meta: Fix changelog for 21.12.0

6c217de

jan-auer merged commit 6c3415c into master Mar 10, 2022

jan-auer deleted the ref/session-partitioning branch March 10, 2022 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ref(store): Random Kafka partitioning for sessions #1194

ref(store): Random Kafka partitioning for sessions #1194

jan-auer commented Feb 23, 2022

jan-auer Feb 23, 2022

untitaker Feb 23, 2022

jan-auer Feb 23, 2022

jjbayer left a comment

jan-auer commented Feb 23, 2022

untitaker Feb 23, 2022

ref(store): Random Kafka partitioning for sessions #1194

ref(store): Random Kafka partitioning for sessions #1194

Conversation

jan-auer commented Feb 23, 2022

jan-auer Feb 23, 2022

Choose a reason for hiding this comment

untitaker Feb 23, 2022

Choose a reason for hiding this comment

jan-auer Feb 23, 2022

Choose a reason for hiding this comment

jjbayer left a comment

Choose a reason for hiding this comment

jan-auer commented Feb 23, 2022

untitaker Feb 23, 2022

Choose a reason for hiding this comment