-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Receive] Fix race condition when adding multiple new tenants at once #7941
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's not a fix really but a revert. Thanks for the test which reproduces the bug, it will be much easier to fix. I think we should fix this instead of reverting because the cost of creating this slice is not trivial when you have thousands of selects per second.
hi @GiedriusS I got your point, this client list should be relatively stable most of the time (wasteful of memory to create slices), I spent some time to actually fix it, appreciate another review |
I think the e2e test failure is tranisent, but i don't have permission to rerun |
519b937
to
ea3c2a0
Compare
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
8582bfc
to
982408e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 thanks a lot for this and sorry for the problems
Head branch was pushed to by a user without write access
28580fd
to
83b09f5
Compare
Signed-off-by: Yi Jin <[email protected]>
no worries, I've tried to fix the tests, might help merge it since all checks pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use force pushes in the future as it makes it hard to follow what changes between reviews, thanks!
…thanos-io#7941) * [Receive] fix race condition Signed-off-by: Yi Jin <[email protected]> * add a change log Signed-off-by: Yi Jin <[email protected]> * memorize tsdb local clients without race condition Signed-off-by: Yi Jin <[email protected]> * fix data race in testing with some concurrent safe helper functions Signed-off-by: Yi Jin <[email protected]> * address comments Signed-off-by: Yi Jin <[email protected]> --------- Signed-off-by: Yi Jin <[email protected]>
…thanos-io#7941) * [Receive] fix race condition Signed-off-by: Yi Jin <[email protected]> * add a change log Signed-off-by: Yi Jin <[email protected]> * memorize tsdb local clients without race condition Signed-off-by: Yi Jin <[email protected]> * fix data race in testing with some concurrent safe helper functions Signed-off-by: Yi Jin <[email protected]> * address comments Signed-off-by: Yi Jin <[email protected]> --------- Signed-off-by: Yi Jin <[email protected]> Signed-off-by: Saswata Mukherjee <[email protected]>
* Merge pull request #7674 from didukh86/query_frontend_tls_redis_fix Query-frontend: Fix connection to Redis cluster with TLS. Signed-off-by: Saswata Mukherjee <[email protected]> * Capnp: Use segment from existing message (#7945) * Capnp: Use segment from existing message Signed-off-by: Filip Petkovski <[email protected]> * Downgrade capnproto Signed-off-by: Filip Petkovski <[email protected]> --------- Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: Saswata Mukherjee <[email protected]> * [Receive] Fix race condition when adding multiple new tenants at once (#7941) * [Receive] fix race condition Signed-off-by: Yi Jin <[email protected]> * add a change log Signed-off-by: Yi Jin <[email protected]> * memorize tsdb local clients without race condition Signed-off-by: Yi Jin <[email protected]> * fix data race in testing with some concurrent safe helper functions Signed-off-by: Yi Jin <[email protected]> * address comments Signed-off-by: Yi Jin <[email protected]> --------- Signed-off-by: Yi Jin <[email protected]> Signed-off-by: Saswata Mukherjee <[email protected]> * Cut patch release v0.37.1 Signed-off-by: Saswata Mukherjee <[email protected]> * Update promql-engine for subquery fix (#7953) Signed-off-by: Saswata Mukherjee <[email protected]> * Sidecar: Ensure limit param is positive for compatibility with older Prometheus (#7954) Signed-off-by: Saswata Mukherjee <[email protected]> * Update changelog Signed-off-by: Saswata Mukherjee <[email protected]> * Fix changelog Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Saswata Mukherjee <[email protected]> Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: Yi Jin <[email protected]> Co-authored-by: Filip Petkovski <[email protected]> Co-authored-by: Yi Jin <[email protected]>
…thanos-io#7941) * [Receive] fix race condition Signed-off-by: Yi Jin <[email protected]> * add a change log Signed-off-by: Yi Jin <[email protected]> * memorize tsdb local clients without race condition Signed-off-by: Yi Jin <[email protected]> * fix data race in testing with some concurrent safe helper functions Signed-off-by: Yi Jin <[email protected]> * address comments Signed-off-by: Yi Jin <[email protected]> --------- Signed-off-by: Yi Jin <[email protected]>
…thanos-io#7941) * [Receive] fix race condition Signed-off-by: Yi Jin <[email protected]> * add a change log Signed-off-by: Yi Jin <[email protected]> * memorize tsdb local clients without race condition Signed-off-by: Yi Jin <[email protected]> * fix data race in testing with some concurrent safe helper functions Signed-off-by: Yi Jin <[email protected]> * address comments Signed-off-by: Yi Jin <[email protected]> --------- Signed-off-by: Yi Jin <[email protected]>
Update: actually fix the issue instead of reverting the old one, memorize the TSDB client list is valuable to avoid creating thousands of slices in memory, see a6fbb9f
This reverted PR #7782 and fixed Issue #7892
Reproducible by newly added unit tests
TestMultiTSDBAddNewTenant
:After this fix, unit test would pass
Changes
Verification