Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
117692: changefeedccl: introduce quota to parallelio r=jayshrivastava a=jayshrivastava Problem: In this (#111829) investigation, it was observed that there was a lot of wasted CPU performing unions in intsets. This is caused by the `parallelIO` struct which relies on performing unions to check for conflicting keys. The problem is not with intsets being slow, but it has to do with how they are used: `parallelIO` uses a goroutine to both process incoming requests and emit outgoing results. It accepts incoming requests unconditionally, enqueing them if they cannot be emitted due to conflicting in flight requests. As outgoing results are processed, each outgoing result is cross checked with, in the worst case, all enqueued requests to see if they can be emitted. The cross checking requires unions. ``` incoming request -> request queue -> request handler -> result ^ ^ | cross check all entries to see | | if a new request can be emitted | ``` A problem arises when the incoming request queue grows to some critical length where it significantly slows down the cross checking. This slows down result processing and ultimately slows down consumption from the request queue. This creates a negative feedback loop which causes the request queue to grow so large that results take very long to process. This creates a bottle neck, which throttles the entire changefeed. See comments #115536 for more details. The request queue is unbounded. The only reason it doesn't cause an OOM is because the incoming requests are bounded (by the per-changefeed memory limit). Solution: This change solves this problem by setting a quota for the maximum events being processed by the library at the same time. This change sets a size of 128 requests by default. This setting can be changed using a new cluster setting `changefeed.parallel_io.request_quota`. Before this change, the API for the parallelio library was very bad. It required the caller to select on both the request channel and result channel to prevent deadlock. There were also no public methods. This made it unclear how to properly use the API. This change makes an explicit API with public methods. However, it keeps the same 2-channel scheme because removing that would require a larger refactor. This is left as a TODO. Closes: #115536 Release note: None Epic: None 118288: roachtest: make ruby-pg test work on Ubuntu 22.04 r=rafiss a=rafiss This required updating the version under test, which led to a few new test failures that we track now. fixes #112109 Release note: None Co-authored-by: Jayant Shrivastava <[email protected]> Co-authored-by: Rafi Shamim <[email protected]>
- Loading branch information