-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changefeedccl: better handle kafka quotas #92290
Comments
cc @cockroachdb/cdc |
cc @cockroachdb/cdc |
Overall, the answer seems be first to design a throttling system for changefeeds, similar in spirit to memory monitoring / admission control / quota pools. Consider scope per pool/bucket: is it per-cluster, per-database or per-changefeed? Actually, perhaps changefeeds could be grouped into arbitrary pools. |
@HonoreDB , I believe you're working on this right now. |
Not going to do this in 23.2. |
Creating a list of sub-issues to track the work that still needs to be done:
|
I discussed this with yev and my understanding of this issue is:
The work needs to be done in this area is:
@miretskiy Could you help me confirm my understanding here? |
Your understanding is exactly right. |
Thanks for the confirmation! |
For my education, I wanted to check how this effect gets propagated. Are we just busy waiting here https://github.com/cockroachdb/cockroach/blob/edd9e94cfd8489492816617200aae8c12946b83f/pkg/ccl/changefeedccl/sink_kafka.go#L558-L577 since the producer will be blocked while waiting for throttled https://github.com/IBM/sarama/blob/2767191b19b2e190f7095f21cac2a014de80e92c/broker.go#L1009-L1010. And since the sarama is waiting, we also wait in workerLoop() which in turn slows down the emitting messages. If this is correct, could you point to me where emitting less messages in turn slow down rangefeed in the code? |
@miretskiy ^ : ) last question 😬 |
Let's discuss offline... |
Previously, users were limited to setting a single kafka quota configuration for cockroachdb which was then applied and restricting all changefeeds. This patch introduces a new changefeed configuration option, allowing users to define client id for different changefeeds, allowing users to specify different kafka quota configurations for different changefeeds. To use it, users can specify a unique client ID using `kafka_sink_config` and configure different quota settings on kafka server based on https://kafka.apache.org/documentation/#quotas. ``` CREATE CHANGEFEED FOR foo WITH kafka_sink_config='{"ClientID": "clientID1"}' ``` Fixes: cockroachdb#92290 Release note: `kafka_sink_config` now supports specifying a different client ID for different changefeeds, enabling users to define distinct kafka quota configurations for various changefeeds.
Previously, users were limited to setting a single kafka quota configuration for cockroachdb which was then applied and restricting all changefeeds. This patch introduces a new changefeed configuration option, allowing users to define client id for different changefeeds, allowing users to specify different kafka quota configurations for different changefeeds. To use it, users can specify a unique client ID using `kafka_sink_config` and configure different quota settings on kafka server based on https://kafka.apache.org/documentation/#quotas. ``` CREATE CHANGEFEED FOR foo WITH kafka_sink_config='{"ClientID": "clientID1"}' ``` Note that Fixes: cockroachdb#92290 Release note: `kafka_sink_config` now supports specifying a different client ID for different changefeeds, enabling users to define distinct kafka quota configurations for various changefeeds. For any kafka versions >= V1_0_0_0 ([KIP-190: Handle client-ids consistently between clients and brokers](https://cwiki.apache.org/confluence/display/KAFKA/KIP-190%3A+Handle+client-ids+consistently+between+clients+and+brokers)), any string can be used as client ID. For earlier kafka versions, clientID can only contain characters [A-Za-z0-9._-] are acceptable. For example, ``` CREATE CHANGEFEED FOR ... WITH kafka_sink_config='{"ClientID": "clientID1"}' ```
118643: changefeedccl: allow per changefeed kafka quota config r=rharding6373 a=wenyihu6 Previously, users were limited to setting a single kafka quota configuration for cockroachdb which was then applied and restricting all changefeeds. This patch introduces a new changefeed configuration option, allowing users to define client id for different changefeeds, allowing users to specify different kafka quota configurations for different changefeeds. To use it, users can specify a unique client ID using `kafka_sink_config` and configure different quota settings on kafka server based on https://kafka.apache.org/documentation/#quotas. ``` CREATE CHANGEFEED FOR foo WITH kafka_sink_config='{"ClientID": "clientID1"}' ``` Fixes: #92290 Release note: `kafka_sink_config` now supports specifying a different client ID for different changefeeds, enabling users to define distinct kafka quota configurations for various changefeeds. For any kafka versions >= V1_0_0_0 ([KIP-190: Handle client-ids consistently between clients and brokers](https://cwiki.apache.org/confluence/display/KAFKA/KIP-190%3A+Handle+client-ids+consistently+between+clients+and+brokers)), any string can be used as client ID. For earlier kafka versions, clientID can only contain characters [A-Za-z0-9._-] are acceptable. For example, ``` CREATE CHANGEFEED FOR ... WITH kafka_sink_config='{"ClientID": "clientID1"}' ``` Co-authored-by: Wenyi Hu <[email protected]>
Previously, users were limited to setting a single kafka quota configuration for cockroachdb which was then applied and restricting all changefeeds. This patch introduces a new changefeed configuration option, allowing users to define client id for different changefeeds, allowing users to specify different kafka quota configurations for different changefeeds. To use it, users can specify a unique client ID using `kafka_sink_config` and configure different quota settings on kafka server based on https://kafka.apache.org/documentation/#quotas. ``` CREATE CHANGEFEED FOR foo WITH kafka_sink_config='{"ClientID": "clientID1"}' ``` Note that Fixes: cockroachdb#92290 Release note: `kafka_sink_config` now supports specifying a different client ID for different changefeeds, enabling users to define distinct kafka quota configurations for various changefeeds. For any kafka versions >= V1_0_0_0 ([KIP-190: Handle client-ids consistently between clients and brokers](https://cwiki.apache.org/confluence/display/KAFKA/KIP-190%3A+Handle+client-ids+consistently+between+clients+and+brokers)), any string can be used as client ID. For earlier kafka versions, clientID can only contain characters [A-Za-z0-9._-] are acceptable. For example, ``` CREATE CHANGEFEED FOR ... WITH kafka_sink_config='{"ClientID": "clientID1"}' ```
Kafka users can set a data quota in kafka. As a high throughput, high scale database, we can easily run up on those quotas. Customers cannot have an arbitrarily large kafka cluster to handle traffic bursts.
Likely what we want to do is:
We should also add end-to-end testing for the scenario where kafka is purposely slow
Jira issue: CRDB-21692
Epic CRDB-21691
The text was updated successfully, but these errors were encountered: