Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: set sane default for kv.transaction.write_pipelining_max_batch_size #32606

Merged

Conversation

nvanbenschoten
Copy link
Member

Informs #32522.

There is a tradeoff here between the overhead of waiting for consensus for a batch if we don't pipeline and proving that all of the writes in the batch succeed if we do pipeline. We set this default to a value which experimentally strikes a balance between the two costs.

To determine the best value for this setting, I ran a three-node single-AZ AWS cluster with 4 vCPU nodes (m5d.xlarge). I modified KV to perform writes in an explicit txn and to run multiple statements. I then ran kv0 with 8 DML statements per txn (a reasonable estimate for the average number of statements that an explicit txn runs) and adjusted the batch size of these statements from 1 to 256. This resulted in the following graph:

image

We can see that the cross-over point where txn pipelining stops being beneficial is with batch sizes somewhere between 128 and 256 rows. Given this information, I set the default for
kv.transaction.write_pipelining_max_batch_size` to 128.

Of course, there are a lot of variables at play here: storage throughput, replication latency, node size, etc. I think the setup I used hits a reasonable middle ground with these.

Release note: None

@nvanbenschoten nvanbenschoten requested review from tbg and a team November 26, 2018 19:53
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

Informs cockroachdb#32522.

There is a tradeoff here between the overhead of waiting for consensus
for a batch if we don't pipeline and proving that all of the writes in
the batch succeed if we do pipeline. We set this default to a value which
experimentally strikes a balance between the two costs.

To determine the best value for this setting, I ran a three-node single-AZ AWS
cluster with 4 vCPU nodes (`m5d.xlarge`). I modified KV to perform writes in an
explicit txn and to run multiple statements. I then ran `kv0` with 8 DML
statements per txn (a reasonable estimate for the average number of statements
that an **explicit** txn runs) and adjusted the batch size of these statements
from 1 to 256. This resulted in the following graph:

<see graph in PR>

We can see that the cross-over point where txn pipelining stops being beneficial
is with batch sizes somewhere between 128 and 256 rows. Given this information,
I set the default for `kv.transaction.write_pipelining_max_batch_size` to 128.

Of course, there are a lot of variables at play here: storage throughput,
replication latency, node size, etc. I think the setup I used hits a reasonable
middle ground with these.

Release note: None
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/txnPipelineBatch branch from 9d7db78 to 52242a7 Compare November 26, 2018 22:28
@nvanbenschoten
Copy link
Member Author

bors r+

Planning on backporting to 2.1.

craig bot pushed a commit that referenced this pull request Nov 26, 2018
32606: kv: set sane default for kv.transaction.write_pipelining_max_batch_size r=nvanbenschoten a=nvanbenschoten

Informs #32522.

There is a tradeoff here between the overhead of waiting for consensus for a batch if we don't pipeline and proving that all of the writes in the batch succeed if we do pipeline. We set this default to a value which experimentally strikes a balance between the two costs.

To determine the best value for this setting, I ran a three-node single-AZ AWS cluster with 4 vCPU nodes (`m5d.xlarge`). I modified KV to perform writes in an explicit txn and to run multiple statements. I then ran `kv0` with 8 DML statements per txn (a reasonable estimate for the average number of statements that an **explicit** txn runs) and adjusted the batch size of these statements from 1 to 256. This resulted in the following graph:

![image](https://user-images.githubusercontent.com/5438456/49038443-fc91e200-f18a-11e8-810d-1172821e63ea.png)

We can see that the cross-over point where txn pipelining stops being beneficial is with batch sizes somewhere between 128 and 256 rows. Given this information, I set the default for 
 kv.transaction.write_pipelining_max_batch_size` to 128.

Of course, there are a lot of variables at play here: storage throughput, replication latency, node size, etc. I think the setup I used hits a reasonable middle ground with these.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig
Copy link
Contributor

craig bot commented Nov 26, 2018

Build succeeded

@craig craig bot merged commit 52242a7 into cockroachdb:master Nov 26, 2018
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/txnPipelineBatch branch November 27, 2018 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants