-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize metadata merge protocol #44
Comments
Measurements showing performance improvement from v1.0 with the optimizaitons in #68:
These measurements come from my laptop, with 4vcpus (2 cores, 4 threads) and pretty lousy bandwidth and latency from the peer node. |
Measurements from an ec2 test node:
|
100k merge in ec2 in 35s:
|
1MM merge in ec2:
|
👍 |
A final measurement with an additional optimization (occurs check delayed until request time, so that it can happen in parallel):
so we are merging at a cool 2.5K writes/s |
Metadata merges as initially implemented in #41 fetch data batches synchronously within the flow of the query stream.
This may be fine for small merges, but throughput will suffer in larger merges and potentially hold up open query result sets in the source.
The data merge can be implemented with a background goroutine fetching data batches as requested by the primary merge goroutine through a buffered channel.
The text was updated successfully, but these errors were encountered: