-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't suspend writes based on batch size. Instead, log if it seems to be too big #4762
Conversation
This looks fine to me, but I have a question: It looks like the code just adds logging. I don't see code that does the other part of the title: "Don't suspend writes based on batch size." What am I missing? |
{ | ||
std::unique_lock<decltype(mWriteMutex)> sl(mWriteMutex); | ||
|
||
// If the batch has reached its limit, we wait | ||
// until the batch writer is finished | ||
while (mWriteSet.size() >= batchWriteLimitSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HowardHinnant Here's the logic change.
fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 This patch fixes a potential deadlock as well. Needs to be reformatted before merging though. Nice job Mark!
log if it seems to be too big.
@sophiax851 this may require perf signoff |
Internal tracker: RPFC-80 |
May be replaced by #4882 |
|
#4503 writes asynchronously to nudb with a background thread, but it has an upper limit of records that can be in the queue to be written. When reached, writes are blocked. Online deletion copies a ledger and ensures that it writes the entire ledger to nudb. This amount of data being written likely causes further writes to be blocked until the data is written. This includes persisting new records for each new ledger produced, which is synchronous for consensus. This causes desyncs. This PR (#4762) is expected to fix this. If #4882 (which just reverts #4503) is merged, then #4503+#4762 (combined) can still be considered for testing+merging in the future. |
note via @mtrippled : there is no reason to merge this PR. We can consider #4503 (comment) - tweaking NuDB's write buffer is probably better design. We can design micro-benchmarks to show how it affects NuDB itself. Write prioritization is too much complexity, but if NuDB could write faster, then that was the whole point of #4503. |
High Level Overview of Change
Removes a limit that blocks nodestore writes. This delays consensus because each new ledger is persisted during consensus processing.
Context of Change
This removes the check that limits write batch size and blocks when the limit is reached. It also adds logging to indicate when the limit is exceeded. The trade-off is that the pending writes consume memory until they are persisted, so memory can increase without the limit. But since blocking writes result in consensus instability, the existing behavior is a proactive pessimization. Ultimately, pending writes either increase due to increased input (transaction volume), or because of I/O slowdown. If the IO and memory subsystem can't handle the throughput then the process will be unstable regardless.
Type of Change