Move record-progress processing to separate thread #618
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is intended to let us prioritize work on other requests over work on
record-progress, thereby avoiding some of the timeouts and "database is locked"
errors we would otherwise see when the record-progress requests happen to take
priority.
This separate thread is designed to only run when the server has no requests
in-flight (other than a short, bounded, queue of record-progress requests). If
that queue fills up, we will tell workers to slow down, causing them to retry
requests -- currently at fixed intervals and per worker thread, but a future
commit might clean that up a little to have a more intentional delay.
In general this should, hopefully, decrease the error rate as particularly
human-initiated requests should never have to wait for more than one
record-progress event to complete before having largely uncontended access to the
database. (Other requests still happen concurrently, but requests are typically
very rare in comparison to record-progress which are multiple times a second,
effectively constantly processing).
Errors like rust-lang/rust#94775 (comment) are the primary motivation here, which I hope this is enough to largely clear up.