Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move record-progress processing to separate thread #618

Merged
merged 1 commit into from
Mar 11, 2022

Conversation

Mark-Simulacrum
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented Mar 11, 2022

This is intended to let us prioritize work on other requests over work on
record-progress, thereby avoiding some of the timeouts and "database is locked"
errors we would otherwise see when the record-progress requests happen to take
priority.

This separate thread is designed to only run when the server has no requests
in-flight (other than a short, bounded, queue of record-progress requests). If
that queue fills up, we will tell workers to slow down, causing them to retry
requests -- currently at fixed intervals and per worker thread, but a future
commit might clean that up a little to have a more intentional delay.

In general this should, hopefully, decrease the error rate as particularly
human-initiated requests should never have to wait for more than one
record-progress event to complete before having largely uncontended access to the
database. (Other requests still happen concurrently, but requests are typically
very rare in comparison to record-progress which are multiple times a second,
effectively constantly processing).

Errors like rust-lang/rust#94775 (comment) are the primary motivation here, which I hope this is enough to largely clear up.

This is intended to let us prioritize work on other requests over work on
record-progress, thereby avoiding some of the timeouts and "database is locked"
errors we would otherwise see when the record-progress requests happen to take
priority.

This separate thread is designed to only run when the server has no requests
in-flight (other than a short, bounded, queue of record-progress requests). If
that queue fills up, we will tell workers to slow down, causing them to retry
requests -- currently at fixed intervals and per worker thread, but a future
commit might clean that up a little to have a more intentional delay.

In general this should, hopefully, decrease the error rate as particularly
human-initiated requests should never have to wait for more than one
record-progress event to complete before having largely uncontended access to the
database. (Other requests still happen concurrently, but requests are typically
very rare in comparison to record-progress which are multiple times a second,
effectively constantly processing).
@Mark-Simulacrum
Copy link
Member Author

@bors r+

@bors
Copy link
Contributor

bors commented Mar 11, 2022

📌 Commit 229283e has been approved by Mark-Simulacrum

@bors
Copy link
Contributor

bors commented Mar 11, 2022

⌛ Testing commit 229283e with merge f9baf09...

@bors
Copy link
Contributor

bors commented Mar 11, 2022

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing f9baf09 to master...

@bors bors merged commit f9baf09 into rust-lang:master Mar 11, 2022
@Mark-Simulacrum Mark-Simulacrum deleted the opt branch March 11, 2022 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants