Address sporadic hanging of evals on certain samples #1482
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As has been brought up before (#1384, #1292, #270), evals suffer from a hanging issue, where an evaluation run will hang for a very long time (if not indefinitely) at the end of a run (say, on the 99th sample of out 100).
This PR addresses this issue, by replacing a seemingly redundant single-threaded thread creation that was happening when making requests, nested inside the already multi-threaded eval loop. My impression is that this nested multithreading was causing overhead that resulted in the hanging experienced.
I had also noticed this hanging issue in
EVALS_SEQUENTIAL=1
mode (where it no longer occurs at the end, but instead randomly in the middle of the run).I was able to identify the source of this issue though debugging print statements that ultimately pointed to the
request_with_timeout
function as the culprit.We have tested the new
request_with_timeout
code on a fork where we have run multiple new and pre-existing evals, including with 3rd party solvers, and found no change in behaviour or errors, and a clear improvement on the hanging issue.