-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roslyn failing to clean up listeners #48816
Comments
One of those PRs has been open for five months, the other six days. If we can't merge a fix today then we need to push foward with disabling the assert. This is the number 1 blocker for our builds. It's been a known issue for sometime and hasn't gotten the necessary attention. |
Having a little issue with the live stats function but you can see the damage here. |
Historically there were many other causes for this. Here are some of the other PRs that helped: It was difficult to demonstrate the need for #44514 when more acute problems remained, but now that it's clear and the complex |
The live issue is now fixed so stats are updating. The data only goes back ~15 days for error messages though. I didn't start tracking that until yesterday and only ran a back fill push for last 200 or so builds. |
Part of the issue here appears to be that the code is using a one minute timeout to gate cancellation. The IDE team also made a change to CI to over utilize cores on all of our test legs. That isn't a reasonable combination. The timeout can and will get hit for normal starvation reasons on a regular basis. We need to pull this timeout entirely if we want to have stable CI |
Tests are expected to cancel all pending operations before the end of the test. The cleanup code has fast-path handling for this case to guarantee the timeout will not come into play for correctly written code. All failures of this check are either a true product bug or a true test bug for the test that timed out. Some known product bugs remain. For example, #44522 is a case where we fail to cancel asynchronous operations when the owning object is disposed. We just haven't reached a point where we could demonstrate that particular code path was responsible for a failure because the investigations keep revealing other cases that we fix. |
So how do we move forward here? This is the primary issue causing instability in CI. My instinct is to say that we should delete the assert until we get fixes for the underlyin gproduct bugs. |
Closing out due to lack of movement in 4 years. IF we need to do something here, we shoudl do it. |
Runfo Tracking Issue: Roslyn failing to clean up listeners
Build Result Summary
The text was updated successfully, but these errors were encountered: