-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Mono]: Potential deadlock during EventPipe rundown using interpreter. #58996
[Mono]: Potential deadlock during EventPipe rundown using interpreter. #58996
Conversation
Identified deadlock between finalizer thread and rundown enumerating all interpreter method. Since rundown will query for method name in callback when iterating interpreter methods, interp_jit_info_foreach, that might lead to additional loader activity. The hash map in interpreter keeping the methods is locked with default JIT memory manager, but since the callback might end up in mono_class_create_from_typedef that will take loader lock, we get the following lock order on that code path, memory manager->loader lock. Finalizer thread invokes OnThreadExiting using interpreter and that might end up with a reverse lock order, loader lock->memory manager on that code path, so these two have a potential to deadlock. This is not a problem under JIT or AOT since the JIT hash table is lock free therefore not causing any deadlocks due to lock order between memory manager and loader lock. Could be fixed by changing into a lock free hash table in interpreters for interp_code_hash might be to risky at this point. A more safe fix is to take a copy of the pointers while holding lock and then iterate using local copy (simple array of pointers). Since this method is only called during rundown, only when using interpreter, and only include the pointers (InterpMethod *) we use with the callback, it will have some temporary memory impact (allocating an array of pointers), but will mitigate the deadlock since we can safely call iterator callback without holding the lock. It will also improve interpreter performance in situations where we run session rundown, since lock will be held a much shorter amount of time.
Tagging subscribers to this area: @BrzVlad Issue DetailsIdentified potential deadlock between finalizer thread and rundown enumerating all interpreter method. Since rundown will query for method name in callback when iterating interpreter methods, interp_jit_info_foreach, it might lead to additional loader activity. The hash map in interpreter keeping the methods is locked with default JIT memory manager, but since the callback might end up in mono_class_create_from_typedef that will take loader lock, we get the following lock order on that code path, memory manager->loader lock. Finalizer thread invokes OnThreadExiting using interpreter and that might end up with a reverse lock order, loader lock->memory manager on that code path, so these two have a potential to deadlock. This is not a problem under JIT or AOT since the JIT hash table is lock free therefore not causing any deadlocks due to lock order between memory manager and loader lock. Issue hit on CI by at least: Could be fixed by changing into a lock free hash table in interpreters for interp_code_hash but might be to risky at this point. A more safe fix is to take a copy of the pointers while holding lock and then iterate using local copy (simple array of pointers). Since this method is only called during rundown, only when using interpreter, and only include the pointers (InterpMethod *) we use with the callback, it will have some temporary memory impact (allocating an array of pointers), but will mitigate the deadlock since we can safely call iterator callback without holding the lock. It will also improve interpreter performance in situations where we run session rundown, since lock will be held a much shorter amount of time.
|
@lateralusX Is it possible that this also fixes #56449 ? |
@BrzVlad Test must start/stop EventPipe sessions with rundown enabled to potential hit this deadlock, normally there are only specific EventPipe tests that does that on CI, so unless this test runs EventPipe session, it won't trigger these code paths. |
/backport to release/6.0 |
1 similar comment
/backport to release/6.0 |
Started backporting to release/6.0: https://github.com/dotnet/runtime/actions/runs/1228737819 |
Identified potential deadlock between finalizer thread and rundown enumerating all interpreter method. Since rundown will query for method name in callback when iterating interpreter methods, interp_jit_info_foreach, it might lead to additional loader activity. The hash map in interpreter keeping the methods is locked with default JIT memory manager, but since the callback might end up in mono_class_create_from_typedef that will take loader lock, we get the following lock order on that code path, memory manager->loader lock. Finalizer thread invokes OnThreadExiting using interpreter and that might end up with a reverse lock order, loader lock->memory manager on that code path, so these two have a potential to deadlock. This is not a problem under JIT or AOT since the JIT hash table is lock free therefore not causing any deadlocks due to lock order between memory manager and loader lock.
Issue hit on CI by at least:
#58781
#58599
Could be fixed by changing into a lock free hash table in interpreters for interp_code_hash but might be to risky at this point. A more safe fix is to take a copy of the pointers while holding lock and then iterate using local copy (simple array of pointers). Since this method is only called during rundown, only when using interpreter, and only include the pointers (InterpMethod *) we use with the callback, it will have some temporary memory impact (allocating an array of pointers), but will mitigate the deadlock since we can safely call iterator callback without holding the lock. It will also improve interpreter performance in situations where we run session rundown, since lock will be held a much shorter amount of time.