Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a deadlock scenario in envfiles usage #2569

Closed
wants to merge 2 commits into from

Conversation

saich
Copy link
Contributor

@saich saich commented Aug 14, 2020

Summary

Fixes a deadlock scenario when the agent restores the state from its data file and the tasks are using environment files feature

Implementation details

The issue happens in some particular cases where the system waits indefinitely to lock a lock, which it itself has locked it, and hasn't unlocked yet, resulting in a deadlock.

Testing

New tests cover the changes: no

Description for the changelog

Bug - Fixed a bug that could cause a deadlock resulting in the agent not functioning as expected

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@sparrc
Copy link
Contributor

sparrc commented Aug 14, 2020

Hello, thank you for the contribution, rather than keeping the deferred unlock and creating a new "unsafe" function, could you change the implementation to just Unlock after the lock is not used anymore? I believe something like this would avoid the deadlock: 913e3b5

@saich saich force-pushed the fix/envfiles-deadlock branch from efa970f to 31d0fa0 Compare August 18, 2020 01:34
@saich saich force-pushed the fix/envfiles-deadlock branch from 31d0fa0 to f9bfa5c Compare August 18, 2020 01:36
@saich
Copy link
Contributor Author

saich commented Aug 18, 2020

Hey @sparrc, thanks for the feedback. Incorporated your suggestion.

@saich
Copy link
Contributor Author

saich commented Aug 18, 2020

The tests seems to be failing on Travis CI because of an unrelated issue, and seems like an intermittent issue. Appreciate if anyone can restart the tests.

@saich saich marked this pull request as ready for review August 18, 2020 02:06
@yhlee-aws
Copy link
Contributor

We have created new PR with your fix here: #2580 to better manage the PR with some testing issues we are seeing at the moment (we are actively working on unblocking it). Please track this fix through the new PR.

@yhlee-aws yhlee-aws closed this Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants