Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No more use of unordered reactions in federated programs; fix for deadlocks in some federated programs #1684

Merged
merged 144 commits into from
Jul 27, 2023

Conversation

arengarajan99
Copy link
Contributor

@arengarajan99 arengarajan99 commented Apr 2, 2023

The current runtime support for federated execution relies on a set of blocking unordered "control reactions" for each input port that to block the handling of reactions until the required port statuses are known. The drawbacks of this design are discussed in Rengarajan 2023, along with a solution that no longer relies on the use of blocking unordered reactions. This PR implements that solution.

This work was initiated by @arengarajan99 and completed by @petervdonovan and @byeong-gil.

Companion PRs

Tasks completed

  • Use a less hacky mechanism for marking ports that receive from delayed connections as not part of the MLAA calculation (@petervdonovan)
  • Find dead/unnecessary code in the code generator and delete it (split this into sub-tasks as needed) (the CodeCov reports may help with this)
  • Pass the other C tests besides the C federated tests
  • Ensure IntelliJ does not emit warnings related to the code that is touched by this and the other related PRs
  • Get this branch up to date with master (@petervdonovan)
  • Check that this also works for the Python target (@petervdonovan)
  • Factor C-specific code out of FedASTUtils and into CExtension or CExtensionUtils (@petervdonovan)
  • Get all tests in reactor-c to pass (@petervdonovan)
  • Get the TypeScript runtime to be compatible with this branch (@byeong-gil)
  • Get the Python tests to pass (before fixing the Python tests) (@arengarajan99)
  • Fix the broken C tests and check that they still pass See Update level assignment algorithm #1865 (comment) (EDIT: This was explained by Edward here and has since been properly addressed.)
  • Fix the broken Python tests and check that they still pass or that they are at least not regressing wrt master (@petervdonovan)
  • Check that all code added has JavaDoc (@petervdonovan)
  • Everywhere rename outputControlReaction -> portAbsentReaction
  • Do not require lots of threads to be available in order to correctly execute federated programs
  • Find dead/unnecessary code in the C runtime and delete it

For remaining non-required tasks, see: #1922

Fixes #1708.

The ability to do this is the initial motivation for this PR.
@petervdonovan
Copy link
Collaborator

@lhstrh I anticipate that the tests will pass now (they pass locally and have been passing in recent commits). I know that this has been marked "ready for review" for some time just to trigger CI, but now I have cleaned it up and believe it is truly ready for review.

common class `NetworkReactor`

Those two classes have no common method or variable. However, it does
when we want to apply the TPO level concept to reactor-ts. The common
parent class `NetworkReactor` will be added for that in future work.
In reactor-ts, from now on, network receivers are stored in a map
consists of portID and network receivers. Thus, portID should be passed
when network receivers are registered to the map.
Copy link
Member

@lhstrh lhstrh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am only able to spot check this PR because it is so large, but my impression is that this yielded significant improvement in overall code quality with the many fixes and refactoring steps carried out in this effort. Thank you a ton, @petervdonovan and @byeong-gil, for bringing this to the finish line. Considering all the troubles you encountered on the way, I'm thoroughly impressed that you managed to get all the tests to pass! Congrats on reaching this milestone, everyone!

@lhstrh lhstrh enabled auto-merge July 27, 2023 19:40
@lhstrh lhstrh changed the title Remove Unordered Reactions from Federated Runtimes No more use of unordered reactions in federated programs Jul 27, 2023
@lhstrh lhstrh added this pull request to the merge queue Jul 27, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 27, 2023
@lhstrh lhstrh added this pull request to the merge queue Jul 27, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 27, 2023
@lhstrh lhstrh added this pull request to the merge queue Jul 27, 2023
Merged via the queue into master with commit 48b1b37 Jul 27, 2023
@lhstrh lhstrh deleted the remove-unordered-reactions branch July 27, 2023 22:52
@petervdonovan petervdonovan added refactoring Code quality enhancement bugfix labels Aug 26, 2023
@petervdonovan petervdonovan changed the title No more use of unordered reactions in federated programs No more use of unordered reactions in federated programs; fix for deadlocks in some federated programs Aug 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix refactoring Code quality enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ambiguous reaction precedence rules cause deadlock
5 participants