Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for cycle detection issue #2112

Merged
merged 5 commits into from
Jan 19, 2024
Merged

Fix for cycle detection issue #2112

merged 5 commits into from
Jan 19, 2024

Conversation

lhstrh
Copy link
Member

@lhstrh lhstrh commented Nov 20, 2023

Fixes #2033.

@lhstrh lhstrh added the bugfix label Nov 20, 2023
@lhstrh lhstrh requested review from edwardalee and cmnrd November 20, 2023 07:57
@lhstrh
Copy link
Member Author

lhstrh commented Nov 20, 2023

Looks like the code I touched is also used for federation to figure out the topology. For cycle detection, a physical connection should break the cycle, which it does now, but it makes federated tests fail. Not sure how to address this. @edwardalee, could you take a look?

@edwardalee
Copy link
Collaborator

It is not surprising that this change would cause failures. It changes the semantics of the eventualDestinations method. I'm surprised there aren't unfederated failures. I also think that excluding connections with delays is suds. Maybe this is related to federated failures with multiport's and delays?

@lhstrh
Copy link
Member Author

lhstrh commented Nov 20, 2023

To me, it is surprising. This code is used for causality analysis in the validator. Maybe it should not be, but for that use, connections that are physical or have delay should not imply dependencies.

@edwardalee
Copy link
Collaborator

The eventualDestinations method comment says this:

   * Return a list of ranges of this port, where each range sends to a list of destination ports
   * that receive data from the range of this port. Each destination port is annotated with the
   * channel range on which it receives data. The ports listed are only ports that are sources for
   * reactions, not relay ports that the data may go through on the way. Also, if there is an
   * "after" delay anywhere along the path, then the destination is not in the resulting list.

I think your reasoning is that a physical connection is much like an after connection with an unspecified amount of delay. This is reasonable, so changing this method (and its description) is defensible, but it still means making sure that all uses still work. If I had to guess, there is a place where this method is used to set num_destinations, and maybe this is being used in federated execution.

@byeonggiljun
Copy link
Collaborator

byeonggiljun commented Jan 18, 2024

@lhstrh I fixed the compile error by not assigning the TPO level to physical connections and added a test for this PR's change. Could you review them?

@lhstrh lhstrh requested a review from petervdonovan January 18, 2024 23:24
Copy link
Collaborator

@petervdonovan petervdonovan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of any way that these changes could cause problems. I think it's true that physical connections do not need to be addressed in TPO.

Copy link
Member Author

@lhstrh lhstrh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the right solution to me!

@lhstrh lhstrh added this pull request to the merge queue Jan 19, 2024
Merged via the queue into master with commit eca1847 Jan 19, 2024
43 checks passed
@lhstrh lhstrh deleted the physical-cycle branch January 19, 2024 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cycle detection erroneously reports cycles with physical connections
4 participants