Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the invocation of parallel methods (HAST-246) #36

Open
Piedone opened this issue Oct 23, 2019 · 0 comments
Open

Optimize the invocation of parallel methods (HAST-246) #36

Piedone opened this issue Oct 23, 2019 · 0 comments

Comments

@Piedone
Copy link
Member

Piedone commented Oct 23, 2019

When Tasks are started in a loop then there will be an InvocationProxy-like structure in the VHDL (see invocationIndex) where we select the next available instance of the parallelized method (which was created from a lambda expression) to start. This has as many branches as many instances there are. It's not an issue with lower instance counts or simpler initialization logic but once it gets more complex timing errors can happen.

Then there's a corresponding wait state ("Waiting for the state machine invocation of the following method to finish...") that checks all the instances' start and finish signals, which again can get complex.

While both just scale linearly we could optimize to make them simpler somehow. I don't know whether we can do anything really with the wait state, as it's unavoidable for us to check at one point whether all the FSMs finished. But the invocation part could be simpler by pairing invocations similar to invocations between standard methods' FSMs.

Possible approaches:

  • The most promising: Since just a single FSM is started at a given time it would work to push input data to common global registers and have an invocationIndex register as well (but a single signal can't have multiple drivers). If invocationIndex contains the index corresponding to a given FSM then that FSM will start itself. However, this would need significant architectural changes. Alternatively, we could add small pieces of glue logic between the existing parallel FSMs and such global registers (for every FSM there would be some combinatorial logic listening to its corresponding invocationIndex). However, this might not help as all because the current logic is supposed to describe the same too.
  • Possibly related: Loop unrolling (HAST-114) #14. One solution might be to unroll the Task-creating loops and pair an instance in each unrolled loop body. This, however, would be pretty hard to implement and if the loop body is complex then it'd also greatly increase resource usage.
  • This SO answer mentions using shift registers instead multiplexers. However, at higher levels of parallelism shifting out inputs from a register would take a lot of time for higher FSM indices.

Jira issue

@github-actions github-actions bot changed the title Optimize the invocation of parallel methods Optimize the invocation of parallel methods (HAST-246) Sep 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant