Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proc_dff: bit-granularity optimizations and refactoring #4781

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

georgerennie
Copy link
Collaborator

@georgerennie georgerennie commented Nov 28, 2024

proc_dff converts processes (sets of sync rules) into flip-flops through what is essentially structural pattern matching. As part of this it tries to make some optimizations to the sync rules so that it can produce simpler flip-flops (e.g. $adff and $dff instead of $aldff and $dffsr). These optimizations were previously applied in a fairly adhoc manner sprinkled throughout the inference which made it a pain to improve and have confidence in the correctness, as well as limiting the extent to which the optimizations could be applied. They were applied on full signals as found in the lhs of sync rules and thus could miss potential for optimization where different parts of a signal are best matched by different flip-flops.

Motivated by a pattern I saw with sv2v where a struct is lowered to one wire and may be partially reset even though the whole thing gets assigned at once, this example was being lowered to an $aldff, even though actually it is just the combination of a $dffe and an $adff in different parts of it.

module top(input wire clk, input wire rst, output reg [7:0] q, input wire [7:0] d);
always @(posedge clk or posedge rst) begin
	if (rst) q[3:0] <= '0;
	else     q <= d;
end
endmodule

This pr refactors proc_dff into three parts that are iterated on whilst there are still signals needing DFFs: extracting the relevant sync rules from the process, optimizing them and then generating flip-flop cells. The optimizations narrow the width of the signal that the flop is currently being generated for to the largest range of bits starting at the LSB that can have all the same optimizations applied as the LSB. This means that range is as optimized as it can be. The bits that are removed doing this are not deleted from the process and so are considered as a target in the next iteration. It is probably easiest to see the optimizations and choices of flip-flops by looking at the code which should be fairly well documented. For standard use-cases this should give basically the same results as before this change, it just allows supporting more corner cases.

This pr also fixes an issue in opt_dff where sigmap wasn't being used so it would fail to fold some muxes into enable signals. This caused test failures with the proc_dff changes. It also adds test cases for these new proc_dff optimizations.

To test this, it would be good to try running reasonable size verilog designs (ideally with async resets) through proc and checking the inferred flops are not a regression from previous Yosys. I believe Amaranth doesn't use sync processes so read_verilog and yosys-slang are probably the main interfaces affected by this.

* Instead of an ad hoc mix of optimizations and inferences, this tries
  to make it more principled by first extracting a set of asynchronous
  update rules from the process, then optimizing them before lowering
  them to a concrete flip-flop type, preferring simpler ones
@georgerennie
Copy link
Collaborator Author

As a side note, I think there are other bits of proc that could do with a bit of tidying up and being adapted to cover more general patterns. Maybe I'll have a look at proc_arst at somepoint...

@georgerennie georgerennie marked this pull request as ready for review November 28, 2024 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant