-
Notifications
You must be signed in to change notification settings - Fork 177
Add -fpga flag to enable FPGA-oriented compilation strategies (currently for memories) #2111
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started reviewing before I realized this was a draft, looks good though!
src/main/scala/firrtl/passes/memlib/SetDefaultReadUnderWrite.scala
Outdated
Show resolved
Hide resolved
@albert-magyar, could you add some HighFirrtl circuits which would synthesize differently with your new fpga target? |
Here is an example that stresses all of the improvements. With the FPGA flag enabled, this will emit a 1024x32b, 2-RW, synchronous-read memory. You will need to target an FPGA architecture that provides underlying memories that can be configured to support two RW ports, so I would recommend starting with the 7-series architecture if you're going to try with Yosys.
|
47544e4
to
df5c997
Compare
Thanks for the example @albert-magyar. |
df5c997
to
1896cca
Compare
1896cca
to
2cee5fc
Compare
Thanks for looking into this regardless. It is always nice to try to make this open-source "CI-able." |
@albert-magyar we found another case where FPGA and ASIC generation differs. Reading memory files in SYNTHESYS block. |
I agree that |
I put the whole memory initialization question on the dev meeting agenda for Monday. We will put what ever solution we come up with in a follow up PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be happy to merge this as is. But we should wait for @seldridge or @jackkoenig to also give their OK.
One note for discussion/review is that the enhancements to Also, I am leaving this as a draft for now, since there is still testing of various cases that I would like to do with Vivado. I am not sure when I will have time to do that. |
2cee5fc
to
173ece0
Compare
d048bb9
to
19d76d2
Compare
I updated this with some Scaladoc. |
19d76d2
to
31c94d6
Compare
* This is enabled by adding a PassthroughSimpleSyncReadMemsAnnotation * Can be emitted directly with new changes to the Verilog emitter * Add some new deprecations to VerilogMemDelays * Run scalafmt on VerilogMemDelays
* Optionally defines read-under-write behavior for all 'undefined' memories * Use DefaultReadFirstAnnotation to choose read-first default * Use DefaultWriteFirstAnnotation to choose write-first default * Seal DefaultReadUnderWriteAnnotation based on Jack's feedback
* Update test to include both 'old' and 'new' read-under-write values
* Address @ekiwi comments from review * Change match cases to scalafmt-mandated lined-up style
* Update name of FPGA flag based on Jack's comment * Add Scaladoc to describe what each constituent transform does * Add SeparateWriteClocks to --target:fpga
d3bbfca
to
a95df00
Compare
a95df00
to
1afa3b4
Compare
I tried to test it with Intel Quartus, using the following Chisel code:
I am using x.5-SNAPSHOT Without the "--target:fpga" it generates registers, which we expect. With the option enabled it generates invalid Verilog code:
Quartus complains:
|
Thanks for the bug report @schoeberl ! Could you try to fix the generated Verilog manually so that the memory is correctly inferred by Quartus? The diff of changes that you had to apply might be helpful in fixing this issue. |
OK, I will try. BTW, I am wondering why the generated Verilog code has actually 3 read ports and from 3 possible different clocks. One of them, |
Update: the Verilog error resulted from an error in my Chisel code, as followed:
having the wrong address for the second write port. After fixing it, Quartus compiles but does not infer an on-chip memory, but registers. Will try to propose working Verilog code. |
Quartus needs Verilog that requests newly written data to be returned on the reading. So following code works:
The only differences are the following lines added:
and
However, a better style would be to have the read on an else branch:
BTW, I am wondering what signal
|
According to firrtl semantics, when the |
At least on Xilinx FPGAs the read enable (as currently supported by firrtl) can be inferred for clock gating BRAMs. I'm also using it as an implementation trick to keep the read data value unchanged (which probably doesn't quite match the firrtl semantics but nonetheless the generated Verilog works fine with Vivado). |
I wish we had a good way to automatically test for regressions in how Quartus/Vivado synthesize our memories. |
Some behavior of hardware can simply not be expressed in Verilog or VHDL. One example is undefined behavior on read during write being undefined for the same address. Bad, but that is what HW and HW languages are. Keeping read data unchanged when disabling read enable is, I guess, not a strong property of the language and not so easily achieved in HW. What is the implementation in an ASIC RAM? Using read enable to enable a register update for the read address? I think we would need to reread the Verilog and VHDL spec for this corner case. For other logic when an update depends on an enable signal this would generate latches. Something we would like to avoid. |
I have no experience with ASIC tapeouts. I did find this manual for a suite of memory compilers: https://users.ece.cmu.edu/~koopman/ece548/hw/hw5/meml80.pdf On page 17 it describes the interface for a
So if we use this cell from a Chisel design, we could wire the |
The behavior of retaining the last read value is definitely not guaranteed by the Chisel semantics. |
Type of change: enhancement
API impact: API addition + make an internal method private (can keep public in backport).
Backend code-generation impact: More readwrite inference. Also, significant changes to emitted Verilog iff new flags set.
Desired merge strategy: merge
Release notes (WIP):
FirrtlStage
now supports new flags to incorporate compilation strategies intended to better integrate with downstream FPGA tools. When the--target:fpga
flag is used, the Verilog emitted for synchronous-read memories will change significantly. However, the default flow will produce Verilog that is identical to head-of-tree before this PR, aside from the purely syntactic change of using more net-declaration assignments.The existing, optional
InferReadWrite
pass may now infer readwrite ports from read/write port pairs that share the same clock and address for undefined read-under-write synchronous-read memories. Previously recognized cases with mutually exclusive read and write enables will continue to be combined as before.Commentary:
(This is a draft; it still needs documentation and more tests.)
After the discussion on #2092 and #2094, I am opening this to subsume both of those PRs and a few related changes. This PR introduces a number of optional compiler strategies that are all enabled by the
--target:fpga
flag:VerilogMemDelays
without introducing explicit pipeline registers or splitting ports.InferReadWrite
(which also receives some enhancements; see below).Most of these are effectively vendor-neutral, as they simply avoid the elaborate and inference-unfriendly permutation of FIRRTL memories and work around the behavioral memory API in Chisel. While the strong preference for read-first ports in (3) is based largely on experience with Xilinx FPGAs, the extremely strict semantics of write-first FIRRTL memories are also not the simplest or most generic pattern for inference to any underlying macro.
Furthermore, these are combined with some other enhancements. The VerilogEmitter is now capable of emitting simple synchronous-read memories intact; however, this will never happen with the default flow, as
VerilogMemDelays
retains its previous, conservative behavior without the--target:fpga
flag. Finally, whenInferReadWrite
is enabled, some extra matching-address port pairs (from #2094) will be combined. One open question is whether this enhancement should be tied to the--target:fpga
flag or made available by default as part of--infer-rw
.Milestone:
This will backport cleanly to
1.4.x
with a couple of minor changes. Notably, this sets up a deprecation for a future change ofVerilogMemDelays
from aPass
to aTransform
so that it can look at annotations. For now, it uses an override ofexecute
and a deprecation warning. Since this flow is generally made up of optional features, the consensus in the meeting was that this would make sense as a1.4.x
milestone if there is general support to add these features as options.Contributor Checklist
Reviewer Checklist (only modified by reviewer)
Please Merge
?