[rtl,racl] Fixes to RACL signaling to ensure non-X #26180

alees24 · 2025-02-09T20:26:25Z

Presently all RACL log-generating IP include an assertion that all of 'racl_error_log_o' is known. This is not true in simulation unless the 'a_valid' input signal is asserted. In simulation the other signals are deliberately driven to 'X' when a_valid is deasserted.

Therefore this change suppresses the RACL log information when it is invalid.

Attention please: This is a draft PR to raise awareness of the issue, and it allows simulations to proceed. However:

I presume that the racl_error_log_o data is used only when qualified racl_error_o is asserted, since there is no obvious validity indicator in the log information itself. racl_error_o and racl_error_log_o presently may not have the same timing if the tlul_adapter_reg timing is changed, since reg_re/reg_we may have different timing to tl_i. I have raised this concern before, but note that the racl_error_log_o signaling is already driven directly from tl_i so this change introduces no new fragility.
This is to resolve an assertion issue, since ASSERT_KNOWN is used in the top module of the block-level IP, rather than a design issue and yet it does introduce additional logic into the design that is unnecessary if its use is qualified with 'racl_error_o' assertion.
Is it worth considering merging 'racl_error_o' and 'racl_error_log_o' into a single structure, so that they can be treated as a consistent whole, or would that change be too far-reaching at this stage?
My preference would be to modify the assertions and annotate their behavior.

Aside: it's presently built upon the ac_range_check fixes just because both affect the auto-generated RTL.

Use strict SystemVerilog typing in assignments. Introduce assertion that prim_count errors become alerts. Signed-off-by: Adrian Lees <[email protected]>

Presently all RACL log-generating IP includes an assertion that all of 'racl_error_log_o' is known. This is not true in simulation unless the 'a_valid' is asserted. Therefore suppress the RACL log information when it is invalid. Signed-off-by: Adrian Lees <[email protected]>

Razer6 · 2025-02-09T20:50:32Z

hw/top_darjeeling/ip_autogen/racl_ctrl/rtl/racl_ctrl_reg_top.sv

  assign racl_error_o = |addr_hit & ((reg_re & ~|racl_addr_hit_read) |
                                     (reg_we & ~|racl_addr_hit_write));
-  assign racl_error_log_o.racl_role  = racl_role;
+  assign racl_error_log_o.racl_role  = tl_i.a_valid ? racl_role : '0;


Since in racl_ctrl, capturing the logs is qualified with racl_error_o, we can remove these additional qualifications and tweak the assertions I guess.

However when thinking about it, it might become necessary because of the way racl_ctrl at the moment combines all racl errors. There is no arbitration and all error logs are simply ORed together, assuming that only one is valid and all others are 0.

Do you care about multiple RACL errors occurring simultaneously? Surely this is possible with multiple outstanding requests from multiple hosts?

However when thinking about it, it might become necessary because of the way racl_ctrl at the moment combines all racl errors. There is no arbitration and all error logs are simply ORed together, assuming that only one is valid and all others are 0.

Then it's already necessary, and qualifying the 'racl_error_log_o' with tl_i.a_valid is not sufficient. It must be qualified with 'racl_error_o' for each individual IP, because nothing prevents a transaction on another device occurring simultaneously with a non-zero 'racl_role' and/or non-zero 'ctn_uid', whether or not that transaction generates a RACL error. Also, other devices will see 'Get' on their TL input ports when they are idle because the TL-UL traffic is not zeroed in the TL-UL fabric (see prim_fifo_sync implementation when using 'Pass'through).

Razer6 · 2025-02-09T20:51:30Z

I presume that the racl_error_log_o data is used only when qualified racl_error_o is asserted, since there is no obvious validity indicator in the log information itself.

That is correct.

racl_error_o and racl_error_log_o presently may not have the same timing if the tlul_adapter_reg timing is changed, since reg_re/reg_we may have different timing to tl_i. I have raised this concern before, but note that the racl_error_log_o signaling is already driven directly from tl_i so this change introduces no new fragility.

Within the reg_tops, I agree. If the tlul_adapter_reg is configured differently, i.e., adding a flop stage inside, this would change the timings of the signals and impacting RACL. Though this is not the general use case, and strictly controlled by the templates, I think we are safe to go as there is a valid and RACL compatible configuration inside.

When using the tlul_adapter_reg with RACL, there is tlul_adapter_reg_racl, which implements a RACL shim in front of the adapter itself that works independently of the timings of the used tlul_adapter_reg.

Does that clarify the implementation?

3. Is it worth considering merging 'racl_error_o' and 'racl_error_log_o' into a single structure, so that they can be treated as a consistent whole, or would that change be too far-reaching at this stage?
My preference would be to modify the assertions and annotate their behavior.

We can go that route now.

alees24 · 2025-02-10T09:37:06Z

Is it worth considering merging 'racl_error_o' and 'racl_error_log_o' into a single structure, so that they can be treated as a consistent whole, or would that change be too far-reaching at this stage?
My preference would be to modify the assertions and annotate their behavior.

We can go that route now.

Thanks for clarifying the implementation details. Given my observation above that the error log output must be qualified with 'racl_error_o' and not simply 'a_valid' I think the assertion should probably remain as it is (ASSERT_KNOWN).

Razer6 · 2025-02-11T16:31:07Z

Witht the changes of #26184, I think this PR is superseeded. error_log.valid is always known. An assertion on that is placed. Reading the data part of the error log is qualified witherror_log.valid.

alees24 · 2025-02-14T16:23:06Z

Superseded by #26184

alees24 added 2 commits February 9, 2025 20:02

[rtl,ac_range_check] SystemVerilog type and assertion fixes

226e689

Use strict SystemVerilog typing in assignments. Introduce assertion that prim_count errors become alerts. Signed-off-by: Adrian Lees <[email protected]>

alees24 requested review from Razer6 and a-will February 9, 2025 20:26

Razer6 reviewed Feb 9, 2025

View reviewed changes

alees24 closed this Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rtl,racl] Fixes to RACL signaling to ensure non-X #26180

[rtl,racl] Fixes to RACL signaling to ensure non-X #26180

alees24 commented Feb 9, 2025 •

edited

Loading

Razer6 Feb 9, 2025

Razer6 Feb 10, 2025

alees24 Feb 10, 2025

alees24 Feb 10, 2025 •

edited

Loading

Razer6 commented Feb 9, 2025

alees24 commented Feb 10, 2025

Razer6 commented Feb 11, 2025

alees24 commented Feb 14, 2025

[rtl,racl] Fixes to RACL signaling to ensure non-X #26180

[rtl,racl] Fixes to RACL signaling to ensure non-X #26180

Conversation

alees24 commented Feb 9, 2025 • edited Loading

Razer6 Feb 9, 2025

Choose a reason for hiding this comment

Razer6 Feb 10, 2025

Choose a reason for hiding this comment

alees24 Feb 10, 2025

Choose a reason for hiding this comment

alees24 Feb 10, 2025 • edited Loading

Choose a reason for hiding this comment

Razer6 commented Feb 9, 2025

alees24 commented Feb 10, 2025

Razer6 commented Feb 11, 2025

alees24 commented Feb 14, 2025

alees24 commented Feb 9, 2025 •

edited

Loading

alees24 Feb 10, 2025 •

edited

Loading