You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
but I don't see a substantial difference in 3 → 1 vs. 2 → 1 experiments, or a difference w.r.t its vpternlogq sibling, where all latencies are listed as 1. Shouldn't both dword and qword variants be listed with latency 2 for operands 2 and 3? What am I missing?
If I'm reading Agner's testing harness right, his latency experiment times
repeated 50 times. He lists latency of ternlog on Zen 4 as 1 cycle in all cases (but if latency from second operand is indeed 2, his experiment wouldn't uncover that).
(unfortunately I do not have access to a Zen 4 machine to run more experiments)
The text was updated successfully, but these errors were encountered:
On Zen 4, summary of vpternlogd latency experiments is given as
Latency operand 1 → 1: 1
Latency operand 2 → 1: 2
Latency operand 3 → 1: 1
https://uops.info/html-lat/ZEN4/VPTERNLOGD_ZMM_ZMM_ZMM_I8-Measurements.html
but I don't see a substantial difference in 3 → 1 vs. 2 → 1 experiments, or a difference w.r.t its vpternlogq sibling, where all latencies are listed as 1. Shouldn't both dword and qword variants be listed with latency 2 for operands 2 and 3? What am I missing?
If I'm reading Agner's testing harness right, his latency experiment times
repeated 50 times. He lists latency of ternlog on Zen 4 as 1 cycle in all cases (but if latency from second operand is indeed 2, his experiment wouldn't uncover that).
(unfortunately I do not have access to a Zen 4 machine to run more experiments)
The text was updated successfully, but these errors were encountered: