-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arm SVE CGEMM / ZGEMM Natural Kernels #542
Conversation
In fact I haven't considered writing ZGEMM kernels until I realized there's no But anyway, the kernel's out and it's better than |
Excellent! |
@xrq-phys have the cgemm/zgemm kernels gotten the beta == 0 treatment? |
@devinamatthews Not yet. Working on |
Converted to draft while working on beta == 0. @xrq-phys please convert back to a regular PR when it's ready for merging. |
Pic. size seems a bit different from upstream. Generaged w/ MATLAB. Open to any change.
FMOV [hsd]M, #imm does not allow zero immediate. Use wzr, xzr instead.
Fixed Pushing rebased for convenience. |
These new kernels (2vx10) yield a peak efficiency of around 95% on GW4 Isambard.
![zgemm_ukr](https://user-images.githubusercontent.com/7891482/133478227-6d4d66aa-eebe-48e3-878a-4dcd51bea2e4.png)
As GW4 Isambard is observed to be working at a frequency a bit lower than 1.8GHz, these numbers could be a little higher on SC Fugaku (which is very crowded at the moment due to end-of-term job accounting).
Here's also a comparison between
![cgemm_nat](https://user-images.githubusercontent.com/7891482/133478750-0911049f-feca-4f67-a719-8b0655775445.png)
![zgemm_nat](https://user-images.githubusercontent.com/7891482/133478770-ed60809e-df9d-420b-ba63-db24b6f99ad5.png)
nat
and1m
:The new kernels should compile also with Clang.