You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Yes. I'm trying to analyze the following kernel that uses SVE instructions with OSACA 0.6.1.
There's missing performance data for quite a few instructions. Some of them, liked the SVE signed unpacks aren't in the database for V2. Some of them like sub and mul are present, but not for SVE z-registers. Finally the load ld1w is present, but doesn't seem to match (maybe because of the predicate?)
Describe the solution you'd like
The 'Code Quality Analyzer' in the MAQAO tools seems to provide performance data. Apparently for ARM64 they take latencies from a developer manual and aren't actually benchmarked. The current released version of MAQAO doesn't have V2 architecture support, so this analysis was done for V1, hence the different port model- but something similar would be great.
The text was updated successfully, but these errors were encountered:
stefandesouza
changed the title
[REQUEST] Add missing SVE instructions to database
[REQUEST] Add missing SVE instructions to database for V2
Dec 18, 2024
Is your feature request related to a problem? Please describe.
Yes. I'm trying to analyze the following kernel that uses SVE instructions with OSACA 0.6.1.
This is the output I get from OSACA using the Neoverse V2 architecture flag:
There's missing performance data for quite a few instructions. Some of them, liked the SVE signed unpacks aren't in the database for V2. Some of them like sub and mul are present, but not for SVE z-registers. Finally the load ld1w is present, but doesn't seem to match (maybe because of the predicate?)
Describe the solution you'd like
The 'Code Quality Analyzer' in the MAQAO tools seems to provide performance data. Apparently for ARM64 they take latencies from a developer manual and aren't actually benchmarked. The current released version of MAQAO doesn't have V2 architecture support, so this analysis was done for V1, hence the different port model- but something similar would be great.
The text was updated successfully, but these errors were encountered: