Defined invscalv, invscalm, invscald operations. #661

fgvanzee · 2022-09-08T20:38:16Z

Details:

Defined invert-scale (invscal) operation on vectors (level-1v), matrices (level-1m), and diagonals (level-1d).
Added test modules for invscalv and invscalm to the testsuite.
Updated BLISObjectAPI.md and BLISTypedAPI.md API documentation to reflect the new operations. Also updated KernelsHowTo.md accordingly.
Renamed 'beta' to 'alpha' in scalv and scalm testsuite modules (and input.operations files) so that the parameter name matches the parameter used in the documentation.

Details: - Defined invert-scale (invscal) operation on vectors (level-1v), matrices (level-1m), and diagonals (level-1d). - Added test modules for invscalv and invscalm to the testsuite. - Updated BLISObjectAPI.md and BLISTypedAPI.md API documentation to reflect the new operations. Also updated KernelsHowTo.md accordingly. - Renamed 'beta' to 'alpha' in scalv and scalm testsuite modules (and input.operations files) so that the parameter name matches the parameter used in the documentation.

Details: - Defined invert-scale (invscal) operation on vectors (level-1v), matrices (level-1m), and diagonals (level-1d). - Added test modules for invscalv and invscalm to the testsuite. - Updated BLISObjectAPI.md and BLISTypedAPI.md API documentation to reflect the new operations. Also updated KernelsHowTo.md accordingly. - Renamed 'beta' to 'alpha' in scalv and scalm testsuite modules (and input.operations files) so that the parameter name matches the parameter used in the documentation. - (cherry picked from commit 4afe0cf)

Details: - Updated Makefile and common.mk so that the targeted configuration's kernel CFLAGS are applied to source files that are found in a 'kernels' subdirectory within an enabled addon. For now, this behavior only applies when the 'kernels' directory is at the top level of the addon directory structure. For example, if there is an addon named 'foobar', the source code must be located in addon/foobar/kernels/ in order for it to be compiled with the target configurations's kernel CFLAGS. Any other source code within addon/foobar/ will be compiled with general-purpose CFLAGS (the same ones that were used on all addon code prior to this commit). Thanks to AMD (esp. Mithun Mohan) for suggesting this change and catching an intermediate bug in the PR. - Comment/whitespace updates. - (cherry picked from commit fd885cf) Fix line number issue in flattened blis.h. (#660) Details: - Updated the top-level Makefile so that it invokes flatten-headers.py without the -c option, which was requesting that comments be stripped (since comment stripping is disabled by default). - Updated flatten-headers.py to accept a new option (-l) to enable insertion of #line directives into the output file. This new option is enabled by default. - Also added logic to flatten-headers.py that outputs a warning if both comment stripping and line numbers are requested since the comment stripping will cause the line numbers to become inaccurate. - (cherry picked from commit 6e5431e) Defined invscalv, invscalm, invscald operations. (#661) Details: - Defined invert-scale (invscal) operation on vectors (level-1v), matrices (level-1m), and diagonals (level-1d). - Added test modules for invscalv and invscalm to the testsuite. - Updated BLISObjectAPI.md and BLISTypedAPI.md API documentation to reflect the new operations. Also updated KernelsHowTo.md accordingly. - Renamed 'beta' to 'alpha' in scalv and scalm testsuite modules (and input.operations files) so that the parameter name matches the parameter used in the documentation. - (cherry picked from commit 4afe0cf) Added '-q' quiet mode option to testsuite. (#657) Details: - Added support for a '-q' command line option to the testsuite. This option suppresses most informational output that would normally clutter up the screen. By default, verbose mode (the previous status quo) will be operative, and so quiet mode must be requested. - (cherry picked from commit a87eae2) Arm64 dgemmsup with extended MR&NR (#655) Details: - Since the number of registers in NEON is large but their lengths are short, I'm here extending both MR and NR. - The approach is to represent the C microtile in registers optionally in columns, so for sizes like 6x7m, the 'crr' kernel is the default with 'rrr' supported through an in-register transpose. - A few asm kernels are crafted for 'rv' to complete this extended size support. - For 'rd' I'm still relying heavily on C99 intrinsic kernels with branching so the performance might not be optimal. (Sorry for that.) - So far, these changes only affect the 'firestorm' subconfig. - This commit also contains row-preferential s12x8 and d6x8 gemm ukernels. These microkernels are templatized versions of the existing s8x12 and d6x8 ukernels defined in bli_gemm_armv8a_asm_d6x8.c. - (cherry picked from commit dfa5413) Temporarily disabled #line directives from 6826c1c. Details: - Commented out the inclusion of #line preprocessor directives in the flattened header output provided by build/flatten-headers.py. This output was added recently in 6826c1c, but was later found to have thrown off the line numbering referenced by compiler warnings and errors (possibly due to license comment blocks, which are stripped from source headers as they are inlined into the monolithic header). - (cherry picked from commit 9e5594a)

fgvanzee merged commit 4afe0cf into master Sep 8, 2022

fgvanzee deleted the invscal branch September 23, 2022 01:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Defined invscalv, invscalm, invscald operations. #661

Defined invscalv, invscalm, invscald operations. #661

fgvanzee commented Sep 8, 2022

Defined invscalv, invscalm, invscald operations. #661

Defined invscalv, invscalm, invscald operations. #661

Conversation

fgvanzee commented Sep 8, 2022