-
Notifications
You must be signed in to change notification settings - Fork 374
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implemented chol, trinv, ttmm, hpdinv.
Details: - Implemented an initial set of "level-4" operations: - 'chol': Cholesky factorization - 'trinv': Triangular matrix inversion - 'ttmm': Triangular-transpose matrix multiply (that is, either L^H * L or U * U^H, where the diagonal of L or U is real) - 'hpdinv': Hermitian-positive definite matrix inversion (also known as spdinv, or symmetric-positive definite matrix inversion for real-domain matrices) The first three operations each contain three kinds of algorithmic variants: - blocked ("blk"): blocked algorithms expressed in terms of object APIs. - unblocked ("unb"): unblocked algorithms expressed in terms of object APIs. - optimized unblocked ("opt"): optimized unblocked algorithms expressed in terms of typed APIs. except for ttmm, which omits the unblocked ("unb") implementations. (In contrast to the first three operations, 'hpdinv' is implemented as a composite operation in terms of chol, trinv, and ttmm, and so it does not have any algorithmic variants of its own.) For every variant that is implemented, there are two separate functions, one each to handle lower- and upper-triangular matrices. In the case of 'trinv', unit and non-unit diagonals are also supported, albeit via conditional statements in a unified set of variants that work for both cases. Each of 'chol', 'trinv', and 'ttmm' employs an extra level of recursion for the self-similar subproblem, with 4*KC and KC used for the outer and inner algorithmic blocksizes, respectively. All four operations provide object and typed APIs. (NOTE: The variants added by this commit were inspired and modeled after those present in libflame.) - Added testsuite modules to test the chol, trinv, ttmm, and hpdinv operations for correctness and updated the input.operations files accordingly. - Changed invertsc operation to be a non-destructive operation; that is, it now takes separate input and output operands. This change applies to both the object and typed APIs. - Defined an alternative square root operation, sqrtrsc, which, when operating on complex scalars, assumes the imaginary part of the input to be zero. - Changed the semantics of addm, subm, copym, axpym, scal2m, and xpbym so that when the source matrix has an implicit unit diagonal, the operation leaves the diagonal of the destination matrix untouched. Previously, the operations would interpret an implicit unit diagonal on the source matrix as a request to manifest the unit diagonal *explicitly* on output (either as something to copy in the case of copym, or something to compute with in the cases of addm, subm, axpym, scal2m, and xpbym). It turns out that this behavior was too cute by half and could cause unintended headaches for practical use cases. (This change in behavior also required small modifications to the trmv and trsv testsuite modules so that they would properly test matrices with unit diagonals.) - Added missing dependencies for copym to gemv, ger, hemv, trmv, and trsv testsuite modules. - Implemented level-0-like ltsc, ltesc, gtsc, gtesc operations in frame/util, which use lt, lte, gt, and gte level-0 scalar macros. - Implemented bli_acquire_mparts_tl2br() in bli_part.c, which provides selected subpartitions of a larger matrix. Also made a trivial variable rename in bli_part.c to harmonize with variable naming conventions elsewhere in BLIS. - Due to the fact that this code was developed against a more recent commit of BLIS (bce86b1) which employs const correctness, this commit adds -Wno-discarded-qualifiers for gcc, or -Wno-incompatible-pointer-types-discards-qualifiers for clang, to the list of compiler flags used for all source code. In the case of clang, -Wno-unused-but-set-variable is also thrown in just to pacify clang's protest of some unused variables in select files.
- Loading branch information
1 parent
0e4491d
commit 02b5acd
Showing
146 changed files
with
11,960 additions
and
118 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.