Skip to content

Commit

Permalink
Implemented chol, trinv, ttmm, hpdinv.
Browse files Browse the repository at this point in the history
Details:
- Implemented an initial set of "level-4" operations:
  - 'chol': Cholesky factorization
  - 'trinv': Triangular matrix inversion
  - 'ttmm': Triangular-transpose matrix multiply (that is, either
    L^H * L or U * U^H, where the diagonal of L or U is real)
  - 'hpdinv': Hermitian-positive definite matrix inversion (also known
    as spdinv, or symmetric-positive definite matrix inversion for
    real-domain matrices)
  The first three operations each contain three kinds of algorithmic
  variants:
  - blocked ("blk"): blocked algorithms expressed in terms of object
    APIs.
  - unblocked ("unb"): unblocked algorithms expressed in terms of object
    APIs.
  - optimized unblocked ("opt"): optimized unblocked algorithms
    expressed in terms of typed APIs.
  except for ttmm, which omits the unblocked ("unb") implementations.
  (In contrast to the first three operations, 'hpdinv' is implemented as
  a composite operation in terms of chol, trinv, and ttmm, and so it
  does not have any algorithmic variants of its own.) For every variant
  that is implemented, there are two separate functions, one each to
  handle lower- and upper-triangular matrices. In the case of 'trinv',
  unit and non-unit diagonals are also supported, albeit via conditional
  statements in a unified set of variants that work for both cases. Each
  of 'chol', 'trinv', and 'ttmm' employs an extra level of recursion for
  the self-similar subproblem, with 4*KC and KC used for the outer and
  inner algorithmic blocksizes, respectively. All four operations
  provide object and typed APIs. (NOTE: The variants added by this
  commit were inspired and modeled after those present in libflame.)
- Added testsuite modules to test the chol, trinv, ttmm, and hpdinv
  operations for correctness and updated the input.operations files
  accordingly.
- Changed invertsc operation to be a non-destructive operation; that is,
  it now takes separate input and output operands. This change applies
  to both the object and typed APIs.
- Defined an alternative square root operation, sqrtrsc, which, when
  operating on complex scalars, assumes the imaginary part of the input
  to be zero.
- Changed the semantics of addm, subm, copym, axpym, scal2m, and xpbym
  so that when the source matrix has an implicit unit diagonal, the
  operation leaves the diagonal of the destination matrix untouched.
  Previously, the operations would interpret an implicit unit diagonal
  on the source matrix as a request to manifest the unit diagonal
  *explicitly* on output (either as something to copy in the case of
  copym, or something to compute with in the cases of addm, subm, axpym,
  scal2m, and xpbym). It turns out that this behavior was too cute by
  half and could cause unintended headaches for practical use cases.
  (This change in behavior also required small modifications to the trmv
  and trsv testsuite modules so that they would properly test matrices
  with unit diagonals.)
- Added missing dependencies for copym to gemv, ger, hemv, trmv, and
  trsv testsuite modules.
- Implemented level-0-like ltsc, ltesc, gtsc, gtesc operations in
  frame/util, which use lt, lte, gt, and gte level-0 scalar macros.
- Implemented bli_acquire_mparts_tl2br() in bli_part.c, which provides
  selected subpartitions of a larger matrix. Also made a trivial
  variable rename in bli_part.c to harmonize with variable naming
  conventions elsewhere in BLIS.
- Due to the fact that this code was developed against a more recent
  commit of BLIS (bce86b1) which employs const correctness, this commit
  adds -Wno-discarded-qualifiers for gcc, or
  -Wno-incompatible-pointer-types-discards-qualifiers for clang, to
  the list of compiler flags used for all source code. In the case of
  clang, -Wno-unused-but-set-variable is also thrown in just to pacify
  clang's protest of some unused variables in select files.
  • Loading branch information
jay-acosta committed Jan 14, 2023
1 parent 0e4491d commit 02b5acd
Show file tree
Hide file tree
Showing 146 changed files with 11,960 additions and 118 deletions.
3 changes: 2 additions & 1 deletion CREDITS
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@ BLIS framework
Acknowledgements
---

The BLIS framework was primarily authored by
The BLIS framework was originally authored by

Field Van Zee @fgvanzee (The University of Texas at Austin)

but many others have contributed code and feedback, including

Jay Acosta @jay-acosta (Oracle)
Sameer Agarwal @sandwichmaker (Google)
Murtaza Ali (Texas Instruments)
Sajid Ali @s-sajid-ali (Northwestern University)
Expand Down
12 changes: 12 additions & 0 deletions common.mk
Original file line number Diff line number Diff line change
Expand Up @@ -678,6 +678,18 @@ ifeq ($(CC_VENDOR),clang)
CWARNFLAGS += -Wno-tautological-compare -Wno-pass-failed
endif

# Disable discarded qualifier warnings.
# NOTE: This is a temporary hack until the 'ampere' branch can catch up to the
# point in the 'master' brange lineage where const correctness is implemented
# throughout BLIS's higher-level APIs.
ifeq ($(CC_VENDOR),gcc)
CWARNFLAGS := -Wno-discarded-qualifiers
else
ifeq ($(CC_VENDOR),clang)
CWARNFLAGS := -Wno-incompatible-pointer-types-discards-qualifiers -Wno-unused-but-set-variable
endif
endif

$(foreach c, $(CONFIG_LIST_FAM), $(eval $(call append-var-for,CWARNFLAGS,$(c))))

# --- Position-independent code flags (shared libraries only) ---
Expand Down
2 changes: 1 addition & 1 deletion examples/oapi/04level0.c
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ int main( int argc, char** argv )
bli_normfsc( &zeta, &alpha );
bli_printm( "alpha := normf( zeta ) # normf() = complex modulus in complex domain.", &alpha, "%4.1f", "" );

bli_invertsc( &gamma );
bli_invertsc( &gamma, &gamma );
bli_printm( "gamma := 1.0 / gamma", &gamma, "%4.2f", "" );


Expand Down
16 changes: 2 additions & 14 deletions frame/0/bli_l0_check.c
Original file line number Diff line number Diff line change
Expand Up @@ -55,20 +55,8 @@ GENFRONT( copysc )
GENFRONT( divsc )
GENFRONT( mulsc )
GENFRONT( sqrtsc )
GENFRONT( sqrtrsc )
GENFRONT( subsc )


#undef GENFRONT
#define GENFRONT( opname ) \
\
void PASTEMAC(opname,_check) \
( \
obj_t* chi \
) \
{ \
bli_l0_xsc_check( chi ); \
}

GENFRONT( invertsc )


Expand Down Expand Up @@ -357,7 +345,7 @@ void bli_l0_xxbsc_check
(
obj_t* chi,
obj_t* psi,
bool* is_eq
bool* is
)
{
err_t e_val;
Expand Down
13 changes: 2 additions & 11 deletions frame/0/bli_l0_check.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,17 +51,8 @@ GENTPROT( copysc )
GENTPROT( divsc )
GENTPROT( mulsc )
GENTPROT( sqrtsc )
GENTPROT( sqrtrsc )
GENTPROT( subsc )


#undef GENTPROT
#define GENTPROT( opname ) \
\
void PASTEMAC(opname,_check) \
( \
obj_t* chi \
);

GENTPROT( invertsc )


Expand Down Expand Up @@ -152,5 +143,5 @@ void bli_l0_xxbsc_check
(
obj_t* chi,
obj_t* psi,
bool* is_eq
bool* is
);
1 change: 1 addition & 0 deletions frame/0/bli_l0_fpa.c
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ GENFRONT( mulsc )
GENFRONT( subsc )
GENFRONT( invertsc )
GENFRONT( sqrtsc )
GENFRONT( sqrtrsc )
GENFRONT( unzipsc )
GENFRONT( zipsc )

Expand Down
1 change: 1 addition & 0 deletions frame/0/bli_l0_fpa.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ GENPROT( mulsc )
GENPROT( subsc )
GENPROT( invertsc )
GENPROT( sqrtsc )
GENPROT( sqrtrsc )
GENPROT( unzipsc )
GENPROT( zipsc )

Expand Down
15 changes: 2 additions & 13 deletions frame/0/bli_l0_ft.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
// -- Level-0 function types ---------------------------------------------------
//

// addsc, divsc, subsc
// addsc, divsc, subsc, invertsc

#undef GENTDEF
#define GENTDEF( ctype, ch, opname, tsuf ) \
Expand All @@ -52,18 +52,6 @@ typedef void (*PASTECH2(ch,opname,tsuf)) \
INSERT_GENTDEF( addsc )
INSERT_GENTDEF( divsc )
INSERT_GENTDEF( subsc )

// invertsc

#undef GENTDEF
#define GENTDEF( ctype, ch, opname, tsuf ) \
\
typedef void (*PASTECH2(ch,opname,tsuf)) \
( \
conj_t conjchi, \
ctype* chi \
);

INSERT_GENTDEF( invertsc )

// mulsc
Expand Down Expand Up @@ -118,6 +106,7 @@ typedef void (*PASTECH2(ch,opname,tsuf)) \
);

INSERT_GENTDEF( sqrtsc )
INSERT_GENTDEF( sqrtrsc )

// getsc

Expand Down
33 changes: 1 addition & 32 deletions frame/0/bli_l0_oapi.c
Original file line number Diff line number Diff line change
Expand Up @@ -115,38 +115,6 @@ GENFRONT( addsc )
GENFRONT( divsc )
GENFRONT( mulsc )
GENFRONT( subsc )


#undef GENFRONT
#define GENFRONT( opname ) \
\
void PASTEMAC0(opname) \
( \
obj_t* chi \
) \
{ \
bli_init_once(); \
\
num_t dt = bli_obj_dt( chi ); \
\
conj_t conjchi = bli_obj_conj_status( chi ); \
\
void* buf_chi = bli_obj_buffer_for_1x1( dt, chi ); \
\
if ( bli_error_checking_is_enabled() ) \
PASTEMAC(opname,_check)( chi ); \
\
/* Query a type-specific function pointer, except one that uses
void* for function arguments instead of typed pointers. */ \
PASTECH(opname,_vft) f = PASTEMAC(opname,_qfp)( dt ); \
\
f \
( \
conjchi, \
buf_chi \
); \
}

GENFRONT( invertsc )


Expand Down Expand Up @@ -181,6 +149,7 @@ void PASTEMAC0(opname) \
}

GENFRONT( sqrtsc )
GENFRONT( sqrtrsc )


#undef GENFRONT
Expand Down
11 changes: 1 addition & 10 deletions frame/0/bli_l0_oapi.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,17 +63,8 @@ GENPROT( addsc )
GENPROT( divsc )
GENPROT( mulsc )
GENPROT( sqrtsc )
GENPROT( sqrtrsc )
GENPROT( subsc )


#undef GENPROT
#define GENPROT( opname ) \
\
BLIS_EXPORT_BLIS void PASTEMAC0(opname) \
( \
obj_t* chi \
);

GENPROT( invertsc )


Expand Down
24 changes: 22 additions & 2 deletions frame/0/bli_l0_tapi.c
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,8 @@ INSERT_GENTFUNC_BASIC( subsc, subs )
void PASTEMAC(ch,opname) \
( \
conj_t conjchi, \
ctype* chi \
ctype* chi, \
ctype* psi \
) \
{ \
bli_init_once(); \
Expand All @@ -76,7 +77,7 @@ void PASTEMAC(ch,opname) \
\
PASTEMAC(ch,copycjs)( conjchi, *chi, chi_conj ); \
PASTEMAC(ch,kername)( chi_conj ); \
PASTEMAC(ch,copys)( chi_conj, *chi ); \
PASTEMAC(ch,copys)( chi_conj, *psi ); \
}

INSERT_GENTFUNC_BASIC( invertsc, inverts )
Expand Down Expand Up @@ -176,6 +177,25 @@ void PASTEMAC(ch,opname) \
INSERT_GENTFUNC_BASIC0( sqrtsc )


#undef GENTFUNCR
#define GENTFUNCR( ctype, ctype_r, ch, chr, opname ) \
\
void PASTEMAC(ch,opname) \
( \
ctype* chi, \
ctype* psi \
) \
{ \
bli_init_once(); \
\
const ctype_r chi_r = PASTEMAC(ch,real)( *chi ); \
\
PASTEMAC2(chr,ch,sqrt2s)( chi_r, *psi ); \
}

INSERT_GENTFUNCR_BASIC0( sqrtrsc )


#undef GENTFUNC
#define GENTFUNC( ctype, ch, opname ) \
\
Expand Down
12 changes: 1 addition & 11 deletions frame/0/bli_l0_tapi.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,17 +51,6 @@ INSERT_GENTPROT_BASIC0( addsc )
INSERT_GENTPROT_BASIC0( divsc )
INSERT_GENTPROT_BASIC0( mulsc )
INSERT_GENTPROT_BASIC0( subsc )


#undef GENTPROT
#define GENTPROT( ctype, ch, opname ) \
\
BLIS_EXPORT_BLIS void PASTEMAC(ch,opname) \
( \
conj_t conjchi, \
ctype* chi \
);

INSERT_GENTPROT_BASIC0( invertsc )


Expand All @@ -88,6 +77,7 @@ BLIS_EXPORT_BLIS void PASTEMAC(ch,opname) \
);

INSERT_GENTPROT_BASIC0( sqrtsc )
INSERT_GENTPROT_BASIC0( sqrtrsc )


#undef GENTPROT
Expand Down
30 changes: 30 additions & 0 deletions frame/1m/bli_l1m_tapi.c
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,11 @@ void PASTEMAC2(ch,opname,EX_SUF) \
\
/* When the diagonal of an upper- or lower-stored matrix is unit,
we handle it with a separate post-processing step. */ \
/* NOTE: This code was disabled after I realized that when matrix A has the
properties of having a unit diagonal (and being lower or upper stored),
the operation should only read the strictly lower/upper triangle and
leave the diagonal of B untouched. */ \
/*
if ( bli_is_upper_or_lower( uplox ) && \
bli_is_unit_diag( diagx ) ) \
{ \
Expand All @@ -99,6 +104,7 @@ void PASTEMAC2(ch,opname,EX_SUF) \
rntm \
); \
} \
*/ \
}

INSERT_GENTFUNC_BASIC( addm, addd )
Expand Down Expand Up @@ -148,6 +154,11 @@ void PASTEMAC2(ch,opname,EX_SUF) \
\
/* When the diagonal of an upper- or lower-stored matrix is unit,
we handle it with a separate post-processing step. */ \
/* NOTE: This code was disabled after I realized that when matrix A has the
properties of having a unit diagonal (and being lower or upper stored),
the operation should only read the strictly lower/upper triangle and
leave the diagonal of B untouched. */ \
/*
if ( bli_is_upper_or_lower( uplox ) && \
bli_is_unit_diag( diagx ) ) \
{ \
Expand All @@ -169,6 +180,7 @@ void PASTEMAC2(ch,opname,EX_SUF) \
rntm \
); \
} \
*/ \
}

INSERT_GENTFUNC_BASIC0( copym )
Expand Down Expand Up @@ -222,6 +234,11 @@ void PASTEMAC2(ch,opname,EX_SUF) \
\
/* When the diagonal of an upper- or lower-stored matrix is unit,
we handle it with a separate post-processing step. */ \
/* NOTE: This code was disabled after I realized that when matrix A has the
properties of having a unit diagonal (and being lower or upper stored),
the operation should only read the strictly lower/upper triangle and
leave the diagonal of B untouched. */ \
/*
if ( bli_is_upper_or_lower( uplox ) && \
bli_is_unit_diag( diagx ) ) \
{ \
Expand All @@ -239,6 +256,7 @@ void PASTEMAC2(ch,opname,EX_SUF) \
rntm \
); \
} \
*/ \
}

INSERT_GENTFUNC_BASIC0( axpym )
Expand Down Expand Up @@ -311,6 +329,11 @@ void PASTEMAC2(ch,opname,EX_SUF) \
\
/* When the diagonal of an upper- or lower-stored matrix is unit,
we handle it with a separate post-processing step. */ \
/* NOTE: This code was disabled after I realized that when matrix A has the
properties of having a unit diagonal (and being lower or upper stored),
the operation should only read the strictly lower/upper triangle and
leave the diagonal of B untouched. */ \
/*
if ( bli_is_upper_or_lower( uplox ) && \
bli_is_unit_diag( diagx ) ) \
{ \
Expand All @@ -331,6 +354,7 @@ void PASTEMAC2(ch,opname,EX_SUF) \
rntm \
); \
} \
*/ \
}

INSERT_GENTFUNC_BASIC0( scal2m )
Expand Down Expand Up @@ -448,6 +472,11 @@ void PASTEMAC2(ch,opname,EX_SUF) \
\
/* When the diagonal of an upper- or lower-stored matrix is unit,
we handle it with a separate post-processing step. */ \
/* NOTE: This code was disabled after I realized that when matrix A has the
properties of having a unit diagonal (and being lower or upper stored),
the operation should only read the strictly lower/upper triangle and
leave the diagonal of B untouched. */ \
/*
if ( bli_is_upper_or_lower( uplox ) && \
bli_is_unit_diag( diagx ) ) \
{ \
Expand All @@ -465,6 +494,7 @@ void PASTEMAC2(ch,opname,EX_SUF) \
rntm \
); \
} \
*/ \
}

INSERT_GENTFUNC_BASIC0( xpbym )
Expand Down
Loading

0 comments on commit 02b5acd

Please sign in to comment.