Skip to content

Releases: wu-kan/HPL-AI

HPL-AI Version 2.3d

24 Apr 01:13
Compare
Choose a tag to compare

-- [M] testing/ptest/HPLAI_pdinfo.cc

Kan Wu added support for AscendCL, and added a new strategy
to enable GEMM to use CPU and GPU/NPU at the same time:
-- [M] src/pgesv/HPLAI_blas.cc

Kan Wu added added a compilation option BLAS_ILP64 to better
support ILP64/32 BLAS:
-- [M] src/pgesv/HPLAI_pdgesv.cc

HPL-AI Version 2.3c

18 Apr 19:50
d059487
Compare
Choose a tag to compare

Kan Wu fixed the bug when n > 65536.
-- [M] src/pgesv/HPLAI_pdgesv.cc

Kan Wu added an option to use the half-precision computeType
in cublasGemmEx (cuda@11: is required):
-- [M] src/blas/HPLAI_blas.cc

Kan Wu renamed the executable "xhplai" to "xhpl_ai", to be
synced to HPL-AI-NVIDIA v1.0.0 in nvidia:hpc-benchmarks:
-- [M] src/Makefile.am
-- [M] testing/Makefile.am

Kan Wu release the software at
https://github.com/SYSU-SCC/sysu-scc-spack-repo:
-- [M] testing/ptest/HPLAI_pdinfo.cc

HPL-AI Version 2.3b

24 Mar 07:01
5b2f262
Compare
Choose a tag to compare
HPL-AI Version 2.3b Pre-release
Pre-release

Kan Wu optimized the device blaspp routines through the
strategy of buffering memory: the program will check whether
the buffer size is sufficient at runtime, free and reallocate
the buffer when necessary. Therefore, it is recommended to
run the benchmark twice with the same parameters to obtain
higher computation rate:
-- [M] src/blas/HPLAI_blas.cc
-- [M] testing/ptest/HPLAI_pdinfo.cc

HPL-AI Version 2.3a

14 Mar 09:03
Compare
Choose a tag to compare
HPL-AI Version 2.3a Pre-release
Pre-release

Junkang Huang added the parallel GMRES and matrix construction
part (https://github.com/schuangs/hpl-ai-with-IR) for HPL-AI.
Note that the GMRES did not check the backward-error, which may
cause the result to be invalid.

Kan Wu reimplemented HPL-AI with some bug fixes and some minor
optimizations. The C souce code of hpl-2.3 remained the same,
while took advantages of modern C++ features such as templates,
overloading, and GPU backends via blaspp.