v0.2.2
This is a feature and bugfix release, which adds new functionality and resolves a number of problems.
What's New
- Loki supports a new, streamlined way of composing transformation pipelines from individual
Transformation
classes. Transformation arguments are shared among transformations, ensuring consistency, e.g., forDimension
parameters. Pipelines and transformation arguments can even be constructed purely from the config file, which will become the default for theloki-transform.py convert
command in the future. See #217 for more details on how this works. - The pool allocator transformation has a new option to improve compatibility with Cray Compiler Environment 16 on AMD platforms. For that, the pointer arithmetic is removed and
LOC
calls are used directly in the kernel to determine the offset of a temporary in the scratch allocation. See #231 for more details. - A new
RemoveCodeTransformation
has been added, replacing theRemoveCallsTransformation
and incorporating the dead code removal. Additionally, it provides a new feature to remove pragma-annotated code sections via!$loki remove
/!$loki end remove
(#276). - Loki's JIT functionality that is used to build and run tests has been amended so that it honours environment variables and no longer depends on
gfortran
exclusively. Instead, environment variablesCC
,FC
,F90
, andLD
are inspected to determine the compile commands to use, andCFLAGS
,FCFLAGS
,F90FLAGS
, andLDFLAGS
can be used to set corresponding flags. Default values are provided for GNU and NVHPC compilers. With this, it is now possible to run the test suite also on MacOS after installinggcc
andgfortran
(e.g., via Homebrew), and setting the environment variables accordingly. Note that Numpy's F2PY, which is used to call Fortran routines from the Python test base, works also with non-GNU compilers (e.g., NVHPC) but requires gcc to compile the C interface routines. Also, not all tests are compatible with NVHPC and test failures are a known issue that will be resolved in the future (#301). See #294 for more details. - The
parse_expr
utility's functionality has been expanded to support derived types and underpins now theget_pragma_parameters
utility, providing a vastly expanded functionality for expressions in pragma annotations (#292).
What's Changed
- [CMake] Expose GLOBAL_VAR_OFFLOAD and INCLUDES in loki_transform_target by @awnawab in #264
- Preserve imported statement functions by @awnawab in #251
- Fix codecov by adding CODECOV_TOKEN by @reuterbal in #278
cgen
: multiconditional/switch/select case statement by @MichaelSt98 in #267- Introducing the Pipeline class by @mlange05 in #260
- Alternative stack/pool allocator implementation based on Cray pointers compatible with Cray+AMD stack by @MichaelSt98 in #231
- improved
replace_intrinsics
and addedrename_variables
by @MichaelSt98 in #266 - Revert "DEPENDENCY TRAFO: statement functions included via c-style imports preserved" (#251) by @reuterbal in #282
cgen
: return type and var for function(s) by @MichaelSt98 in #269- Pipeline configuration from file by @mlange05 in #271
- Fixing nested associate scope-parentage tracking after inlining by @mlange05 in #281
- F2C:
DeReferenceTrafo
by @MichaelSt98 in #273 - REGEX frontend: white space and nesting bugfix by @reuterbal in #274
- Preserve import statement functions - take II by @awnawab in #283
- Skip driver routine in
GlobalVariableAnalysis
by @awnawab in #265 - MaskedTransformer: Fix in-place rebuilding of scoped nodes by @mlange05 in #284
- Avoid variable_map in TypedSymbol.get_derived_type_member and verify type information is derived correctly by @reuterbal in #285
- SCCHoist: hoist inline call temporaries and don't hoist statically declared arrays by @awnawab in #268
- Pool allocator: correctly resolve derived type member as block dimension and ignore pointer/allocatable arrays by @awnawab in #249
- Marked region removal and general code removal transformation by @mlange05 in #276
- SCC: make vertical dimension optional by @awnawab in #270
SCCBaseTransformation.get_integer_variable
now also checks module imports by @awnawab in #279- Improve performance of pragma-region attach/detach by using transformers by @mlange05 in #286
- Reorganising test directories by @mlange05 in #287
- [Bugfix] available_frontends: Import pytest locally to make dependency optional by @reuterbal in #290
DataflowAnalysis
bugfix: preserve body nesting invisit_MaskedStatement
by @awnawab in #288- Loki expression parser based on pymbolic parser by @MichaelSt98 in #272
- F2C: optional case-sensitivity for variables/symbols by @MichaelSt98 in #277
- Transformation to hoist temporaries in kernel language transpilation by @MichaelSt98 in #291
- fix scoping for global var hoisting by @MichaelSt98 in #293
- SCC: Support for bounds aliases and derived type members as bounds by @awnawab in #250
- Consistent, environment-configurable use of Compiler class in JIT compilation by @reuterbal in #294
- Derived-type inheritance by @awnawab in #295
- Improve
parse_expr
and use inprocess_dimension_pragmas
by @MichaelSt98 in #292
Full Changelog: v0.2.1...v0.2.2