Releases: ecmwf-ifs/loki
v0.2.10
This is mostly a maintenance release with a set of minor bugfixes and improvements.
Included is a conceptual change to the CMake plan mode in preparation for future improvements: A new plan
mode has been added to Transformation
objects, which allows to encode dependency graph changes and other, build-system relevant transformation steps in a dry-run mode. Instead of deriving the CMake plan heuristically from the Scheduler graph, the plan mode will now perform a dry-run of the entire pipeline.
As a consequence, plan mode is now no longer supported without a pipeline definition in the config file. Usage without a pipeline definition in a config file has been deprecated since v0.2.8.
What's Changed
- Correct symbols after derived type enrichment by @mlange05 in #450
- resolve_vector_notation: fix/improvement by @MichaelSt98 in #454
- SGraph (re-)build optimisations by @reuterbal in #449
- CMake Plan: Pipeline dry-run and transformation-based plan creation by @reuterbal in #453
- Sanitise: Resolve free range indices when resolving associates by @mlange05 in #455
- extending dependency trafo to access specs by @MichaelSt98 in #436
Full Changelog: v0.2.9...v0.2.10
v0.2.9
What's new
- A new
FieldOffloadTransformation
was added to inject FIELD API boilerplate code to driver routine (#437) do_unroll_loop
supports negative loop bounds (#443)- A number of bugfixes
All changes
- Loki: Pin pymbolic version to 2022.2 due to upstream incompatibility by @mlange05 in #434
- Sanitise: New transformation sub-package and some refactoring by @mlange05 in #433
- Transformations: "Parallel" sub-module with driver-level parallelilsation utilities by @mlange05 in #415
- JoinableStringList: Do not break lines within quoted strings by @reuterbal in #440
- Sanitise: Update scope on unchanged expressions in remove_associates by @mlange05 in #439
- Continued: F2C/CUDA transpilation by @MichaelSt98 in #424
- DataOffload: Add new FieldOffloadTransformation for FIELD API boilerplate injection by @wertysas in #437
- Expressions: Handle intrinsic function calls by @mlange05 in #416
- Inline: Fix rescoping of intrinsic procedure symbols in elementals by @mlange05 in #445
- DataOffload: fix generation of offload pragmas by @awnawab in #442
- Unroll negative loop bounds and retain pragmas inside unrolled loop body by @awnawab in #443
- Offload: Refactor
loki.transformations.data_offload
into separate sub-package by @mlange05 in #446 - Loki: Turn test sub-directories into sub-pacakges by @mlange05 in #447
- Module: Fix enrichment of type info via
Module
imports by @mlange05 in #448 - CMake plan and enrichment bugfix by @reuterbal in #441
- Fix Linter warning by @reuterbal in #451
- [F2C transpilation] (driver level) convert interface to import by @MichaelSt98 in #422
Full Changelog: v0.2.8...v0.2.9
v0.2.8
This release consists of fixes, refactoring, additions and deprecates a number of outdated or redundant features and APIs.
Deprecations
- The Open Fortran Parser frontend is no longer supported by Loki. It is still available in this release but its use will print deprecation warnings and it is no longer tested in the CI. OFP will be removed from Loki in the next release (see #411 and #406)
- The CLAW compiler is no longer maintained. In order to use the OMNI frontend, the recommended procedure is no longer to install CLAW but to use the latest OMNI compiler frontend directly, e.g., via the
--with-omni
flag in the install script (see #408 and #406) - The ability to use
loki-transform.py
without a config file is deprecated and will be removed in the next release. The config file is far superior when parameterising transformation pipelines and the config file can easily be versioned together with the code it is meant to transform. See the CLOUDSC config file for an example (#429) - The Maxeler transpilation module has been removed (#405)
What's new
- The handling of typed symbols for expressions moves closer to their corresponding scopes, with a new convenience API introduced in #375
- The CLOUDSC2 mini-app, a simplified cloud microphysics scheme with tangent-linear and adjoint code paths is now part of the regression test suite (#230)
- A new vertical loop fusion transformation has been added that is guided via in-source annotations (#374)
- f90wrap has been updated to 0.2.15+, which restores compatibility with Numpy 2.0+ (#407)
- Pragma-guided high-order loop transformations, such as fusion, fission, interchange or unrolling, can now be triggered via a single transformation (#430)
All changes
- IR: Move
expr_visitors
toloki.ir
by @mlange05 in #372 - Improve on multiconditionals/switch/select case by @MichaelSt98 in #384
- Transpilation: optional arguments by @MichaelSt98 in #385
- Fix edge case for vector section mapping by @MichaelSt98 in #382
- IR: Symbol management on scoped nodes by @mlange05 in #375
- Extend 'resolve_vector_notation' to look for available and appropriate loops by @MichaelSt98 in #386
- Fix representation of array return type in OMNI frontend by @reuterbal in #391
- SingleColumn: Fix vectorisation of nested else-if bodies by @mlange05 in #392
- Inline functions (including multi-line and non-elemental functions) by @MichaelSt98 in #378
- handle modulo operator/function for c-like-backends by @MichaelSt98 in #383
- Transformations: ResolveAssociateTransformer re-write to in-place substitution by @mlange05 in #387
- Transformations: Remove routine pragmas when inlining functions by @mlange05 in #395
- Utility to remove duplicate arguments for calls and callees by @MichaelSt98 in #367
- Pytest CLI option for log-level by @reuterbal in #396
- Logging: Small log-level sanitisation and CLI flags by @mlange05 in #394
- Transformations: Re-organise
inline
andextract
sub-packages by @mlange05 in #376 - Vertical loop fusion and demotion of temporaries by @MichaelSt98 in #374
- Skip privatization of arrays with existing data declarations by @awnawab in #389
- Update f90wrap to 0.2.15 as minimum to ensure compatibility with numpy 2.0+ by @reuterbal in #407
- Remove Maxeler transpilation module by @reuterbal in #405
- Fix Scheduler instantiation without config (fix #373) by @reuterbal in #403
- Fix SccAnnotate when existing acc pragmas declare a copy category more than once by @reuterbal in #409
- Improve representation of procedure pointers (fix #393) by @reuterbal in #399
- Add option to install "plain OMNI" to install script and upgrade Github actions runners by @reuterbal in #408
- Fix function inlining when only interface is available (fixes #397) by @reuterbal in #402
- Regression test for CLOUDSC2 by @reuterbal in #230
- Remove duplicate declarations for external statements (fix #57) by @reuterbal in #404
- Utilities to merge associate blocks and restrict depth of associate resolution by @mlange05 in #388
- Prevent superfluous clone of loki in ecwam regression test by @awnawab in #410
- Inline elemental functions: skip calls with args being array (slices) by @MichaelSt98 in #401
- Frontend: Deprecate OFP and purge from test base by @mlange05 in #411
- CMake/python_venv: Do not request COMPONENT Development by @reuterbal in #413
- Dimension: Support stepping, implicit aliases and remove contrainsts by @mlange05 in #414
- Handle Loki dimension pragmas for modules (and not only routines) for FP by @MichaelSt98 in #417
- Allow for optional case-sensitive 'recursive_expression_map_update' by @MichaelSt98 in #418
- extend 'remove_explicit_array_dimensions' by @MichaelSt98 in #421
- C-like-backends: skip/don't write Fortran interfaces by @MichaelSt98 in #423
- Make builddir a runtime argument of
FileWriteTransformation
by @awnawab in #425 - Loki-transform: Add deprecation message about custom entry points by @mlange05 in #429
- Expression: Expression cloning and mapper tests by @mlange05 in #419
- Extract: Improved region-outlining for complex procedures by @mlange05 in #412
- IR: Fix false "end" matches in pragma_regions_attached utility by @mlange05 in #431
- Transformation to call loop transform utilities by @awnawab in #430
- Bump version number to 0.2.8 by @reuterbal in #432
Full Changelog: v0.2.7...v0.2.8
v0.2.7
What's New
- Experimental Fortran-to-CUDA transpilation demonstrated on CLOUDSC (#328)
- A new
SplitReadWriteTransformation
that allows user-guided GPU optimisation to make loads independent from stores (#329) - A new
LowerConstantArrayIndices
transformation to pass full arrays instead of constant slices in kernel calls (#348) - New transformation utilities to introduce loop blocking for driver loops (#362)
- A new string-based substitution mechanism for expressions (#366)
- Refactoring of SCC tests (#353) and transformation utilities (#354)
- And many small improvements and bug fixes (see below)
All Changes
- IR: Automatic sanitisation of tuples in IR constructors by @mlange05 in #350
- Run pytest on macos in GH actions by @reuterbal in #262
- SCC test reshuffle by @mlange05 in #353
- Transformations: Move common SCC utility routines to
utilities
by @mlange05 in #354 - Transformations: Test and fix corner case in get_local_arrays by @mlange05 in #355
- Tools: Disable timeout utility test on MacOS due to sporadic failures by @mlange05 in #356
- Fixed logical evaluation of PRESENT intrinsics on Array variables by @JoeffreyLegaux in #341
- ecWAM regression tests: switch to develop-1.3 branch by @awnawab in #358
- Split reads and writes for certain accumulation patterns by @awnawab in #329
- fix for 'resolve_vector_notation' utility by @MichaelSt98 in #361
- Transformations: Internalise
IdemTransformation
by @mlange05 in #360 - New transformation 'LowerConstantArrayIndices' to allow to … by @MichaelSt98 in #348
- OMNI: Fix dimension range-indexing in frontend by @mlange05 in #363
- Loki-transform: Pass
cuf
option to FilewriteTrafo by @mlange05 in #364 - Filter out globals in
get_local_arrays
by @awnawab in #370 - extend hoist variables functionality by @MichaelSt98 in #357
- Change/fix pipeline for mode 'scc-raw-stack' by @MichaelSt98 in #371
- Minimal padding in pool allocator by @awnawab in #365
- CLOUDSC low-level GPU (transpilation) via Loki (CUF/CUDA) by @MichaelSt98 in #328
- Loop splitting/blocking of block loops by @wertysas in #362
- String-based expression substitution and moar expression tests! by @mlange05 in #366
- SCC: Add vectorisation annotations in SCCRevector and translate in SCCAnnotate by @mlange05 in #359
- Update VERSION to 0.2.7 by @reuterbal in #381
New Contributors
Full Changelog: v0.2.6...v0.2.7
v0.2.6
This is a minor release with a number of housekeeping changes and some new features.
What's new
- We had a dependency on the Pydantic 1.x releases until now, and this release adds support for Pydantic 2. The next release will require Pydantic 2. (#349)
- The InlineTransformation allows now to inline statement functions (#345)
- A new LoopUnrollTransformation allows to explicitly unroll pragma-annotated loops (#347)
- Loki IR has now support for the
FORALL
statement and construct. However, this feature is only fully supported with the Fparser2 frontend (#210) - Cray pointers are now represented in the Loki IR as
Intrinsic
nodes (#342) - Python package installation works now correctly also from tarballs and other non-git versioned installation sources (#344)
- The test base has been cleaned up: all regression tests use now publicly available source branches, and all tests should now create temporary files in test-local temporary directories to avoid littering the source tree (#335, #343)
All changes
- Add support of the FORALL statement and construct (fparser/fgen) by @quepas in #210
- Rigorous use of tmp_path in tests by @reuterbal in #343
- DrHookTransformation: Add explicit label renaming by @mlange05 in #346
- Support for representing cray pointers using OFP or FP (fixes #338) by @reuterbal in #342
- Housekeeping on CMakeLists.txt and pyproject.toml by @reuterbal in #344
- BlockIndexInject: Exclude non-target calls from arg_map (fixes #336) by @reuterbal in #337
- Allow inlining of Statement Functions by @MichaelSt98 in #345
- IR: Update to Pydantic >2.0 compatibility by @mlange05 in #349
- Fix DEV_ALLOC_SIZE for ecwam regression and add SCC-HOIST variant by @awnawab in #351
- Update ecwam regression tests to use develop branch by @reuterbal in #335
- Add loop unroll transformation by @Andrew-Beggs-ECMWF in #347
New Contributors
- @Andrew-Beggs-ECMWF made their first contribution in #347
Full Changelog: v0.2.5...v0.2.6
v0.2.5
A minor release adding new transformations and fixing issues in the frontends, handling of derived types, dataflow analysis and transformations.
What's New
- A general
BlockIndexInjectTransformation
that injects the block-index into all array subscripts that have a local rank one less than their declared rank (#303) - A corresponding, IFS-specific
BlockViewToFieldViewTransformation
to replace per-block view pointers with full field pointers (#303) - A new
SCCRawStackPipeline
that uses a pool-allocator variant where each use of temporaries is replaced with fixed offsets into a pre-allocated scratch memory (#314, incorporating #201 by @rolfhm)
All Changes
- Block-index injection transformations by @awnawab in #303
- Fix parse failures with REGEX frontend due to white space in declarations by @reuterbal in #323
DataFlowAnalysis
bug fixes by @awnawab in #320- Fix derived type inheritance when parent type is not available (#330) by @reuterbal in #331
- InlineTransformation: Update Scheduler SGraph if marked_inline is activated by @awnawab in #322
HoistVariablesAnalysis
: remove unused explicit interfaces after inlining by @awnawab in #319- Fix Linter warnings for inline calls with interface block imported from header with func.h suffix by @reuterbal in #332
- Add transformation generated imports to driver or after inlining by @awnawab in #321
- Fix wrong classification as StatementFunction in translation to Loki IR by @reuterbal in #327
- get_pragma_parameters: Fix parsing clauses without parentheses in the tail string by @reuterbal in #324
- ProgramUnit.resolve_typebound_var: raise error if top-level parent is not declared by @reuterbal in #325
- Transformations: SCCRawStackPipeline and SCC config-from-file by @mlange05 in #314
Full Changelog: v0.2.4...v0.2.5
v0.2.4
This is a minor maintenance release matching the declaration of Hybrid 2024 Milestone 1.
What's Changed
- Repo reorganisation: Moving transformations by @mlange05 in #296
- Fix: import of private symbols affects the type inference by @quepas in #308
- JIT compilation updates and compatibility with f90wrap v0.2.14 by @reuterbal in #315
- IR: Fix
get_pragma_params
for multiline pragmas by @mlange05 in #313 - Transformations: Remap declaration symbols and adjust imports when inlining by @mlange05 in #311
- Docs: Update to links from static doc pages by @mlange05 in #312
New Contributors
Full Changelog: v0.2.3...v0.2.4
v0.2.3
This is a minor bugfix/maintenance release to resolve some issues around the Loki installation and version number discovery, particularly when installing from a code version that is not under Git version control.
What's Changed
- Fix installation without git checkout by @reuterbal in #302
- Fetch tags in Github workflows by @reuterbal in #305
- Update version number to 0.2.3 by @reuterbal in #306
Full Changelog: v0.2.2...v0.2.3
v0.2.2
This is a feature and bugfix release, which adds new functionality and resolves a number of problems.
What's New
- Loki supports a new, streamlined way of composing transformation pipelines from individual
Transformation
classes. Transformation arguments are shared among transformations, ensuring consistency, e.g., forDimension
parameters. Pipelines and transformation arguments can even be constructed purely from the config file, which will become the default for theloki-transform.py convert
command in the future. See #217 for more details on how this works. - The pool allocator transformation has a new option to improve compatibility with Cray Compiler Environment 16 on AMD platforms. For that, the pointer arithmetic is removed and
LOC
calls are used directly in the kernel to determine the offset of a temporary in the scratch allocation. See #231 for more details. - A new
RemoveCodeTransformation
has been added, replacing theRemoveCallsTransformation
and incorporating the dead code removal. Additionally, it provides a new feature to remove pragma-annotated code sections via!$loki remove
/!$loki end remove
(#276). - Loki's JIT functionality that is used to build and run tests has been amended so that it honours environment variables and no longer depends on
gfortran
exclusively. Instead, environment variablesCC
,FC
,F90
, andLD
are inspected to determine the compile commands to use, andCFLAGS
,FCFLAGS
,F90FLAGS
, andLDFLAGS
can be used to set corresponding flags. Default values are provided for GNU and NVHPC compilers. With this, it is now possible to run the test suite also on MacOS after installinggcc
andgfortran
(e.g., via Homebrew), and setting the environment variables accordingly. Note that Numpy's F2PY, which is used to call Fortran routines from the Python test base, works also with non-GNU compilers (e.g., NVHPC) but requires gcc to compile the C interface routines. Also, not all tests are compatible with NVHPC and test failures are a known issue that will be resolved in the future (#301). See #294 for more details. - The
parse_expr
utility's functionality has been expanded to support derived types and underpins now theget_pragma_parameters
utility, providing a vastly expanded functionality for expressions in pragma annotations (#292).
What's Changed
- [CMake] Expose GLOBAL_VAR_OFFLOAD and INCLUDES in loki_transform_target by @awnawab in #264
- Preserve imported statement functions by @awnawab in #251
- Fix codecov by adding CODECOV_TOKEN by @reuterbal in #278
cgen
: multiconditional/switch/select case statement by @MichaelSt98 in #267- Introducing the Pipeline class by @mlange05 in #260
- Alternative stack/pool allocator implementation based on Cray pointers compatible with Cray+AMD stack by @MichaelSt98 in #231
- improved
replace_intrinsics
and addedrename_variables
by @MichaelSt98 in #266 - Revert "DEPENDENCY TRAFO: statement functions included via c-style imports preserved" (#251) by @reuterbal in #282
cgen
: return type and var for function(s) by @MichaelSt98 in #269- Pipeline configuration from file by @mlange05 in #271
- Fixing nested associate scope-parentage tracking after inlining by @mlange05 in #281
- F2C:
DeReferenceTrafo
by @MichaelSt98 in #273 - REGEX frontend: white space and nesting bugfix by @reuterbal in #274
- Preserve import statement functions - take II by @awnawab in #283
- Skip driver routine in
GlobalVariableAnalysis
by @awnawab in #265 - MaskedTransformer: Fix in-place rebuilding of scoped nodes by @mlange05 in #284
- Avoid variable_map in TypedSymbol.get_derived_type_member and verify type information is derived correctly by @reuterbal in #285
- SCCHoist: hoist inline call temporaries and don't hoist statically declared arrays by @awnawab in #268
- Pool allocator: correctly resolve derived type member as block dimension and ignore pointer/allocatable arrays by @awnawab in #249
- Marked region removal and general code removal transformation by @mlange05 in #276
- SCC: make vertical dimension optional by @awnawab in #270
SCCBaseTransformation.get_integer_variable
now also checks module imports by @awnawab in #279- Improve performance of pragma-region attach/detach by using transformers by @mlange05 in #286
- Reorganising test directories by @mlange05 in #287
- [Bugfix] available_frontends: Import pytest locally to make dependency optional by @reuterbal in #290
DataflowAnalysis
bugfix: preserve body nesting invisit_MaskedStatement
by @awnawab in #288- Loki expression parser based on pymbolic parser by @MichaelSt98 in #272
- F2C: optional case-sensitivity for variables/symbols by @MichaelSt98 in #277
- Transformation to hoist temporaries in kernel language transpilation by @MichaelSt98 in #291
- fix scoping for global var hoisting by @MichaelSt98 in #293
- SCC: Support for bounds aliases and derived type members as bounds by @awnawab in #250
- Consistent, environment-configurable use of Compiler class in JIT compilation by @reuterbal in #294
- Derived-type inheritance by @awnawab in #295
- Improve
parse_expr
and use inprocess_dimension_pragmas
by @MichaelSt98 in #292
Full Changelog: v0.2.1...v0.2.2
v0.2.1
This is a bugfix release that contains a number of small fixes in transformations and Scheduler.
What's New
- Utility methods have been added to
CallStatement
, which simplify inspecting, validating and converting keyword-arguments to positional arguments (see #235) - The batch-processing module
loki.bulk
has been renamed toloki.batch
What's Changed
- kwargs utilities by @MichaelSt98 in #235
- Allow to ignore specific dimensions in "shift to zero indexing" by @MichaelSt98 in #236
- Add 'reverse_traversal=True' to DerivedTypeArgumentsTransformation manifest by @MichaelSt98 in #238
- Create a pid-specific temporary directory and clean it up at the end by @reuterbal in #261
- SCC-HOIST: Hoist variables as
kwargs
(optionally) by @MichaelSt98 in #237 GlobalVarHoistTransformation
: fix for functions/inline calls by @MichaelSt98 in #240- Support colon notation for all dimensions in flatten_arrays by @MichaelSt98 in #239
- Small CMake layer fixes for SL by @awnawab in #248
- Rename
bulk
->batch
and createir
sub-package by @mlange05 in #258 - SingleColumn: Demote arrays that are not used at all in the body by @mlange05 in #259
- Scheduler: Fix handling of external module procedures by @reuterbal in #263
Full Changelog: v0.2.0...v0.2.1