Improve performance of pragma-region attach/detach by using transformers #286

mlange05 · 2024-04-12T03:55:44Z

A subtle but significant performance bug meant that attaching/detaching pragma-regions was becoming ver expensive for large code regions, eg. when finding marked regions to remove (PR #276). This PR fixes this by re-implementing the matching and resolution of PragmaRegion objects with custom in-place transformers that perform the matching at tuple-level.

One important consideration here is that we now only match pragma-pairs at tuple level, and thus have an implicit check for misplaced !$loki end <marker> pragmas, that will now trigger warnings. We also need to be very careful about tuple-insertion when resolving the regions objects during detach, as we do not want nested tuples in our IR tree, but cannot use flatten on tuples generically. This is marked via comments in the respective places.

And finally, some anecdotal speed-up evidence using an experimental ec-physics inlined control flow routine:
Before

(loki_env) $ loki-ecphys-gen.py inline --source ./input  --build .
[Loki::Sourcefile] Constructed from input/ec_phys_drv.F90 in 0.89s
[Loki::Sourcefile] Constructed from input/ec_phys.F90 in 1.37s
[Loki::Sourcefile] Constructed from input/callpar.F90 in 2.56s
[Loki] Inlined EC_PHYS in 2.78s
[Loki] Inlined CALLPAR in 4.76s
[Loki] Remove marked regions in 136.73s   <= this uses `with pragma_regions_attached`

after:

(loki_env) $ loki-ecphys-gen.py inline --source ./input  --build .
[Loki::Sourcefile] Constructed from input/ec_phys_drv.F90 in 0.88s
[Loki::Sourcefile] Constructed from input/ec_phys.F90 in 1.48s
[Loki::Sourcefile] Constructed from input/callpar.F90 in 2.53s
[Loki::EC-Physics] Inlined EC_PHYS in 2.78s
[Loki::EC-Physics] Inlined CALLPAR in 4.90s
[Loki::EC-Physics] Remove marked regions in 2.62s

github-actions · 2024-04-12T03:58:27Z

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/286/index.html

codecov · 2024-04-12T04:38:37Z

Codecov Report

Attention: Patch coverage is 94.11765% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 92.87%. Comparing base (a594b44) to head (024018b).

Files	Patch %	Lines
loki/pragma_utils.py	94.11%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #286      +/-   ##
==========================================
- Coverage   92.88%   92.87%   -0.01%     
==========================================
  Files         102      102              
  Lines       18253    18269      +16     
==========================================
+ Hits        16954    16968      +14     
- Misses       1299     1301       +2

Flag	Coverage Δ
lint_rules	`96.39% <ø> (ø)`
loki	`92.84% <94.11%> (-0.01%)`	⬇️
transformations	`92.22% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

reuterbal

Many thanks! Another long-standing utility done right 😄
Implementation looks good to me and the speed-up is extremely impressive!

mlange05 requested a review from reuterbal April 12, 2024 03:55

mlange05 added 2 commits April 12, 2024 04:09

IR: Improved implementation of attach_region_pragma via Transformer

a008854

IR: Improved implementation of detach_pragma_region via Transformer

024018b

mlange05 force-pushed the naml-improve-pragma-region-attached branch from 462695c to 024018b Compare April 12, 2024 04:10

reuterbal approved these changes Apr 12, 2024

View reviewed changes

reuterbal added the ready to merge This PR has been approved and is ready to be merged label Apr 12, 2024

reuterbal merged commit b5c5791 into main Apr 12, 2024
12 checks passed

reuterbal deleted the naml-improve-pragma-region-attached branch April 12, 2024 16:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of pragma-region attach/detach by using transformers #286

Improve performance of pragma-region attach/detach by using transformers #286

mlange05 commented Apr 12, 2024

github-actions bot commented Apr 12, 2024

codecov bot commented Apr 12, 2024 •

edited

Loading

reuterbal left a comment

Improve performance of pragma-region attach/detach by using transformers #286

Improve performance of pragma-region attach/detach by using transformers #286

Conversation

mlange05 commented Apr 12, 2024

github-actions bot commented Apr 12, 2024

codecov bot commented Apr 12, 2024 • edited Loading

Codecov Report

reuterbal left a comment

Choose a reason for hiding this comment

codecov bot commented Apr 12, 2024 •

edited

Loading