Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of pragma-region attach/detach by using transformers #286

Merged
merged 2 commits into from
Apr 12, 2024

Conversation

mlange05
Copy link
Collaborator

A subtle but significant performance bug meant that attaching/detaching pragma-regions was becoming ver expensive for large code regions, eg. when finding marked regions to remove (PR #276). This PR fixes this by re-implementing the matching and resolution of PragmaRegion objects with custom in-place transformers that perform the matching at tuple-level.

One important consideration here is that we now only match pragma-pairs at tuple level, and thus have an implicit check for misplaced !$loki end <marker> pragmas, that will now trigger warnings. We also need to be very careful about tuple-insertion when resolving the regions objects during detach, as we do not want nested tuples in our IR tree, but cannot use flatten on tuples generically. This is marked via comments in the respective places.

And finally, some anecdotal speed-up evidence using an experimental ec-physics inlined control flow routine:
Before

(loki_env) $ loki-ecphys-gen.py inline --source ./input  --build .
[Loki::Sourcefile] Constructed from input/ec_phys_drv.F90 in 0.89s
[Loki::Sourcefile] Constructed from input/ec_phys.F90 in 1.37s
[Loki::Sourcefile] Constructed from input/callpar.F90 in 2.56s
[Loki] Inlined EC_PHYS in 2.78s
[Loki] Inlined CALLPAR in 4.76s
[Loki] Remove marked regions in 136.73s   <= this uses `with pragma_regions_attached`

after:

(loki_env) $ loki-ecphys-gen.py inline --source ./input  --build .
[Loki::Sourcefile] Constructed from input/ec_phys_drv.F90 in 0.88s
[Loki::Sourcefile] Constructed from input/ec_phys.F90 in 1.48s
[Loki::Sourcefile] Constructed from input/callpar.F90 in 2.53s
[Loki::EC-Physics] Inlined EC_PHYS in 2.78s
[Loki::EC-Physics] Inlined CALLPAR in 4.90s
[Loki::EC-Physics] Remove marked regions in 2.62s

@mlange05 mlange05 requested a review from reuterbal April 12, 2024 03:55
Copy link

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/286/index.html

@mlange05 mlange05 force-pushed the naml-improve-pragma-region-attached branch from 462695c to 024018b Compare April 12, 2024 04:10
Copy link

codecov bot commented Apr 12, 2024

Codecov Report

Attention: Patch coverage is 94.11765% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 92.87%. Comparing base (a594b44) to head (024018b).

Files Patch % Lines
loki/pragma_utils.py 94.11% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #286      +/-   ##
==========================================
- Coverage   92.88%   92.87%   -0.01%     
==========================================
  Files         102      102              
  Lines       18253    18269      +16     
==========================================
+ Hits        16954    16968      +14     
- Misses       1299     1301       +2     
Flag Coverage Δ
lint_rules 96.39% <ø> (ø)
loki 92.84% <94.11%> (-0.01%) ⬇️
transformations 92.22% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@reuterbal reuterbal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks! Another long-standing utility done right 😄
Implementation looks good to me and the speed-up is extremely impressive!

@reuterbal reuterbal added the ready to merge This PR has been approved and is ready to be merged label Apr 12, 2024
@reuterbal reuterbal merged commit b5c5791 into main Apr 12, 2024
12 checks passed
@reuterbal reuterbal deleted the naml-improve-pragma-region-attached branch April 12, 2024 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge This PR has been approved and is ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants