Improve `parse_expr` and use in `process_dimension_pragmas` #292

MichaelSt98 · 2024-04-18T09:59:04Z

Improve parse_expr and use in process_dimension_pragmas.

add correct parsing of : to RangeIndex((None, None))
support DerivedType s

…None))'

…otation original notation), add parse_expr to (missing) tests

github-actions · 2024-04-18T10:01:41Z

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/292/index.html

…o the original behaviour allowing necessary pass of different pattern to 'process_dimension_pragmas'

codecov · 2024-04-18T12:11:09Z

Codecov Report

Attention: Patch coverage is 98.21429% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 94.96%. Comparing base (44bccfd) to head (2b8c5cb).
Report is 33 commits behind head on main.

Files	Patch %	Lines
loki/expression/parser.py	94.59%	4 Missing ⚠️
loki/ir/pragma_utils.py	98.11%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #292      +/-   ##
==========================================
+ Coverage   94.94%   94.96%   +0.01%     
==========================================
  Files         153      153              
  Lines       32008    32427     +419     
==========================================
+ Hits        30391    30795     +404     
- Misses       1617     1632      +15

Flag	Coverage Δ
lint_rules	`96.39% <ø> (ø)`
loki	`95.13% <98.21%> (+0.01%)`	⬆️
transformations	`92.17% <ø> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

MichaelSt98 · 2024-04-18T12:20:57Z

It's probably possible to combine the logic of

_get_pragma_parameters_re = re.compile(r'(?P<command>[\w-]+)\s*(?:$(?P<arg>.+?)$)?')

and

_get_pragma_dim_parameter_re = re.compile(r'(?P<command>[\w-]+)\s*(?:$(?P<arg>.*)$)?')

However, I think it makes more sense to introduce the pattern argument in get_pragma_parameters() ...

…luation and evaluation for arrays

MichaelSt98 · 2024-04-19T11:46:55Z

Some improvements to the expression parser included in this PR:

Executing:

from loki import parse_expr

class Foo:
    val3 = 1
    arr = [[1, 2], [3, 4]]
    def __init__(self, _val1, _val2):
        self.val1 = _val1
        self.val2 = _val2
    def some_func(self, a, b):
        return a + b
    @staticmethod
    def static_func(a):
        return 2*a

context = {'foo': Foo(2, 3)}
test_str = 'foo%val1 + foo%val2 + foo%val3'
parsed = parse_expr(f'{test_str}')
print(f"parsing: {test_str} with context: '{context}'\n  results in '{parsed}'")
parsed = parse_expr(f'{test_str}', evaluate=True, context=context)
print(f"parsing and evaluating: {test_str} with context: '{context}'\n  results in '{parsed}'")
test_str = 'foo%val1 + foo%some_func(1, 2) + foo%static_func_2(3)'
parsed = parse_expr(f'{test_str}', evaluate=True, context=context)
print(f"parsing and evaluating: {test_str} with context: '{context}'\n  results in '{parsed}'")
test_str = 'foo%val1 + foo%some_func(1, 2) + foo%static_func(3) + foo%arr(1, 2)'
parsed = parse_expr(f'{test_str}', evaluate=True, context=context, strict=True)
print(f"parsing and evaluating: {test_str} with context: '{context}'\n  results in '{parsed}'")
test_str = 'foo%val1 + foo%some_func(1, b=2) + foo%static_func(a=3) + foo%arr(1, 2)'
parsed = parse_expr(f'{test_str}', evaluate=True, context=context, strict=True)
print(f"parsing and evaluating: {test_str} with context: '{context}'\n  results in '{parsed}'")

gives

parsing: foo%val1 + foo%val2 + foo%val3 with context: '{'foo': <__main__.Foo object at 0x1494ff19c3d0>}'
  results in 'foo%val1 + foo%val2 + foo%val3'
parsing and evaluating: foo%val1 + foo%val2 + foo%val3 with context: '{'foo': <__main__.Foo object at 0x1494ff19c3d0>}'
  results in '6'
parsing and evaluating: foo%val1 + foo%some_func(1, 2) + foo%static_func_2(3) with context: '{'foo': <__main__.Foo object at 0x1494ff19c3d0>}'
  results in '5 + foo%static_func_2(3)'
parsing and evaluating: foo%val1 + foo%some_func(1, 2) + foo%static_func(3) + foo%arr(1, 2) with context: '{'foo': <__main__.Foo object at 0x1494ff19c3d0>}'
  results in '13'
parsing and evaluating: foo%val1 + foo%some_func(1, b=2) + foo%static_func(a=3) + foo%arr(1, 2) with context: '{'foo': <__main__.Foo object at 0x1494ff19c3d0>}'
  results in '13'

reuterbal

Many thanks, the improvements to parse_expr and how it can deal with derived types is absolutely fantastic! I'm really impressed by the degree of functionality you have created in a short amount of time.

Just out of curiosity but not necessarily something we need: would the evaluation also work for nested derived types? Say something like

class Bar:
    baz = 4

class Foo:
    bar = Bar()

parse_expr('1 + foo%bar%baz', context={'foo': Foo()}, evaluate=True)

I am, however, not convinced by the integration in process_dimension_pragmas. The workaround with the expanded regex pattern breaks down as soon as there's more than one set of nested parentheses, and it's fundamentally a problem that cannot be solved by regex matching.

Therefore, I would propose to re-implement the get_pragma_parameters utility to provide this functionality. But we can also do this as a separate PR and leave out the support for parentheses in this PR.

reuterbal · 2024-04-18T12:03:24Z

loki/expression/parser.py


    map_range = map_slice
    map_range_index = map_slice
    map_loop_range = map_slice

    def map_variable(self, expr, *args, **kwargs):
-        return sym.Variable(name=expr.name)
+        parent = kwargs.pop('parent', None)
+        return sym.Variable(name=expr.name, parent=parent) # , **kwargs)


Suggested change

return sym.Variable(name=expr.name, parent=parent) # , **kwargs)

return sym.Variable(name=expr.name, parent=parent)

reuterbal · 2024-04-22T08:00:17Z

loki/expression/parser.py

+                kwargs = {
+                    k: self.rec(v)
+                    for k, v in expr.name.kw_parameters.items()}
+                kwargs = CaseInsensitiveDict(kwargs)


You could build the CaseInsensitiveDict directly like

Suggested change

kwargs = {

k: self.rec(v)

for k, v in expr.name.kw_parameters.items()}

kwargs = CaseInsensitiveDict(kwargs)

kwargs = CaseInsensitiveDict(

(k, self.rec(v))

for k, v in expr.name.kw_parameters.items()

)

loki/expression/parser.py

reuterbal · 2024-04-22T08:53:08Z

loki/expression/tests/test_expression.py

+    context = {'arr': [[1, 2], [3, 4]]}
+    test_str = '1 + arr(1, 2)'
+    parsed = parse_expr(convert_to_case(f'{test_str}', mode=case), evaluate=True, context=context)
+    assert parsed == 3
+


Not sure my brain is fully awake, but do we not have a row-major vs. column-major conflict here? From the definition, arr should look like this to me:

1 2 3 4

And I would expect arr(1, 2) to be interpreted as column-major, which should yield 3.
So, the evaluated result should be 4, unless I'm misunderstanding/missing something of course.

As discussed offline ...

reuterbal · 2024-04-22T08:54:13Z

loki/expression/tests/test_expression.py

+    test_str = 'foo%val1 + foo%val2 + foo%val3'
+    parsed = parse_expr(convert_to_case(f'{test_str}', mode=case))
+    assert str(parsed).lower().replace(' ', '') == 'foo%val1+foo%val2+foo%val3'
+    with pytest.raises(Exception):


Could we narrow down the expected exception type here?

reuterbal · 2024-04-22T08:54:22Z

loki/expression/tests/test_expression.py

+    test_str = 'foo%val1 + foo%some_func(1, 2) + foo%static_func_2(3)'
+    parsed = parse_expr(convert_to_case(f'{test_str}', mode=case), evaluate=True, context=context)
+    assert str(parsed).lower().replace(' ', '') == '5+foo%static_func_2(3)'
+    with pytest.raises(Exception):


Same comment about exception type

reuterbal · 2024-04-22T11:31:25Z

loki/ir/pragma_utils.py

            parameters[match.group('command')].append(match.group('arg'))
    parameters = {k: v if len(v) > 1 else v[0] for k, v in parameters.items()}
    return parameters


-def process_dimension_pragmas(ir):
+_get_pragma_dim_parameter_re = re.compile(r'(?P<command>[\w-]+)\s*(?:\((?P<arg>.*)\))?')


This won't work reliably, unfortunately. Counterexample:

!$loki something key1((val1 + 1)/2) key2((var1+1/(var2-5))/(2-3))

Generally, regex break down on recursive tasks like matching parentheses.

Using a REGEX for this utility was a lazy way of solving the problem at the time, but it's likely no longer the correct way of dealing with this. Instead, we should scan the string, break the chunks down but match parentheses in the process.

A similar utility exists in the REGEX frontend to match parentheses in strings:
https://github.com/ecmwf-ifs/loki/blob/main/loki/frontend/regex.py#L280-L344

I agree. I did what you suggested based on the link you provided.

reuterbal · 2024-04-22T11:32:29Z

loki/ir/pragma_utils.py

@@ -14,9 +14,8 @@
 from loki.ir.find import FindNodes
 from loki.ir.transformer import Transformer
 from loki.ir.visitor import Visitor
-
+# from loki.expression.parser import parse_expr


Suggested change

# from loki.expression.parser import parse_expr

…to scanning the string and breaking down the chunks to match parentheses

MichaelSt98 · 2024-04-24T09:20:31Z

Regarding your nested derived types question:
It was not possible, but I've added a commit to allow nested derived types.

With this, executing:

from loki import parse_expr

class Bar:
    baz = 4

class Foo:
    bar = Bar()

ir = parse_expr('1 + foo%bar%baz', context={'foo': Foo()}, evaluate=True)
print(ir)

yields 5.

There are also more sophisticated tests for nested derived types in expression/tests/test_expression.py

reuterbal

Many thanks!
The much improved get_pragma_parameters is excellent! And the support for nested derived types was purely curiousity, the fact that this works now with a small set of changes really emphasizes how incredibly great this is!

Just a final request for a small docstring, and I'm a little puzzled by the NotImplementedError in the evaluator.

loki/expression/parser.py

reuterbal · 2024-04-25T09:51:43Z

loki/ir/pragma_utils.py

    pragma = as_tuple(pragma)
    parameters = defaultdict(list)
    for p in pragma:
+        parameter = None


This is redundant, no?

reuterbal · 2024-04-25T09:53:11Z

loki/ir/pragma_utils.py

-
-def get_pragma_parameters(pragma, starts_with=None, only_loki_pragmas=True,
-        pattern=_get_pragma_parameters_re):
+class PragmaParameters:


Fantastic, many thanks!
Would you mind adding just a two-line docstring explaining what it's for to both the class and the find method?

reuterbal · 2024-04-25T09:54:44Z

loki/expression/parser.py

+        if self.strict:
+            raise NotImplementedError


Not sure I understand the control-flow here. What's not implemented when we run through cleanly without an exception?

We can discuss about that. However, I thought if strict = True one wants to have an evaluated expression which in this case is not true as it would return the unchanged expression itself?! What do you think?

Sorry, you're right of course. I missed the fact that we won't reach this point if it's one of the known "lookup" nodes. Thanks for clarifying!

reuterbal

Excellent, many thanks!

MichaelSt98 added 4 commits April 18, 2024 07:18

loki expression parser: correct parsing of ':' to 'RangeIndex((None, …

14f4788

…None))'

better/simplified parametrization of test

0947987

Loki expr parser, support DerivedTypes, do not cast floats (to keep n…

914c492

…otation original notation), add parse_expr to (missing) tests

use 'parse_expr' in 'process_dimension_pragmas'

df11a85

MichaelSt98 added 3 commits April 18, 2024 11:08

fix cyclic import

9900b3a

Fix test, as pragma string 'x+y' no is now properly parsed as Sum

dec1ad7

introduce 'pattern' argument for 'get_pragma_parameters' defaulting t…

47e435d

…o the original behaviour allowing necessary pass of different pattern to 'process_dimension_pragmas'

MichaelSt98 requested a review from reuterbal April 18, 2024 12:21

parse_expr: proper handling of derived type(s) variables includig eva…

ef5b5cc

…luation and evaluation for arrays

reuterbal requested changes Apr 22, 2024

View reviewed changes

MichaelSt98 added 2 commits April 23, 2024 15:19

parse_expr: allow for nested derived types

a60479e

Adapted 'get pragma parameters' utilities, switching from pure REGEX …

c834d5d

…to scanning the string and breaking down the chunks to match parentheses

reuterbal requested changes Apr 25, 2024

View reviewed changes

add missing docstring and remove some unnecessary comments

2b8c5cb

reuterbal approved these changes Apr 25, 2024

View reviewed changes

reuterbal added the ready to merge This PR has been approved and is ready to be merged label Apr 25, 2024

reuterbal merged commit 5673795 into main Apr 25, 2024
12 checks passed

reuterbal deleted the nams_process_dimension_pragmas_parse_expr branch April 25, 2024 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `parse_expr` and use in `process_dimension_pragmas` #292

Improve `parse_expr` and use in `process_dimension_pragmas` #292

MichaelSt98 commented Apr 18, 2024

github-actions bot commented Apr 18, 2024

codecov bot commented Apr 18, 2024 •

edited

Loading

MichaelSt98 commented Apr 18, 2024

MichaelSt98 commented Apr 19, 2024

reuterbal left a comment

reuterbal Apr 18, 2024

reuterbal Apr 22, 2024

reuterbal Apr 22, 2024

MichaelSt98 Apr 24, 2024

reuterbal Apr 22, 2024

reuterbal Apr 22, 2024

reuterbal Apr 22, 2024

MichaelSt98 Apr 24, 2024

reuterbal Apr 22, 2024

MichaelSt98 commented Apr 24, 2024

reuterbal left a comment

reuterbal Apr 25, 2024

reuterbal Apr 25, 2024

reuterbal Apr 25, 2024

MichaelSt98 Apr 25, 2024

reuterbal Apr 25, 2024

reuterbal left a comment

	return sym.Variable(name=expr.name, parent=parent) # , **kwargs)
	return sym.Variable(name=expr.name, parent=parent)

Improve parse_expr and use in process_dimension_pragmas #292

Improve parse_expr and use in process_dimension_pragmas #292

Conversation

MichaelSt98 commented Apr 18, 2024

github-actions bot commented Apr 18, 2024

codecov bot commented Apr 18, 2024 • edited Loading

Codecov Report

MichaelSt98 commented Apr 18, 2024

MichaelSt98 commented Apr 19, 2024

reuterbal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaelSt98 commented Apr 24, 2024

reuterbal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reuterbal left a comment

Choose a reason for hiding this comment

Improve `parse_expr` and use in `process_dimension_pragmas` #292

Improve `parse_expr` and use in `process_dimension_pragmas` #292

codecov bot commented Apr 18, 2024 •

edited

Loading