-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve parse_expr
and use in process_dimension_pragmas
#292
Conversation
…otation original notation), add parse_expr to (missing) tests
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/292/index.html |
…o the original behaviour allowing necessary pass of different pattern to 'process_dimension_pragmas'
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #292 +/- ##
==========================================
+ Coverage 94.94% 94.96% +0.01%
==========================================
Files 153 153
Lines 32008 32427 +419
==========================================
+ Hits 30391 30795 +404
- Misses 1617 1632 +15
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
It's probably possible to combine the logic of
and
However, I think it makes more sense to introduce the |
…luation and evaluation for arrays
Some improvements to the expression parser included in this PR: Executing: from loki import parse_expr
class Foo:
val3 = 1
arr = [[1, 2], [3, 4]]
def __init__(self, _val1, _val2):
self.val1 = _val1
self.val2 = _val2
def some_func(self, a, b):
return a + b
@staticmethod
def static_func(a):
return 2*a
context = {'foo': Foo(2, 3)}
test_str = 'foo%val1 + foo%val2 + foo%val3'
parsed = parse_expr(f'{test_str}')
print(f"parsing: {test_str} with context: '{context}'\n results in '{parsed}'")
parsed = parse_expr(f'{test_str}', evaluate=True, context=context)
print(f"parsing and evaluating: {test_str} with context: '{context}'\n results in '{parsed}'")
test_str = 'foo%val1 + foo%some_func(1, 2) + foo%static_func_2(3)'
parsed = parse_expr(f'{test_str}', evaluate=True, context=context)
print(f"parsing and evaluating: {test_str} with context: '{context}'\n results in '{parsed}'")
test_str = 'foo%val1 + foo%some_func(1, 2) + foo%static_func(3) + foo%arr(1, 2)'
parsed = parse_expr(f'{test_str}', evaluate=True, context=context, strict=True)
print(f"parsing and evaluating: {test_str} with context: '{context}'\n results in '{parsed}'")
test_str = 'foo%val1 + foo%some_func(1, b=2) + foo%static_func(a=3) + foo%arr(1, 2)'
parsed = parse_expr(f'{test_str}', evaluate=True, context=context, strict=True)
print(f"parsing and evaluating: {test_str} with context: '{context}'\n results in '{parsed}'") gives
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks, the improvements to parse_expr
and how it can deal with derived types is absolutely fantastic! I'm really impressed by the degree of functionality you have created in a short amount of time.
Just out of curiosity but not necessarily something we need: would the evaluation also work for nested derived types? Say something like
class Bar:
baz = 4
class Foo:
bar = Bar()
parse_expr('1 + foo%bar%baz', context={'foo': Foo()}, evaluate=True)
I am, however, not convinced by the integration in process_dimension_pragmas
. The workaround with the expanded regex pattern breaks down as soon as there's more than one set of nested parentheses, and it's fundamentally a problem that cannot be solved by regex matching.
Therefore, I would propose to re-implement the get_pragma_parameters
utility to provide this functionality. But we can also do this as a separate PR and leave out the support for parentheses in this PR.
loki/expression/parser.py
Outdated
|
||
map_range = map_slice | ||
map_range_index = map_slice | ||
map_loop_range = map_slice | ||
|
||
def map_variable(self, expr, *args, **kwargs): | ||
return sym.Variable(name=expr.name) | ||
parent = kwargs.pop('parent', None) | ||
return sym.Variable(name=expr.name, parent=parent) # , **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return sym.Variable(name=expr.name, parent=parent) # , **kwargs) | |
return sym.Variable(name=expr.name, parent=parent) |
loki/expression/parser.py
Outdated
kwargs = { | ||
k: self.rec(v) | ||
for k, v in expr.name.kw_parameters.items()} | ||
kwargs = CaseInsensitiveDict(kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could build the CaseInsensitiveDict directly like
kwargs = { | |
k: self.rec(v) | |
for k, v in expr.name.kw_parameters.items()} | |
kwargs = CaseInsensitiveDict(kwargs) | |
kwargs = CaseInsensitiveDict( | |
(k, self.rec(v)) | |
for k, v in expr.name.kw_parameters.items() | |
) |
context = {'arr': [[1, 2], [3, 4]]} | ||
test_str = '1 + arr(1, 2)' | ||
parsed = parse_expr(convert_to_case(f'{test_str}', mode=case), evaluate=True, context=context) | ||
assert parsed == 3 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure my brain is fully awake, but do we not have a row-major vs. column-major conflict here? From the definition, arr
should look like this to me:
1 2
3 4
And I would expect arr(1, 2)
to be interpreted as column-major, which should yield 3
.
So, the evaluated result should be 4
, unless I'm misunderstanding/missing something of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline ...
test_str = 'foo%val1 + foo%val2 + foo%val3' | ||
parsed = parse_expr(convert_to_case(f'{test_str}', mode=case)) | ||
assert str(parsed).lower().replace(' ', '') == 'foo%val1+foo%val2+foo%val3' | ||
with pytest.raises(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we narrow down the expected exception type here?
test_str = 'foo%val1 + foo%some_func(1, 2) + foo%static_func_2(3)' | ||
parsed = parse_expr(convert_to_case(f'{test_str}', mode=case), evaluate=True, context=context) | ||
assert str(parsed).lower().replace(' ', '') == '5+foo%static_func_2(3)' | ||
with pytest.raises(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about exception type
loki/ir/pragma_utils.py
Outdated
parameters[match.group('command')].append(match.group('arg')) | ||
parameters = {k: v if len(v) > 1 else v[0] for k, v in parameters.items()} | ||
return parameters | ||
|
||
|
||
def process_dimension_pragmas(ir): | ||
_get_pragma_dim_parameter_re = re.compile(r'(?P<command>[\w-]+)\s*(?:\((?P<arg>.*)\))?') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work reliably, unfortunately. Counterexample:
!$loki something key1((val1 + 1)/2) key2((var1+1/(var2-5))/(2-3))
Generally, regex break down on recursive tasks like matching parentheses.
Using a REGEX for this utility was a lazy way of solving the problem at the time, but it's likely no longer the correct way of dealing with this. Instead, we should scan the string, break the chunks down but match parentheses in the process.
A similar utility exists in the REGEX frontend to match parentheses in strings:
https://github.com/ecmwf-ifs/loki/blob/main/loki/frontend/regex.py#L280-L344
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I did what you suggested based on the link you provided.
loki/ir/pragma_utils.py
Outdated
@@ -14,9 +14,8 @@ | |||
from loki.ir.find import FindNodes | |||
from loki.ir.transformer import Transformer | |||
from loki.ir.visitor import Visitor | |||
|
|||
# from loki.expression.parser import parse_expr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# from loki.expression.parser import parse_expr |
…to scanning the string and breaking down the chunks to match parentheses
Regarding your nested derived types question: With this, executing: from loki import parse_expr
class Bar:
baz = 4
class Foo:
bar = Bar()
ir = parse_expr('1 + foo%bar%baz', context={'foo': Foo()}, evaluate=True)
print(ir) yields There are also more sophisticated tests for nested derived types in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks!
The much improved get_pragma_parameters is excellent! And the support for nested derived types was purely curiousity, the fact that this works now with a small set of changes really emphasizes how incredibly great this is!
Just a final request for a small docstring, and I'm a little puzzled by the NotImplementedError
in the evaluator.
loki/ir/pragma_utils.py
Outdated
pragma = as_tuple(pragma) | ||
parameters = defaultdict(list) | ||
for p in pragma: | ||
parameter = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is redundant, no?
|
||
def get_pragma_parameters(pragma, starts_with=None, only_loki_pragmas=True, | ||
pattern=_get_pragma_parameters_re): | ||
class PragmaParameters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic, many thanks!
Would you mind adding just a two-line docstring explaining what it's for to both the class and the find
method?
if self.strict: | ||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the control-flow here. What's not implemented when we run through cleanly without an exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can discuss about that. However, I thought if strict = True
one wants to have an evaluated expression which in this case is not true as it would return the unchanged expression itself?! What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, you're right of course. I missed the fact that we won't reach this point if it's one of the known "lookup" nodes. Thanks for clarifying!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, many thanks!
Improve
parse_expr
and use inprocess_dimension_pragmas
.:
toRangeIndex((None, None))
DerivedType
s