-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimal operator priority pretty-printing in translators #69
Comments
It turns out that things are not as easy as they look like at the first glance. I've tried a naïve approach, introducing a "priority" integer that gets compared to that's inside. Literals get priority of 0, exponentiation gets 20, multiplication gets 40, addition gets 50, etc. Various translators for various target languages may output different values, as some languages have slightly different ideas on operator priorities. A very simple example
However, things get ugly fast, as there's also operator's associativity. For example:
Current priority-only based algorithm would translate all these examples to Moreover, I'm not totally sure that it's a good idea to omit parenthesis in |
I would not omit parentheses as they can make the generated code more understandable when the author uses them to separate logically different parts. |
I've started a comparison table on priorities in different languages: https://docs.google.com/spreadsheets/d/13Cs3SidXHJVDCqydHCBwnk3uQ2FjGmHsjZa6SHkrE34/edit?usp=sharing |
Unfortunately, it's not completely possible. From the AST point of view:
are translated to exactly the same BinOp(
BinOp(
IntNum(1),
Add,
IntNum(2)
),
Add,
IntNum(3)
) After we've got AST, it's not possible to distinguish them. |
They could be
BinOp(
Paren(
BinOp(
IntNum(1),
Add,
IntNum(2)
)
),
Add,
IntNum(3)
)
BinOp(
BinOp(
IntNum(1),
Add,
IntNum(2)
),
Add,
IntNum(3)
)
Paren(
Paren(
BinOp(
Paren(
Paren(
BinOp(
IntNum(1),
Add,
IntNum(2)
)
)
)
Add,
IntNum(3)
)
)
) But I don't know whether this is the proper way to do it, or is it worth it at first place... |
Well, it's generally not how AST parsing works. More or less, the general idea is to eliminate parenthesis and all these operator application priority steps, and that's actually what we need, as we can't rely on particular operator priorities (and parenthesis application) of a single language, as they can be different for different targets... |
Yes, but we can interpret parentheses in ksy expressions as "force parenthesis here" operators. I cannot think of any case where these "forced" parentheses would change the meaning of the expression in any of the target languages. Of course sometimes we need to put additional parentheses into the target language which won't be in the AST where the target language's operator precedence does not match with the .ksy expression precedence. So the AST would still describe the same expression, but would add support for additional used-only-for-clarity-parentheses. Or this could be even an optional compiler option: keep original parentheses from ksy or not. |
…ions (as discussed in kaitai-io/kaitai_struct#69): * `translate()` family recursive functions now pass along not a single string, but string + priority integer * There is a `translate(v, outerPriority)` wrapper that analyzes inner priority vs outer priority and adds parenthesis * Fixed many `translate(...)` invocations to adhere to new pattern * Ideally, `translate()` should be always called aware of the context the expression will be used as * Introduced `CTernaryOperator` that carries the same logic of C-like ternary operator common to many languages * Modified some TranslatorSpec tests Work in progress, still has problems with associative operators.
I've pushed what I have now into distinct branch optimal_expressions, so anyone can track my progress. |
kaitai-io/kaitai_struct_compiler#277 was merged, so I would continue using this issue to track remaining problems, this is roughly the plan:
I've started playing with comparisons, and immediately figured out that it's not so simple even with very basic set of comparison operators. Some languages have all of comparison operators on same level of precedence (e.g. Python), some languages have A simple test:
An also interesting bit is that modern gcc issues a warning on this:
So, it looks like at least for Python (and likely for C++ too to avoid the warning) we need to modify generation to parenthesize. |
A few other data points:
So, looks like we'll have to build that distinction in the custom per-language precedence tables. |
While experimenting with expression, I've whipped up a tool all-expression that allows me to quickly test some ideas in many languages we support:
|
Right now, generation of target code by translators yield more or less spontaneous parenthesis generation. In some cases, it proves to be unoptimal (i.e. extra parenthesis are added when there's no need for them) and sometimes it yields wrong results, i.e.:
A few examples that should probably provide more optimal translation:
1 + 2
(1 + 2)
1 + 2
1 + 2 + 5
((1 + 2) + 5)
1 + 2 + 5
(1 + 2) / (7 * 8)
((1 + 2) / (7 * 8))
(1 + 2) / (7 * 8)
I propose to:
Hopefully, it would also provide an ultimate answer to all these bugs above as well.
The text was updated successfully, but these errors were encountered: