-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement br_table; drop tableswitch #249
Conversation
From the point of view of being readable, this seems neutral, perhaps even helpful. A source viewer will just need to analyse the patterns and present such patterns in familiar switch statements etc if they fit well, and removing the cases might even make such pattern matching simpler. One significant difference is that the key expression is evaluate in the scope of all these blocks, whereas a switch key expression is evaluated outside the scope of the cases, but even this is probably just another detail in pattern matching. The default falling through seems ok at first sight. Having the labels first seems consistent with them being immediate arguments. Could the br_table be a multi_block which has been discussed before, so that it heads all the blocks. Then the key expression could be evaluated outside their scope. In other words, just add one more immediate integer argument giving the number of blocks to start, and some opcode to end each block - it might even look like |
Great, I was wanting to also propose fallthrough-on-default for the same symmetry reason you gave. To check my understanding, an s-expr parser would know it has reached the end of the |
Yes, the s-expr syntax is unambiguous. |
An explicit default gives producers more flexibility than fallthrough. Practical implementations will need a separate bounds-check branch in any case; fallthrough semantics restrict this branch to a fixed location. An explicit default means lets producers decide where they want it to go without creating extra branch-to-branch constructs. If we're going to have the default, I propose it be explicit. Another option is to make an out-of-bounds index in the table consistent with an out-of-bounds index in linear memory. The analogy to |
@sunfishcode What about the br_table have an implicit unreachable target, say target zero, so the default break could use this target to implement the trapping case, and otherwise directly break to a target. This unreachable target might be usable by in the break table too, for cases in which some are unreachable. |
In support of an explicit default operand: Compilers for languages like C/C++ benefit from flexibility in where the default arm goes, because their default can appear in any order with respect to the cases, and can fall through or be reached from fall through from cases. Also, all optimizing implementations will need a bounds-check branch anyway, so it's not much extra burden to let producers specify where they want that branch to go. |
@sunfishcode, not sure how the default ordering is affected, that can just as well be done with a separated br? The tests in this PR even contain examples of that. |
@rossberg-chromium It's true; one can always add extra |
I support a default target, since that's actually how tableswitch's table On Thu, Feb 25, 2016 at 11:25 AM, Dan Gohman [email protected]
|
What would the result of the Might there be some merit in making the unreachable target implicit so that:
The case in which the runtime compiler can prove that the bounds check in unnecessary may be common. For example, reading an int8 and dispatching to a table of 256 entries. An implicit unreachable block would avoid the producer making it explicit, and the decoder might be able to detect some of the cases and optimize away some of these bounds checks even if the runtime compiler does not have great DCE. |
Oops, I hadn't considered the jump-to-jump implied by having default fallthrough. An explicit default target makes sense since this is basically the target of the bounds-check-branch in the machine code. So +1 to adding a |
@@ -82,9 +82,9 @@ and expr' = | |||
| Block of expr list (* execute in sequence *) | |||
| Loop of expr (* loop header *) | |||
| Break of var * expr option (* break to n-th surrounding label *) | |||
| Br_if of var * expr option * expr (* conditional break *) | |||
| Break_if of var * expr option * expr (* conditional break *) | |||
| Break_table of var list * expr option * expr (* indexed break *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming these Break_if
and Break_table
, when the ast.ml names are Br_if
and Br_table
, makes it less obvious that kernel.ml is intended to be a subset of ast.ml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember that this is abstract syntax, not concrete, so it does not need to reflect naming 1:1.
But you are right that the names weren't particularly consistent -- I switched to proper camel casing, like the other constructors use. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of a pre-existing nit: but they're not breaks; breaks only branch forward and these branch forwards and backwards. How about expanding to "br" to "Branch"? That way noone will be superficially confused by lack of "Continue" :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, but there are arguments both ways on that; not sure we should revisit the br
naming debate (WebAssembly/design#445) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was a text-format discussion and I think, in that proposal, the user would see both "break" and "continue" (therefore going the correct direction) so that's a separate topic altogether; here we have a single node that is jumping both forward and backward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is many of the arguments there are also applicable here, both ways, and we couldn't resolve things there.
Okay, added a default branch. Yet I'm not convinced that that's the best design. The jump-to-jump argument still sounds like a premature nano optimisation to me, I doubt it has any practical relevance. And other than that, it is not clear what the more useful design is, if you look at broader use cases than just encoding C-style switch. Beyond C producers, the following pattern will probably be fairly common:
With the default branch, you now have to introduce a cumbersome extra block to express that. |
This one case may not matter a bunch (smart compilers can fold anything, etc), but avoiding logical jumps-to-jumps is (now) a consistent theme overall for control flow operators so I think it's nice to stay consistent here. |
@@ -235,15 +229,17 @@ expr1 : | |||
| BR var expr_opt { fun c -> Br ($2 c label, $3 c) } | |||
| BR_IF var expr { fun c -> Br_if ($2 c label, None, $3 c) } | |||
| BR_IF var expr expr { fun c -> Br_if ($2 c label, Some ($3 c), $4 c) } | |||
| BR_TABLE var var_list expr | |||
{ fun c -> let xs, x = Lib.List.split_last ($2 c label :: $3 c label) in | |||
Br_table (xs, x, None, $4 c) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused why the grammar uses var var_list
, when the code prepends the first label to to the list and then splits a label off the end of the list. Would var_list var
work, and avoid the prepending and splitting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LR(1) generators like Yacc cannot deal with var_list var
easily, will cause shift/reduce conflicts. So annoyingly, you have to turn it around.
lgtm |
@@ -20,10 +17,10 @@ and expr' = | |||
| Loop of expr list | |||
| Br of var * expr option | |||
| Br_if of var * expr option * expr | |||
| Br_table of var list * var * expr option * expr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the binary encoding, I expect the default's var
would go before the var list
(that's what we've done in SM). So since it's that way in both text and binary, perhaps the AST should match them so that the AST->binary mapping is not only regular w.r.t types but also order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, interesting. It's a bit surprising, I suppose, since it's the label to apply to all larger indices. (Also, I still have some sympathies for the clamping interpretation of the indexing, which would suggest the default, i.e., max to go last as well. :) ) But ultimately I don't mind. @titzer, not what V8 does, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, putting the default after the var list
makes monotonic sense. The only reason to put it before is the vague preference for putting lists at the end, but I think both would work so I'd also be fine putting it after matching the AST here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
He he, I like the word "monotonic sense". If only the world made monotonic sense.
lgtm |
Has the use case for the I see some merit in the result expression, some code patterns that could be expressed as an expression avoiding use of a local variable, but I don't see how they can be expressed in familiar code patterns without using a local. It seems similar to the issues with the result expression of `br_if - in a familiar source code the pattern would use a local variable but there is no native support for these in wasm anyway. If there is an expression result then this bumps into the result 'arity' issue that still seems unresolved. i.e. Would it be accepting a list of expressions that are the arguments to an implicit tuple constructor or a single expression that may have multiple result values? Perhaps the result expression should just be removed from |
Implement br_table; drop tableswitch
The "i32x4" case looks trivial now, but we will be adding i64x2 later.
...as decided yesterday.
The syntax I propose here is trying to maximise simplicity and symmetry with
br_if
:The first expression is the (optional) argument transferred to the label, the second is the index.
To maximise simplicity, I propose that there is no default. Analogous to
br_if
, the operator simply falls through if the index does not apply (i.e., is out of bounds). A default jump can trivially be implemented by a consecutivebr
.WDYT?