Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement br_table; drop tableswitch #249

Merged
merged 4 commits into from
Mar 7, 2016
Merged

Implement br_table; drop tableswitch #249

merged 4 commits into from
Mar 7, 2016

Conversation

rossberg
Copy link
Member

...as decided yesterday.

The syntax I propose here is trying to maximise simplicity and symmetry with br_if:

(br_table <var>* <expr>? <expr>)

The first expression is the (optional) argument transferred to the label, the second is the index.

To maximise simplicity, I propose that there is no default. Analogous to br_if, the operator simply falls through if the index does not apply (i.e., is out of bounds). A default jump can trivially be implemented by a consecutive br.

WDYT?

@ghost
Copy link

ghost commented Feb 24, 2016

From the point of view of being readable, this seems neutral, perhaps even helpful. A source viewer will just need to analyse the patterns and present such patterns in familiar switch statements etc if they fit well, and removing the cases might even make such pattern matching simpler. One significant difference is that the key expression is evaluate in the scope of all these blocks, whereas a switch key expression is evaluated outside the scope of the cases, but even this is probably just another detail in pattern matching.

The default falling through seems ok at first sight.

Having the labels first seems consistent with them being immediate arguments.

Could the br_table be a multi_block which has been discussed before, so that it heads all the blocks. Then the key expression could be evaluated outside their scope. In other words, just add one more immediate integer argument giving the number of blocks to start, and some opcode to end each block - it might even look like case!?

@lukewagner
Copy link
Member

Great, I was wanting to also propose fallthrough-on-default for the same symmetry reason you gave. To check my understanding, an s-expr parser would know it has reached the end of the <var>s when it sees the first ( of an , right? I think so, in which case lgtm.

@rossberg
Copy link
Member Author

Yes, the s-expr syntax is unambiguous.

@sunfishcode
Copy link
Member

An explicit default gives producers more flexibility than fallthrough. Practical implementations will need a separate bounds-check branch in any case; fallthrough semantics restrict this branch to a fixed location. An explicit default means lets producers decide where they want it to go without creating extra branch-to-branch constructs. If we're going to have the default, I propose it be explicit.

Another option is to make an out-of-bounds index in the table consistent with an out-of-bounds index in linear memory. The analogy to br_if is appreciated, however br_if's relationship with fallthrough behavior in hardware doesn't occur in br_table, and br_if's role in general-purpose code isn't the same as br_table's. Also, br_if is a boolean operation whose entire purpose is to take a boolean value and select between a boolean set of semantics. br_table has one behavior for a range of its inputs and must have an abrupt semantics discontinuity if it is to have any other semantics for the remaining range. Multiple-personality instructions are often the source of surprising behavior, and, when practical, wasm has avoided them by trapping when an instruction is asked to do something outside its primary purpose.

@ghost
Copy link

ghost commented Feb 24, 2016

@sunfishcode What about the br_table have an implicit unreachable target, say target zero, so the default break could use this target to implement the trapping case, and otherwise directly break to a target. This unreachable target might be usable by in the break table too, for cases in which some are unreachable.

@sunfishcode
Copy link
Member

In support of an explicit default operand: Compilers for languages like C/C++ benefit from flexibility in where the default arm goes, because their default can appear in any order with respect to the cases, and can fall through or be reached from fall through from cases. Also, all optimizing implementations will need a bounds-check branch anyway, so it's not much extra burden to let producers specify where they want that branch to go.

@rossberg
Copy link
Member Author

@sunfishcode, not sure how the default ordering is affected, that can just as well be done with a separated br? The tests in this PR even contain examples of that.

@sunfishcode
Copy link
Member

@rossberg-chromium It's true; one can always add extra brs to achieve the desired semantics, however the wasm engine is typically going to have to end up folding these branches into other branches anyway. An explicit default gives producers the flexibility to describe the control flow they want more directly.

@titzer
Copy link
Contributor

titzer commented Feb 25, 2016

I support a default target, since that's actually how tableswitch's table
worked.

On Thu, Feb 25, 2016 at 11:25 AM, Dan Gohman [email protected]
wrote:

@rossberg-chromium https://github.com/rossberg-chromium It's true; one
can always add extra brs to achieve the desired semantics, however the
wasm engine is typically going to have to end up folding these branches
into other branches anyway. An explicit default gives producers the
flexibility to describe the control flow they want more directly.


Reply to this email directly or view it on GitHub
#249 (comment).

@ghost
Copy link

ghost commented Feb 26, 2016

What would the result of the br_table operator be if there were a default target and no fall-through? Would it be unreachable?

Might there be some merit in making the unreachable target implicit so that:

(br_table $unreachable ...) =>
(block $unreachable
  (%br_table ...)
  (unreachable))

The case in which the runtime compiler can prove that the bounds check in unnecessary may be common. For example, reading an int8 and dispatching to a table of 256 entries. An implicit unreachable block would avoid the producer making it explicit, and the decoder might be able to detect some of the cases and optimize away some of these bounds checks even if the runtime compiler does not have great DCE.

@lukewagner
Copy link
Member

Oops, I hadn't considered the jump-to-jump implied by having default fallthrough. An explicit default target makes sense since this is basically the target of the bounds-check-branch in the machine code. So +1 to adding a var preceding the var list in Br_table.

@@ -82,9 +82,9 @@ and expr' =
| Block of expr list (* execute in sequence *)
| Loop of expr (* loop header *)
| Break of var * expr option (* break to n-th surrounding label *)
| Br_if of var * expr option * expr (* conditional break *)
| Break_if of var * expr option * expr (* conditional break *)
| Break_table of var list * expr option * expr (* indexed break *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming these Break_if and Break_table, when the ast.ml names are Br_if and Br_table, makes it less obvious that kernel.ml is intended to be a subset of ast.ml.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that this is abstract syntax, not concrete, so it does not need to reflect naming 1:1.

But you are right that the names weren't particularly consistent -- I switched to proper camel casing, like the other constructors use. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a pre-existing nit: but they're not breaks; breaks only branch forward and these branch forwards and backwards. How about expanding to "br" to "Branch"? That way noone will be superficially confused by lack of "Continue" :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, but there are arguments both ways on that; not sure we should revisit the br naming debate (WebAssembly/design#445) here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a text-format discussion and I think, in that proposal, the user would see both "break" and "continue" (therefore going the correct direction) so that's a separate topic altogether; here we have a single node that is jumping both forward and backward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is many of the arguments there are also applicable here, both ways, and we couldn't resolve things there.

@rossberg
Copy link
Member Author

rossberg commented Mar 2, 2016

Okay, added a default branch.

Yet I'm not convinced that that's the best design. The jump-to-jump argument still sounds like a premature nano optimisation to me, I doubt it has any practical relevance. And other than that, it is not clear what the more useful design is, if you look at broader use cases than just encoding C-style switch. Beyond C producers, the following pattern will probably be fairly common:

(br_table ... (index))
(handle exceptional error case)

With the default branch, you now have to introduce a cumbersome extra block to express that.

@lukewagner
Copy link
Member

The jump-to-jump argument still sounds like a premature nano optimisation to me, I doubt it has any
practical relevance.

This one case may not matter a bunch (smart compilers can fold anything, etc), but avoiding logical jumps-to-jumps is (now) a consistent theme overall for control flow operators so I think it's nice to stay consistent here.

@@ -235,15 +229,17 @@ expr1 :
| BR var expr_opt { fun c -> Br ($2 c label, $3 c) }
| BR_IF var expr { fun c -> Br_if ($2 c label, None, $3 c) }
| BR_IF var expr expr { fun c -> Br_if ($2 c label, Some ($3 c), $4 c) }
| BR_TABLE var var_list expr
{ fun c -> let xs, x = Lib.List.split_last ($2 c label :: $3 c label) in
Br_table (xs, x, None, $4 c) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused why the grammar uses var var_list, when the code prepends the first label to to the list and then splits a label off the end of the list. Would var_list var work, and avoid the prepending and splitting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LR(1) generators like Yacc cannot deal with var_list var easily, will cause shift/reduce conflicts. So annoyingly, you have to turn it around.

@sunfishcode
Copy link
Member

lgtm

@@ -20,10 +17,10 @@ and expr' =
| Loop of expr list
| Br of var * expr option
| Br_if of var * expr option * expr
| Br_table of var list * var * expr option * expr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the binary encoding, I expect the default's var would go before the var list (that's what we've done in SM). So since it's that way in both text and binary, perhaps the AST should match them so that the AST->binary mapping is not only regular w.r.t types but also order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, interesting. It's a bit surprising, I suppose, since it's the label to apply to all larger indices. (Also, I still have some sympathies for the clamping interpretation of the indexing, which would suggest the default, i.e., max to go last as well. :) ) But ultimately I don't mind. @titzer, not what V8 does, WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, putting the default after the var list makes monotonic sense. The only reason to put it before is the vague preference for putting lists at the end, but I think both would work so I'd also be fine putting it after matching the AST here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He he, I like the word "monotonic sense". If only the world made monotonic sense.

@lukewagner
Copy link
Member

lgtm

@ghost
Copy link

ghost commented Mar 7, 2016

Has the use case for the br_table result expression been considered? I note that the current v8 implementation does not seem to implement this and it was not included in the prior tableswitch - although I might be misreading this.

I see some merit in the result expression, some code patterns that could be expressed as an expression avoiding use of a local variable, but I don't see how they can be expressed in familiar code patterns without using a local. It seems similar to the issues with the result expression of `br_if - in a familiar source code the pattern would use a local variable but there is no native support for these in wasm anyway.

If there is an expression result then this bumps into the result 'arity' issue that still seems unresolved. i.e. Would it be accepting a list of expressions that are the arguments to an implicit tuple constructor or a single expression that may have multiple result values?

Perhaps the result expression should just be removed from br_table and perhaps also from br_if and br, and just depend on the use of local variables to pass values in these cases. What do people think?

rossberg added a commit that referenced this pull request Mar 7, 2016
Implement br_table; drop tableswitch
@rossberg rossberg merged commit 2a186db into master Mar 7, 2016
@rossberg rossberg deleted the br_table branch March 7, 2016 18:39
ngzhian added a commit to ngzhian/spec that referenced this pull request Nov 4, 2021
The "i32x4" case looks trivial now, but we will be adding i64x2 later.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants