Flexible control-flow operators. #427

sunfishcode · 2015-10-23T22:02:18Z

This introduces br, br_if, and so on, while also preserving the
high-level if and if_else operators. It also converts switch into
tableswitch and generalizes it to support labels in enclosing scopes.

This introduces `br`, `br_if`, and so on, while also preserving the high-level `if` and `if_else` operators. It also converts `switch` into `tableswitch` and generalizes it to support labels in enclosing scopes.

sunfishcode · 2015-10-23T22:10:32Z

Updated PR to be independent of statements=expressions.

rossberg · 2015-10-23T22:54:07Z

I'd like to understand the motivation for tableswitch better. Isn't that simply a more convoluted way of writing a C-style switch? As far as I can tell, it is completely equivalent, the only differences being stylistic. Namely, (1) all cases have to be written with an indirection through an extra label instead of using case directly as the label, (2) it forces to list all cases between 0 and the max label explicitly instead of having omitted ones go to default implicitly, and (3) it can jump to outside labels directly instead of using the existing branch construct for that.

(1) and (2) are effectively just changes in syntax, not semantics. Of course, this representation makes sense on the encoding level, but on the AST level, it just makes the construct (significantly) more verbose and more complicated to describe and spec. (3) on the other hand is just syntactic sugar, analogous to br_if.

So in terms of AST semantics and spec factorisation, tableswitch is strictly inferior to a C-style switch, while being functionally equivalent. I think we should treat the table representation as an encoding concern, not a semantics concern.

sunfishcode · 2015-10-23T23:18:49Z

This tableswitch has one area of flexibility beyond what a C-style switch has: it can directly branch to labels outside its own body.

It also has a notable restriction in that it requires the index space to be zero-based and dense. This makes it more directly correspond to an actual jump table, as reflected by the rename to tableswitch.

rossberg · 2015-10-23T23:24:10Z

Yes, I understand, but as I tried to point out above, that seems more like an encoding consideration, and not like something that is necessarily relevant on the semantic abstraction level of the AST.

sunfishcode · 2015-10-23T23:28:21Z

I don't see how the semantics of where a tableswitch can branch to is an encoding consideration. Similarly, the index space determines how tableswitch interprets its operand, so it's also part of the semantics.

rossberg · 2015-10-24T01:06:49Z

As mentioned above, the ability to jump to an external label is merely syntactic sugar over a break (and limited one, because there is no way you can make it transfer a value). As such, it is primarily an encoding consideration as well.

Can you explain what you are referring to as index space, and how it affects the operand?

qwertie · 2015-10-24T03:42:36Z

We discussed tableswitch earlier in #322. It is advantageous over the old switch because

tableswitch is able to represent the vast majority of C switch statements, unlike the old switch (IIUC, some C switch statements such as duff's device are still not allowed)
Using a dense rather than sparse jumptable allows a Wasm JIT or interpreter to be very simple, which ~~maximizes~~ minimizes startup time. The optimization problem of how to implement the switch statement (as a collection of jump tables, gotos (breaks) and if-elses) is shifted from JIT-time to compile-time.

sunfishcode · 2015-10-24T03:50:39Z

@rossberg-chromium One can express any reasonable switch-like construct as syntax sugar on top of any other plus extra nodes. As such, I don't see why this is a reason to prefer any one over any other.

Also, Statements == Expressions is an orthogonal concern. This tableswitch can still transfer values, just as br can, if we choose to do that.

The index space of tableswitch is the integer sequence starting at 0 and continuing for some finite length. A tableswitch operator defines a mapping from values in its sequence to labels, and it uses this mapping to compute a label, given its operand.

rossberg · 2015-10-24T07:47:08Z

@qwertie, yes, I was comparing with a C-style switch, which I'm happy moving towards. Density is an independent consideration.

@sunfishcode, re index space, that's what I guessed, but then I still don't understand your earlier remark on that.

It may help to point out that we are actually discussing syntax here rather than semantics. I am fine with the goal of increasing the flexibility of the switch operator. But I am puzzled how the explicit syntactic indirection through extra labels helps at the AST level.

Let's be concrete. Consider a switch as described in this proposal:

$label: (switch (operand)
  ($case0 $case1 $case2 $label $case4 $case5) $default
  (case $case2 (a))
  (case $case1 (b))
  (case $case5 (c))
  (case $case4)
  (case $default (d))
  (case $case0 (e))
)

Here is how I would design the concrete S-expr syntax:

$label: (switch (operand)
  (case 3 br $label)
  (case 2 (a))
  (case 1 (b))
  (case 5 (c))
  (case 4)
  (default (d))
  (case 0 (e))
)

That is expressing exactly the same structure but with much less repetition. You can still represent it as a table of labels in the binary encoding, but as a textual or AST-level representation it is superior in just about every way I can imagine, for both producers and "consumers", including the spec.

Note that the density requirement is independent from this choice of syntax. I don't see a particularly strong reason to require listing the (case 4) explicitly in the latter form, but there is nothing preventing us from requiring it if we want to emphasise the table nature.

Note also that I even turned the break to the outer label into special syntax for case, that's different from using a separate br operator (which would be parenthesised). It obviously is redundant to introduce that, but there you go.

So, again, this is really about syntax, not semantics. AFAICT, the latter way of structuring the operator syntax is more convenient on all levels. I'd like to understand what advantage you see in the former notation. Or would you be fine with the latter as well?

sunfishcode · 2015-10-24T16:53:32Z

I've now added a patch to this PR converting it to what I think you're asking for. Is this what you had in mind?

In my opinion, this patch makes the prose harder to read for humans, makes it harder for me to convince myself that the semantics aren't incomplete or inconsistent, and it puts greater distance between the design-document AST representation and what I might imagine we'll want in the binary encoding.

rossberg · 2015-10-26T14:41:23Z

@sunfishcode, cool, thanks! And for your pleasure, I already implemented it: WebAssembly/spec#153 :)

sunfishcode · 2015-10-26T14:44:20Z

I decided I didn't like it, so I removed it now.

rossberg · 2015-10-26T15:47:07Z

I see. :( Well, please have a look at the spec PR. AFAICS, with this structure, it is as simple as it will get. Tests also remain more readable. I hear what you are saying re prose and distance to binary encoding, but I think those are far outweighed by familiarity and structural simplicity, respectively.

titzer · 2015-10-26T17:49:49Z

AstSemantics.md

+References to labels must occur within an *enclosing construct* that defined
+the label. This means that references to an AST node's label can only happen
+within descendents of the node in the tree. For example, references to a
+`block`'s label can only happen from within the `block`'s body. In practice,


"happen from within" => "occur within"

titzer · 2015-10-26T17:53:24Z

This mostly looks good modulo some wordsmithing.

titzer · 2015-10-26T17:55:20Z

@rossberg-chromium You've mentioned a C-style switch as a basis for the tableswitch design. Here we are being a little bit more explicit with labels, which pushes a bit of work up to producers, at the gain of being explicit for backends and engines. Is this a reasonable tradeoff in your opinion? Or are you proposing that the syntax of the AST is C-like and the binary encoding is table-switch like?

kg · 2015-10-26T18:36:37Z

@sunfishcode Have you done much research into how this would impact pre-compression and post-compression file size (in the asmjs packer, for example?) The significant changes to ASTs this implies could mean a big file size regression (or a big file size win, in which case that alone is a compelling reason to make your changes!)

sunfishcode · 2015-10-26T19:34:29Z

@kg Looking at some asm.js code of some miscellaneous applications, if statements are over 6x more frequent than loops, and over 300x more frequent than switches. This iteration of the proposal still has if and if_else. And, I expect no difference in the size of loops in the common case of optimized code. Consequently, I'm assuming this doesn't introduce a big file size regression or win.

sunfishcode · 2015-10-26T19:51:13Z

Wordsmithing changes applied.

kg · 2015-10-26T21:01:54Z

The question is about the binary encoding. Are you saying if and if_else would be opcodes in the binary representation? I thought they were syntax sugar?

ghost · 2015-10-26T21:25:49Z

@kg I think this has been answer elsewhere - syntax sugar is a spec matter and there are still opcodes for the sugar.

rossberg · 2015-10-27T12:59:52Z

On 26 October 2015 at 18:55, titzer [email protected] wrote:

@rossberg-chromium https://github.com/rossberg-chromium You've
mentioned a C-style switch as a basis for the tableswitch design. Here we
are being a little bit more explicit with labels, which pushes a bit of
work up to producers, at the gain of being explicit for backends and
engines. Is this a reasonable tradeoff in your opinion? Or are you
proposing that the syntax of the AST is C-like and the binary encoding is
table-switch like?

Yes, the latter. There is no real benefit in introducing explicit labels as
jump targets on the AST level -- quite the opposite, in fact. A
C/JavaScript-style switch notation is much more manageable when formulating
semantics, or writing/reading examples.

rossberg · 2015-10-28T16:42:34Z

WebAssembly/spec#153 now matches everything described by this PR (I think), modulo the notation for tableswitch.

I did some more pondering over the latter, esp looking into how the proposed notation would affect the spec. Honestly, it would be quite ugly. :( In addition to tracking the extra indirection, it would require distinguishing two different kinds of (raw) labels in the AST (since tableswitch addresses its inner targets differently from external targets) as well as different scoping rules for those labels' symbolic representation.

In other words, my assessment still stands that this notational factorisation for tableswitch is suboptimal. Unless we have strong technical reasons to do it that way, I highly recommend more streamlined notation. @sunfishcode, do you think you can still warm up to that?

sunfishcode · 2015-10-29T13:04:02Z

My attempt above at implementing what I thought you were asking for made the prose about twice as long, and I've now discovered, as I feared, that it did have a bug hidden in the extra complexity: it didn't address whether case and default could fall through to case_br and default_br.

Right now, I am very confused as to the intended relationship between AstSemantics.md, ml-proto's s-expression syntax, what we might expect to call the binary format AST, and what we might expect in the actual binary format. This confusion is making it difficult for me to figure out what you're asking for here, and what it means for other parts of the system.

I propose we merge this PR as a starting point, and then you can submit a new PR after it proposing the specific changes what you would like to see. I think that will help me understand better how you want this to look, rather than having me try to write up what I think you want. Does that make sense?

rossberg · 2015-10-29T16:25:18Z

@sunfishcode, okay, sounds good to me.

sunfishcode · 2015-10-30T15:33:27Z

Ok. Merging.

Flexible control-flow operators.

Flexible control-flow operators.

2e1ccb1

This introduces `br`, `br_if`, and so on, while also preserving the high-level `if` and `if_else` operators. It also converts `switch` into `tableswitch` and generalizes it to support labels in enclosing scopes.

sunfishcode force-pushed the new-control-flow branch from 5fde46e to 2e1ccb1 Compare October 23, 2015 22:09

sunfishcode force-pushed the new-control-flow branch from 6b70f43 to 2e1ccb1 Compare October 26, 2015 14:43

titzer reviewed Oct 26, 2015
View reviewed changes

"happen from within" => "occur within"

74c21c8

sunfishcode force-pushed the new-control-flow branch from 74029cc to 74c21c8 Compare October 26, 2015 19:47

rossberg mentioned this pull request Oct 28, 2015

Implement tableswitch and flexible control flow WebAssembly/spec#153

Closed

sunfishcode closed this Oct 29, 2015

sunfishcode reopened this Oct 29, 2015

lukewagner mentioned this pull request Oct 29, 2015

Define the behavior of "switch" #322

Closed

sunfishcode added a commit that referenced this pull request Oct 30, 2015

Merge pull request #427 from WebAssembly/new-control-flow

ca9fc75

Flexible control-flow operators.

sunfishcode merged commit ca9fc75 into master Oct 30, 2015

sunfishcode deleted the new-control-flow branch October 30, 2015 15:33

This was referenced Oct 30, 2015

Added else child to the if statement. #400

Closed

merging loop constructs #261

Closed

rossberg mentioned this pull request Nov 3, 2015

Streamline tableswitch notation #443

Closed

sunfishcode mentioned this pull request Nov 6, 2015

Break / Branch? #445

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flexible control-flow operators. #427

Flexible control-flow operators. #427

sunfishcode commented Oct 23, 2015

sunfishcode commented Oct 23, 2015

rossberg commented Oct 23, 2015

sunfishcode commented Oct 23, 2015

rossberg commented Oct 23, 2015

sunfishcode commented Oct 23, 2015

rossberg commented Oct 24, 2015

qwertie commented Oct 24, 2015

sunfishcode commented Oct 24, 2015

rossberg commented Oct 24, 2015

sunfishcode commented Oct 24, 2015

rossberg commented Oct 26, 2015

sunfishcode commented Oct 26, 2015

rossberg commented Oct 26, 2015

titzer Oct 26, 2015

titzer commented Oct 26, 2015

titzer commented Oct 26, 2015

kg commented Oct 26, 2015

sunfishcode commented Oct 26, 2015

sunfishcode commented Oct 26, 2015

kg commented Oct 26, 2015

ghost commented Oct 26, 2015

rossberg commented Oct 27, 2015

rossberg commented Oct 28, 2015

sunfishcode commented Oct 29, 2015

rossberg commented Oct 29, 2015

sunfishcode commented Oct 30, 2015

Flexible control-flow operators. #427

Flexible control-flow operators. #427

Conversation

sunfishcode commented Oct 23, 2015

sunfishcode commented Oct 23, 2015

rossberg commented Oct 23, 2015

sunfishcode commented Oct 23, 2015

rossberg commented Oct 23, 2015

sunfishcode commented Oct 23, 2015

rossberg commented Oct 24, 2015

qwertie commented Oct 24, 2015

sunfishcode commented Oct 24, 2015

rossberg commented Oct 24, 2015

sunfishcode commented Oct 24, 2015

rossberg commented Oct 26, 2015

sunfishcode commented Oct 26, 2015

rossberg commented Oct 26, 2015

titzer Oct 26, 2015

Choose a reason for hiding this comment

titzer commented Oct 26, 2015

titzer commented Oct 26, 2015

kg commented Oct 26, 2015

sunfishcode commented Oct 26, 2015

sunfishcode commented Oct 26, 2015

kg commented Oct 26, 2015

ghost commented Oct 26, 2015

rossberg commented Oct 27, 2015

rossberg commented Oct 28, 2015

sunfishcode commented Oct 29, 2015

rossberg commented Oct 29, 2015

sunfishcode commented Oct 30, 2015