-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define the behavior of "switch" #322
Comments
sgtm, it encodes both C-style switches and (some) functional-language ones adequately. Hm, it feels like there ought to be a compact binary encoding for common instances where none of the |
+1 to not requiring the default to be last, and to allowing it to fall through to other cases. I'm not tied to a specific AST representation of this, but we want this functionality. And +1 to giving switch a label, making cases fall through by default, and making the exits from a switch be breaks. Making switch cases be non-fallthrough by default is on many people's list of "things to fix if I ever write a new language", and for human-written languages I agree, but for WebAssembly, switch is the N-way dispatch construct, and it's nice to limit it to that and not also bundle in additional implicit branching. We're going to be using breaks to exit from switches to outer nesting levels anyway, so it's consistent to also use breaks to exit to the immediate nesting level too. |
We have to make sure we're not getting into too much of a tangle for the I think we should avoid making switches into a swiss army knife of control So unless the implementation is going to generate a more efficient On Tue, Sep 1, 2015 at 5:03 PM, Dan Gohman [email protected] wrote:
|
This proposal does exactly as your first two paragraphs suggest: it makes switch less of a tangle for the engine implementation, and makes it less of a swiss army knife of control flow constructs. I agree that there's little value of a construct which is just shorthand for an if-cascade, so I don't see a strong motivation for a lookupswitch construct. Producers can emit their own if-trees if they know they want a binary search tree. The switch proposed here is similar to table-switch. The value it adds to wasm is that it's a jump table. Having fallthrough just means that it doesn't also have implicit branching tacked on, so it's simpler. C, asm.js, and others can put a default case in the middle and it can fall through to other cases, so if WebAssembly can't do that, it'll force us to produce more awkward control flow. |
LLVM's switch is actually even more expressive than this. Instead of requiring cases be contained within the body of a switch, a switch statement can directly branch anywhere. In the spirit of #299, the most expressive form of switch would be one where each arm is just a label of a block/statement to break from or loop to continue to. |
Two downsides to LLVM's switch:
One alternative is to add a CFG node that is a bit like the loop+switch construct that Relooper generates for irreducible control flow. It would have an initial label, a label to exit the CFG, and some labeled basic blocks. It could easily compile to either a loop+switch in ASM.JS or basic blocks+branches in a native backend. LLVM's switch would be expressible as a switch within the initial expression or label of a CFG node, with each arm simply branching to one of the CFG labels. e.g.
in ASM.JS:
A more extreme possibility is: just use a CFG, but include Relooper classifications of the basic blocks so the backend can easily turn it into structured control flow if necessary. Or even just make the ASM.JS backend do its own Relooper analysis. |
It would actually be fairly straightforward to add a return value to a #299 style switch. Such a switch is a break statement that, instead of taking a single immediate that specifies which block to break to, takes N immediates that specify N blocks, and an integer to select which one to break to. All we'd need to do is add a result value operand just like break's. LLVM's switch is actually even more general than that and can go "backward", and #299 has a distinction between forward and backward branching (viz. break and continue). One option is to ignore that generality; another would be to eliminate that distinction too and just use break for both forward and backward branching (and presumably break's result value would just be ignored when it's backward). Either way, lowering to asm.js switch is not complex; one can use a switch with cases that immediately do a break or continue with the appropriate labels. A notable difference between these #299 style approaches and generalized CFG approaches is that #299 manages to retain some of the advantages of well-structured control flow -- it can still be structured and analyzed hierarchically, loops are identified and guaranteed to be single-entry, etc. |
I meant the other way: that a value can't be returned out of such a switch. But you can do that by adding a label to the switch statement that's used as a target to "return" from the switch, I guess.
My example goes backward. The CFG node's basic blocks are in scope in the nested switch, so it both branches to basic blocks that precede it in the CFG, as well as the CFG's exit label.
I'm in favor of getting rid of the distinction between branching forward and backward (see #310 (comment)). The value operand in branch should match the type defined by the label being branched to. In
Can you elaborate on this? I don't understand how you'd do this.
The CFG expression I described is meant to work in conjunction with all the existing structured control flow operations. The idea is that a compiler backend uses something like Relooper to produce structured control flow, and for the irreducible case it falls back to a CFG expression rather than a var+loop+switch. On ASM.JS the CFG expression turns into the var+loop+switch, but for probably every other runtime it would map to something lower level. |
I mean that a #299 style switch, because such a switch would just be an N-way break, so if break can return a value, then switch can too, in the same way.
It does, though it does so at the cost of introducing a generalized CFG even in cases where control flow could otherwise remain explicitly well-structured.
This is now a switch where the cases break to any label. Replace break with continue for backward branches, or eliminate the break/continue distinction to taste :-). The spirit of #299 is to collapse this idiom into a simple switch node that just directly branches places instead of cases that contain branches to places. I do think the idea of an embedded CFG-like construct is worth considering. The advantage of #299 is that it can always preserve well-structured control flow without falling back to a CFG-like construct when it isn't necessary. |
When I say that switch can return a value, I mean to the enclosing expression, not to the labels it branches to. So you're right that you could define a switch as a N-way break that may pass values to the branch targets, but it would be pretty weird to be able to use that switch construct as an expression:
I see, so you mean that the switch would just be a table-based branch, but that it could still only target labels that are in-scope according to JavaScript-like structured control flow semantics. So a C-style switch statement might be translated like this:
That works, and it's appealing that it distills switch to just a table-based branch, but it's certainly confusing to read. I think the problem is that label defines a branch target at its end. You could redefine the label node to create a branch target to its beginning scoped to the expressions preceding it in a block. That would make the above example:
I like that. Much closer to the LLVM-style switch while still being easy to transform to JavaScript. It does complicate the scoping rules for the label's branch target, though. Then the CFG node I mentioned above might look like this:
So it would just use normal label nodes with more permissive scoping rules for their branch targets. |
This should be resolved by #427. |
The design is vague about how switches work. The spec interpreter does this:
More relevant context about the spec interpreter:
This issue is not about whether there should be a C-like concept of statements, which seems to be an open issue. If the spec ends up making switch a statement, this is irrelevant. But if not:
As it's defined in the spec interpreter, there isn't a direct translation to and from C-style switches: the default case must be the equivalent of the final case, and so cannot fall through to other cases. I believe the rationale for this design is to ensure a switch always yields a value of the type expected by its context.
Another way to achieve the same goal would be:
Instead of each non-fallthru case being an exit from the switch, the exits from this switch would be explicit breaks and the final case. This makes switch equivalent in flexibility to C-style switches, but allows it to still be used as an expression.
For example:
This would map 0=>100.0, 1=>100.0, 3=>300.0, and any other value =>200.0. Any value other than 0|1|2|3 would also set %b to 1. To implement this with the current spec interpreter's switch, you'd need to duplicate case 2's expression into the default expression.
The text was updated successfully, but these errors were encountered: