-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can block labels optionally go on end
?
#372
Comments
Hm, I’d much rather not. In Wasm’s structured control, labels index block constructs, not positions in the instruction stream. They are JavaScript-style labels, not C-style labels. To pretend otherwise is misleading IMO and would just invite misunderstanding and confusion. Because it would match neither the binary format nor the abstract syntax. It also does not fit the rest of the concrete syntax at all, where every other symbolic name is bound by placing it directly after the keyword introducing the thing it names. And it doesn’t mesh with the placing of block signatures, which determine the type of the label. (Also, there is no To address specific examples like the one you give (which hopefully won’t be written by hand much), wouldn’t it be sufficient to just repeat the label at the end, in a comment?
I would hope that browsers showing code would do something along these lines anyway, not just for |
Yes, but a block is composed of two lexical parts and this just changes which part the label is on; it's still putting the label "on the block".
I don't think that's a relevant consideration for this textual language that is distinctly neither.
Labels don't exist in the binary format, they are present only for readability; I think putting the label where your eye naturally wants to jump significantly enhances readability.
But that's not what we're rendering in browsers so it doesn't really matter.
I'm not considering the writing-by-hand use case, I'm talking about fixing the very real problem you can see today that, if you open any non-toy wasm in devtools, your code goes diagonally off the screen if the function happens to contain a on-trivial switch statement. It's very jarring and it's nothing you'd expect from either a high-level language or a low-level assembly. I am sure that as soon as anyone starts doing any debugging this will be one of the first walls they hit. I think we must fix this problem. |
But isn't that mostly a question of layout, which is a separate issue? Why wouldn't the comment solution work just as well in combination with a respective layout heuristics? It would require such a heuristics either way.
Well, as I argued in the other paragraph, it would still be at odds with everything else in the syntax, and separates it weirdly from the associated type annotation.
Hold on, labels stand for concrete indices. Having some labels appear in line at the top (loops) and others out of line at the bottom doesn't square with how the indexing works. Mapping the textual form to the underlying structure it is supposed to represent would become quite confusing.
Sure, I was just saying that to point out an analogy, not because I consider looking like JavaScript relevant as such. |
I could go either way on this, but Luke's proposal does have the nice On Sat, Oct 29, 2016 at 8:32 PM, rossberg-chromium <[email protected]
|
Yes, technically one could do the same layout if the labels were on the blocks, but good luck reading that. Really, have you tried to read a switch statement with more than a few cases? It's a nightmare.
That would be better than nothing, but if we all do the layout I'm suggesting above (or some variation thereof) and add the label in comments (can you really imagine not given the unreadability?), then it will be silly that we can't just express the label directly without using comments and naming every label twice.
That's a good point, we should allow the type to go on the
Labels stand for concrete indices at the uses, but the label on the block represents no index immediate so putting it at the beginning or end makes no difference w.r.t the serialization and from an AST POV it's the same node and the same label.
I can't see how this is confusing. Is there perhaps some underlying issue, like something that requires more work in spec/interpreter or the formalism, that is the real root cause for your objection? @titzer Glad you think it would be a readability improvement too. Due to forward references of functions and other named things, we already require multiple passes when converting text to binary. E.g., in SM's text-to-binary, we have an intermediate "resolve names" pass that this would drop into easily. |
How about this: in syntactic analogy to branches, we (optionally) allow The only thing this doesn't give you is avoiding repetition of the label, but arguably the repetition improves readability even more in many cases (various langs require that kind of name repetition). WDYT? Some more comments below.
Oh no! It also is the block signature. Moving that around inconsistently is worse, and will make little sense in the light of desirable generalisations, such as allowing block function signatures.
Not so, it would immediately affect
No binding of a symbolic identifier is an actual immediate, label or otherwise. Yet they all resolve to raw numbers in order of textual appearance (for labels, that order is reversed, but it's still in order). Moving some label binders out of line would break that.
The formalism is unaffected since it doesn't have symbolic labels, and it should be easy in the interpreter as well. No, I just think that concrete syntax should be regular and match abstract syntax. Especially in Wasm, where the abstract syntax materialises quite concretely in the binary format, which the text format is supposed to represent. |
I mean, that'd be better than comments, but, as you say, it still seems like unnecessary noise when blocks are stacked, as I showed in my original post. I can actually see putting labels on both ends enhancing usability for non-switch use cases (like a really long block), though, so that seems fine to allow. I mean, given the many other syntactic sugars, can't one consider
Yes, by "Blocks" of course I meant "Blocky things". Label-on-
That's just an optimization of the more intuitive two-step process of attaching names to block nodes in the AST and then computing depths on the AST; I don't think that should override usability. |
I agree that it is more readable to put the label at the end, except for That said, I dislike the idea of making the location of the label arbitrary (i.e. if it were sugar for
or this:
Putting the label in both places fixes this, but it's pretty ugly IMO. I suppose it's nicer than the comment in that it can be validated. OTOH, if we are talking about generating this for viewing in the browser, we can potentially do better than comments or labels: you can highlight the branch location on mouse-over, use arrows, add color, etc. As block signatures, I've found that people have been confused by |
@lukewagner, is your only concern with it verbosity, then? I mean, there is a lot of verbosity in the text format, avoiding it has not been a priority so far, AFAICT. And here it would even be optional. Sugar or no sugar, I'd really prefer not to mess up binding order and natural scoping rules (keep in mind that the label name scopes over the body only, which would also seem non-obvious if its binder was somewhere else). @binji, is it really that ugly to reference the label at Just trying to find something that's a reasonable compromise... |
@rossberg-chromium I mostly just want wasm's text language to not be terribly worse than a traditional linear assembly language and the diagonal-of-doom was the first thing to stick out. So I've been thinking about this in terms of experienced users that understands the wasm block/loop rules, but @binji makes a good point that we actually have an opportunity to remove one significant bump in the wasm learning curve by regularly putting labels where control flow goes (the top of loops and the |
I think a benefit to @rossberg-chromium's option of having the labels at both the top and the bottom is that it helps answer the question of "how are these nested?", which is separate from "where does it jump to?":
Without the labels on top and bottom, we'd have
and then if someone wants to find where |
For that purpose, I'd agree with @binji that what you'd want is your text editor to highlight the other one, as we are accustomed to today with parens and curlies. FWIW, I was only proposing adding labels in the linear format, where there is a separate |
@rossberg-chromium OK, it's not that ugly. But the only languages that are coming to mind are CMake and basic, so... not that pretty either. :) TBH, I don't have a strong opinion about this. In terms of the options we've discussed: 1) keeping it the same 2) putting labels at the branch destination 3) labels at the top, or at the top and bottom 4) putting labels anywhere, I would rank them as 1 or 2, then 3, then a very distant 4. |
@binji: Modula, Ada, Oberon, Dylan, AppleScript, Clu... :) Of the options you enumerate, I'd be equally fine with either 1 or 3, but On 2 November 2016 at 22:10, Ben Smith [email protected] wrote:
|
@binji Perhaps then we could reconsider option 4 so that @rossberg-chromium can put labels where they have the most Purity of Essence while browser devtools can render what is most efficiently readable. That would at least allow a given tool to be consistent (always-top or always-at-destination). |
For a linear assembler code I see merit in having the labels at the bottom, at the control flow target. But from a data flow perspective, where the block may well be embedded within an expression, it seems much easier to read having the label at the start of the block scope so that the reader can see where the data goes. How about for the linear text format presenting the label at the end when the block returns no values - this would fit common I see no harm having different presentation patterns for the same effective code if that helps the reader! |
@lukewagner, option 3 seems even more readable and less unpopular. :) |
@rossberg-chromium Yes, I would be in favor of 3 were it not for the fact that it will undoubtedly cause otherwise-unnecessary confusion with readers because this one branch has two apparent targets. I think we shouldn't inhibit what will be the ideal presentation for a large set of users who won't have any sympathy for "yes, but wasm is different you see because..."; putting a label where control flow jumps isn't a radical idea. |
@lukewagner The reader will also want to see clearly the scope of the block, where it unwinds to on exit, and the label at the start will help the reader see this. I think you are being a little too critical of having a label at the start of the block, as if there is no value for the reader at all, and this is not the case. Consider when the |
@lukewagner, I am all for avoiding confusion, but with the exact opposite conclusion. It behaves quite different from the usual jump(*), so if it also looks accordingly different then that is a Good Thing -- we should prevent misconception about what is really going on, not promote it. That's on top of all the technical arguments. Labelling the block you exit is not a radical idea either, and the closer precedent. (*) In particular, a branch manipulates the operand stack, and in a way that depends on the block entry point -- you cannot infer that without looking there. Just considering the special case of a branch with no arguments and an empty stack is misleading. |
@rossberg-chromium (Setting aside whether using labels to label where they jump to in an assembly language is the closer precedent) If block signatures are always at the top (which makes sense since that does reflect their actual binary encoding order), that is a fair point; users may need to jump to the top on occasion. I was also thinking that we can help along new readers by having the syntax highlighting applied by devtools to the displayed wasm text use a more subdued style for |
@lukewagner, I like the idea of using different syntax highlighting to distinguish where control is going. Anyway, #378 :) Btw, what I meant re the entry point is that it is relevant even if the block signature is placed elsewhere (or empty) -- the signature only determines what values are kept on the stack, but you still need to know the stack at the entry point to determine how much is being discarded. |
Indeed, there are multiple reasons one might care to visit the beginning of a block. |
... and to finish that thought: if I'm at a branch and want to go to either, it's nice to be able to |
Co-authored-by: Thomas Lively <[email protected]>
Fix various places in the spec text and interpreter where binary encodings were inconsistent with those we decided on in WebAssembly#372.
Being able to put a named label on
end
would, I think, make wasm textual representation for switches like those in switch.wast a lot more readable.With this change and some whitespace heuristics, switch could look a lot nicer
instead of the current:
(imagine a switch with hundreds of cases; the diagonal!)
The text was updated successfully, but these errors were encountered: