-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify well-structured control flow #475
Conversation
from outside it. This restriction ensures all control flow graphs are well-structured | ||
in the exact sense as in high-level languages like Java and JavaScript. To | ||
further see the parallel, note that a `br` to a `block`'s label is functionally | ||
equivalent to a labeled `break` in high-level languages, that is, a `br` on a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"in high-level languages, thus a br simply breaks out of a block"
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure which part of the sentence you mean to edit to that, I can see more than one way to do it. I pushed an update which rewords part of it (which I agree makes it shorter and clearer). How is it now?
lgtm other than minor wordsmithing on the last sentence. |
3cf3436
to
abb1961
Compare
from outside it. This restriction ensures all control flow graphs are well-structured | ||
in the exact sense as in high-level languages like Java and JavaScript. To | ||
further see the parallel, note that a `br` to a `block`'s label is functionally | ||
equivalent to a labeled `break` in high-level languages, that is, a `br` simply |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might read a little smoother if we change ", that is, " to "in that". That avoids the comma pauses. Otherwise lgtm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, done.
abb1961
to
bdcc63a
Compare
lgtm While we're clarifying, would you want to add examples of what not possible? Or do you think that's too much? Maybe such deep-dive should be in Rationale.md? |
Rationale might be a good place to go into more detail, yeah. But it's actually not easy! :) Since what's not possible is jumping into a loop, but really, even that is possible if you use a control flow threading variable, so what we really mean is "jump into a loop without additional overhead" - but even that isn't strictly true, since the threading variable would likely be eliminated out, as was mentioned back when we discussed this in more detail. So we kind of mean "not possible to jump into a loop using just Sorry for the lengthy paragraph, I've actually been trying to find a good way to say this - just like you, re-reading this section made me want to add something. But I can't find a way that is not too detailed while remaining correct. I think it might actually be best to stay at the more intuitive level, which is what we'll have after this pull: (1) we mention that entering the middle of a loop is not possible, which indeed in some sense it isn't, and (2) we mention the connection to high-level languages, since our limitation is completely identical to theirs, so all readers should understand what we mean easily. |
I'd be fine if tricky things like this were succinct in the main design, with a link to Rationale.md for more details. It sounds like what you're suggesting? p.s.: I'm amused at jumping into the middle of Rationale.md from another part of the design. It's meta. |
Basically I'm saying I don't know how to write this in the Rationale without going into that entire huge paragraph :) But if you think details like that make sense for Rationale, happy to add them, plus a link from here. |
Yes, I think it would be helpful: the current document isn't clear on how we got to the well-structured control flow design we have. The few of us involved in getting there have the context, but external folks don't. |
Cool, will start a followup pull request for that now. |
I propose AstSemantics.md just say "can't jump into the middle of a loop" and mean it in the literal sense, rather than trying to mean it in the sense which includes all things which are semantically equivalent to it :-). If we want to write a separate compiler-writers guide, that'd be a good place to explain the various options for lowering a loop with multiple entries. Also, ironically, the original text here was intended to try to ease the fears of compiler writers who aren't aware of the full power of labeled break, for whom emphasizing that "it's just like JS" is actually more confusing than enlightening. |
Created followup pull in #479. @sunfishcode: Is your proposed change in the last comment for this pull, or for the followup? |
I was addressing this patch. I was mainly agreeing that the larger paragraph above should go somewhere else besides AstSemantics.md :-). And concerning my other comment, from my perspective, drawing parallels to high-level language constructs in AstSemantics.md is a distraction, but I realize that perspectives will vary. |
Cool, yes, the content in that big paragraph is intended for Rationale.md, as suggested by @jfbastien. It's in #479 if you want to take a look. I think I got it better than that rambling massive paragraph here ;) but it's still not easy to summarize this stuff. I get the point that more intuitions might be a distraction for some readers. But I think those readers likely already would understand the topic, from the rest of the spec? Whereas the parallel to high-level languages would help a large class of other compiler hackers. |
Waiting for feedback from @sunfishcode. |
My feedback is that I personally think it's more confusing than enlightening. My impression talking to even some people who know JS well is that it's not widely known just how theoretically powerful labeled break is, because its full power is almost never used in hand-written code (for good reason, to be sure), so drawing a parallel to JS doesn't seem to convey the right idea. |
Yes, I agree not all JavaScript devs know about labeled break. Certainly many casual devs might not. But still quite a significant amount of JavaScript (and Java) developers do, in particular, the ones writing compilers to and from JavaScript (and Java) would be very likely to. And as discussed earlier I think that's a very important audience for us. |
Perhaps some of the rationale could be explained by the goal to optimize parsing and analysis and even runtime performance for simpler consumers? For example: 'Control structures that lead to more efficient parsing and control flow analysis are clearly identified and separated from those needed to support general control flow. While the general control flow operators could be used to specify all control flow, implementations would be expected to be slower parsing them and may well not optimize them well so runtime performance many also be slower.' |
@kripken Many people know how to exit from an inner loop of a nest using labeled break. However, many people I've talked to recently were not aware that labeled breaks can build arbitrary control-flow DAGs (if we treat |
Yes, it sounds like those people are in an intermediary stage between not knowing about labeled break, and fully grokking all the theoretical implications of what it implies. But the second part of the sentence confuses me?
I am quite curious to hear more about those people and their background, and what would help them better understand things. But this is starting to sound somewhat philosophical, and this pull request is just a small no-semantic-changes text clarification with several lgtms and no other objections - is it ok if I merge it (with your reservations as already noted, of course), and we'll continue the discussion separately? |
The original text here is talking about the ability to create any DAG, because that's one of the things that is possible to do without a helper variable. That's one of things the original text is trying to point out, specifically to (briefly) assuage fears that wasm's structured control flow is too much restricted by high-level language sensibilities. Immediately following this with a sentence likening wasm to high-level languages weakens what the original text is trying to convey. Another concern is that WebAssembly is a low-level language in general, but its AST structure has already led some people to think of it in high-level language terms in other areas, and it isn't a very good high-level language. The more we encourage thinking about WebAssembly literally in terms of JS or "high level languages" in the spec, the more we risk diluting wasm with conflicting purposes, potentially representing no purpose well. |
I certainly don't want to weaken what the original text is trying to convey. But I don't see how it does - looks the opposite to me - so I don't have any idea how to fix it. Do you have a concrete suggestion for how I can improve this pull? Happy to iterate on that with you. Regarding the second point, the addition here uses a specific analogy (and a 100% precise one) to explain a specific feature. It's not saying "wasm is high-level". But, if you feel we should clarify that wasm is low-level (I don't think we need to, but also I don't see the harm) then perhaps draft a separate pull request with that addition (for the FAQ maybe?) instead of opposing this one on a side issue? It also sounds like that side issue is a very big deal for you, so let's address it seriously and with the proper focus, on its own? |
I, for one, do think this PR provides a useful clarification. |
I have seen this list mentioned a few times around this project as something we should focus on. I think this says a lot about perspectives. I propose that a better list to think about is this list. AstSemantics does not define or explain itself in terms of JS, and I think this is an important invariant, to encourage us to think of WebAssembly as a new language. |
I have been convinced to stop opposing this PR. One comment I would add then is that labeled break is also a feature of somewhat less high-level languages such as Rust. |
bdcc63a
to
a181137
Compare
Sounds good, I added Rust and Go which you found have labeled break as well. Any other languages worth mentioning? |
The more we add, the more this should be in Rationale.md. AstSemantics.md is for "this is what things are", whereas Rationale.md is for "and here's why it's this way and what that implies". Maybe move all of this text (and the preceding paragraph) to Rationale.md? |
@jfbastien Personally I like to see rationale and 'implementation notes' in an annotated specification near where the issue is specified, but don't hold anything up on this account. @kripken Common Lisp has |
@jfbastien: not opposed, but I literally added two words and a comma :) I was hoping not to open a new discussion in this already-too-long-issue... |
@jfbastien: How about if I merge this and start a followup to move parts into Rationale+links to them? Or do you prefer I do that in this pull? |
sgtm |
Ok, merging this now, and will start on followup. |
…-flow Clarify well-structured control flow
I think it's helpful to add some clarification to what we mean by "well-structured", since that term - and similarly "irreducible control flow" and so forth - are not universally familiar, and actually have some different definitions (e.g. they can be relative to specific constructs).
In addition, this added text provides an intuition to help understand our control flow for people more familiar with high-level languages. With this addition, I hope the text will be more accessible to a wider range of compiler hackers. In particular, it could help the large community of hackers currently compiling down to JavaScript, that we would love to get compiling to WebAssembly eventually.
I believe this would also address most of my concerns on the break/branch topic (#445).