Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local definition clarification #499

Closed
wants to merge 1 commit into from
Closed

Local definition clarification #499

wants to merge 1 commit into from

Conversation

jcbeyler
Copy link

@jcbeyler jcbeyler commented Dec 9, 2015

Clarify that locals are virtual registers and not required to be translated into stack slots to the generated code.

…slated

  into stack slots to the generated code.
@jcbeyler
Copy link
Author

jcbeyler commented Dec 9, 2015

This was not entirely automatically clear to me so adding clarification

@@ -225,6 +226,9 @@ The details of index space for local variables and their types will be further c
e.g. whether locals with type `i32` and `i64` must be contiguous and separate from
others, etc.

Finally, because local variables can be seen as virtual registers, `get_local` and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of low-level detail that is probably only interesting to engine implementors and compiler people, so I don't think that it is necessary here.

@jfbastien
Copy link
Member

Maybe put this in Rationale.md? I think we need to clarify why we have expression trees (which use the latest computed value) as well as get/set local and load/store, but I don't think this belongs in AST semantics. This could then add details interesting to implementations such as register coloring.

I think "virtual register" is exactly the right wording, considering our usage of "virtual ISA" to describe WebAssembly.

@ghost
Copy link

ghost commented Dec 11, 2015

@jfbastien The term 'local variable' fits just as well if not better than 'virtual register'. The ISA reference was a replacement for IR and neither seem appropriate to me except from the perspective of the use case of source-to-source compilation. I don't think it is constructive to be promoting language in the specification that fortifies a virtual-machine-code interpretation of the language when no one has justified this position versus a clear source code story for the web?

The key issue here is that we can have a clear source code story for the web, have a binary blob that is a one-to-one lossless encoding of a text source code, without compromising the use case of being a target for source-to-source compilation. No one has been able to dispute this technical point, so in fairness I ask the group to accept this as plausible even though not demonstrated yet (we don't have a text source format).

With this accepted, the language is not an 'assembly language'. I refer to the description at https://en.wikipedia.org/wiki/Assembly_language 'Assembly language is converted into executable machine code by a utility program referred to as an assembler; the conversion process is referred to as assembly, or assembling the code.' Since the deployed blob can reasonably be considered equivalent to the text source code, it does not fit this generally understood definition. I understand that there have been many assemblers that have higher level language features, but they target a machine code and this distinction is absent in the binary blob being deployed. It is the client runtime that compiles the deployed code 'into executable machine code' not the encoder that packs the binary source code. Sorry it just does not fit.

With this accepted, the continued interpretation of wasm as an ISA can only be seen as an extreme position, and one that I do not believe is in the health of the web. Perhaps you have reached the current position by pooling ideas from asm.js which practically constrains the language to using expressions and from llvm's IR which suggested an object code, plus adopting rather unfitting analogies to byte-code from other related technology. However what I am seeing is a complete unwillingness from yourself and a core subgroup to recognize that the development can be reconciled with a clear source code position for the web, as evidenced by their silence and comments such as yours above on this patch.

I believe the group can expect Chairs to engage in reaching consensus on technical matters and would choose Chairs that have the technical skill to do so, so a continued silence from the Chairs makes their positions less and less tenable. You are a Chair, have been silent on the matter, and yet are promoting the fortification of a disputed position - can you see that this is not particularly constructive.

@ncbray
Copy link

ncbray commented Dec 11, 2015

@JSStats

I get that you are frustrated. I don't think anyone wants to ignore you, I think people don't really understanding what point you're trying to make. Being understood is a struggle for pretty much everyone. Communication is difficult. I will say that assuming a lack of comprehension (by everyone involved, including myself) rather than malice makes me a happier person.

It is possible to view the same thing in two different ways. Portable native code is not really source code and is also not really machine code. Asserting that WASM is not an assembly language sort of misses the point because it is simultaneously right and wrong. Different folks will use different analogies to understand what portable native code actually is. Trying to change people's mental models directly pretty much never works. Discussing implications, use cases, and outcomes is much more effective. "unfitting" implies there is an undesirable outcome that emerges, somewhere.

I suspect the use case you're differing on is that textual source with arbitrary identifier names and comments can be converted to binary and back, and the names and comments recovered. This can be rationalized both through the language analogy and the machine code analogy. Currently, WASM has fixed register names an no comments. If I am understanding what you're advocating, from the machine code perspective you're asking for nameable registers and associating opaque blobs with arbitrary locations in the program. Make a case for that?

@ghost
Copy link

ghost commented Dec 12, 2015

@ncbray It does not seem that difficult a technical matter to communicate. To help me understand where the message is being lost could I ask if you understand the technical question: Can we have a clear source code story for the web, have a deployed binary blob that is a one-to-one lossless encoding of a text source code, without compromising the use case of being a target for source-to-source compilation?

Can you answer this question at this point? Would you need more time (perhaps time you don't have)? Would you like more information? Would you consider it plausible but reasonably ask for a demonstration? Do you consider it completely implausible?

Would you concur that a compressed text source code file is still the same source code, yet would be a binary?

Would you concur that just being a binary does not reasonably make it a virtual-machine-code or the compression or encoding process Assembly?

Would you concur that the deployed blob being developed here could have this same property?

Please note my above feedback on this patch was that it would basically lgtm if only it qualified the working as being from the perspective of source-to-source compilation - is that really all that unreasonable?

Describing the current local-variables as 'registers' is really not a good fit. In the target ISA's (x86 etc) registers are gobal thread-local storage, whereas the local-variables have lexical scope within a function. I do understand the an ISA could have function local registers or even have no registers etc - but these are not the target ISAs for this project.

I do support the use case of source-to-source compilation and I am happy for the specification to be well annotated with such use notes but it could be done by qualifying interpretations of local variables as registers for this use case etc.

The focus on the deployed binary being a virtual-machine-code and the encoding being Assembly, excludes the clear source code story for the web. I am here making the case for a clear source code story for the web, and explaining the position and not ignoring people, but I don't think it is fair that the case for the opposing view need not be made?

Can you make the case for the virtual-machine-code story for the web, and for excluding the clear source code story for the web in the process?

@ncbray
Copy link

ncbray commented Dec 12, 2015

@JSStats - an implicit point I was making was that this thread really isn't the place to discuss it. I believe you're fighting a proxy war, here, about a core concept of the project. :) Calling it a virtual register isn't the core issue.

Dealing with that, first. "virtual register" is an analogy to compiler IR that is not yet target specific. Which is what WASM is, in a way. An ISA-specific "register" (machine register?) is not a solid analogy. virtual registers are pretty much always function scoped. This is one of those ways that "portable native code" differs from "machine code". When threading is added, we'll likely need to add the concept of thread-scoped virtual registers for implementing stack pointers, etc. So, while I can see "register" as objectionable when equated to "machine register", if you say "virtual register" to a C compiler person, I believe they'll know what you're talking about. If I wanted to be a jerk, I'd claim that "local" implies it lives in memory and can be aliased. Or that it lives in a dictionary because I am thinking of Python locals. Or is someone who lives in a particular city. Any term is unreasonable if you bring the correct baggage with you. ;)

And for the rest of it. Just advice and background because I don't think this thread is the right place to hash things out.

I believe that when you say "source code" this is a fairly loaded term that you need to break down into smaller concepts. It holds a lot of meaning to you, and I am pretty sure it means different things to me. "lossless" is also loaded. I don't think that calling WASM "source", "machine code", or "assembly" particularly matters. WASM is specifying the behavior of a program, call it what you will. Binary vs. text is just an encoding issue. Binary <=> text should be lossless, but that does not necessarily imply that the text has all the degrees of freedom you would want from a human-oriented language, such as being able to name variables. (Imagine if "go fmt" stripped comments and renamed variables based on the order they were declared. Is this no longer "source"?) WASM is designed to be generated by machines and consumed by machines, which means it may look quite different than other "source". Call it source-to-source or source-to-binary, I think the question is really about how many degrees of freedom are available when specifying the program?

@ghost
Copy link

ghost commented Dec 12, 2015

@ncbray Don't let a lack of an appropriate issue stop our discussion and you are welcome to continue in #483 and I look forward to seeing your answers there.

Calling it a virtual register is advancing a disputed virtual-machine-code focus. JF claimed 'virtual register' was appropriate because WebAssembly is a 'virtual ISA'.

I dispute that wasm be targeted at only 'C compiler' people and the one-eyed focus that 'WASM is designed to be generated by machines and consumed by machines', and I would also like some consideration be given to it being readable and writeable, and programmers are not dealing with compiler IR or 'portable native code' and it's simply a local variable in source code to them.

I don't deny you using 'virtual register' in specification annotations if is qualified as from the perspective of source-to-source compilation.

'lossless and one-to-one' is the most precise technical concept I have found so far to communicate a technical aspect of the dispute and it does have a precise technical test. Obviously this is not sufficient as a linear virtual machine code could have a lossless encoding, and a text format with not support for comments etc might also be lossless, but neither would be in the spirit of giving some consideration to the source code being readable and writeable.

It might surprise you that 'Binary <=> text' is not currently specified to be lossless, rather just isomorphic. The model is one of lossy Assembly of the annotated text source code to an isomorphic encoding described as an ISA. It forces all users to deployed stripped code, and can you make the case for this limitation?

Another concept that might help is the programming principle of encapsulation which might be generally accepted as making code more readable, and might support the use of expressions, and perhaps could also support some pipeline optimizations.

If it's all the same to you then can we just call it 'source code' and have the specification language and pipeline reflect this?

@jfbastien
Copy link
Member

@JSStats:

I believe the group can expect Chairs to engage in reaching consensus on technical matters and would choose Chairs that have the technical skill to do so, so a continued silence from the Chairs makes their positions less and less tenable. You are a Chair, have been silent on the matter, and yet are promoting the fortification of a disputed position - can you see that this is not particularly constructive.

The role of the W3C CG chairs is to ensure that the group meets its goals, follows the W3C CG procedures, as well as to act as contact points for the code of conduct and general procedures. All participants, chairs included, have a responsibility to moderate discussion and keep the group focused on relevant topics.

When I chime in with technical opinions I do so as an individual contributor, not as chair. Chairs currently do not have any special technical veto rights: we operate under an unspecified operational agreement revolving around rough consensus-building where chairs have no special rights. The chairs can adopt a formal operational agreement, but I believe that if we did so we would continue to put the technical decision capabilities in the hands on CG members.

The "disputed position" you refer to seems to be disputed by yourself only. As @ncbray pointed out this is a communication problem. I do not understand the position you're trying to take, and you're disrupting PRs by asking for unrelated changes to well-established wording. Yes I'm fortifying this position, but that's because it's well-established within the group. You may be right, maybe that position should be changed, but derailing each PR you dislike isn't a productive way to engage.

In this case, my silence was also motivated by traveling back to the US from Germany, as well as it being the weekend. I'd like a bit more appreciation from you with the amount of time we invest in trying to interact productively with you: it takes a tremendous amount of time, doesn't move us forward technically in a direction I'm interested in, and only leads to you being even more upset and acting in a manner that I find disruptive and offensive. That's probably not your intent, but it's the outcome you're generating. I'm nonetheless trying to be fair and inclusive with you.

I suggest you file new issues that are concise and to the point, and avoid inflammatory wording. At the same time you have to embrace the focus most folks involved currently have: we're trying to build technical artifacts that people can try out. We're currently past philosophical questions about what we're trying to build. We want these artifacts because they'll allow us to revisit these early decision with data instead of going on a hunch of what we intuitively thought was right. You may be getting little traction not because people disagree with you, but because they're not interested at this point in time.

@sunfishcode
Copy link
Member

I agree with @titzer; the text proposed for AstSemantics.md feels a little out of place with the way the rest of the surrounding text is written. But, the text proposed for Rationale.md lgtm.

@ghost
Copy link

ghost commented Jan 6, 2016

Withdraw my objection. Using an optional source code meta data sections people can re-purpose the format from other perspectives and I will explore doing so elsewhere to avoid any conflict or confusion.

@sunfishcode sunfishcode modified the milestone: MVP Jul 12, 2016
@sunfishcode sunfishcode self-assigned this Jul 25, 2016
@sunfishcode
Copy link
Member

This was assigned to me, however reading this again now, my impression is that the current text in Rationale.md is already fairly clear about locals being register-allocated. If anyone feels the current wording is still unclear, feel free to make a new proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants