-
Notifications
You must be signed in to change notification settings - Fork 2
Concern with un-decoded bits / alternative for SIMD #11
Comments
Thanks for laying this out in detail. While I understand the concern, I'm skeptical that any approach to this problem that doesn't involve validating un-decoded bits will be sufficiently useful to meet the needs of today's SIMD developers. Take one important use-case: feature detection based on differences between versions of a single engine. Because the approach described here will only allow high-fidelity feature detection of newly-standardized features on engines which have knowledge of those features, it will never be useful in such a case (since the old versions of the engine couldn't possibly know what opcodes will be standardized in the future), and thus developers will continue to make use of host-provided mechanisms like the JS-based techniques used today. But it's the pain of doing that (building, maintaining, and serving multiple binaries, maintaining & running detecting code, etc) that I see as the central target for @tlively's work here. The same reasoning applies for feature detection across different engines, since it's unlikely that different engines with different release schedules & priorities will happen to align their releases of a fully-functional version of a new proposal with a "dummy" version in other engines. Of course none of what I've written above implies that Wasm must add some mechanism for validating un-decoded bits. But I think these use-cases should be met by anything we do add. Or to put the requirement another way, if we're going to add a feature detection mechanism to Wasm for use-cases like that of SIMD additions over time, it must be powerful enough to supersede most (possibly all) of the uses of host-based feature detection (at least once the mechanism itself is standardized and implemented across the ecosystem). |
I guess there are two distinct-but-related use cases here:
and what I'm proposing in this issue only addresses (1), but not (2). (1) seems like a problem that will only increase over time as wasm gets deployed more widely so I expect it's something we'll eventually need to have an answer for. In contrast, (2) seems like a problem that will diminish over time as there are a (hopefully) finite set of SIMD features with finite implementation windows before we start reaching diminishing returns from adding new SIMD features. The specific coefficients matter a lot here, of course, but I think this is something to weigh against the permanence of the above concerns. Also, practically speaking: how bad would it be in a Web context to deal with lagging browser implementations using a |
I recently implemented decoding and validation of SIMD in Wizard (but not yet execution). I think the difference between decoding the actual instructions and validating their types is actually very small here. (e.g. there are no instructions with "interesting signatures", lending them all to be listable in a table: https://github.com/titzer/wizard-engine/blob/master/src/engine/Opcodes.v3#L287). Keep in mind that decoding requires validating immediates like alignment and lane indexes, and it is incrementally not much more work to check a signature from the table https://github.com/titzer/wizard-engine/blob/master/src/engine/CodeValidator.v3#L779). An engine has to decode the value types and constant initializers as well. I think we wouldn't be saving an engine much actual work by omitting type validation but still requiring decoding and validation of the V128 type and instructions because there are so many places they can occur (tables, globals, signatures, locals, and in function bodies). For SIMD as it is, I see four possible levels of engine support:
I don't see the option in-between 1. and 2. where an engine decodes but doesn't validate SIMD types. As for the broader discussion, I am somewhat tempted to agree in general that we want engines to align on what bits are acceptable and want to avoid "circumventing the standardization process" and pushing out non-standard bits that work (perhaps underhandedly) on other engines by virtue of skipping decoding. I would almost be tempted to advocate that all engines support all the features in their frontends, but this will break down for complex features like GC, which are a lot of work, and very subtle, in the front end. Thus, in the case of GC, I think we'll want engines that just refuse to decode these features, treating them as invalid binary bytes. |
@titzer, on WebAssembly/design#1444 you asked that we find a holistic answer to versioning, and I agree! But my impression is that the discussion so far is going the opposite direction, focussing very narrowly on singular use cases and considering ad-hoc solutions to them. At the same time, I'm also very sympathetic of @lukewagner's original observation that a generic ability to accept unparsed code paths is a risky proposition. It could make the need for feature detection a self-fulfilling prophecy. IMO, we should look for something that works
For example, GC is an example of a feature that affects many parts of the language, where parsing is relatively easy, but validation is not (and execution might be impossible). So it might indeed be desirable to have something in-between 1 and 2, where engines have to decode it (avoiding Luke's problem) but not to validate it (avoiding unwanted engine complexity). And of course, that would then also work for SIMD. So abstracting away from SIMD, we can recast Ben's spectrum of "optional support" as follows: A. Does not decode the feature and rejects it. My assumption would be that some environments or engine makers will probably prefer either A or E for optional features, and we should allow either, uniformly. To support feature detection, engines would have to be allowed to use one of B or C as well. Feature detection could then be supported in the form of alternative sections, but those are only accepted if all their alternatives decode (B) or validate (C), respectively. (To be clear, B vs C would be a choice in the spec, not by an implementation – and B seems preferable, since it works for more features.) (I'm not convinced yet that D is a worthwhile option, but of course it can be introduced as well.) As @ajklein points out, this does not address some of the current SIMD use cases. But I agree with Luke's reply. |
The SIMD case is really two cases: engines that don't support SIMD at all and don't know about v128 ("new types + new instructions"), and engines that know about v128 but may not be aware of all instructions ("new instructions"). Consider the latter case, new instructions. It is not limited to new SIMD instructions, and while it may be "diminishing over time" it is not given that we'll just stop adding instructions at some point. Instructions that have been proposed but not spec'd yet that fall into this simple category include carry propagation instructions, byte swap instructions, memory control instruction, and thread management instructions. We will have acquire-release atomic instructions. Looking to some common instruction sets for things missing from Wasm, we find bitfield insert/extract/clear instructions and bit reverse. Looking to the past, we added some instructions post-MVP that could have benefitted from a simple guard: sign extension, non-trapping float-to-int, and arguably the bulk memory and bulk table instructions (even if there were non-instruction additions in the latter proposal). Common to a number of the messages above is that they present solutions that do not address this problem, which seems important enough and long-term enough to want to be solved. The "only new instructions" use case is met simply using the proposed The problem with wanting to avoid undecoded bits is that there can be what amounts to undecoded bits in the instruction stream already, at least if it's a matter of an implementation wanting to use private instructions. |
I don't think
I don't assume we'll stop adding features over time; we definitely will. The question is whether the delay between standardization and implementation will become so bad that it becomes necessary to support undecoded bits in the manner of In contrast, iiuc, in the two browsers that do currently natively implement SIMD, there are hardware configurations that don't support SIMD so what I'm proposing above could be adopted in the short-term as a solution for these present-day problems in a minimally-invasive manner. |
Would this allow an implementation to sneak in custom extensions without the code being rejected on conforming engines, though? That possibility is the main worry Luke points out wrt feature_block – it would open the gates to enabling freely shipping code containing "vendor extensions" and all the bads that can follow from that for the future of the standard and its ecosystem. |
@rossberg Unlike, say,
Whether this is sneaky "enough" is another matter, maybe; I don't see how it can easily encode control flow, new types of block structure, or operate on reference types, but perhaps there are ways. (A suitable shuffle operation that carries what is interpreted as an index into a custom section could act as a reference to private code to be inlined in the compiled program, I guess, but it still feels pretty limited.) More generally, I feel that the core of the use case under discussion is that "there are going to be instructions that not all engines know what to do with at every point in time". It's already possible to work around that with sniffing and supplying multiple versions of a module, or with dynamic linking; it's just not pleasant to do so. A solution that directly addresses the use case will almost certainly need to allow for some bits to be uninterpreted. If the committee can't come to agreement on that we'll have to live with existing workarounds (or new ones, though they're not likely to be much better than the ones we have already). It's not the end of the world; it just adds friction. |
@lars-t-hansen, yes, but Luke's argument is that it would be bad to allow (or in fact, require) engines to accept any code that they do not understand, because that would create a backdoor for proprietary vendor extensions that could undermine the standard. And that applies to non-standard opcodes and non-standard immediates alike. Maybe I'm misreading your reply, but I'm trying to understand what difference you see between the two. Right now, no conforming engine would accept unspecified shuffle patterns either, so they are not observably different from unspecified opcodes, AFAICS. And feature blocks would of course allow both. |
@rossberg, perhaps we're talking about slightly different things. The shuffle patterns I'm talking about are valid (every lane is less than 32), they're just nonsensical in practical programs. So they would validate, and they have a meaning, but since no useful program would need to express that meaning, they can be assigned a different meaning in an implementation at extremely low risk. The risk would then be further mitigated by gating the use of the "private opcode" on some type of implementation sniffing. |
I strongly agree with @lars-t-hansen and @ajklein that gracefully handling completely unknown SIMD extensions from the future is the raison d'etre of this proposal. If we can't find consensus that this problem is worth solving in the first place, then so be it, but my SIMD users will continue to be frustrated with the relatively poor developer experience we offer for working around this problem. |
@lars-t-hansen, ah, now I see what you mean, thanks! Interesting. But of course, a deliberately non-conforming VM would clearly not be blessed by the standard, unlike the (ab)use of feature blocks for a similar purpose. So the amount of "criminal energy" required to exploit that would be on a whole different level. |
Just to note, in the latest iteration of that proposal, the idea was that any validation rules not dependent on popping types from the type stack during validation would still be applied in dead code. There was a general sense that we didn't want to relax the type system any more than strictly necessary, and that we wanted to limit how wacky one could get with smuggling invalid instructions/random bytes into dead code. EDIT: and I think more importantly, there was a concern that we haven't previously exposed any distinction between decode-time and validation-time errors, and the spec isn't currently factored in a way that would make this easy |
Re: @lukewagner's comment earlier:
I'd just like to note that this seems to kind of sidestep a common "in-between" state: shipping new features quickly before dropping support for all versions of all engines that have not implemented them. Many applications wish to support several versions of common browsers and operating systems, and anything targeting embedded use-cases (smart TVs, etc.) will need to support several-year-old Chromium and WebKit at a minimum. Runtime compatibility checks don't just mean "you can start using new features before Safari ships them"; for many, they mean "you can start using new features before making the hard decision to drop support for devices manufactured the year the feature was introduced". |
I'd like to lay out a concern regarding feature-detection along with a way to address this concern while still addressing the needs of SIMD (which as @tlively was saying in his last presentation, seems to be our only short-term need for feature-detection).
The concern is that, despite our best intentions in the short-term, if we allow wasm modules to validate with completely un-decoded bits, then over the long-term (as wasm gets more toolchains and engines and business contexts), we'll end up with a lot of wasm modules containing non-standard bits. While it is certainly possible today for particular toolchains and engines to agree to produce/consume non-standard/pre-standard bits (which is actually an essential part of the process of standardization), there is a natural check-and-balance in that these non-standard/pre-standard bits will only run on those particular engines and fail to validate on the rest. If it becomes possible to produce modules with non-standard bits that validate on all wasm engines, then it becomes much easier to push out non-standard bits and circumvent the standardization process in general. As with most vague concerns, maybe there comes a time when this is a risk we want to take, but I think it's a good idea to hold off on crossing this Rubicon for as long as possible.
If we zoom in on the SIMD use case, I think a reasonable alternative solution makes sense (which is slightly different from what I've suggested before) based on the observation that the engineering-effort and engine-size required to decode, but not validate or compile, SIMD (or any other future wasm feature for that matter) is miniscule compared to the overall effort required to implement MVP wasm 1.0, and thus it's not worth trying to optimize away SIMD decoding on engines that don't support SIMD (classic Amdahl's Law).
Based on this observation, I think we can derive a reasonable compromise:
i32
value).unreachable
instruction, and thus they are decoded as dead code.v128
with an actual 128-bit value -- they could use nothing or a dummy.Some other nice technical benefits from this approach are:
With respect to the other major concepts of "version" and "profile":
At least, that's the best I've been able to come up with; happy to discuss!
The text was updated successfully, but these errors were encountered: