Skip to content
This repository has been archived by the owner on Aug 17, 2022. It is now read-only.

Allocator function considerations #25

Closed
jgravelle-google opened this issue Apr 17, 2019 · 21 comments
Closed

Allocator function considerations #25

jgravelle-google opened this issue Apr 17, 2019 · 21 comments

Comments

@jgravelle-google
Copy link
Contributor

Some of the incoming binding expressions reference an allocator function (namely, alloc‑utf8‑str and alloc‑copy), and there's a few considerations for the design.

First is: how should this be polyfilled? By referencing an allocator function by index, we can't access it directly via JS.
Given that a polyfill can be assumed to be part of the instantiation code, in theory we could say that it could modify the incoming wasm bytes to export the given function. However that would make the polyfill modify the underlying wasm module, exporting the allocator for the world to see and use.

Second: in the non-polyfill case, how odd is it that the embedder can call a non-exported function? Even though it's opt-in (if you don't want the function to be called, don't specify it as an allocator function in the webidl-binding section), it's still unusual to have a non-exported function being called from the outside world.

To simplify those, I think a reasonable thing to do is to have the allocator functions specify exports, either by name or by index.

Thoughts? Other considerations I'm missing?

@rossberg
Copy link
Member

This is an important consideration, and I agree with your observation.

As a general rule, no access to a module should ever be defined in terms of indices. Indices are internal names inside a module. They are an implementation detail with no meaning outside. Furthermore, they cannot be expected to be stable, e.g., when a module is recompiled by a producer or extended.

In a similar vein, no assumptions should be made about the availability of internal entities that are not exported. The Wasm module system provides proper encapsulation, and an engine is free to take advantage of this property as it sees fit, including completely optimising away internal entities from a module instance.

Thus, all module access must only happen through export names.

These principles should be followed by a bindings specification as well. Otherwise, as you observe, it would require magic abilities that can neither be defined nor implemented by reduction to regular client code -- but I would expect that webidl-bindings are merely a convenience mechanism to generate code that you could equivalently write by hand.

The easiest fix would be replacing the alloc-func-idx in the current explainer with an export name.

@lukewagner
Copy link
Member

Good points; "can I write a polyfill that doesn't need to rewrite the .wasm?" seems like a good proxy for "can web idl bindings be semantically layered on top of core wasm?" which seems like a desirable property in general.

To wit, beyond just functions, with GC types and type imports, binding operators (e.g., as) would need to reference typedefs in the type section via type index, so I suppose they'd need to be made exports too.

The original problem I saw with requiring the allocator function to be exported is that it forces the allocator function into the "public" (post-binding) interface of the module, which also seems to break encapsulation. As a new idea: what do you think about allowing bindings to "hide" an export (simply removing it from the actual exports)?

@fgmccabe
Copy link
Contributor

I think the route around this conundrum is clear if you accept one condition: that managed memory has already been implemented. In that scenario one would simply not expose any kind of allocator. (Personally, I do not much like the idea of hidden exports)

@jgravelle-google
Copy link
Contributor Author

The thing about forcing the allocator to be public post-binding is, 1) morally how different is it to be visible to just the embedder, vs visible to everyone? and 2) even if you don't export your allocator you do need to share your memory (as in import or export, not necessarily threaded shared memory) to actually write the incoming value to the allocated buffer.

The other suggestion I'd had in mind was to say "we could punt on this and not offer these for now", but a module author / toolchain can always implement that policy by not using any of these bindings, if they care more about hiding their memory/allocator. As a single data point, Emscripten today imports its memory buffers from JS, and exports malloc to be called from JS library code.
I believe managed memory, specifically memory slices, allows for not sharing memory or allocators. I think we'd want to allow for both types of bindings.

I'm ambivalent about the idea of hidden exports. It's polyfillable (delete the exported property after grabbing a handle to it in the wrapping code), and it limits the scope of the allocator in a non-magic sort of way. But it doesn't let us encapsulate memory, which is probably even more important.

Are there benefits to having hidden allocator but public memory that I'm missing?

But yeah, export-by-name sounds like the way to go here.

@fgmccabe
Copy link
Contributor

fgmccabe commented Apr 17, 2019 via email

@jgravelle-google
Copy link
Contributor Author

To wit, beyond just functions, with GC types and type imports, binding operators (e.g., as) would need to reference typedefs in the type section via type index, so I suppose they'd need to be made exports too.

Missed this bit

Do they though? My understanding was that the wasm-type-taking binding expressions referred to types as immediates, e.g. i32=0x7f. Thus they don't reference the type section by index, thus they can't really be exports.

Though looking more closely, the type section today is only for functypes, meaning either A) we'll need to rethink what a wasm-type immediate looks like in the presence of user-defined types, or B) only allow inline, structural type definitions, preventing us from using nominal typing.
Although now I remember it, primitive types (e.g. i32) are negative so as to distinguish them from user-specified types, so a type index of 0x03 would be a reference to an index in the type section. Ah. Yeah that's almost-surely a problem then.

@fitzgen
Copy link
Contributor

fitzgen commented Apr 17, 2019

FWIW, wasm-bindgen also exposes the allocator when bindings require it.

Are we concerned about the code size footprint export string names have over indices?

@jgravelle-google
Copy link
Contributor Author

@fgmccabe but surely exposing the entire contents of one's memory can lead to a strict superset of those kinds of trouble? Regardless of how it allocates the buffer, even if we invert the control or even preallocate space, the binding needs to write the resulting bytes into the buffer. This means the buffer needs to be writable externally, which means the outside world can break any of those constraints arbitrarily at any time.
Now full managed memory allows the external world to return a result as a memory slice, which makes the allocation and writing orthogonal to what's happening internal to the module, but absent that I don't see how having allocator+memory exported is worse than just memory exported.

@fitzgen In principle we can dedupe these locally, by having a string table subsection in the bindings section. We should have only a single allocator function, so having to repeat "my_awesome_allocator" N times is pretty obviously wasteful.

@lukewagner
Copy link
Member

@jgravelle-google Actually yeah, I suppose if one is trying to present a minimal/idiomatic interface to JS from wasm (in which one would want to hide the allocator), one could always wrap the wasm module with a JS module that simply re-exported everything except the allocator function. Because of the magic of how ESM exports work, the re-exports will be the actual (Web IDL-bound) wasm exported function and thus the wrapper would have no call overhead. So maybe that's good enough and I'll retract the "export hiding" idea.

@fgmccabe
Copy link
Contributor

fgmccabe commented Apr 17, 2019 via email

@titzer
Copy link

titzer commented Apr 25, 2019

I too feel uncomfortable with exposing an allocator, regardless of how it's done. I think we end up with an unfortunate dependency on the memory for doing basic things. For managed languages implemented on top of Wasm, almost certainly the first thing they would do is allocate their own (perhaps internal) string object and copy the bytes into that.

It seems cleaner and more future compatible to have such APIs continue to return typed, managed, opaque strings (either anyref or an imported string). Then I think there should be a pair of imports str_len(s: string) -> u32 and str_copy(s: string, offset: u32) which get the length of the string and copy the string into a memory offset, respectively. The strlen + my_allocator + str_copy dance could be done in wrapper code inside the module, so the allocator is not exposed to the outside.

This also has the advantage that in a hypothetical "all managed" world, memory and memory allocation would not be necessary.

@lukewagner
Copy link
Member

So there is already (in the explainer, using the as binding operator) the ability to return strings via opaque reference type to wasm and then call encodeInto to copy the string into linear memory so that allocation is completely controlled by the wasm code. And yes, a managed language compiling to wasm-gc wouldn't even need to copy into linear memory.

The only reason to even consider the alloc‑utf8‑str binding operator is to optimize the case where the destination is linear memory and the source isn't already a GC string and so there is a wholly unnecessary GC allocation and copy. Thus, I expect something roughly like alloc-utf8-str will be necessary to hit performance goals in the limit. But alloc-utf8-str almost certainly is not part of the (to be defined) MVP subset so I don't think we need to worry too much about the particulars of its design at the moment; I only included it in the explainer to give an impression of how we might solve this performance problem in the future (since inquiring minds want to know!).

@jgravelle-google
Copy link
Contributor Author

After some offline discussion, the thing that got me to think we should avoid specifying allocators at all in the MVP version of WebIDL bindings is the problem of failure. What do we do in the event of a failed allocation? How does the wasm allocator function indicate to the host that it has failed? What should the host do with that? What does the host return to the wasm call? A way to avoid that is to trap on failure to allocate. Does that mean we should depend on the exception handling proposal? Should those traps be catchable?

The part where it really really falls over is in the presence of compound types. Say we want to return an array of 10 strings, and we fail to allocate the 4th string. How can we free the allocated strings? In what order? Etc.

I don't think most of those have obvious answers, so I think deferring until after we launch an initial version of webidl bindings is really reasonable here.

@fitzgen
Copy link
Contributor

fitzgen commented Apr 30, 2019

I'm very sympathetic to carving out small kernels of functionality we can ship sooner rather than later.

But I'd argue that avoiding JS string allocations is a fundamental goal of the webidl bindings, and that it is a must-have and not a nice-to-have to be added later. That is, an "MVP" implementation of webidl bindings that doesn't include support for passing and receiving strings would not satisfy the definition of MVP since it would not prove that it can achieve its main goal.

The very first sentence of the explainer lays out this goal:

The proposal describes adding a new mechanism to WebAssembly for reliably avoiding unnecessary overhead when calling, or being called, through a Web IDL interface.

I've been doing quite a bit of Wasm-and-DOM-interaction profiling recently, and even with interning and caching common strings sent from Wasm to JS, I'm seeing overheads on the order of 10% of total time dedicated just to creating JS strings from Wasm that are then passed directly into DOM methods.

I know many others have seen similar overheads in their applications as well.


What do we do in the event of a failed allocation? How does the wasm allocator function indicate to the host that it has failed? What should the host do with that? What does the host return to the wasm call?

We could use -1 as a sentinel for OOM, same as a failed memory.grow. And I could imagine some sort of binding operator that returns 0 or 1 as an additional return value back to the wasm caller based on if anything OOM'd.

But I think trapping is simpler, and is good enough to satisfy an MVP.

A way to avoid that is to trap on failure to allocate. Does that mean we should depend on the exception handling proposal? Should those traps be catchable?

Not in the initial MVP.

Once exception handling ships and a webidl bindings MVP ships, then lets add catchability.

The part where it really really falls over is in the presence of compound types. Say we want to return an array of 10 strings, and we fail to allocate the 4th string. How can we free the allocated strings? In what order? Etc.

FWIW, freeing them in reverse order of allocation would work out nicely for LIFO allocators.

But I think maybe that strings in compound types could be deferred as a post-MVP feature so long as sending and receiving bare strings was supported in the MVP.

@lukewagner
Copy link
Member

Another realization: to implement a polyfill, the memory also needs to be exported, not just the allocator function. (I think we've not noticed this since lld apparently always exports the memory with a canonical name.)

This runs against an independent goal I've had which is that, with Web IDL Bindings, it should be possible to have 2 modules, each with their own encapsulated memory, using just Web IDL Bindings (and no GC allocation) to copy a sequence of bytes from one module to the other. ("Encapsulated" here means the memory is neither imported nor exported.)

Given the directly-conflicting (and also attractive) goal that it should be possible to implement Web IDL Bindings in a purely layered manner (semantically and also literally, as a polyfill), this makes me go back to the earlier idea that, just as Web IDL Bindings would be able to wrap the core instance exports to produce new, bound, exports, Web IDL Bindings should also be able to hide exports entirely. If we view bindings as just a layer around a core wasm module, this doesn't seem unnatural.

@littledan
Copy link
Contributor

@lukewagner Would it make sense to think of a "polyfill" (in a broad sense) including generated JS glue code and some simple manipulations of the Wasm module?

@rossberg
Copy link
Member

@littledan, if you mean the incoming Wasm module then that would make me very uneasy, because it would create a slippery slope towards arbitrary manipulation.

I think what @lukewagner has in mind is merely "wrapping" the resulting JS export object, which could mean producing an object with fewer properties.

@littledan
Copy link
Contributor

@rossberg Well, if the change from here is to require the memory that the allocator references be exported, that sounds fine to me.

@lukewagner
Copy link
Member

Right: I'm imagining a pure "wrapping" JS polyfill (of both imports and exports) which does no .wasm binary manipulations, and uses only existing wasm JS API functions.

@jgravelle-google
Copy link
Contributor Author

@rossberg Well, if the change from here is to require the memory that the allocator references be exported, that sounds fine to me.

I've been assuming that for the purposes of polyfilling, though I think it should be modeled in the bindings section explicitly.

Two other ways to skin this cat:

  1. Have webidl-bindings take a hard dependency on multi-memory and first-class memory objects in particular.
  2. Allow the polyfillable-MVP to be a subset of MVP, i.e. we can spec allocators but not polyfillably.

Orthogonally, we can simplify the handling of free by requiring the host to only invoke alloc once. Under this scheme, an array-of-strings would need to precompute the space required for all the strings plus the array, then alloc that space in one contiguous chunk. Therefore if the allocation fails, it must do so atomically.
Are there cases where we fundamentally can't merge the allocations? There's awkwardness around structures that expect to be heterogenous, e.g. a C-style char** needs to be an array of pointers followed by N strings, but that should be solvable.

@pchickey
Copy link
Collaborator

Closing as out-of-date: the proposal has undergone significant revision in the direction that @lukewagner outlined, and most of the discussion here is no longer relevant.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants