How is linear memory allocated #227

jfbastien · 2015-06-24T22:49:36Z

All the talk of memory reminds me that we haven't discussed how WebAssembly modules get their memory.

Current implementations:

asm.js is passed an array buffer at creation time, and that buffer isn't growable. There was discussion of allowing resizing, but that's not efficient on all engines so isn't supported.
NaCl and PNaCl live in a separate process and pre-allocate a huge amount of virtual memory at process creation time, and lazily allocate physical memory when used.

WebAssembly, at least initially, shares its virtual memory space with other parts of the browser, which means that over-allocation will lead to fragmentation and potentially virtual memory exhaustion. This is a problem e.g. on 32-bit Windows XP systems which are still pretty big usecases.

Allocating physical memory lazily also means that an application can fault at runtime for any read/write that touches memory that was never used and now needs to be allocated. I think it's a desirable feature, but without signal handling it's kind of hard to handle!

Also, what kind of alignment and power-of-two size guarantees do we make, if any?

I think it would be great to support mmap, and on _start just allocate the heap with some restricted flags. We can decide to restrict what can be done initially (don't be lazy, allocate all physical memory, don't allow reallocation), and loosen these restrictions later. This is similar to passing in memory from the embedder, but can be made more powerful later while still being polyfillable (the polyfill can behave the same as asm.js does).

The text was updated successfully, but these errors were encountered:

MikeHolman · 2015-06-24T23:27:57Z

asm.js is passed an array buffer at creation time, and that buffer isn't growable. There was discussion of allowing resizing, but that's efficient on all engines so isn't supported.

The two engines which implement asm.js support this. And I just added ArrayBuffer.transfer, which Chrome/FF already have, so if Safari joins the mix (@pizlonator) it seems like this should be fine (and it was pretty easy to add, so I don't see much motivation against it).

In any case, I was envisioning that a wasm module requests a CommitHeapSize and a ReserveHeapSize, which would be a contiguous address space (i.e. VirtualAlloc(null, ReserveHeapSize, MEM_RESERVE); VirtualAlloc(baseAddress, CommitHeapSize, MEM_COMMIT); ). I think being a page-size multiple is a useful feature, and 64k is apparently an option for page size on arm64, so that might be what we want (though it seems a little big to me).

lukewagner · 2015-06-25T18:43:52Z

What's currently in the design [1] is that a heap is private state of a module, created when the module is first loaded. Not explicitly documented yet is it that the "memory initialization section" [2] would be the place to declare the initial size (and initial state, analogous to what native binaries do with .data, .bss etc).

The two problems I see with mmap (as opposed to sbrk) is that (1) it would be a major source of non-determinism if we allow mmap to choose the base address (and if not, what's the point), (2) I don't see an efficient impl that allows a browser to address the contiguous-region-fragmentation problem: if you put wasm memory in two separate regions, you still have one index that needs to somehow disallow access to the intervening region and VirtualProtect/mprotecting on every call/return doesn't seem viable.

I don't really understand the proposal to let the module allocate its own memory since it would mean starting execution w/o a heap (so no load/store) which seems like a weird special mode that we'd have to implement. I also don't see the increase in expressiveness compared to just declaring an initial heap size and allowing sbrk at runtime.

kripken · 2015-07-27T23:13:52Z

It looks like the current docs mention sbrk as being present in the MVP. Should we define more precisely how that would work? I assume the idea is a posix-style "get a delta, return the previous absolute"? I can open a PR if that sounds ok.

Also not clear to me in the docs is where sbrk would come from. Would it be imported from somewhere, or would it be an opcode?

jfbastien · 2015-07-27T23:23:07Z

Yes, I think it would be good to clarify after reaching agreement here.

I'd lean towards a limited developer-side mmap-ish API that toolchains happen to implement by default (and where implementation may round allocations up).

Fixed size at load time seems limiting, and so does MEM_RESERVE.

Developers don't choose where the heap is, they just choose its size (since wasm implies a hidden base).

kripken · 2015-07-27T23:34:57Z

What functionality are you proposing it would have that is more powerful than sbrk? (I'm curious where you fall in between sbrk and full POSIX mmap.)

For anything non-trivial, perhaps it makes sense to stick to sbrk for the MVP, and leave more sophisticated things for the future?

jfbastien · 2015-07-28T01:00:09Z

I'm not sure I'd initially give it that much more capabilities:

addr should be NULL, though that's a bit moot since we're not necessarily exposing the actual virtual address.
length could be restricted (min/max values) and have to be page-sized (see sysconf below).
prot other than PROT_READ | PROT_WRITE can be rejected for now, although PROT_NONE makes sense too (see below).
flags should probably be MAP_ANONYMOUS. I'd like to allow specifying MAP_NORESERVE or MAP_POPULATE, since they're useful in different applications. I'm not sure we should accept anything else.
fd should be -1.
offset should be 0.

munmap should probably not be available for MVP, and added later.

We should probably offer sysconf(_SC_PAGE_SIZE) or something analogous.

It would however be nice if the low-address page could be mapped as PROT_NONE from user-side code, so there's no magical "you can't address these low bits" implied in the format. That would mean that addr can be passed in, but has to be the previous mmap location plus its length.

That allows us to:

Expose the same capabilities as sbrk.
Remove some magic from WebAssembly's memory management.
Extend the API in the future.
Support existing code more easily.

kripken · 2015-07-28T18:03:46Z

As far as I can tell, the additional power you are proposing over sbrk is the allocation types (MAP_NORESERVE, etc.)?

I didn't follow the part about removing magic. Are you saying you see a problem with a wasm application accessing HEAP[0] and you want that to be avoidable by wasm content itself?

kg · 2015-07-28T18:07:03Z

As soon as we start sharing heaps between modules or doing similar things, we'll want the ability to use mmap to specify how the address spaces interact.

Creating deadzones with PROT_NONE is a pretty compelling feature for debugging and robustness as well. In the long run people may want copy-on-write or read-only pages so having mmap exposed at some level early is valuable in the long run.

jfbastien · 2015-07-28T18:15:55Z

@kripken: "magic" refers to the PROT_NONE section (currently ill-defined), as well as how the heap gets mapped / what its alignment and size are, ...

kripken · 2015-07-28T18:17:33Z

That seems to add magic - I still don't follow what you mean by "remove some magic" earlier? What is the current magic that this removes?

jfbastien · 2015-07-28T18:44:09Z

The current suggestion automagically sets some PROT_NONE memory area, unspecified what size it is and how that happens. mmap allows developer code to PROT_NONE anything they want, no magic.

kripken · 2015-07-28T18:48:13Z

I see, thanks, that's the part at paragraph 3 here, I now see.

Do we agree that sbrk is enough for the MVP? Or are you also proposing that these features (that I agree we want eventually) are urgent? They don't seem polyfillable to JS so I was assuming they were non-MVP.

kg · 2015-07-28T19:36:49Z

I think sbrk might be enough but I kind of fundamentally object to it as our baseline memory model. I think mmap is the right foundation for address/memory management. Worse is better, though, so maybe sbrk is the right kind of worse :-)

kripken · 2015-07-28T20:59:13Z

Ok, it looks like most of us agree that mmap is the better option, so perhaps there just isn't a reason to do sbrk as a short-term thing. This suggests that we

Drop sbrk from the MVP; wasm only supports a fixed memory size in the first early iteration.
Add mmap to PostMVP, with similar features as @jfbastien suggested.
FutureFeatures already has more advanced mmap capabilities (of files) as well as madvise.

How does that sound?

jfbastien · 2015-07-28T21:05:27Z

@kripken sgtm :-)
mmap can be made restrictive enough to be the same as sbrk, while removing some magic and making things simpler once we do expand mmap further.

kripken · 2015-07-28T21:32:55Z

Opened #285.

pizlonator · 2015-07-28T21:56:17Z

Sorry to be late to the party.

On Jun 24, 2015, at 4:28 PM, Michael Holman [email protected] wrote:

asm.js is passed an array buffer at creation time, and that buffer isn't growable. There was discussion of allowing resizing, but that's efficient on all engines so isn't supported.

The two engines which implement asm.js support this. And I just added ArrayBuffer.transfer, which Chrome/FF already have http://kangax.github.io/compat-table/es7/#ArrayBuffer.transfer, so if Safari joins the mix (@pizlonator https://github.com/pizlonator) it seems like this should be fine (and it was pretty easy to add, so I don't see much motivation against it).

I object to WebKit doing this. We don’t currently have a plan to “support” asm.js in the sense of recognizing “use asm”, and there is no good way to reconcile our plain-JS singleton-based constant inference and reassigning the HEAP variables.

In any case, I was envisioning that a wasm module requests a CommitHeapSize and a ReserveHeapSize, which would be a contiguous address space. I think being a page-size multiple is a useful feature, and 64k is apparently an option for page size on arm64, so that might be what we want (though it seems a little big to me).

—
Reply to this email directly or view it on GitHub #227 (comment).

lukewagner · 2015-10-29T16:20:10Z

This issue has been resolved by a number of PRs refining linear memory [1][2][3].

jfbastien added the question label Jun 24, 2015

jfbastien added this to the MVP milestone Jun 24, 2015

kripken mentioned this issue Jul 28, 2015

Remove sbrk from MVP, add mmap&friends to AstSemantics #285

Closed

jfbastien mentioned this issue Aug 20, 2015

Delegate address space allocation to the VM #306

Closed

jfbastien mentioned this issue Sep 8, 2015

Negotiated heap size and methods of resizing the heap. #331

Closed

lukewagner closed this as completed Oct 29, 2015

rossberg mentioned this issue Jun 23, 2016

Type checking and unreachable blocks #707

Closed

shelby3 mentioned this issue Jun 25, 2020

[reconstructed] WD-40 (for reducing Rust with the next mainstream language) #35 keean/zenscript#50

Closed

NodixBlockchain mentioned this issue Jun 26, 2020

new wd-40 test keean/zenscript#51

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How is linear memory allocated #227

How is linear memory allocated #227

jfbastien commented Jun 24, 2015

MikeHolman commented Jun 24, 2015

lukewagner commented Jun 25, 2015

kripken commented Jul 27, 2015

jfbastien commented Jul 27, 2015

kripken commented Jul 27, 2015

jfbastien commented Jul 28, 2015

kripken commented Jul 28, 2015

kg commented Jul 28, 2015

jfbastien commented Jul 28, 2015

kripken commented Jul 28, 2015

jfbastien commented Jul 28, 2015

kripken commented Jul 28, 2015

kg commented Jul 28, 2015

kripken commented Jul 28, 2015

jfbastien commented Jul 28, 2015

kripken commented Jul 28, 2015

pizlonator commented Jul 28, 2015

lukewagner commented Oct 29, 2015

How is linear memory allocated #227

How is linear memory allocated #227

Comments

jfbastien commented Jun 24, 2015

MikeHolman commented Jun 24, 2015

lukewagner commented Jun 25, 2015

kripken commented Jul 27, 2015

jfbastien commented Jul 27, 2015

kripken commented Jul 27, 2015

jfbastien commented Jul 28, 2015

kripken commented Jul 28, 2015

kg commented Jul 28, 2015

jfbastien commented Jul 28, 2015

kripken commented Jul 28, 2015

jfbastien commented Jul 28, 2015

kripken commented Jul 28, 2015

kg commented Jul 28, 2015

kripken commented Jul 28, 2015

jfbastien commented Jul 28, 2015

kripken commented Jul 28, 2015

pizlonator commented Jul 28, 2015

lukewagner commented Oct 29, 2015