Linking.md: Use multiple data and code sections #138

sbc100 · 2020-02-21T00:04:55Z

I'd like to propose that we move towards using multple data and code sections in the object format.

This matches llvm's internal ideas about what a section is. Today if you iterate through section in an object you will only see a single code section, even though we default -ffunction-section. This mean the linker is forced then break down the monolithic code and section sections in sub-sections.

There are bugs popping up due to the fact that we dont currently map llvm's concept of a section onto a wasm section: https://reviews.llvm.org/D74531

There is a wasm proposal out of make repeated sections a valid thing: https://github.com/WebAssembly/conditional-sections.

The fact that we can currently validate wasm object files with tools like wasm-validate is feature I don't want to loose, so such tools would need to learn about conditional sections (at at least the multi-section part of it) before we would want to enable this by default.

The text was updated successfully, but these errors were encountered:

dschuff · 2020-03-09T17:23:09Z

I think this makes sense. IIUC the current state is that object files can't be loaded without being relocated (i.e. they can't run correctly) but they do validate, right? We could preserve that property by just declaring that they use the conditional-sections proposal, and that any tool that wants to process them has to support that proposal (and of course those tools would still maintain "mvp" object file support). I think that also means we can do it as soon as the proposal is stable enough and supported by tools; we don't necessarily have to wait until all the browsers support it (as long as we're comfortable with "shipping" before stage 4, at risk of having to break compatibility or maintain extra hacks if things change).

aardappel · 2020-03-09T18:32:48Z

There's currently a lot of tools that will let you look at the contents of a .o even though they don't understand that it is different in some way from a regular .wasm, it be a shame to have all those stop working. So we'd have to make an effort to fix all of them. We're not the authors of all of them :)

Also, I am not following what information is gained by putting a function in a code section by itself, since a code section carries no information other than.. its size? Seems to me the linking data referring to segments of a code section or to a whole code section would be entirely equivalent, what am I missing?

sbc100 · 2020-03-17T17:03:51Z

You are correct its very useful that many tools can inspect object files. Requiring those tools to be aware of the multi-sections thing is (as far as I can tell) the main/only downside to this change.

But I think its worth it. Aside from binaryen and wabt how many other object inspection tools are there out there? If its only one or two then I'm certainly prepared to do the work on them too.

The benefits are mostly for consistency and simplicity of internal representation within llvm. There are two primary places I'm thinking about:

Any tool that used llvm's libObjectFile to iterate through section. We expect each function to be in its own section since wasm is always -ffunction-sections. If I have 3 functions I expect to see 3 code sections the objdump output.
The linker works on the granularity of sections. We currently subdivide the data and code sections in subsections (that we call "chunks" in the current wasm-ld code) in order to work around this.

Also the motivating issue: https://reviews.llvm.org/D74531. Here clang is expecting the ast to live in its own "section", but in the current model data sections are not modeled as section at all but segments (sub-sections of the data section which llvm tools don't know about).

tlively · 2020-03-17T22:02:22Z

There is precedent for requiring tools to implement stage 3 proposals to read object files: all object files currently contain a data count section whether or not bulk memory is enabled for their contents. So I think requiring tools to implement a proposal to continue reading object files is acceptable, as long as that proposal is reasonably stable and we are confident that it will eventually be standardized. I would not say the conditional sections proposal is quite there yet.

This is enought make it work up until llvm-cov tries to read the named data sections in the binary and can't find them. For this final part to work we probably need to switch the object format to using multiple code and data sections: WebAssembly/tool-conventions#138 Not sure if its worth submitting this part in isolation without a fully working solution? See #13046

Emit __clangast in custom section instead of named data segment to find it while iterating sections. This could be avoided if all data segements (the wasm sense) were represented as their own sections (in the llvm sense). This can be resolved by WebAssembly/tool-conventions#138 And the on-disk hashtable in clangast needs to be aligned by 4 bytes, so add paddings in name length field in custom section header. The length of clangast section name can be represented in 1 byte by leb128, and possible maximum pads are 3 bytes, so the section name length won't be invalid in theory. Fixes https://bugs.llvm.org/show_bug.cgi?id=35928 Differential Revision: https://reviews.llvm.org/D74531

sbc100 mentioned this issue Feb 21, 2020

Using repeated sections in the wasm object file format. WebAssembly/conditional-sections#17

Open

sbc100 mentioned this issue Dec 18, 2020

Add initial support -fcoverage-mapping support emscripten-core/emscripten#13072

Open

kateinoigakukun mentioned this issue Oct 20, 2021

🍒 [WebAssembly] Emit clangast in custom section aligned by 4 bytes swiftlang/llvm-project#3451

Merged

sbc100 mentioned this issue Mar 25, 2022

[WebAssembly] Use WebAssembly custom sections in C/C++ source files llvm/llvm-project#54552

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linking.md: Use multiple data and code sections #138

Linking.md: Use multiple data and code sections #138

sbc100 commented Feb 21, 2020

dschuff commented Mar 9, 2020

aardappel commented Mar 9, 2020

sbc100 commented Mar 17, 2020

tlively commented Mar 17, 2020

Linking.md: Use multiple data and code sections #138

Linking.md: Use multiple data and code sections #138

Comments

sbc100 commented Feb 21, 2020

dschuff commented Mar 9, 2020

aardappel commented Mar 9, 2020

sbc100 commented Mar 17, 2020

tlively commented Mar 17, 2020