-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mark JsValue idx as NonZeroU32 #155
Conversation
I noticed in the generated wasm for examples/dom that `JsStatic` takes two words of storage (for its `Option<JsValue>`) when it would seem to only need one, if we mandate that `0` is not a valid heap index. Changing the `.idx` type to `NonZeroU32` fixes this. However, LLVM doesn't know that these values are non-zero yet (rust-lang/rust#49572) which means in the implementation of Option's `get_or_insert_with` (https://doc.rust-lang.org/nightly/src/core/option.rs.html#774) LLVM is no longer able to prove that the `unreachable!()` branch is never hit and thus JsStatic ends up pulling in all the panic infrastructure where it previously didn't. This results in the examples/dom output being much larger, rather than smaller as I had hoped. Probably makes sense to wait for the upstream issue to be fixed so that this can be landed without regressing anything.
I wanted to post this for posterity but don't think it makes sense to merge this now so I'll close. |
Hm an interesting idea! While poking around this myself I saw a few opportunities for improvement: 4436c0e and 65acc3b, but unfortunately neither really helps here. I think enough inlining happens that we're not able to inform LLVM that the return value of the JS shims are nonzero (or inlining loses that information). As a result LLVM can't optimize the |
Interesting. It seems like your second commit actually generates worse (bigger) code if I'm reading right. Before: (block $B0
(if $I1
(i32.load
(i32.const 1024))
(then
(set_local $l0
(i32.load
(i32.const 1028)))
(br $B0)))
(i32.store
(i32.const 1028)
(tee_local $l0
(call $./dom.__wbg_static_accessor_document_document)))
(i32.store
(i32.const 1024)
(i32.const 1))) After: (block $B0
(if $I1
(i32.eq
(i32.load
(i32.const 1024))
(i32.const 1))
(then
(set_local $l0
(i32.load
(i32.const 1028)))
(br $B0)))
(i64.store
(i32.const 1024)
(i64.or
(i64.shl
(i64.extend_u/i32
(tee_local $l0
(call $./dom.__wbg_static_accessor_document_document)))
(i64.const 32))
(i64.const 1)))) Another thing I noticed is that in examples/dom, the two accesses to the |
I'm not actually sure what subpar codegen you had observed. |
Hm curious! I was actually inspecting LLVM IR rather than the wasm, but looking at the dom example the goal of the second commit was to remove an otherwise dead call to That being said LLVM's code generation for wasm I believe wasn't at the same level of quality you'd expect for x86/x86_64 at the time of LLVM 6 (what rustc is using). I believe the situation has improved in upstream LLVM (now on version 7), though. It may just be that there's subpar codegen here. I'm curious though that your snippets don't include a call to |
I'm just looking at your examples/dom in release mode with LTO and -Oz, the same 1508 bytes before wasm-opt and 632 after wasm-opt (which mostly just removes \00 in data) that you had mentioned. I don't recall seeing __wbindgen_object_drop_ref except in the .js (though it is curious that it's there yet not in the .wasm). rustc 1.27.0-nightly (48fa6f963 2018-04-05). |
Sorry, I misread – __wbindgen_object_drop_ref is called at the end of the file to drop document.body, but not anywhere else. |
Whoa ok now this is weird. So it turns out these are producing two wasm files:
When using I'm tempted though to leave it as this is pretty touchy based on LLVM settings and I think otherwise removing the extraneous call that can't be optimized away in some circumstances is probably best. I suspect that LLVM's backend will improve over time in micro-cases like this, but if it becomes a larger problem we can surely continue to investigate. |
I noticed in the generated wasm for examples/dom that
JsStatic
takes two words of storage (for itsOption<JsValue>
) when it would seem to only need one, if we mandate that0
is not a valid heap index. Changing the.idx
type toNonZeroU32
fixes this.However, LLVM doesn't know that these values are non-zero yet (rust-lang/rust#49572) which means in the implementation of Option's
get_or_insert_with
(https://doc.rust-lang.org/nightly/src/core/option.rs.html#774) LLVM is no longer able to prove that theunreachable!()
branch is never hit and thus JsStatic ends up pulling in all the panic infrastructure where it previously didn't. This results in the examples/dom output being much larger, rather than smaller as I had hoped. Probably makes sense to wait for the upstream issue to be fixed so that this can be landed without regressing anything.