Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark JsValue idx as NonZeroU32 #155

Closed
wants to merge 1 commit into from

Conversation

sophiebits
Copy link
Contributor

I noticed in the generated wasm for examples/dom that JsStatic takes two words of storage (for its Option<JsValue>) when it would seem to only need one, if we mandate that 0 is not a valid heap index. Changing the .idx type to NonZeroU32 fixes this.

However, LLVM doesn't know that these values are non-zero yet (rust-lang/rust#49572) which means in the implementation of Option's get_or_insert_with (https://doc.rust-lang.org/nightly/src/core/option.rs.html#774) LLVM is no longer able to prove that the unreachable!() branch is never hit and thus JsStatic ends up pulling in all the panic infrastructure where it previously didn't. This results in the examples/dom output being much larger, rather than smaller as I had hoped. Probably makes sense to wait for the upstream issue to be fixed so that this can be landed without regressing anything.

I noticed in the generated wasm for examples/dom that `JsStatic` takes two words of storage (for its `Option<JsValue>`) when it would seem to only need one, if we mandate that `0` is not a valid heap index. Changing the `.idx` type to `NonZeroU32` fixes this.

However, LLVM doesn't know that these values are non-zero yet (rust-lang/rust#49572) which means in the implementation of Option's `get_or_insert_with` (https://doc.rust-lang.org/nightly/src/core/option.rs.html#774) LLVM is no longer able to prove that the `unreachable!()` branch is never hit and thus JsStatic ends up pulling in all the panic infrastructure where it previously didn't. This results in the examples/dom output being much larger, rather than smaller as I had hoped. Probably makes sense to wait for the upstream issue to be fixed so that this can be landed without regressing anything.
@sophiebits
Copy link
Contributor Author

I wanted to post this for posterity but don't think it makes sense to merge this now so I'll close.

@sophiebits sophiebits closed this Apr 21, 2018
@alexcrichton
Copy link
Contributor

Hm an interesting idea! While poking around this myself I saw a few opportunities for improvement: 4436c0e and 65acc3b, but unfortunately neither really helps here. I think enough inlining happens that we're not able to inform LLVM that the return value of the JS shims are nonzero (or inlining loses that information). As a result LLVM can't optimize the as_ref().unwrap() because the imported JS functions in theory could return 0, but I'd also be fine inserting some unsafe code to get rid of the unwrap

@sophiebits
Copy link
Contributor Author

Interesting. It seems like your second commit actually generates worse (bigger) code if I'm reading right. Before:

    (block $B0
      (if $I1
        (i32.load
          (i32.const 1024))
        (then
          (set_local $l0
            (i32.load
              (i32.const 1028)))
          (br $B0)))
      (i32.store
        (i32.const 1028)
        (tee_local $l0
          (call $./dom.__wbg_static_accessor_document_document)))
      (i32.store
        (i32.const 1024)
        (i32.const 1)))

After:

    (block $B0
      (if $I1
        (i32.eq
          (i32.load
            (i32.const 1024))
          (i32.const 1))
        (then
          (set_local $l0
            (i32.load
              (i32.const 1028)))
          (br $B0)))
      (i64.store
        (i32.const 1024)
        (i64.or
          (i64.shl
            (i64.extend_u/i32
              (tee_local $l0
                (call $./dom.__wbg_static_accessor_document_document)))
            (i64.const 32))
          (i64.const 1))))

Another thing I noticed is that in examples/dom, the two accesses to the document static both check if it has been initialized, whereas the second check should be unnecessary. Not sure if there's a way to convince LLVM of that? At least in theory it seems like it could know that nothing else assigns to that memory.

@sophiebits
Copy link
Contributor Author

I'm not actually sure what subpar codegen you had observed.

@alexcrichton
Copy link
Contributor

Hm curious! I was actually inspecting LLVM IR rather than the wasm, but looking at the dom example the goal of the second commit was to remove an otherwise dead call to __wbindgen_object_drop_ref. That function should never be needed in the deref impl but LLVM couldn't convince itself of that (needing some unsafe code).

That being said LLVM's code generation for wasm I believe wasn't at the same level of quality you'd expect for x86/x86_64 at the time of LLVM 6 (what rustc is using). I believe the situation has improved in upstream LLVM (now on version 7), though. It may just be that there's subpar codegen here.

I'm curious though that your snippets don't include a call to __wbindgen_object_drop_ref which should have been eliminated after the second commit. What benchmark are you using to generate those snippets? I'll try to poke around as well and see if we can coerce rustc/llvm to do a better job.

@sophiebits
Copy link
Contributor Author

sophiebits commented Apr 22, 2018

I'm just looking at your examples/dom in release mode with LTO and -Oz, the same 1508 bytes before wasm-opt and 632 after wasm-opt (which mostly just removes \00 in data) that you had mentioned. I don't recall seeing __wbindgen_object_drop_ref except in the .js (though it is curious that it's there yet not in the .wasm). rustc 1.27.0-nightly (48fa6f963 2018-04-05).

@sophiebits
Copy link
Contributor Author

Sorry, I misread – __wbindgen_object_drop_ref is called at the end of the file to drop document.body, but not anywhere else.

@alexcrichton
Copy link
Contributor

Whoa ok now this is weird. So it turns out these are producing two wasm files:

$ cargo build --release --target wasm32-unknown-unknown
$ cargo rustc --release --target wasm32-unknown-unknown -- --emit llvm-ir,link

When using cargo build there is indeed no extraneous calls to drop_ref, but when using cargo rustc there's an extraneous call to that because deref isn't inlined. Sure enough though I can reproduce the slight tweaks in codegen (a few more instructions) after the commit rather than before the commit.

I'm tempted though to leave it as this is pretty touchy based on LLVM settings and I think otherwise removing the extraneous call that can't be optimized away in some circumstances is probably best. I suspect that LLVM's backend will improve over time in micro-cases like this, but if it becomes a larger problem we can surely continue to investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants