-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: ArrayBuffer Views #5
Conversation
… drawbacks and alternatives sections
Is it safe to provide access to See how See rust-lang/rust#39271 (comment) for discussion. I wonder how JavaScript itself deals with sNaNs here? |
@matklad Right you are, and this is exactly the kind of worries I had about aliasing -- thank you so much for helping to zero in on the issues! So one thing I think we could do here would be to create a custom Rust type that overloads |
It's not immediately obvious to be why we can't do the same kind of trick Rust's Unless there's something about signaling NaNs we like and want to keep for some reason, though I've never dealt with that before. |
Right, we can--that's what I'm suggesting about using a custom type that overloads indexing (instead of the native
We definitely don't want them--signalling NaNs can raise a SIGFPE signal and abort the process. |
Ha, and of course we already aren’t using native slice types, we’re using the |
(I’ll revise the RFC with these ideas and see what people think.) |
My understanding is that the problem with sNaN is a bit more nuanced. On x86/ARM in practice sNaNs don't cause SIGFPE (the relevant bit in the control register is off), so CPU is not the problem. But in LLVM sNaNs are not produced by any instruction except for transmute, so LLVM assumes that having an f64 value with sNaN bit pattern is UB. When LLVM does optimizations, it assumes that any code that observes sNaN is dead, etc, etc.
It would be great to see a prototype here: I am afraid that Rust Index/IndexMut are not flexible enough for this, because you have to return |
Ah, I see. Thank you for the explanation! So
Yep, I just realized this. Argh! Back to brainstorming... |
Hm, I now think that I am confusing this with another IEEE754 issue: rust-lang/rust#10184. :)Better to read the thread on rust-lang/rust#39271 I guess :) The consensus seems that LLVM and not the CPU is the problem, and that we actually just don't know what LLVM guarantees are.
Just what JavaScript itself does with this problem? Is it possible to get an sNaN into JavaScripted variable because of this? |
Since typed arrays give you the ability to write any bit patterns you like, you can generate the bit pattern of an sNaN in the buffer. But the only way for JavaScript to interpret those bits as a floating-point number is to use a But the spec says nothing specifically about sNaNs; it only says the implementation can choose any NaN bit pattern it wants. The spec says that a JS engine has to always produce the exact same NaN bit pattern whenever writing a NaN to a typed array. I'm not sure if JS engines actually obey that in the real world--there's been a lot of debate about what the spec should claim here, and the spec may not reflect reality (in which case the spec is probably wrong, not reality--JS engine vendors kind of get to call the shots on this issue). But if that rule is in fact legitimate, then it demonstrates that there's no such thing as an sNaN value in JS, since it basically means you can never observe differences between NaN values. That's at the level of semantics, though. At the level of implementation, I don't know if any JS engines would allow an sNaN bit pattern to get passed around as a JS value. I would guess not, but I don't know for sure. |
So maybe the best we can do here for floating point types is just to use an API that requires explicit methods for indexing into the slice, instead of being able to use let f = slice.get(i);
slice.set(j, f); |
I'm realizing there's a similar potential issue with integers, too, since it's at least theoretically possible that at some point we'll need to always canonicalize them to be little-endian, even on big-endian targets. Years ago, I advocated for the spec mandating this but didn't succeed. Still, if the web actually comes to rely on little-endian, which I feel pretty confident it will, any big-endian architectures that ever come along and want to implement JavaScript engines are going to have to normalize integers in typed arrays to little-endian in order to be compatible with the ecosystem of JS content in the world. It's also possible that big-endian is just dead and we don't have to worry about it. The ergonomic impact of |
I think the biggest problem is not the ergonomics, but the fact that we must expose @BurntSushi, did you have a chance to dig into the question of LLVM handling of signaling NaNs? I think you were going to: rust-lang/rust#39271 (comment) :) |
Good point. There’s no way to prevent a typed array from having sNaN bit patterns in it, so for this to have fully defined behavior either Rust has to be able to support sNaN or we can’t expose native slices. Definitely interested in the results of @BurntSushi’s research! :) |
We can, before giving access to |
Hm, this comment hints that maybe there are no problems after all here? servo/webrender#2027 (comment) |
Even if that weren’t expensive it wouldn’t be an option, unfortunately — typed arrays are used to hold heterogeneous information, so even if you’re looking at a section of it as float data other parts might be integers. So changing their bits would break programs. But I think I’m starting to convince myself that we may be able to use actual slices for all types after all, with no canonicalization. I’m basing this on a few arguments:
If we can get away with this, the benefit is getting to interop with all the Rust ecosystem that uses slices, as you say, and maybe also compiler optimizations around slices (I’m not sure). |
OK, yeah, there's no reason why we needed cslice in the first place. We use its ABI in two (morally identical) functions, https://github.com/neon-bindings/neon/blob/master/crates/neon-runtime/src/neon.cc#L202-L206 and instead of a single out-pointer for a pair, we can simply pass two out-pointers for the two fields, as stack-local variables: https://github.com/neon-bindings/neon/blob/master/src/js/binary.rs#L42-L47 And then we can call |
…ce` and `CMutSlice`
…iew" is the generic name used in the specs and docs for typed array types) in favor of `ArrayBufferData`, which is the name used in the actual JS spec
@matklad I've updated this RFC according to our discussion. I feel good about the state of it, so I'll probably tag it with "final comment period" soon. |
text/0000-array-buffer-views.md
Outdated
|
||
While it requires `unsafe` code, this design allows users to define their own `ViewTypes` for compound types such as tuples or structs. | ||
|
||
## Convenience methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a typical pattern in Rust? I haven't seen it before. What's the benefit over super fish, just simpler syntax for new users?
let a = but.as_u32_slice()[0];
let b = but.as_slice::<u32>()[1];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I figured turbofish is nice for people who are used to it, but in my experience it's been confusing to teach to new Rust programmers. So I figured the conveniences would be nice for tutorials, teaching material, slide decks, etc.
OK, one more question for anyone who may have an opinion:
I think the first two are my favorites. The latter is nice but it could confuse people into thinking it's only for So I'm leaning towards |
…wType` so they make sense for both `JsArrayBuffer` and `JsBuffer`
Pushed the name change. |
This RFC proposes modifying the
JsArrayBuffer
API to allow viewing the underlying buffer data with different binary formats, similar to how typed arrays work in JS. This gives safe Rust more expressive control to manipulate the data.Rendered