-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement type casts (non-transmute & transmute) #116
Comments
We currently have to and from arrays, and the rest you can do from there in a pinch. It didn't seem that there was much hardware support for casts within Intel and ARM so we pushed it to the back of the schedule. |
The only casts that seem broadly supported is between floats and integers of the same width, which we do have implemented |
Arrays are more ergonomic if used with the array map method. |
My case is going between |
So you want two i8 at a time to combine into i16? or do you want to take in one integer register and produce two registers of data by sign extending each individual i8? |
The former. I'm parsing ints (quickly) and trying to see if I can use core_simd for the most part to do this: The other thing I might be missing is a nice way to switch to and from |
I can't look too closely right now, but that transmute_copy is suspicious. but yeah, you'll probably have some transmutes in there for a while until we fill in the edges of the API. For a fix that works immediately you could convert this to the safe_arch crate, which has bytemuck support, then you're at least doing pretty much all safe code. |
Ah I see I can do |
Thanks. Suspicious yes, but I have ensured that we never look at memory
that isn't ours. I.e. the slice is always at least long enough when
transmuted. I had a look at bytemuck but it seemed I would need to turn it
into a fixed size array before using it as a Pod.
…On Sat, May 8, 2021 at 7:57 PM Lokathor ***@***.***> wrote:
I can't look too closely right now, but that transmute_copy is suspicious.
but yeah, you'll probably have some transmutes in there for a while until
we fill in the edges of the API.
For a fix that works immediately you could convert this to the safe_arch
crate, which has bytemuck support, then you're at least doing pretty much
all safe code.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#116 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGEJCA45FDGM7UEXW6GHH3TMWCTHANCNFSM44EHGCYQ>
.
|
Is there a principled reason for this considering that both |
We haven't really prioritized conversions yet, but I think our general stance has been "this is a job for transmute, and safe transmute will make that easier". Converting between the std::arch types is a little different since |
Yeah if you deliberately want to change lane count like this while preserving register byte size, just go to the arch type and then from the arch type in your new lane count type using stdsimd. It's a hair verbose, but works totally fine. |
I hope the timeline for that feature works out well for SIMD purposes.
Going via the arch type isn't portable and adds an unnecessary intermediate step. |
Yeah, it doesn't look like it, honestly. That project seems like it has all but stalled. Even if not, I also think it might be worth including explicit ways to perform conversions, given how common they are in SIMD code. |
There's some prior art here in |
I think that those are just "transmute that we're willing to bless because it's so common". So, safe transmute makes them in some sense pointless. However, I do sympathize that safe transmute will take a while to come around. I think "just use transmute for now" kinda has to be the answer here though, unless we want to implement our own mini-safe-transmute API within the portable simd API. We could maybe do that, but I don't think we should stabilize that mini api in the long run, and I'm hesitant to suggest thay anyone write down a bunch of code (probably in the form of weird macro stuff) that we know we'll never keep long term. |
Is there something more to this than defining the obvious I think for me, the thing that would push something like this over the edge is if this were a common source of I suspect the case is the former, although I'm not certain. If so, then yes, I think these APIs should absolutely be added unless there's some complexity here that I don't understand. |
Well, the approximate breakdown is:
So if the request, as given in the example code in the first post, is a general "cast" operation that "just works" then it feels like something that is unfortunately just slightly awkwardly out of scope. I admit that it's useful, but I don't think it best fits in this sub-project of rust in the long term. |
The implementation is trivial, the reason we've been avoiding it is because it's not clear what semantics I do think we can implement |
That might be a nice compromise assuming the codegen is the same. (And I assume it would be.) @hsivonen What do you think? |
Indeed, lane reinterpretation is currently a source of (Masks should zero-cost convert to integer lanes but not vice versa.) |
I took a quick crack at this for five minutes and am just commenting here so that it's staring me down when I return to it: in order to make this sound, this requires either reworking the way we currently apply our "can actually be implemented" lane limit, or else accepting the extremely awkward return signature of pub fn to_ne_bytes(self) -> [[u8; N]; LANES];
pub fn from_ne_bytes([[u8; N]; LANES]) -> Self; Frankly, I don't think that signature is acceptable for all the uses everyone has been wanting this for, but I do have ideas for how to grind the lane limit under my heel and make this right. |
That only applies to full-width masks, bitmasks will zero-cost convert to a different type ( |
I agree. (I meant SIMD register mask types in my previous remark.) |
@workingjubilee I tried my hand at this and I think the way to go is: trait ToBytes {
type Bytes;
#[doc(hidden)]
fn to_bytes_impl(self) -> Self::Bytes {
unsafe { core::mem::transmute(self) }
}
#[doc(hidden)]
fn from_bytes_impl(bytes: Self::Bytes) -> Self {
unsafe { core::mem::transmute(bytes) }
}
}
...
pub fn to_ne_bytes(self) -> Self::Bytes { self.to_bytes_impl() }
pub fn from_ne_bytes(bytes: Self::Bytes) -> Self { Self::from_bytes_impl(bytes) }
... If we didn't have |
Based on the limitations of the implementation in the linked PR, using |
changed the issue to also include transmutes (which weren't originally intended to be covered by this issue, but whatever...) |
rustc's |
By the way - wonder what's the suggested way to convert between lane types of different size but with the same number of lanes? (e.g. |
I wasn't able to find type casts, we need them:
This is different than transmuting (which is something we also need; we're planning on mostly relying on safe-transmute for this).
The text was updated successfully, but these errors were encountered: