Implement type casts (non-transmute & transmute) #116

programmerjake · 2021-05-05T08:36:18Z

I wasn't able to find type casts, we need them:

fn f(v: f32x16) -> i8x16 {
    v.cast() // doesn't currently work
}

This is different than transmuting (which is something we also need; we're planning on mostly relying on safe-transmute for this).

Lokathor · 2021-05-05T12:47:47Z

We currently have to and from arrays, and the rest you can do from there in a pinch.

It didn't seem that there was much hardware support for casts within Intel and ARM so we pushed it to the back of the schedule.

calebzulawski · 2021-05-05T19:35:50Z

The only casts that seem broadly supported is between floats and integers of the same width, which we do have implemented

workingjubilee · 2021-05-07T13:28:07Z

Arrays are more ergonomic if used with the array map method.

gilescope · 2021-05-08T17:53:42Z

My case is going between i8x16 <--> i16x8 and the like. I have an impressive amount of transmutes... there's probably a nicer way?

Lokathor · 2021-05-08T17:59:55Z

So you want two i8 at a time to combine into i16?

or do you want to take in one integer register and produce two registers of data by sign extending each individual i8?

gilescope · 2021-05-08T18:50:49Z

The former. I'm parsing ints (quickly) and trying to see if I can use core_simd for the most part to do this:

https://github.com/pickfire/parseint/blob/b75bf5e47a3b2fdc1934bab046e677e033173e18/src/lib.rs#L151

The other thing I might be missing is a nice way to switch to and from __m128i for unsupported intrinsics.

Lokathor · 2021-05-08T18:57:41Z

I can't look too closely right now, but that transmute_copy is suspicious.

but yeah, you'll probably have some transmutes in there for a while until we fill in the edges of the API.

For a fix that works immediately you could convert this to the safe_arch crate, which has bytemuck support, then you're at least doing pretty much all safe code.

gilescope · 2021-05-08T18:57:49Z

Ah I see I can do i8x16::from(__m128i). - Excellent that removes a fair few transmutes.

gilescope · 2021-05-10T10:15:13Z

Thanks. Suspicious yes, but I have ensured that we never look at memory that isn't ours. I.e. the slice is always at least long enough when transmuted. I had a look at bytemuck but it seemed I would need to turn it into a fixed size array before using it as a Pod.

…

On Sat, May 8, 2021 at 7:57 PM Lokathor ***@***.***> wrote: I can't look too closely right now, but that transmute_copy is suspicious. but yeah, you'll probably have some transmutes in there for a while until we fill in the edges of the API. For a fix that works immediately you could convert this to the safe_arch crate, which has bytemuck support, then you're at least doing pretty much all safe code. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#116 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGEJCA45FDGM7UEXW6GHH3TMWCTHANCNFSM44EHGCYQ> .

hsivonen · 2021-05-23T14:14:04Z

the trait `From<SimdU8<16_usize>>` is not implemented for `SimdU16<8_usize>`

Is there a principled reason for this considering that both u8x16 and u16x8 convert to/from __m128i using from()/into()? Or just not done yet? (I want to do a zero-cost endianness-dependent re-interpretation of the register.)

calebzulawski · 2021-05-23T14:49:54Z

We haven't really prioritized conversions yet, but I think our general stance has been "this is a job for transmute, and safe transmute will make that easier". Converting between the std::arch types is a little different since __m128i doesn't encode integer type information, but maybe that should just be a transmute as well?

Lokathor · 2021-05-23T14:59:34Z

Yeah if you deliberately want to change lane count like this while preserving register byte size, just go to the arch type and then from the arch type in your new lane count type using stdsimd. It's a hair verbose, but works totally fine.

hsivonen · 2021-05-23T18:25:24Z

We haven't really prioritized conversions yet, but I think our general stance has been "this is a job for transmute, and safe transmute will make that easier".

I hope the timeline for that feature works out well for SIMD purposes.

Yeah if you deliberately want to change lane count like this while preserving register byte size, just go to the arch type and then from the arch type in your new lane count type using stdsimd. It's a hair verbose, but works totally fine.

Going via the arch type isn't portable and adds an unnecessary intermediate step.

thomcc · 2021-05-23T18:33:12Z

I hope the timeline for that feature works out well for SIMD purposes.

Yeah, it doesn't look like it, honestly. That project seems like it has all but stalled.

Even if not, I also think it might be worth including explicit ways to perform conversions, given how common they are in SIMD code.

thomcc · 2021-05-23T18:34:58Z

There's some prior art here in fN::{from,to}_bits, uN::{to/from}_ne_bytes, etc.

Lokathor · 2021-05-23T18:56:02Z

I think that those are just "transmute that we're willing to bless because it's so common". So, safe transmute makes them in some sense pointless. However, I do sympathize that safe transmute will take a while to come around.

I think "just use transmute for now" kinda has to be the answer here though, unless we want to implement our own mini-safe-transmute API within the portable simd API. We could maybe do that, but I don't think we should stabilize that mini api in the long run, and I'm hesitant to suggest thay anyone write down a bunch of code (probably in the form of weird macro stuff) that we know we'll never keep long term.

BurntSushi · 2021-05-23T19:02:23Z

unless we want to implement our own mini-safe-transmute API

Is there something more to this than defining the obvious From impls and implementing them with transmute?

I think for me, the thing that would push something like this over the edge is if this were a common source of unsafe in otherwise safe portable SIMD code. That would be pretty unfortunate and a compelling reason IMO for exposing safe bitcast APIs (just like std does for integers<->bytes). If you are otherwise writing unsafe everywhere, then a transmute or a pointer cast seems like less of a big deal.

I suspect the case is the former, although I'm not certain. If so, then yes, I think these APIs should absolutely be added unless there's some complexity here that I don't understand.

Lokathor · 2021-05-23T19:27:47Z

Well, the approximate breakdown is:

It's probably the former, that otherwise pure safe code would have a transmute in it.
We could just put a bunch of From impls but then that would fly in the face of how From currently works with other number types (which might be acceptable overall?).
If we don't use From we probably don't want like one method per destination type on each source type. Because that's a lot of methods kinda clogging up the docs and the auto-complete lists and so on. Possible, but kinda a poor solution.
Which would mean some sort of alternative trait for just these special transmutes we want to support. And that's where we'd unfortunately have a mini safe transmute API sneaking into the SIMD stuff.
The eventual Safe Transmute API would absolutely without a doubt cover everything we put in as well as valid transmutes we maybe forget to put in.

So if the request, as given in the example code in the first post, is a general "cast" operation that "just works" then it feels like something that is unfortunately just slightly awkwardly out of scope. I admit that it's useful, but I don't think it best fits in this sub-project of rust in the long term.

calebzulawski · 2021-05-23T19:27:51Z

Is there something more to this than defining the obvious From impls and implementing them with transmute?

The implementation is trivial, the reason we've been avoiding it is because it's not clear what semantics From should carry. Should i16 implement From<[u8; 2]>? Probably not. I think transmute carries the meaning we want, the downside is the unsafe.

I do think we can implement {to,from}_bits for all vectors, which would allow a safe "transmute" with the slight inconvenience of two function calls instead of one.

BurntSushi · 2021-05-23T19:56:02Z

I do think we can implement {to,from}_bits for all vectors, which would allow a safe "transmute" with the slight inconvenience of two function calls instead of one.

That might be a nice compromise assuming the codegen is the same. (And I assume it would be.) @hsivonen What do you think?

hsivonen · 2021-05-24T05:15:08Z

Indeed, lane reinterpretation is currently a source of unsafe in otherwise-safe code. to_bits/from_bits would work for me.

(Masks should zero-cost convert to integer lanes but not vice versa.)

workingjubilee · 2021-05-24T06:07:59Z

I took a quick crack at this for five minutes and am just commenting here so that it's staring me down when I return to it: in order to make this sound, this requires either reworking the way we currently apply our "can actually be implemented" lane limit, or else accepting the extremely awkward return signature of

pub fn to_ne_bytes(self) -> [[u8; N]; LANES];
pub fn from_ne_bytes([[u8; N]; LANES]) -> Self;

Frankly, I don't think that signature is acceptable for all the uses everyone has been wanting this for, but I do have ideas for how to grind the lane limit under my heel and make this right.

programmerjake · 2021-05-24T06:16:30Z

(Masks should zero-cost convert to integer lanes but not vice versa.)

That only applies to full-width masks, bitmasks will zero-cost convert to a different type (u64 or similar?).

hsivonen · 2021-05-24T06:37:41Z

(Masks should zero-cost convert to integer lanes but not vice versa.)

That only applies to full-width masks, bitmasks will zero-cost convert to a different type (u64 or similar?).

I agree. (I meant SIMD register mask types in my previous remark.)

calebzulawski · 2021-05-24T12:17:21Z

@workingjubilee I tried my hand at this and I think the way to go is:

trait ToBytes {
    type Bytes;
    #[doc(hidden)]
    fn to_bytes_impl(self) -> Self::Bytes { 
        unsafe { core::mem::transmute(self) }
    }
    #[doc(hidden)]
    fn from_bytes_impl(bytes: Self::Bytes) -> Self {
        unsafe { core::mem::transmute(bytes) }
    }
}

...
    pub fn to_ne_bytes(self) -> Self::Bytes { self.to_bytes_impl() }
    pub fn from_ne_bytes(bytes: Self::Bytes) -> Self { Self::from_bytes_impl(bytes) }
...

If we didn't have LanesAtMost32 and used const_evaluatable_checked this would work fine, but due to the lane limits, implementing this on SimdI64<8> isn't possible (since it needs SimdU8<64> bytes).

calebzulawski · 2021-05-24T12:44:02Z

Based on the limitations of the implementation in the linked PR, using transmute directly is usually the best solution for now, but if you absolutely must have no unsafe, you can use byte conversions as long as the type is not more than 32 bytes long. Hopefully in the future that limitation can be lifted, but right now that's blocked on #90.

programmerjake · 2021-05-24T14:35:30Z

changed the issue to also include transmutes (which weren't originally intended to be covered by this issue, but whatever...)

programmerjake · 2021-10-26T07:44:08Z

rustc's simd_cast intrinsic needs to gain support for usize/isize:
https://github.com/rust-lang/rust/blob/17e13b549f5f83cd9ffca9a540090754eb95115c/compiler/rustc_codegen_llvm/src/intrinsic.rs#L1675-L1676

came up here:
https://rust-lang.zulipchat.com/#narrow/stream/257879-project-portable-simd/topic/Coersing.20LLVM.20to.20emit.20mulhi/near/259060275

aldanor · 2021-12-03T16:28:29Z

By the way - wonder what's the suggested way to convert between lane types of different size but with the same number of lanes? (e.g. u8x16 to u16x16)

programmerjake added the C-feature-request Category: a feature request, i.e. not implemented / a PR label May 5, 2021

calebzulawski mentioned this issue May 24, 2021

Add byte conversions #121

Merged

programmerjake changed the title ~~Implement type casts~~ Implement type casts (non-transmute & transmute) May 24, 2021

calebzulawski mentioned this issue Nov 3, 2021

Consider enabling other indices than usize #166

Open

okaneco mentioned this issue Oct 6, 2023

Using std::simd to speed-up unfilter for Paeth for bpp=3 and bpp=6 image-rs/image-png#414

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement type casts (non-transmute & transmute) #116

Implement type casts (non-transmute & transmute) #116

programmerjake commented May 5, 2021 •

edited

Loading

Lokathor commented May 5, 2021

calebzulawski commented May 5, 2021 •

edited

Loading

workingjubilee commented May 7, 2021 •

edited

Loading

gilescope commented May 8, 2021

Lokathor commented May 8, 2021

gilescope commented May 8, 2021

Lokathor commented May 8, 2021

gilescope commented May 8, 2021

gilescope commented May 10, 2021 via email

hsivonen commented May 23, 2021

calebzulawski commented May 23, 2021

Lokathor commented May 23, 2021

hsivonen commented May 23, 2021

thomcc commented May 23, 2021

thomcc commented May 23, 2021

Lokathor commented May 23, 2021

BurntSushi commented May 23, 2021 •

edited

Loading

Lokathor commented May 23, 2021

calebzulawski commented May 23, 2021

BurntSushi commented May 23, 2021

hsivonen commented May 24, 2021

workingjubilee commented May 24, 2021

programmerjake commented May 24, 2021

hsivonen commented May 24, 2021

calebzulawski commented May 24, 2021

calebzulawski commented May 24, 2021

programmerjake commented May 24, 2021

programmerjake commented Oct 26, 2021

aldanor commented Dec 3, 2021 •

edited

Loading

Implement type casts (non-transmute & transmute) #116

Implement type casts (non-transmute & transmute) #116

Comments

programmerjake commented May 5, 2021 • edited Loading

Lokathor commented May 5, 2021

calebzulawski commented May 5, 2021 • edited Loading

workingjubilee commented May 7, 2021 • edited Loading

gilescope commented May 8, 2021

Lokathor commented May 8, 2021

gilescope commented May 8, 2021

Lokathor commented May 8, 2021

gilescope commented May 8, 2021

gilescope commented May 10, 2021 via email

hsivonen commented May 23, 2021

calebzulawski commented May 23, 2021

Lokathor commented May 23, 2021

hsivonen commented May 23, 2021

thomcc commented May 23, 2021

thomcc commented May 23, 2021

Lokathor commented May 23, 2021

BurntSushi commented May 23, 2021 • edited Loading

Lokathor commented May 23, 2021

calebzulawski commented May 23, 2021

BurntSushi commented May 23, 2021

hsivonen commented May 24, 2021

workingjubilee commented May 24, 2021

programmerjake commented May 24, 2021

hsivonen commented May 24, 2021

calebzulawski commented May 24, 2021

calebzulawski commented May 24, 2021

programmerjake commented May 24, 2021

programmerjake commented Oct 26, 2021

aldanor commented Dec 3, 2021 • edited Loading

programmerjake commented May 5, 2021 •

edited

Loading

calebzulawski commented May 5, 2021 •

edited

Loading

workingjubilee commented May 7, 2021 •

edited

Loading

BurntSushi commented May 23, 2021 •

edited

Loading

aldanor commented Dec 3, 2021 •

edited

Loading