-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use simd-lite #39
Use simd-lite #39
Conversation
@Licenser it's awesome that this works! The developer opinions aren't a deal-breaker, I think we can fix that in future revisions anyway. |
@Licenser was just looking at the test failures, I think you might want to use this impl of vtstq. I did an integration earlier against your intrinsics work, and this is what I needed (which is why I was complaining about the private fields):
|
/// Compare bitwise test bits nonzero
#[inline]
#[target_feature(enable = "neon")]
#[cfg_attr(target_arch = "arm", target_feature(enable = "v7"))]
#[cfg_attr(test, assert_instr(cmtst))]
// even gcc compiles this to ldr: https://clang.godbolt.org/z/1bvH2x
// #[cfg_attr(test, assert_instr(ld1))]
pub unsafe fn vtstq_u8(a: uint8x16_t, b: uint8x16_t) -> uint8x16_t {
vcgtq_u8(vandq_u8(a, b), vdupq_n_u8(0))
} Ja I think that should be the same as
sadly not even clang compiles |
I know it's ugly :( but it's correcter then mem::transmute, it's the load intrinsic that's made to well ... populate a register. |
ideally we want an neon equivalent of |
hmm I got an idea! We could use a trait to add a ::new() method how about that? |
Yes I love that idea! Only problem is can't add from outside the crate? |
If we control the trait we don't need to control the type :D and it compiles to a vld1 instruciton one way or the other so ¯_(ツ)_/¯ doens't hurt to do the |
ahhh damn we can't have a trait with different arities for functions but we could pass in the slice, that's mildly better I think. |
crazily enough I think mem::transmute made the most sense 😂 |
interesting looks like they do compile down to the same: https://rust.godbolt.org/z/0ORCzh |
:D yay! I think the ::new is a lot nicer. |
sorry I left this on the wrong PR, oops! @Licenser did you find the bug? I think I might have it: 8875cff#diff-e3cc34fc9e09211fc01983167aaf86d7R47 (signed versus unsigned subtraction in utf8check, we had an error in the previous intrinsics) PS. ignore all the intrinsics changes, that's just me bisecting the problem |
Wow, well done! I think that did it!!! Once this all works we should give it a sweep to see if we can simplify some of those intriniscs but most importantly it seems to actually work :D good bye |
@Licenser this looks good to me! Are there big changes you'd want to make before releasing the simd-lite crate,are we getting close? |
I really really would like to see if we can get rid of the copied code in simd-lite before releasing it wherever possible. And were not we need to make sure it's properly attributed and includes the respective copyrights. |
Merging, beautiful work @Licenser ! |
* Put something in the readme so we can have a PR * Add drone file * update build status * unguard for sse4.2 to allow rust to polyfill on older platforms * Add more simd tests * RFC: Neon support (pretty much working) (#35) * feat: neon support * feat: temp stub replacements for neon intrinsics (pending rust-lang/stdarch#792) * fix: drone CI rustup nightly * feat: fix guards, use rust stdlib for bit count operations * fix: remove double semicolon * feat: fancy generic generator functions, thanks @Licenser * Update extq intrinsics * Use simd-lite (#39) * Use simd-lite * Update badge * Update badge * Get rid of transmutes * Use NeonInit trait * vqsubq_u8 fix * vqsubq_u8 fix pt. 2 * use reexprted values from simd-lite * add simd-lite real version
Use simd-lite for ARM support.