Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sat math #100

Merged
merged 5 commits into from
Apr 26, 2021
Merged

Fix sat math #100

merged 5 commits into from
Apr 26, 2021

Conversation

calebzulawski
Copy link
Member

@calebzulawski calebzulawski commented Apr 20, 2021

Looks good to me, just want to merge it in

@workingjubilee
Copy link
Member

Looks like PPC now emits an LLVM Error on saturating math on SimdI128. How do we want to proceed?

@programmerjake
Copy link
Member

iirc, current PPC (until SimpleV comes around) only has 128-bit SIMD: how about just leave off #[repr(simd)] for 128-bit integers?

@programmerjake
Copy link
Member

or, just get rustc to use the fall-back scalar code for that case for the simd intrinsics

@calebzulawski
Copy link
Member Author

calebzulawski commented Apr 21, 2021

There's no reason you can't do 128 bit saturating math in a 128 bit vector. Definitely an LLVM issue. We should probably report it and use a scalar fallback for now.

Also worth noting this would fail on master, and only shows up because of the fixed tests.

@workingjubilee
Copy link
Member

Something was sus about the last merge, so rerunning CI.

@programmerjake
Copy link
Member

There's no reason you can't do 128 bit saturating math in a 128 bit vector. Definitely an LLVM issue. We should probably report it and use a scalar fallback for now.

True. I guess my point was that since PPC only supports 128-bit vectors and since a 128-bit vector of i128 should be equivalent to scalar i128, having #[repr(simd)] shouldn't give any performance advantage.

@workingjubilee
Copy link
Member

error: failed to unpack package console_error_panic_hook v0.1.6

Caused by:
failed to iterate over archive

well that wasn't what I was expecting.
Going to assume the merge errors were spurious.

@workingjubilee
Copy link
Member

There's no reason you can't do 128 bit saturating math in a 128 bit vector. Definitely an LLVM issue. We should probably report it and use a scalar fallback for now.

True. I guess my point was that since PPC only supports 128-bit vectors and since a 128-bit vector of i128 should be equivalent to scalar i128, having #[repr(simd)] shouldn't give any performance advantage.

#[repr(simd)] is about defining SIMD compatible types from our perspective, and it must be platform agnostic. I would be willing to drop SimdI128 and SimdU128 if we discern however that even AVX2 implements u128x2 / i128x2 "in software" rather than having any special instructions for them and that such is a disqualifying metric.

@programmerjake
Copy link
Member

#[repr(simd)] is about defining SIMD compatible types from our perspective, and it must be platform agnostic. I would be willing to drop SimdI128 and SimdU128 if we discern however that even AVX2 implements u128x2 / i128x2 "in software" rather than having any special instructions for them and that such is a disqualifying metric.

Guess we're dropping i128 vectors then:
https://gcc.godbolt.org/z/o1bGT859a

even bitwise or is scalarized when avx512dq is enabled -- for both i128x2 and i128x4

@programmerjake
Copy link
Member

programmerjake commented Apr 21, 2021

#[repr(simd)] is about defining SIMD compatible types from our perspective, and it must be platform agnostic. I would be willing to drop SimdI128 and SimdU128 if we discern however that even AVX2 implements u128x2 / i128x2 "in software" rather than having any special instructions for them and that such is a disqualifying metric.

Guess we're dropping i128 vectors then:
https://gcc.godbolt.org/z/o1bGT859a

even bitwise or is scalarized when avx512dq is enabled -- for both i128x2 and i128x4

(Edit: misread assembly, clang and gcc do have matching abis) gcc and clang don't even agree on how to pass i128x2 and i128x4 -- gcc passes them in avx2 and avx512 registers, clang passes them in memory.

Both gcc and clang scalarize a bitwise or:
https://gcc.godbolt.org/z/cTKTsfvrv

@calebzulawski
Copy link
Member Author

I'd be more interested in implementing these in rust with smaller integers than dropping them entirely.

@calebzulawski
Copy link
Member Author

@workingjubilee I opened an LLVM bug and stdsimd bug (#104)

@workingjubilee
Copy link
Member

I'd be more interested in implementing these in rust with smaller integers than dropping them entirely.

Well, if we're going to write a 2-limb vectorized BigInt why not just add full-featured vectorized BigInts?

@calebzulawski
Copy link
Member Author

Because rust only has i128 right now :)

@workingjubilee
Copy link
Member

workingjubilee commented Apr 25, 2021

Mm.

Even if we resolve the kind of quasi-philosophical objection here, I am somewhat concerned about adding operations we don't actually have reasonably efficient implementations for, unless we intend to bugfix them immediately, or if we are confident that they are useful if you enable higher-level SIMD features. We have actually managed to take things out and then (ahem) "circle back" and readd them once we solved the problem on some level, too, so I think we should do that again here.

It's not much of a SimdU128 if it's not actually doing any SIMD operations even with all the bells and whistles turned on, is it?

@calebzulawski
Copy link
Member Author

Well, we can transmute to another type for bit ops, and then use an explicit non-vector implementation for arithmetic with the intention of adding better arithmetic in the future.

/// let unsat = x.abs();
/// let sat = x.saturating_abs();
#[doc = concat!("assert_eq!(unsat, ", stringify!($name), "::from_array([MIN, 2, 0, 3]);")]
#[doc = concat!("let xs = ", stringify!($name), "::from_array([MIN, -2, 0, 3]);")]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm why xs?

Copy link
Member

@workingjubilee workingjubilee Apr 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mild case of whim / there's a common pattern of for x in xs
thus, implied plurality (a vector)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would just go with x I think, it is a vector but we're not really concerned about the particular lanes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I realized this is my own PR so I can't "approve" anything, but everything looks good to me, don't let this comment hold up a merge

@workingjubilee workingjubilee merged commit a9a1c9d into master Apr 26, 2021
@calebzulawski calebzulawski deleted the fix-sat-math branch August 7, 2021 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants