Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some avx512 to const generics #1042

Merged
merged 126 commits into from
Mar 5, 2021
Merged

some avx512 to const generics #1042

merged 126 commits into from
Mar 5, 2021

Conversation

minybot
Copy link
Contributor

@minybot minybot commented Mar 3, 2021

cvt_roundps_epi32,epu32
cvt_roundepi32,u32_ps
cvt_roundpd_ps,epi32,epu32
mm_scalef_round_ss,sd
mm_fmadd_round_ss,sd
mm_fmsub_round_ss,sd
mm_fnmadd_round_ss,sd
mm_fnmsub_round_ss,sd
mm_maskz_cvt_roundsd_ss
mm_cvt_roundss_si32,i32,u32; mm_cvt_roundsd_si32,i32,u32
mm_cvt_roundi32_ss
mm_cvt_roundu32_ss
mm_cvt_roundsd_ss
mm_add,sub,mul,div_round_ss,sd
mm_sqrt_round_ss,sd
cvt_roundps_pd,ph
cvt_roundph_ps
cvtps_ph
cvtt_roundps,pd_epi32,epu32
mm_max,min,getexp_round_ss,sd
mm_cvt_roundss_sd
cvt_roundss_si32,i32,u32
mm_cvtt_roundsd_si32
shuffle_epi32
shuffle_i32x4,f32x4,i64x2,f64x2
mm_cvtt_roundss,sd_u64,i64,si64
mm_cvt_roundss,sd_u64,i64,si64
mm_cvt_roundu64,i64,si64_ss,sd
shldi_epi64,epi32,epi16
shrdi_epi64,epi32,epi16
ror_epi32,epi64; rol_epi32,epi64;
srai_epi32

minybot added 2 commits March 3, 2021 01:28
…mm_add,sub,mul,div_round_ss,sd; mm_sqrt_round_ss,sd; mm_scalf_round_ss,sd; mm_fmadd,fmsub,fnmadd,fnmsub_round_ss,sd; mm_cvt_roundss_i32,u32; mm_cvt_roundsd_i32,u32; mm_cvt_roundi32,u32_ss; mm_cvt_roundsd_ss
…s,pd_epi32,epu32; mm_max,min_round_ss,sd; mm_getexp_ss,sd; mm_cvt_roundss_sd; cvt_roundss_si32,i32,u32; mm_cvtt_roundsd_si32,i32,u32
@rust-highfive
Copy link

r? @Amanieu

(rust-highfive has picked a reviewer for you, use r? to override)

@minybot
Copy link
Contributor Author

minybot commented Mar 3, 2021

_mm256_srai_epi32 (__m256i a, int imm8)
_mm256_mask_srai_epi32 (__m256i src, __mmask8 k, __m256i a, unsigned int imm8)
When doing _mm256_mask_srai_epi32, I usually call _mm256_srai_epi32.
But
---- verify_all_signatures stdout ----
failed to verify _mm256_mask_srai_epi32

  • failed to equate: unsigned int and PrimSigned(32) for _mm256_mask_srai_epi32

Solution 1. change stdarch-verify/x86-intel.xml to make u32 to i32
Do you think this is a good idea?

@Amanieu
Copy link
Member

Amanieu commented Mar 3, 2021

Can you cast the constant to i32? If that is not possible then just copy the implementation of _mm256_srai_epi32 into _mm256_mask_srai_epi32.

I would prefer not changing the intrinsics XML since it is our source of reference.

@minybot
Copy link
Contributor Author

minybot commented Mar 3, 2021

Can you cast the constant to i32?
I think it might be able to cast to i32

@minybot
Copy link
Contributor Author

minybot commented Mar 3, 2021

For mm_mask_srai_epi32:

  1. copy #[link_name = "llvm.x86.sse2.psrai.d"] to avx512f.rs
  2. cast from u32 to i32

_ => $expand!(15),
}
};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you add this macro?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake when training to solve merge conflict. I remove it now.

pub(crate) const VALID: () = {
let _ = 1 / ((IMM == 4 || IMM == 8 || IMM == 9 || IMM == 10 || IMM == 11) as usize);
};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reuse the macros from x86/macros.rs here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I use #[macro_export] in x86/macros.rs?

@Amanieu
Copy link
Member

Amanieu commented Mar 4, 2021

By the way you need to rebase to address the merge conflicts.

@minybot
Copy link
Contributor Author

minybot commented Mar 5, 2021

The imm8 range for mm512_srai_epi32 is "i32". I tested it with clang and gcc.
Should we still use static_assert_imm8!(IMM8);
or something static_assert_i32!(IMM8) for the compatibility?

@Amanieu
Copy link
Member

Amanieu commented Mar 5, 2021

According to the intrinsics guide the instruction only reads the low 8 bits of imm8. Therefore we should enforce that the value fits in 8 bits.

}

#[allow(unused)]
macro_rules! static_assert_imm8u {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1047 just got merged which includes a static_assert_imm_u8 macro.

lqd and others added 27 commits March 5, 2021 20:03
The LLVM12 upgrade in rustc may be causing issues
@Amanieu Amanieu merged commit c3960ea into rust-lang:master Mar 5, 2021
Amanieu added a commit that referenced this pull request Mar 5, 2021
@minybot minybot deleted the avx512 branch March 5, 2021 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants