Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support @fastmath min #103

Closed
Seelengrab opened this issue Jan 13, 2023 · 2 comments · Fixed by #104
Closed

Support @fastmath min #103

Seelengrab opened this issue Jan 13, 2023 · 2 comments · Fixed by #104

Comments

@Seelengrab
Copy link
Contributor

Seelengrab commented Jan 13, 2023

Now that we have @fastmath min in Base ( JuliaLang/julia#47972), how do I make SIMD.jl work as well? Just using @fastmath min(a, b) with a and b being Vec of Float32 complains about a missing method:

julia> using SIMD, Base.Cartesian

julia> t = Vec(@ntuple 8 _ -> -Inf32)
<8 x Float32>[-Inf, -Inf, -Inf, -Inf, -Inf, -Inf, -Inf, -Inf]

julia> @fastmath min(t,t)
ERROR: MethodError: no method matching minnum(::NTuple{8, VecElement{Float32}}, ::NTuple{8, VecElement{Float32}}, ::SIMD.Intrinsics.FastMathFlags{128})

Closest candidates are:
  minnum(::T, ::T) where T<:(Union{Tuple{Vararg{VecElement{var"#s6"}, var"#s1"}} where var"#s1", var"#s6"} where var"#s6"<:Union{Float32, Float64})
   @ SIMD ~/.julia/packages/SIMD/Ls1Up/src/LLVM_intrinsics.jl:265

Stacktrace:
 [1] min_fast(x::Vec{8, Float32}, y::Vec{8, Float32})
   @ SIMD ~/.julia/packages/SIMD/Ls1Up/src/simdvec.jl:259
 [2] top-level scope
   @ REPL[5]:1

This should ideally directly map to vminps on x86 with AVX, but right now it also has an additional vmpcunordps when not using @fastmath:

julia> @code_native debuginfo=:none syntax=:intel min(t,t)
	.text
	.file	"min"
	.globl	julia_min_346                   # -- Begin function julia_min_346
	.p2align	4, 0x90
	.type	julia_min_346,@function
julia_min_346:                          # @julia_min_346
	.cfi_startproc
# %bb.0:                                # %top
	#APP
	mov	rcx, qword ptr fs:[0]
	#NO_APP
	mov	rcx, qword ptr [rcx - 8]
	mov	rax, rdi
	mov	rcx, qword ptr [rcx + 16]
	mov	rcx, qword ptr [rcx + 16]
	#MEMBARRIER
	mov	rcx, qword ptr [rcx]
	#MEMBARRIER
	vmovups	ymm0, ymmword ptr [rsi]
	vmovups	ymm1, ymmword ptr [rdx]
	vminps	ymm2, ymm1, ymm0
	vcmpunordps	k1, ymm0, ymm0
	vmovaps	ymm2 {k1}, ymm1
	vmovaps	ymmword ptr [rdi], ymm2
	vzeroupper
	ret
.Lfunc_end0:
	.size	julia_min_346, .Lfunc_end0-julia_min_346
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@progbits

Compare to this workaround for length 8:

julia> my_min(a, b) = Vec(@ntuple 8 i -> @fastmath min(a[i], b[i]))
my_min (generic function with 1 method)

julia> @code_native debuginfo=:none syntax=:intel my_min(t,t)
	.text
	.file	"my_min"
	.globl	julia_my_min_295                # -- Begin function julia_my_min_295
	.p2align	4, 0x90
	.type	julia_my_min_295,@function
julia_my_min_295:                       # @julia_my_min_295
	.cfi_startproc
# %bb.0:                                # %top
	#APP
	mov	rcx, qword ptr fs:[0]
	#NO_APP
	mov	rcx, qword ptr [rcx - 8]
	mov	rax, rdi
	mov	rcx, qword ptr [rcx + 16]
	mov	rcx, qword ptr [rcx + 16]
	#MEMBARRIER
	mov	rcx, qword ptr [rcx]
	#MEMBARRIER
	vmovups	ymm0, ymmword ptr [rsi]
	vminps	ymm0, ymm0, ymmword ptr [rdx]
	vmovaps	ymmword ptr [rdi], ymm0
	vzeroupper
	ret
.Lfunc_end0:
	.size	julia_my_min_295, .Lfunc_end0-julia_my_min_295
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@progbits
@KristofferC
Copy link
Collaborator

KristofferC commented Jan 13, 2023

Did you try:

function my_min(a,b)
    mask = @fastmath a < b
    vifelse(mask, a, b)
end

?

We could make that the fastmath min in SIMD .

@Seelengrab
Copy link
Contributor Author

I have not - I'm too used to @ntuple all the things :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants