-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression in broadcasting with CartesianIndices on v1.6.0-DEV #38086
Comments
This comment has been minimized.
This comment has been minimized.
julia> xu8d1 = rand(UInt8, 1000 * 1000); yu8d1 = rand(UInt8, 1000 * 1000);
julia> @btime $xu8d1 .+ $yu8d1;
65.900 μs (2 allocations: 976.70 KiB) Edit: Perhaps this is not a problem with the LLVM backend, as shown by |
BTW, I've learned a bit about the Julia language features and the x86 native codes, but I'm not familiar with the process of generating the LLVM IR codes. So I would appreciate it if someone could help me with this cause analysis and fix. |
FWIW, this fixes it for me: @inline function Base.copyto!(dest::AbstractArray, bc::Base.Broadcast.Broadcasted{Nothing})
axes(dest) == axes(bc) || Base.Broadcast.throwdm(axes(dest), axes(bc))
# Performance optimization: broadcast!(identity, dest, A) is equivalent to copyto!(dest, A) if indices match
if bc.f === identity && bc.args isa Tuple{AbstractArray} # only a single input argument to broadcast!
A = bc.args[1]
if axes(dest) == axes(A)
return copyto!(dest, A)
end
end
bc′ = Base.Broadcast.preprocess(dest, bc)
@inbounds @simd for I in eachindex(bc′)
@inbounds dest[I] = bc′[I]
end
return dest
end the diff is adding an julia> xu8 = rand(UInt8, 1000, 1000); yu8 = rand(UInt8, 1000, 1000);
julia> @btime $xu8 .+ $yu8;
829.650 μs (2 allocations: 976.70 KiB)
julia> @inline function Base.copyto!(dest::AbstractArray, bc::Base.Broadcast.Broadcasted{Nothing})
axes(dest) == axes(bc) || Base.Broadcast.throwdm(axes(dest), axes(bc))
# Performance optimization: broadcast!(identity, dest, A) is equivalent to copyto!(dest, A) if indices match
if bc.f === identity && bc.args isa Tuple{AbstractArray} # only a single input argument to broadcast!
A = bc.args[1]
if axes(dest) == axes(A)
return copyto!(dest, A)
end
end
bc′ = Base.Broadcast.preprocess(dest, bc)
@inbounds @simd for I in eachindex(bc′)
@inbounds dest[I] = bc′[I]
end
return dest
end
julia> @btime $xu8 .+ $yu8;
162.681 μs (2 allocations: 976.70 KiB)
julia> @btime $xu8 .+ $yu8;
145.871 μs (2 allocations: 976.70 KiB) Maybe the check hits a heuristic limit and decides not to emit all the different SIMD loop versions (for different possible broadcasted dimensions), instead doing none at all? |
I'm slogging on the next minor version release of After all this time I have checked the details of @chriselrod's workaround. julia/base/multidimensional.jl Lines 454 to 456 in 6813340
Looking at #37829 (comment), it seems that the default boundary check is intentional. cc: @johnnychen94, @timholy |
BTW, I am wondering if |
I thnk it is fine to just move out the |
I grep'd in this repository and found that For this reason, I think I would like to submit a PR and discuss it there. |
I agree, but let's split it into two parts. One is the immediate fix here (moving the |
move `@inbounds` outside the loop body. see JuliaLang#38086
move `@inbounds` outside the loop body. see JuliaLang#38086
move `@inbounds` outside the loop body. see JuliaLang#38086
move `@inbounds` outside the loop body. see JuliaLang#38086
move `@inbounds` outside the loop body. see JuliaLang#38086
move `@inbounds` outside the loop body. see JuliaLang#38086
move `@inbounds` outside the loop body. see JuliaLang#38086
move `@inbounds` outside the loop body. see JuliaLang#38086
revert changes in reshapedarray.jl use Iterators.rest Update broadcast.jl move `@inbounds` outside the loop body. see JuliaLang#38086
revert changes in reshapedarray.jl use Iterators.rest Update broadcast.jl move `@inbounds` outside the loop body. see JuliaLang#38086
revert changes in reshapedarray.jl use Iterators.rest Update broadcast.jl move `@inbounds` outside the loop body. see JuliaLang#38086
revert changes in reshapedarray.jl use Iterators.rest Update broadcast.jl move `@inbounds` outside the loop body. see JuliaLang#38086
revert changes in reshapedarray.jl use Iterators.rest Update broadcast.jl move `@inbounds` outside the loop body. see JuliaLang#38086
revert changes in reshapedarray.jl use Iterators.rest Update broadcast.jl move `@inbounds` outside the loop body. see JuliaLang#38086
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086 Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl Update multidimensional.jl
move `@inbounds` outside the loop body. see JuliaLang#38086
Although I haven't identified the cause, I've noticed ~10x slowdown in simple broadcasting operations on nightly.
`@code_native` result for "1-D" arrays (It's somewhat misleading. See comments below.)
The most noticeable difference is the LLVM version (i.e. 10 vs 11), but I have no evidence that the LLVM 11 is the cause at the moment.The text was updated successfully, but these errors were encountered: