Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatible with MKL.jl #1683

Closed
danielwe opened this issue Jul 27, 2024 · 24 comments
Closed

Incompatible with MKL.jl #1683

danielwe opened this issue Jul 27, 2024 · 24 comments

Comments

@danielwe
Copy link
Contributor

danielwe commented Jul 27, 2024

After loading MKL, I get ERROR: Error: no BLAS/LAPACK library loaded! and no gradient for BLAS-invoking functions. Before loading MKL it works as expected. MWE:

julia> using Enzyme, LinearAlgebra

julia> f(x) = dot(x, x)
f (generic function with 1 method)

julia> gradient(Reverse, f, ones(1))
1-element Vector{Float64}:
 2.0

julia> using MKL

julia> gradient(Reverse, f, ones(1))
Error: no BLAS/LAPACK library loaded!
Error: no BLAS/LAPACK library loaded!
Error: no BLAS/LAPACK library loaded!
Error: no BLAS/LAPACK library loaded!
1-element Vector{Float64}:
 0.0

This is in a clean environment with current Enzyme, 0.12.25.

julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
@danielwe
Copy link
Contributor Author

It works with MKL.jl version 0.5.0 (current is 0.7.0), so this is probably the culprit: JuliaLinearAlgebra/MKL.jl#104

@danielwe
Copy link
Contributor Author

Enzyme also works with JuliaLinearAlgebra/MKL.jl#164, which was reverted and never part of a release. Paging @ViralBShah and @amontoison---this LP64/ILP64 stuff seems a bit inscrutable

@ViralBShah
Copy link
Contributor

Basically, with INTERFACE_LP64, you get dgemm for 32-bit ints and dgemm_64 for 64-bit ints. This is what Julia expects.

With INTERFACE_ILP64, dgemm will use 64-bit ints which means you cannot get the routines with 32-bit ints.

@ViralBShah
Copy link
Contributor

Out of curiosity, can you try some other linear algebra function, like a matvec?

@ViralBShah
Copy link
Contributor

@staticfloat When we get this error with a missing forward, is it possible to print which symbol is problematic?

@danielwe
Copy link
Contributor Author

danielwe commented Jul 28, 2024

can you try some other linear algebra function, like a matvec?

Plain old gemv seems to work.

enzymemkl.jl:

using Enzyme, LinearAlgebra

N = 3
const A = rand(1, N)
@show A

f(x) = only(A * x)

x = ones(N)
@show f(x)
@show gradient(Reverse, f, x)

using MKL

@show f(x)
@show gradient(Reverse, f, x)
user@host:~$ julia --startup-file=no --project=@Enzyme enzymemkl.jl
A = [0.6367164769913026 0.2198472504406055 0.7289319277231969]
f(x) = 1.5854956551551052
┌ Warning: Using fallback BLAS replacements for (["dsymv_64_"]), performance may be degraded
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
gradient(Reverse, f, x) = [0.6367164769913026, 0.2198472504406055, 0.7289319277231969]
f(x) = 1.5854956551551052
gradient(Reverse, f, x) = [0.6367164769913026, 0.2198472504406055, 0.7289319277231969]

@ViralBShah
Copy link
Contributor

ViralBShah commented Jul 29, 2024

It is using the fallback BLAS replacement - which suggests that MKL is not being used. What do you get with BLAS.lbt_get_config()?

@wsmoses
Copy link
Member

wsmoses commented Jul 29, 2024

No here it is not, most likely. Theres probably a conditional check if it’s symmetric to either use symv or gemv — using the symv fallback only in that case and gemv here (decided by runtime value)

@ViralBShah
Copy link
Contributor

ViralBShah commented Jul 29, 2024

Something while loading Enzyme must be resetting the forwarding tables within LBT. @wsmoses Does Enzyme do anything special with BLAS in Julia? Does it use libblastrampoline?

@danielwe
Copy link
Contributor Author

danielwe commented Jul 29, 2024

Note that regular BLAS calls keep working regardless of package combination. Enzyme doesn't break anything outside itself. It's just Enzyme's own gradient that stops working when MKL is loaded, for certain BLAS functions. Could it be that Enzyme's BLAS rules use the symbols exposed by OpenBLAS instead of those from LBT?

@ViralBShah
Copy link
Contributor

The symbol names in openblas and MKL are the same (took a long time to get those changes into MKL and Apple). The error you are seeing is actually coming from LBT - which happens when some LBT function does not point to an appropriate BLAS function.

@danielwe
Copy link
Contributor Author

Confirming that Enzyme works if JuliaLinearAlgebra/MKL.jl#164 is reinstated but everything else from the current release of MKL.jl is kept. In particular, if I dev MKL and then do git revert --no-commit 45256f1 (reverting the commit that reverted the PR), the MWE from the OP works again.

@danielwe
Copy link
Contributor Author

danielwe commented Jul 29, 2024

I used the hack from MKL.jl's test suite to dump a stack trace instead of the no BLAS/LAPACK error:

function debug_missing_function()
    println("Missing BLAS/LAPACK function!")
    display(stacktrace())
end
BLAS.lbt_set_default_func(@cfunction(debug_missing_function, Cvoid, ()))

Here's the stack trace I get from the MWE:

Missing BLAS/LAPACK function!
20-element Vector{Base.StackTraces.StackFrame}:
 debug_missing_function() at enzymemkl.jl:5
 dot at blas.jl:345 [inlined]
 dot at blas.jl:395 [inlined]
 dot at matmul.jl:15 [inlined]
 f at enzymemkl.jl:16 [inlined]
 diffejulia_f_757wrap at enzymemkl.jl:0
 macro expansion at compiler.jl:6819 [inlined]
 enzyme_call at compiler.jl:6419 [inlined]
 CombinedAdjointThunk at compiler.jl:6296 [inlined]
 autodiff at Enzyme.jl:314 [inlined]
 autodiff at Enzyme.jl:326 [inlined]
 gradient(rm::ReverseMode{false, FFIABI, false}, f::typeof(f), x::Vector{Float64}) at Enzyme.jl:1031
 macro expansion at show.jl:1181 [inlined]
 top-level scope at enzymemkl.jl:25
 eval at boot.jl:385 [inlined]
 include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String) at loading.jl:2076
 _include(mapexpr::Function, mod::Module, _path::String) at loading.jl:2136
 include(mod::Module, _path::String) at Base.jl:495
 exec_options(opts::Base.JLOptions) at client.jl:318
 _start() at client.jl:552

For convenience, here's the implicated line in the revision I'm running: https://github.com/JuliaLang/julia/blob/v1.10.4/stdlib/LinearAlgebra/src/blas.jl#L345

As far as I can tell, the primal call goes through the same line, so it seems mysterious that it only errors when called from inside Enzyme.autodiff. I've confirmed that f(x) still works after the gradient has errored, confirming again that Enzyme isn't messing up LBT forwarding for other callers.

@danielwe
Copy link
Contributor Author

danielwe commented Jul 30, 2024

OK, I'm beginning to see what's happening here.

  • Julia mostly uses Fortran BLAS symbols of the form :<t><func>_64_, such as :dgemv_64_. The exception is dot: for reasons related to calling/return conventions for complex numbers, BLAS.dot forwards to CBLAS symbols of the form :cblas_<t>dot64_. See discussion here: Missing dot(ComplexF32, ComplexF32) for LBT4 + MKL JuliaLinearAlgebra/libblastrampoline#56
  • Enzyme.jl uses CBLAS calls when taking gradients of CBLAS calls, and Fortran calls when taking gradients of Fortran calls. Thus, when differentiating Julia's dot and hitting cblas_ddot64_, it inserts calls to cblas_dcopy64_ and cblas_daxpy64_. In contrast, for the working gemv example, when encountering dgemv_64_ it inserts calls to dgemv_64_ and dscal_64_. (Thanks so much for Enzyme.API.printall!!)
  • Whether cblas_dcopy64_ and cblas_daxpy64_ exist depends on which BLAS library is loaded. With OpenBLAS they do, but with MKL they are missing. Missing CBLAS symbols in MKL is a well-known issue and LBT provides some adapters for the variants of dot, but not for any other functions such as copy or axpy, see https://github.com/JuliaLinearAlgebra/libblastrampoline/blob/main/src/cblas_adapters.c. Hence the errors.

The bug only affects dot, since that's the only function for which Julia uses the CBLAS interface.

It's not clear to me what the appropriate fix is. You can't really blame Enzyme for being consistent, although I guess a Julia-specific custom rule could be added to Enzyme.jl. Another solution would be for LBT to expand its collection of adapters to at least cover copy and axpy (but then there's second-order derivatives---sounds like a game of whack-a-mole). Or maybe LBT could implement adapters in the other direction, allowing Julia to use Fortran symbols across the board?

@wsmoses
Copy link
Member

wsmoses commented Jul 30, 2024

@danielwe axpy only possibly calls axpy and dot, so issues re recursive explosion aren't as concerning: https://github.com/EnzymeAD/Enzyme/blob/3d97f1742790e0d03977dab1f15108b4b3fd12da/enzyme/Enzyme/BlasDerivatives.td#L219

@wsmoses
Copy link
Member

wsmoses commented Jul 30, 2024

copy uses scal and dot. So worst case set is just copy/axpy/scal (in addition to dot) https://github.com/EnzymeAD/Enzyme/blob/3d97f1742790e0d03977dab1f15108b4b3fd12da/enzyme/Enzyme/BlasDerivatives.td#L253

@danielwe
Copy link
Contributor Author

That's good news!

What I find confusing is that both copy and axpy take an integer argument for the number of elements to act on, so MKL should be exporting 64-suffixed names under the LP64 interface. From the user guide:

On 64-bit platforms, selected domains provide API extensions with the _64 suffix (for example, SGEMM_64) for supporting large data arrays in the LP64 library, which enables the mixing of data types in one application. The selected domains and APIs include the following:

  • BLAS: Fortran-style APIs for C applications and CBLAS APIs with integer arguments
  • ...

https://www.intel.com/content/www/us/en/docs/onemkl/developer-guide-linux/2024-2/using-the-ilp64-interface-vs-lp64-interface.html

@staticfloat
Copy link

staticfloat commented Jul 30, 2024

I just want you guys to know that I've spent the day looking into this (and building better tooling for LBT to make this easier to discover). The issue is that the name mangling rules they chose for the cblas symbols is different enough from their fortran symbols that LBT doesn't find them correctly. I'll figure out the right way to fix this soon.

@ViralBShah
Copy link
Contributor

ViralBShah commented Jul 30, 2024

As @danielwe noted, dot, scal etc. are the only handful of BLAS functions that return scalar values creating issues with the fortran calling conventions, and hence we use CBLAS for those.

@staticfloat Will Apple Accelerate have similar issues too?

@staticfloat
Copy link

Alright, I think I fixed the issue. If it is convenient, please try out JuliaLinearAlgebra/libblastrampoline#137 by building it and replacing the libblastrampoline.so.5 in your Julia's lib directory. It should be a drop-in replacement that "just works".

Will Apple Accelerate have similar issues too?

I already solved the analogous issues with Accelerate.

@danielwe
Copy link
Contributor Author

Confirming that the patched libblastrampoline.so.5 fixes both the MWE and the larger computation where I encountered this

@staticfloat
Copy link

Fantastic. I'll try to push a new LBT out soon.

@danielwe
Copy link
Contributor Author

@wsmoses Looks like nothing needs to be done on the Enzyme end, so I'm closing this

staticfloat added a commit to JuliaLang/julia that referenced this issue Jul 31, 2024
This includes support to properly forward MKL v2024's ILP64 CBLAS
symbols, which fixes this Enzyme issue [0].

[0]: EnzymeAD/Enzyme.jl#1683
@staticfloat
Copy link

Julia PR: JuliaLang/julia#55330

staticfloat added a commit to JuliaLang/julia that referenced this issue Aug 1, 2024
This includes support to properly forward MKL v2024's ILP64 CBLAS
symbols, which fixes this [Enzyme
issue](EnzymeAD/Enzyme.jl#1683)
KristofferC pushed a commit to JuliaLang/julia that referenced this issue Aug 2, 2024
This includes support to properly forward MKL v2024's ILP64 CBLAS
symbols, which fixes this [Enzyme
issue](EnzymeAD/Enzyme.jl#1683)

(cherry picked from commit 602b582)
KristofferC pushed a commit to JuliaLang/julia that referenced this issue Aug 8, 2024
This includes support to properly forward MKL v2024's ILP64 CBLAS
symbols, which fixes this [Enzyme
issue](EnzymeAD/Enzyme.jl#1683)

(cherry picked from commit 602b582)
lazarusA pushed a commit to lazarusA/julia that referenced this issue Aug 17, 2024
This includes support to properly forward MKL v2024's ILP64 CBLAS
symbols, which fixes this [Enzyme
issue](EnzymeAD/Enzyme.jl#1683)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants