Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add factorize!() method for BunchKaufman #38

Open
MichelJuillard opened this issue Jan 21, 2024 · 8 comments
Open

Add factorize!() method for BunchKaufman #38

MichelJuillard opened this issue Jan 21, 2024 · 8 comments

Comments

@MichelJuillard
Copy link
Member

@RoyiAvital
Copy link

RoyiAvital commented Jan 21, 2024

Appreciate the support.
Will it support using MKL.jl and Accelerate.jl?

@MichelJuillard
Copy link
Member Author

I will test for MKL. I don't have the hardware to test for Accelerate. @RoyiAvital could you do that?

@RoyiAvital
Copy link

I will. Let me know when it is ready.

@MichelJuillard
Copy link
Member Author

@RoyiAvital Sorry, I got confused with the actual syntax

  1. factorize!() returns a tuple with sytrf!() output, not the factorization returned by BunchKaufman(). It is therefor stillnecessary to call BunchKaufman() after factorize!()
  2. The shorter syntax factorize!(ws, Symmetrical(x)) allocates more than `factorize!(ws, 'U', x)
  3. A the end, for Bunch Kaufman, there is little difference between using factorize!() or the lower level direct call to LAPACK.sytr!()
  4. Here is a working example with both appoaches
using FastLapackInterface
using LinearAlgebra

function loop_1!(vXs, mCs, ws)
    for mC in mCs    
        # factorization
        F1 = factorize!(ws, 'U', mC)
        F = BunchKaufman(mC, F1[2], 'U', true, false, BLAS.BlasInt(0))    
        # solving linear systems
        for vX in vXs
            ldiv!(F, vX)
        end 
    end
end

function approach1( order, iterations)
    # create workspace
    ws = Workspace(LAPACK.sytrf!, mCs[1])
    mCs_1 = deepcopy(mCs)
    vXs_1 = deepcopy(vXs)
    loop_1!(vXs_1, mCs_1, ws)
    mCs_1 = deepcopy(mCs)
    vXs_1 = deepcopy(vXs)
    @time loop_1!(vXs_1, mCs_1, ws)
end    

function loop_2!(vXs, mCs, ws)
    for mC in mCs    
        # factorization    
        A, ipiv, info = LAPACK.sytrf!(ws, 'U', mC)        
        F = BunchKaufman(mC, ipiv, 'U', true, false, BLAS.BlasInt(0))
        # solving linear systems
        for vX in vXs
            ldiv!(F, vX)
        end 
    end
end

function approach2(order, iterations)
    # create workspace
    ws = BunchKaufmanWs(mCs[1])
    
    mCs_1 = deepcopy(mCs)
    vXs_1 = deepcopy(vXs)
    loop_2!(vXs_1, mCs_1, ws)
    mCs_1 = deepcopy(mCs)
    vXs_1 = deepcopy(vXs)
    @time loop_2!(vXs_1, mCs_1, ws)
end    

order = 100
iterations = 10

mCs = []
vXs = []
for i = 1:iterations
    x = randn(order, order)
    mC = hermitianpart!(randn(n, n)).data
    push!(mCs, mC)
    push!(vXs, randn(order))
end

approach1(mCs, vXs)
approach2(mCs, vXs)
  1. It works with MKL (but seems slower than OpenBlas). Could you please try it with Accelerate?

@RoyiAvital
Copy link

Do these lines allocate?

F1 = factorize!(ws, 'U', mC)
F = BunchKaufman(mC, F1[2], 'U', true, false, BLAS.BlasInt(0))  
ldiv!(F, vX)

If not, this is perfect.

I will test on Accelerate.jl and report, no problem.

@MichelJuillard
Copy link
Member Author

It still allocates for a reason that I don't understand but very little. It doesn't depend on the size of the matrix.

@RoyiAvital
Copy link

I assume F = BunchKaufman(mC, F1[2], 'U', true, false, BLAS.BlasInt(0)) is the allocating line, right?

@MichelJuillard
Copy link
Member Author

F1 = factorize!(ws, 'U', mC)

allocates 64 bytes per iteration

F = BunchKaufman(mC, F1[2], 'U', true, false, BLAS.BlasInt(0))  

allocate 48 bytes per iteration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants