Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle large integer exponents in matrix powers #55431

Closed
wants to merge 1 commit into from

Conversation

jishnub
Copy link
Contributor

@jishnub jishnub commented Aug 9, 2024

The idea here is that A^p is evaluated as A^Int(p) if isinteger(p), but Int(p) may error if p is large. In such cases, we may express the exponent as p = m*q + r, where m and r are small enough so that Int(m) and Int(r) succeed, and the operation becomes (A^m)^q * A^r. This is evaluated recursively by repeatedly dividing the exponent.

After this, the following works:

julia> [1 1e-10; 0 1] ^ (1e20)
2×2 Matrix{Float64}:
 1.0  1.0e10
 0.0  1.0

Fixes JuliaLang/LinearAlgebra.jl#1082, but I have not carried out any error analysis to check the accuracy of results in general. Suggestions are welcome. In particular, I'm unsure how the rounding in divrem impacts the results if the exponents are huge.

@jishnub jishnub added linear algebra Linear algebra arrays [a, r, r, a, y, s] labels Aug 9, 2024
@jishnub jishnub changed the title Handle large integer exponents in matrix exponentials Handle large integer exponents in matrix powers Aug 9, 2024
@jishnub jishnub marked this pull request as draft August 9, 2024 14:32
Copy link

@developfast developfast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm, just suggested modifying some cases

# We split the exponentiation into parts for which the exponent can be converted to an Int
if p < 0
return _integerpow(inv(A), -p)
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps explicitly adding a case for p==0 to return the identity matrix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is handled by the A^0 call, which would also ensure type-stability.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm ok fair

# and we raise to the power of q by carrying out the decomposition recursively
A2 = A^Int(m)
q, r = divrem(p, m)
if q > 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this one can directly be q>0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the ==1 case is a no-op, so nicer to skip it.

if p < 0
return _integerpow(inv(A), -p)
end
m = typemax(Int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
m = typemax(Int)
m = prevpow(2, typemax(UInt))

typemax(Int) will not be accurate here (although arguably is not that far off). Note the inaccuracy of divrem(2.0^64, typemax(Int)) == (2.0, 0.0) due to the promotion Int -> Float64. Also, I believe the cost of power_by_squaring scales partly with the count_ones in the power.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the correct value here is floatintmax(p)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right that it needs to be <=maxintfloat(typeof(p)), otherwise the rem could be wrong. So I think it needs to be UInt(min(prevpow(2, typemax(UInt)), maxintfloat(typeof(p)))).

@developfast
Copy link

For the divrem issue, suppose p = 5*10^18 and m = typemax(Int) (which, for a 64-bit system, is 2^63 - 1 or approximately 9.22 × 10^18).

The result of divrem(p, m) would yield q = 0 and r = 5*10^18 (since p < m).
The computation effectively becomes A^(5*10^18), which is not broken down further because r still represents a large exponent. We could choose a smaller m but the tradeoff is using more recursion calls.

return _integerpow(inv(A), -p)
end
m = typemax(Int)
if p <= m
Copy link
Contributor

@mikmoore mikmoore Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow of all of this seems a little more complicated than necessary (maybe I'm wrong). What about just computing q, r = divrem(p, m) initially and branching on q > 0 rather than p <= m? It seems that could remove at least one if block.

Comment on lines +520 to +521
_integerpow(A::AbstractMatrix, p) = A^Integer(p)
function _integerpow(A::AbstractMatrix, p::Union{Float32, Float64})
Copy link
Contributor

@mikmoore mikmoore Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_integerpow(A::AbstractMatrix, p) = A^Integer(p)
function _integerpow(A::AbstractMatrix, p::Union{Float32, Float64})
_integerpow(A::AbstractMatrix, p::Integer) = A^p
function _integerpow(A::AbstractMatrix, p)

It seems that the long definition here is the more general? Apart from Float16 (although even there the overhead should be tiny), I'd feel safer with generic types avoiding the assumed-valid Integer conversion since that's how we got here in the first place.

EDIT: probably my proposed _integerpow(A::AbstractMatrix, p::Integer) is redundant altogether, if not directly overwriting another definition. I haven't traced the whole dispatch logic here.

@stevengj
Copy link
Member

Wouldn't a much simpler solution just be to replace isinteger(p) && return integerpow(A, p) with

isinteger(p) && typemin(Int)  p  typemax(Int) && return integerpow(A, p)

? i.e. just treat p the same as any floating-point exponent if it's too big?

@mikmoore
Copy link
Contributor

just treat p the same as any floating-point exponent if it's too big?

What about if the matrix is the negative identity, a permutation matrix, or a similarly "simple" matrix? Using the floating point exponentiation algorithm might cause a poor result. Although I can't say for sure because I haven't looked at the generic algorithm. Is it eigendecomposition based? I guess that would probably work for matrices with trivial diagonalizations (like my examples), but perhaps would cause some undesirable roundoff and error accumulation in others.

@stevengj
Copy link
Member

stevengj commented Nov 29, 2024

Eigendecompositions are only used for Hermitian matrices; for other matrices it's too dangerous because they might be close to defective.

But nevermind, even schurpow needs to call integerpow, so you can't replace the latter with the former.

Note that this PR needs to move to https://github.com/JuliaLang/LinearAlgebra.jl, I think?

@DilumAluthge
Copy link
Member

We have moved the LinearAlgebra stdlib to an external repo: https://github.com/JuliaLang/LinearAlgebra.jl

@jishnub If you think that this PR is still relevant, please open a new PR on the LinearAlgebra.jl repo.

@DilumAluthge DilumAluthge deleted the jishnub/integerpow branch January 12, 2025 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error on large matrix powers
6 participants