-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle large integer exponents in matrix powers #55431
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall lgtm, just suggested modifying some cases
# We split the exponentiation into parts for which the exponent can be converted to an Int | ||
if p < 0 | ||
return _integerpow(inv(A), -p) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps explicitly adding a case for p==0
to return the identity matrix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is handled by the A^0
call, which would also ensure type-stability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm ok fair
# and we raise to the power of q by carrying out the decomposition recursively | ||
A2 = A^Int(m) | ||
q, r = divrem(p, m) | ||
if q > 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this one can directly be q>0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the ==1
case is a no-op, so nicer to skip it.
if p < 0 | ||
return _integerpow(inv(A), -p) | ||
end | ||
m = typemax(Int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m = typemax(Int) | |
m = prevpow(2, typemax(UInt)) |
typemax(Int)
will not be accurate here (although arguably is not that far off). Note the inaccuracy of divrem(2.0^64, typemax(Int)) == (2.0, 0.0)
due to the promotion Int
-> Float64
. Also, I believe the cost of power_by_squaring
scales partly with the count_ones
in the power.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the correct value here is floatintmax(p)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right that it needs to be <=maxintfloat(typeof(p))
, otherwise the rem
could be wrong. So I think it needs to be UInt(min(prevpow(2, typemax(UInt)), maxintfloat(typeof(p))))
.
For the The result of |
return _integerpow(inv(A), -p) | ||
end | ||
m = typemax(Int) | ||
if p <= m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flow of all of this seems a little more complicated than necessary (maybe I'm wrong). What about just computing q, r = divrem(p, m)
initially and branching on q > 0
rather than p <= m
? It seems that could remove at least one if
block.
_integerpow(A::AbstractMatrix, p) = A^Integer(p) | ||
function _integerpow(A::AbstractMatrix, p::Union{Float32, Float64}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_integerpow(A::AbstractMatrix, p) = A^Integer(p) | |
function _integerpow(A::AbstractMatrix, p::Union{Float32, Float64}) | |
_integerpow(A::AbstractMatrix, p::Integer) = A^p | |
function _integerpow(A::AbstractMatrix, p) |
It seems that the long definition here is the more general? Apart from Float16
(although even there the overhead should be tiny), I'd feel safer with generic types avoiding the assumed-valid Integer
conversion since that's how we got here in the first place.
EDIT: probably my proposed _integerpow(A::AbstractMatrix, p::Integer)
is redundant altogether, if not directly overwriting another definition. I haven't traced the whole dispatch logic here.
Wouldn't a much simpler solution just be to replace isinteger(p) && typemin(Int) ≤ p ≤ typemax(Int) && return integerpow(A, p) ? i.e. just treat |
What about if the matrix is the negative identity, a permutation matrix, or a similarly "simple" matrix? Using the floating point exponentiation algorithm might cause a poor result. Although I can't say for sure because I haven't looked at the generic algorithm. Is it eigendecomposition based? I guess that would probably work for matrices with trivial diagonalizations (like my examples), but perhaps would cause some undesirable roundoff and error accumulation in others. |
Eigendecompositions are only used for Hermitian matrices; for other matrices it's too dangerous because they might be close to defective. But nevermind, even Note that this PR needs to move to https://github.com/JuliaLang/LinearAlgebra.jl, I think? |
We have moved the LinearAlgebra stdlib to an external repo: https://github.com/JuliaLang/LinearAlgebra.jl @jishnub If you think that this PR is still relevant, please open a new PR on the LinearAlgebra.jl repo. |
The idea here is that
A^p
is evaluated asA^Int(p)
ifisinteger(p)
, butInt(p)
may error ifp
is large. In such cases, we may express the exponent asp = m*q + r
, wherem
andr
are small enough so thatInt(m)
andInt(r)
succeed, and the operation becomes(A^m)^q * A^r
. This is evaluated recursively by repeatedly dividing the exponent.After this, the following works:
Fixes JuliaLang/LinearAlgebra.jl#1082, but I have not carried out any error analysis to check the accuracy of results in general. Suggestions are welcome. In particular, I'm unsure how the rounding in
divrem
impacts the results if the exponents are huge.