Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scale!!(::SparseVector, ::SparseVector, x) results in StackOverflowError #19

Closed
abraemer opened this issue Oct 21, 2024 · 7 comments
Closed

Comments

@abraemer
Copy link

abraemer commented Oct 21, 2024

By chance I found that trying to use KrylovKit.exponentiate where the vector is a SparseArrays.SparseVector results in a StackOverflowError due to an endless recursive call. I tracked the error to a call of scale!!(v1, v2, x) where both v1 and v2 are SparseArrays.SparseVectors. It appears to be some unfortunate combination of VectorInterface.jl and broadcasting rules in SparseArrays.jl that leads to the infinite recursion. I am totally unfamiliar with VectorInterface.jl so I didn't attempt to search for a fix so far.

MWE:

using VectorInterface, SparseArrays
scale!!(spzeros(5), spzeros(5), 1) # StackOverflowError

Stacktrace:

      [1] broadcast!
        @ ./broadcast.jl:844 [inlined]
      [2] copyto!
        @ ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/SparseArrays/src/higherorderfns.jl:1175 [inlined]
      [3] materialize!
        @ ./broadcast.jl:878 [inlined]
      [4] materialize!(dest::SparseVector{…}, bc::Base.Broadcast.Broadcasted{…})
        @ Base.Broadcast ./broadcast.jl:875
--- the above 4 lines are repeated 43644 more times ---
 [174581] broadcast!
        @ ./broadcast.jl:844 [inlined]
 [174582] copyto!
        @ ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/SparseArrays/src/higherorderfns.jl:1175 [inlined]
 [174583] materialize!
        @ ./broadcast.jl:878 [inlined]
 [174584] materialize!
        @ ./broadcast.jl:875 [inlined]
 [174585] scale!
        @ ~/.julia/packages/VectorInterface/1L754/src/abstractarray.jl:46 [inlined]
 [174586] scale!!(y::SparseVector{Float64, Int64}, x::SparseVector{Float64, Int64}, α::Int64)
        @ VectorInterface ~/.julia/packages/VectorInterface/1L754/src/abstractarray.jl:60
versioninfo

Julia Version 1.11.1
Commit 8f5b7ca12ad (2024-10-16 10:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 × AMD Ryzen 7 4800H with Radeon Graphics
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver2)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)

Pkg.status()

(jl_MdwVUn) pkg> st
Status `/tmp/jl_MdwVUn/Project.toml`
  [409d34a3] VectorInterface v0.4.6
  [2f01184e] SparseArrays v1.11.0

@abraemer
Copy link
Author

abraemer commented Oct 21, 2024

Did some more experimentation. As can be seen from the error above the issue originates within scale!:

scale!(spzeros(5), spzeros(5), 1)  # StackOverflowError

Looking at its code, it simply amounts to:

v1 = spzeros(5); v2 = spzeros(5);
v1 .= scale!!.(v1,v2, (1,))

Now I am not sure why (1,) is used here but my guess is to "wrap" the scalar in order to broadcast it across the vectors. Canonically in Julia one would use a Ref for this I think. Interestingly, this

v1 .= scale!!.(v1,v2, Ref(1))

Just works without error.

So it might be that the somewhat non-standard wrapping of scalars into Tuple for broadcasting is causing this error. Perhaps we should consider changing all places using one-element tuples to Ref?

EDIT:
I tried to find sources for my claim that "Canonically in Julia one would use a Ref". However I don't find anything authoritative:

Sometimes, you want a container (like an array) that would normally participate in broadcast to be "protected" from broadcast's behavior of iterating over all of its elements. By placing it inside another container (like a single element Tuple) broadcast will treat it as a single value.

  • I did find recommendations to use Ref in the Julia discourse e.g. [1], [2]
  • Also the docstring of Ref tells:

Ref is sometimes used in broadcasting in order to treat the referenced values as a scalar.

So perhaps one could interpret this StackOverflowError as a bug in SparseArrays?

@Jutho
Copy link
Owner

Jutho commented Oct 21, 2024

I do think that if it works with Ref but not with a one-element tuple, this is indeed a bug in SparseArrays. I wouldn't object switching to Ref if this does not imply any loss in performance, but I am always confused by this conflation of intentions for Ref. It is supposed to make objects be referencable and have e.g. a pointer. This seems to imply memory allocation, in my mind at least, which I definitely do not want here. I assume this is not actually true for stack allocated variables, but still, I do find this very confusing, which is why I prefer one-element tuples.

@abraemer
Copy link
Author

I think Ref has some special casing in the context of broadcasting. IIRC there is some special care taken that each Ref is optimized out and never actually performed. Unfortunately I can't find an authoritative source for this.

I managed to reproduce the error without VectorInterface:

using SparseArrays
v1 = spzeros(5); v2 = spzeros(5);
v1 .= .*(v1, (1,))
v1 .= .*(v2, (1,)) # works
v1 .= .*(v1, v2, Ref(1)) # works
v1 .= .*(v1, v2, (1,)) # ERROR: StackOverflowError:

So I will open an issue with SparseArrays.jl.

@Jutho
Copy link
Owner

Jutho commented Oct 21, 2024

As an aside, I am always surprised seeing the use of SparseVector in this use case, where you have a sparse linear operator, but then you want to find something like "eigenvector", "solution of linear system", "matrix exponential on vector" (e.g. the things that KrylovKit.jl computes). Typically, the sparsity of the operator does not translate to its eigenvectors, solutions of linear systems, … . For the matrix exponential, it is maybe ok, if you take very small time steps, but I would also expect that it fills up rather quickly.

@abraemer
Copy link
Author

Yes I agree. It came up for during some exploration with small vectors where I didn't pay attention to the concrete vector types. So no it is not a pressing issue for me by any means :) But it was quite confusing and did cost me a couple of minutes to find the error source. That's why I wanted to report it.

Nonetheless, it might be beneficial to switch to Refs for broadcasting in the long run if that is a (more) standard way of doing so. Perhaps I will make a post on discourse to see whether we can get a better answer to that question.

@Jutho
Copy link
Owner

Jutho commented Nov 12, 2024

Despite the error being in SparseArrays.jl, I think this problem should at least be fixed (circumvented) in VectorInterface 0.4.9 (just released).

@Jutho
Copy link
Owner

Jutho commented Nov 12, 2024

Feel free to reopen if the issue reappears.

@Jutho Jutho closed this as completed Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants