Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC/WIP: make setindex! not remove zeros from sparsity pattern #15568

Closed

Conversation

KristofferC
Copy link
Member

Since sparse now keeps structural zeros it makes sense for setindex to also do this.

I also removed spdelete! since it is no longer used but maybe it is preferable to keep it anyway?

Previous discussion: #9906

@@ -2676,7 +2610,7 @@ function setindex!{Tv,Ti}(A::SparseMatrixCSC{Tv,Ti}, x, I::AbstractMatrix{Bool})
(xidx > n) && break
end # for col in 1:A.n

if (nadd != 0) || (ndel != 0)
if (nadd != 0)
n = length(nzvalB)
if n > (bidx-1)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Can this happen anymore?

@KristofferC KristofferC force-pushed the kc/setindex_keep_zero branch from bcbb39b to adc7508 Compare March 20, 2016 20:49
@KristofferC
Copy link
Member Author

Right now I went with the hardcore option. Basically, always do the same no matter what the value is. The middle ground is to not introduce any new stored elements in the matrix when possible, e.g:

A = spzeros(2,2)
A[1,1] = 0
nnz(A) == 0

@StefanKarpinski
Copy link
Member

Right now I went with the hardcore option. Basically, always do the same no matter what the value is.

+1 this seems like the sanest approach coupled with explicit removal of zeros with dropzeros!.

@tkelman
Copy link
Contributor

tkelman commented Mar 21, 2016

What does this do for e.g. A[:, j] = v for a SparseVector v? Or setting a subset of rows/cols equal to a smaller sparse matrix?

@KristofferC KristofferC changed the title RFC: make setindex! not remove zeros from sparsity pattern RFC/WIP: make setindex! not remove zeros from sparsity pattern Mar 21, 2016
@KristofferC
Copy link
Member Author

Indexing with a sparse vector promotes it to a sparse matrix. Right now it currently removes stored elements in the matrix. Actually, Indexing with a dense matrix converts it to a sparse matrix and calls the same method.

Should probably be changed for consistency. I got some more stuff for SparseVector as well but not done.

@simonster
Copy link
Member

I think people might find it weird that A[:, j] = 0 fills the column with stored zeros.

@tkelman
Copy link
Contributor

tkelman commented Mar 21, 2016

Yeah, there's a bit of a distinction here between the scalar case where a stored zero makes sense, vs the vector/submatrix case where you probably do want two different syntaxes for either storing zeros from the input or not.

@KristofferC
Copy link
Member Author

Would it be better to just start off with making A[i,j] and v[i] preserve zeros when i, j are ints?

@Sacha0
Copy link
Member

Sacha0 commented Jun 16, 2016

Does closing #9906 require anything beyond this pull request and #16947?

This PR appears blocked on clarification of desired setindex! behavior. Two fundamental aspects of setindex! behavior seem at issue:

  1. Whether setindex! should purge allocated entries on zero assignment. (Concretely: Where A[i,j] is an allocated entry, should A[i,j] = 0 de-allocate A[i,j]?)
  2. Whether setindex! should introduce new allocated entries on zero assignment. (Concretely: Where A[i,j] is a not-allocated entry, should A[i,j] = 0 allocate entry A[i,j]?).

(1) appears uncontroversial. Consensus that setindex! should not purge allocated entries on zero assignment (that where A[i,j] is an allocated entry, A[i,j] = 0 should not de-allocate it) grew out of the various related discussions.

(2) appears to be the sticking point.

Concrete proposal for discussion:

Re. (1), setindex! should not purge allocated entries on zero assignment (existing consensus). Re. (2), setindex! should not introduce new allocated entries on zero assignment.

Arguments:

As long as these aspects of behavior are clearly conceptually separated, this proposal is simple and consistent; that these two aspects of behavior have not always been clearly separated in preceding discussion seems to have been part of the bottleneck.

Having setindex! introduce new allocated entries on zero assignment seems like a nightmare for generic programming. For example, that behavior turns otherwise-innocuous statements like A[:,:] = 0 into OOM / performance traps.

Having setindex! introduce new allocated entries on zero assignment seems unintuitive in common cases, e.g. A[:,j] = v with scalar j and SparseVector v intuitively should set A[v.nzind, j] = v.nzval without filling the remainder of column j with stored numerical zeros.

Easy incremental injection of zeros does not require having setindex! introduce new allocated entries on zero assignment. On that front I favor @tkelman's position (see #9906 (comment) and #9906 (comment)): add a simple interface for adding and removing allocated entries, e.g. storezero!(A, i, j) and dropentry!(A, i, j), complementing but orthogonal to the setindex! interface.

Thoughts? Thanks, and best!

@KristofferC
Copy link
Member Author

So, in short, setindex! should always try its best to leave the number of allocated entries the same? I agree with that and adding new methods for changing what the internal structure of the matrix.

@Sacha0
Copy link
Member

Sacha0 commented Jun 17, 2016

So, in short, setindex! should always try its best to leave the number of allocated entries the same?

Awesome rephrase! :)

@tkelman tkelman added the sparse Sparse arrays label Jun 17, 2016
@JaredCrean2
Copy link
Contributor

+1 for a clear summary and position statement. Re. 1, I agree, that's definitely the right behavior. Re. 2, I disagree. Instead, I argue the behavior of setindex! should be independent of the floating point values it is operating on. Otherwise, a calculation that sometimes produces an exactly zero value but other times produces an almost zero value (because of floating point error) would result in an inconsistent sparsity structure in the matrix. Deciding the non-zero structure of the matrix is a logical decision, and logic shouldn't be based on floating point values.

I also support creating an AbstractSparseMatrixCSC and having a number of different implementations that make different choices about setindex!/getindex, so others have mentioned in various issues.

@mauro3
Copy link
Contributor

mauro3 commented Jun 17, 2016

Hitting zero exactly is not that uncommon: multiplying by 0.0 does it. However, in sums hitting 0.0 is indeed not guaranteed. If the sparsity structure should depend on sum==0.0, then it is the coder's responsibility to set those values to 0.0 if close enough. So, I'm +1 for @Sacha0's proposal.

@nalimilan
Copy link
Member

@JaredCrean2 What's the problem with "an inconsistent sparsity structure"? I don't see any drawback to storing less zero entries when possible. Any correct algorithm will return the same result disregarding the sparsity structure, only faster if we store less zeros.

@KristofferC
Copy link
Member Author

One "problem" is that the time complexity to assign a nonzero to an index then depends on the floating point value you previously assigned to that index.

@nalimilan
Copy link
Member

Yes, but always choosing the slowest path doesn't sound worth it...

@JaredCrean2
Copy link
Contributor

The other problem is memory consumption. Adding new non-zeros requires dynamically growing the vectors occasionally. The inconsistent sparsity structure leads to the matrix sometimes requiring x Gb of memory, but sometimes requiring 1.5x (or whatever the growth factor is). The difference becomes more pronounced as the matrix gets larger too.

@KristofferC
Copy link
Member Author

How often do you really introduce a new stored value through indexing though? Isn't the most common way to use sparse(I, J, V) to create the sparsity structure once and then just use that matrix over and over. Dynamically changing sparsity pattern should be a quite rare use case and then it is probably best to just reform the matrix from scratch?

@JaredCrean2
Copy link
Contributor

I agree this should be quite rare, and my preferred implementation of AbstractSparseMatrixCSC would be one that doesn't allow changing the sparsity structure, but I think if dynamically changing the sparsity structure is allowed, the result should be more deterministic.

Either way, this PR is definitely a step in the right direction, and once AbstractSparseMatrixCSC exists there can be different implementations.

@Sacha0
Copy link
Member

Sacha0 commented Jun 20, 2016

Re. 2, I disagree. Instead, I argue the behavior of setindex! should be independent of the floating point values it is operating on. Otherwise, a calculation that sometimes produces an exactly zero value but other times produces an almost zero value (because of floating point error) would result in an inconsistent sparsity structure in the matrix. Deciding the non-zero structure of the matrix is a logical decision, and logic shouldn't be based on floating point values.

I appreciate and sympathize with the principle underlying this argument. Practical considerations should temper principle, however, and when weighing the tradeoffs the scales fall heavily in favor of (2):

Relative to the 'hardcore' option, under (2) assignment values impact SparseMatrixCSCs' underlying structure in two cases insofar as I can see.

  1. (A) When assigning a logical zero to a not-allocated (and ergo also logical) zero entry, (2) preserves that logical zero / the SparseMatrixCSC's underlying structure. This behavior is what enables generic programming over SparseMatrixCSCs; without this behavior, a large fraction of generically-written operations would inappropriately densify SparseMatrixCSCs. In other words, application of much of Julia's power to SparseMatrixCSCs, and consequently much SparseMatrixCSC functionality existing and future, hinges upon Julia retaining (2).
  2. (B) When every assignment to a particular initially-not-allocated (and ergo logical) zero that an operation performs (that is not a logical-zero assignment) assigns a 'miraculous' numerical zero, that entry will remain a not-allocated zero instead of being allocated. (Sorry that sentence is difficult to parse!) The sparse-direct-methods community calls such zeros 'miraculous' because they are so rare that in practice they are not worth considering, which says something. Here a difference in outcome from an operation would often (typically?) require multiple such miracles in series.

(To differentiate the preceding two cases from behaviors (1) and (2) at issue, I will call these (A) and (B).)

Moreover, consider that in the rare case where (B) does lead to a difference in a SparseMatrixCSC's underlying structure, that difference in underlying structure should be marginal, chances are the operation performed was generic to begin with, and, though the underlying structure may be different, the apparent result should not differ. In other words, the difference is not a matter of correctness.

Additionally, consider that the SparseMatrixCSC setindex!/getindex interface primarily serves generic programming and high-level convenience. Most critical operations on SparseMatrixCSCs are not and should not be implemented via the setindex!/getindex interface. Hence concern over (B) does not reach such operations. For generic operations, concern over (B) should be dwarfed by the potential impact of other implementation details in a given generic operation.

One should also flip the structural consistency argument on its head: In the generic operations where these concerns are relevant, (2) should rarely lead to deviation from strict logical structure (and marginal deviation then), whereas the 'hardcore' option should almost always lead to loss of underlying structure (and frequently complete loss).

In summary, on one side of the tradeoff scales we have a rare edge case in which results remain correct but a marginal and largely invisible underlying structure difference may exist. On the other side we lose generic programming, lose both time and memory efficiency, operations that should work turn into slow death by disk thrashing on accidental densification of large SparseMatrixCSCs, etcetera. The imbalance is overwhelming.

Concerning specific points:

Hitting zero exactly is not that uncommon: multiplying by 0.0 does it. However, in sums hitting 0.0 is indeed not guaranteed.

The former case decomposes into two subcases: (1) the typical case of multiplying by a logical zero, which you would hope to preserve (as behavior (2) does); and (2) the rare case of multiplying by a miraculous zero, rare by nature of only occuring when miraculous cancellation occurs upstream in the calculation, itself rare. So perhaps a more precise statement would be: Hitting zero exactly when computing with logical zeros is common and the resulting zeros are logical and should be preserved, while hitting zero exactly outside of that case is rare.

One "problem" is that the time complexity to assign a nonzero to an index then depends on the floating point value you previously assigned to that index.

Echoing @nalimilan, always choosing the slowest path doesn't help.

The other problem is memory consumption. Adding new non-zeros requires dynamically growing the vectors occasionally. The inconsistent sparsity structure leads to the matrix sometimes requiring x Gb of memory, but sometimes requiring 1.5x (or whatever the growth factor is). The difference becomes more pronounced as the matrix gets larger too.

@nalimilan's response applies here as well: The alternative is to always default to an upper bound, where in practice that upper bound is often complete densification. I would rather operations typically be efficient and, once in a blue moon, require marginal memory overhead, than regularly lock my system trying to densify a large SparseMatrixCSC.

Thanks for reading! :) Best!

@JaredCrean2
Copy link
Contributor

JaredCrean2 commented Jun 20, 2016

Interesting. Of the arguments (and the are all quite good), the one I find most compelling is the impact on generic programming. The purpose of sparse matrices, broadly speaking, is to take operations that are O(N^2) or slower and try to make them O(nonzeros). The purpose of generic programming, in the particular case of sparse and dense matrices, is to facilitate applying algorithms with the wrong time complexity. I would rather have sparse matrices with a more limited set of operations defined on them, all of which have the right time complexity, than have more operations, some of which take an order of magnitude more time than they should. Generic programming is valuable for providing a unified interface to sets of similar objects, and Julia is a particularly good language for doing so, but unifying the interfaces to dissimilar objects decreases usability.

However, if supporting generic programming between dense arrays and sparse arrays is the goal, then your original proposal for 2 is necessary to avoid densification.

most critical operations on SparseMatrixCSCs are not and should not be implemented via the setindex!/getindex interface

Ah, this gets to the other half of the question. I've always though that setindex!/getindex are the right interface functions for sparse matrices, and that the user's could should avoid calling them on non-stored entries. What do you think the right interface to a sparse matrix is? I think directly manipulating colptr and nzvals is a bad idea because it is easy to leave the data structure in an inconsistent state. Is there a third option?

Edit: fix bad grammar

@mauro3
Copy link
Contributor

mauro3 commented Jun 21, 2016

Even if only indexing actual non-zeros, setindex!/getindex is slower than straight access of the non-zeros using nzrange/nonzeros because setindex!/getindex has to perform a binary search on the rowval. But I agree that the nzrange/nonzeros doesn't feel quite right, yet. Could maybe the Cartesian indexing be adapted to sparse matrices?

@KristofferC
Copy link
Member Author

@Sacha0

Additionally, consider that the SparseMatrixCSC setindex!/getindex interface primarily serves generic programming and high-level convenience. Most critical operations on SparseMatrixCSCs are not and should not be implemented via the setindex!/getindex interface.

I disagree, assembly into a global stiffness matrix from local smaller matrices is typically done with normal getindex + setindex (altough see ¤15630). I don't see any way around that unless you want to cache the lookup keys which might work for small systems but become intractable for large ones.

@tkelman
Copy link
Contributor

tkelman commented Jun 21, 2016

You aren't doing assembly in terms of index lists?

@KristofferC
Copy link
Member Author

I remember the previous linear index computed so I can half the search range in case I am in the same column (probably the same as what doing it with an index list does).

@Sacha0
Copy link
Member

Sacha0 commented Jun 21, 2016

I disagree, assembly into a global stiffness matrix from local smaller matrices is typically done with normal getindex + setindex (altough see #15630). I don't see any way around that unless you want to cache the lookup keys which might work for small systems but become intractable for large ones.

This use falls under 'high-level convenience': You could implement the same assembly approach via the low-level interface, doing so would only require more development time. Moreover, a specialized implementation via the low-level interface could achieve higher efficiency via elimination or streamlining of some of the logic in getindex/setindex! (at the least). You've wisely chosen to sacrifice some machine time to preserve your time, by exercising a high-level interface.

<digression>
Tangentially, insofar as I am aware the the traditional assembly procedures are

  1. Storing element matrices in coordinate representation, concatenating those element matrices into a global has-duplicates, unordered coordinate representation, and transforming that global coordinate representation into a duplicate-free, ordered or unordered, compressed column or compressed row representation in one efficient shot via a routine like sparse!.
  2. Storing element matrices in clique representation (as small dense matrices accompanied by index vectors), and representing the global matrix implicitly as a collection of clique matrices. This approach is part of the origin of frontal techniques and some efficient sparse matvec techniques, and is particularly relevant when working with higher order elements.

See e.g. Duff, Erisman, and Reid's book on sparse direct methods, or skim the HSL catalogue and archive for methods relevant to finite element matrices. I am only aware of the assembly approach you describe being used in high-level languages / where assembly is not a computational bottleneck --- which it usually isn't, so trading some computational efficiency for development time is indeed the right approach there. That statement provides a nice segue to...
</digression>

The purpose of sparse matrices, broadly speaking, is to take operations that are O(N^2) or slower and try to make them O(nonzeros). The purpose of generic programming, in the particular case of sparse and dense matrices, is to facilitate applying algorithms with the wrong time complexity. I would rather have sparse matrices with a more limited set of operations defined on them, all of which have the right time complexity, than have more operations, some of which take an order of magnitude more time than they should.

This is really nicely stated! I appreciate and sympathize with the principle underlying this argument as well. Here I would again argue practical considerations motivate temperance:

Suppose you need to perform an operation on a SparseMatrixCSC, and a generic implementation of that operation exists but no specialized implementation. Additionally suppose that without performing this operation you cannot move forward with your work, you need perform that operation a handfull of times at best (perhaps only once), and you are not otherwise interested in that operation. Would you rather: (A) wait ten seconds for the operation to complete via the generic code, an order of magnitude more time than the one second a specialized implementation would require (or a minute versus ten seconds, or an hour versus a few minutes, etc); or (B) spend the next several minutes (or hours, or days, or weeks) seeking out and learning how to call a specialized implementation in an external library, or implementing a specialized implementation if none exists, or even developing a sparse algorithm for the operation if none exists?

In situations like the above, your time is more valuable than the machine's time. The operation need not execute with optimal complexity and efficiency, it just needs to work. Were analogs of the above situation not common in practice, languages like Julia, Python, Mathematica, MATLAB, etcetera would not have the appeal they do.

Concern with optimal complexity and efficiency (in preference to human time) should be overriding only where suboptimal complexity and inefficiency make a task computationally intractable or otherwise prohibitively expensive. Impeding the language's ability to work with SparseMatrixCSCs outside of the latter case would be unfortunate.

I've always though that setindex!/getindex are the right interface functions for sparse matrices, and that the user's could should avoid calling them on non-stored entries. What do you think the right interface to a sparse matrix is? I think directly manipulating colptr and nzvals is a bad idea because it is easy to leave the data structure in an inconsistent state. Is there a third option?

.

Even if only indexing actual non-zeros, setindex!/getindex is slower than straight access of the non-zeros using nzrange/nonzeros because setindex!/getindex has to perform a binary search on the rowval. But I agree that the nzrange/nonzeros doesn't feel quite right, yet. Could maybe the Cartesian indexing be adapted to sparse matrices?

I share these questions, and hope that the Julia community will develop answers to them. Re. cartesian indexing adaptation, @timholy recently showcased exciting work in this direction with ArrayIteration.jl (see also #15459).

Thanks again for reading! Best!

Edits: Improved poor sentence structure that led to unintended hyperbole. Removed some off topic bits. Made more concise.

@JaredCrean2
Copy link
Contributor

The operation need not execute with optimal complexity and efficiency, it just needs to work. Were analogs of the above situation not common in practice, languages like Julia, Python, Mathematica, MATLAB, etcetera would not have the appeal they do.

Here is where I distinguish between slow languages like Python and Matlab and fast languages like Julia and C. In the slow languages, there is a (well founded) assumption that speed is not important to users. For fast languages, this is not so, and that puts much more stringent requirements on the design of the standard library. For a performance analysis, I see two main axes: expense of calculating the entries and sparsity of the matrix. Consider:

  1. expensive and very sparse: must avoid calculating numerically zero entries, any dense operations on matrix would be significant
  2. expensive and not very sparse: computing entries is dominating cost, doing sparse solve or not becomes negligible
  3. inexpensive and very sparse: doing any dense operations on the sparse matrix would be dominating cost.
  4. inexpensive and not very sparse: the overhead of assembling the system is significant, and the benefit of doing sparse operations is not that great (because the system is not very sparse), and a dense matrix would be better overall.

On this basis, I reject the

The operation need not execute with optimal complexity and efficiency, it just needs to work

argument. In your original post (pre-edits), you make the argument that premature optimization is the root of all evil, and you are quite correct. Using sparse matrices is the optimization. For users who want things to just work, using dense matrices is the better option. Getting the benefits of sparsity requires taking advantage of sparsity at every stage of the computation. For the 4 cases there are only 2 outcomes, either sparsity is essential and a dense operation would be big problem, or the benefits of sparsity are marginal at best and the user would be better off with a dense matrix. This is why I don't think the interface design should prioritize generic algorithms between sparse and dense matrices, because sparsity should only be employed when optimal time complexity is required.

@tkelman
Copy link
Contributor

tkelman commented Jun 28, 2016

Generality matters. Julia is both a fast language and a high-level convenient one, and we shouldn't be designing API's that look exactly like C. Often a generic algorithm written as if all matrices are dense will be correct but slow for sparse matrices, and that's not ideal but we can always specialize on the implementation to fix the performance in the sparse case. Correctness is easier to do first though. Writing generic algorithms that can also automatically be sparsity-aware will hopefully come with time.

Exact equality with zero and approximate equality with zero are more noticeably different with sparse matrices than with dense ones when it comes to performance. I think that comes with the territory. I'm in favor of the "avoid changing the sparsity structure if you don't absolutely have to" criteria here.

@JaredCrean2
Copy link
Contributor

It seems majority opinion is in favor of sacha0s original criteria, so lets merge the PR.

@KristofferC
Copy link
Member Author

I am on vacation (currently outside Grand Canyon) and wont be at a computer for 10 days so easiest to get this through is if someone can rebase this / take over this PR.

@Sacha0
Copy link
Member

Sacha0 commented Jul 8, 2016

@tkelman, I assume this change would arrive too late to join the other indexing-semantics changes in 0.5? If not too late, I should be able to rework this PR in the next few days.

I am on vacation (currently outside Grand Canyon) and wont be at a computer for 10 days

Enjoy!

@tkelman
Copy link
Contributor

tkelman commented Jul 8, 2016

If you're quick, I don't think this would be disruptive enough that it would absolutely have to wait until after we branch, as long as it doesn't hurt performance (gut feeling says it should only make things better).

@KristofferC
Copy link
Member Author

Superseded by #17404

@Sacha0
Copy link
Member

Sacha0 commented Oct 16, 2016

Using sparse matrices is the optimization. For users who want things to just work, using dense matrices is the better option. Getting the benefits of sparsity requires taking advantage of sparsity at every stage of the computation. For the 4 cases there are only 2 outcomes, either sparsity is essential and a dense operation would be big problem, or the benefits of sparsity are marginal at best and the user would be better off with a dense matrix.

Brief response for posterity: Reducing runtime is not the only motivation for using sparse data structures; reducing memory use is another and equally important. If forced to use a dense data structure for some dataset, you may run out of memory well before the runtime of non-intensive generic operations becomes prohibitive. Best!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants