TODO

- Implement efficient coercions from dgTMatrix to COO_SparseMatrix and
  SVT_SparseMatrix. 'as(<dgTMatrix>, "SparseMatrix")' and
  SparseArray(<dgTMatrix>) should do the former (this is the natural one
  for dgTMatrix objects).

- Implement readMatrixMarket(), similar to Matrix::readMM() but should
  return a COO_SparseMatrix object instead of a dgTMatrix object. Note that
  the easy/lazy implementation could simply be

      as(Matrix::readMM(.), COO_SparseMatrix)

  However it wouldn't be really justified to introduce a new function just
  for that. So hopefully a native implementation will improve efficiency
  enough to be worth it and justify a dedicated function.
  See https://math.nist.gov/MatrixMarket/formats.html

- To write a SparseMatrix object to a Matrix Market file, do we need a
  dedicated writeMatrixMarket() function or should we just define a
  Matrix::writeMM() method for SparseMatrix objects? That method could
  simply do

      Matrix::writeMM(as(x, "CsparseMatrix"))

- Support names() getter and setter on a 1D SparseArray array as a shortcut
  for 'dimnames()[[1L]]' and 'dimnames()[[1L]] <- value' respectively.
  This is to mimic the R base array API.

- Implement fast nzvals() getter/setter for SVT_SparseMatrix objects.
  (Default methods work but are not as fast as they could be.)

- Add is.nonzero() generic with methods for SparseArray objects and other
  sparseMatrix derivatives from the Matrix package ([d|l|n]g[C|R]Matrix
  objects).
  is.nonzero(<SVT_SparseArray>) simply needs to strip off the "nzvals"
  component from the non-empty leaves and return the result as a logical
  SVT_SparseArray. This means that the returned object will be fully
  lacunar (i.e. all its leaves will be lacunar), like with is.na().
  is.nonzero(<[d|l|n]g[C|R]Matrix>) should return an ng[C|R]Matrix object.

- Add nzvals() methods for COO_SparseArray and SVT_SparseArray objects.
  Uncomment nzvals() examples in vignette and SparseArray-class.Rd

- Add unit tests for nzwhich() and nzvals() methods for COO_SparseArray
  and SVT_SparseArray objects.

- Parallelize more operations (with OpenMP). Right now only %*%, crossprod(),
  tcrossprod(), and the matrixStats operations are parallelized.

- Implement coercion from Hits to SVT_SparseMatrix. The returned object
  should be an integer SVT_SparseMatrix with only zeros and ones that is
  the adjacency matrix of the bipartite graph represented by the Hits object.
  It will be fully lacunar.
  Note that in the case of a SelfHits object the result will be a square
  SVT_SparseMatrix.
  Question: should multiple edges between the same two nodes produce values
  > 1 in the adjacency matrix? (Google this.) This means that the resulting
  SVT_SparseMatrix won't necessarily be fully lacunar.

- Use "sparseness" instead of "sparsity" in the doc when referring to the
  quality of being structurally sparse. Use "sparsity" to refer to the
  number (>= 0 and <= 1) that measures how sparse an object is.
  sparsity = 1 - density

- Subassignments like this need to work:
      svt[ , , 1] <- svt[ , , 3, drop=FALSE]
      svt[ , , 1] <- svt[ , , 3]

- Implement C_subassign_SVT_with_Rarray() and C_subassign_SVT_with_SVT().

- Speed up row selection: x[row_idx, ]
  THIS in particular is VERY slow on a SVT_SparseArray object:
      set.seed(123)
      svt2 <- 0.5 * poissonSparseMatrix(170000, 5800, density=0.1)
      dgcm2 <- as(svt2, "dgCMatrix")
      system.time(svt2[-2, ])    # arghhh!!
      #   user  system elapsed
      #  4.606   0.180   4.804
      system.time(dgcm2[-2, ])
      #   user  system elapsed
      #  0.267   0.244   0.513
  Good news is that this is very fast:
      system.time(svt2[2, ])
      #   user  system elapsed
      #  0.001   0.000   0.002
      system.time(dgcm2[2, ])
      #   user  system elapsed
      #  0.165   0.000   0.165

- Support subsetting by a character matrix in .subset_SVT_by_Mindex().

- Support subsetting by a character vector in .subset_SVT_by_Lindex() in
  the 1D case.

- Implement double-bracket subsetting of SVT_SparseArray objects. Both,
  the 1D-style form (e.g. 'svt[[8]]') and the N-dimensional form (e.g.
  'svt[[2,5,1]]' for a 3D object).

- Implement table() method for SVT_SparseArray objects of type logical,
  integer, or raw (should it go in R/SparseArray-summarization.R?)

- Implement function for loading a TENxMatrixSeed subset as an
  SVT_SparseMatrix. Test it on the 1.3 Million Brain Cell Dataset.

- Improve readSparseCSV() functionality by adding a few read.table-like args
  to it. See https://github.com/Bioconductor/SparseArray/issues/5 for the
  details.

- We can't use base::apply() or base::asplit() on a SparseArray derivative
  because they turn it into an ordinary (i.e. dense) array.
  So turn apply() and asplit() into generic functions and implement methods
  for SparseArray objects. These methods could either:
  1. Mimic the behavior of base::apply() and base::asplit() but preverve
     sparseness. However, it should be clarified in the man page for the
     matrixStats methods that using apply() to compute stats on an object
     with > 2 dimensions is not going to be as efficient as using
     the 'dims' argument (almost all matrixStats methods support it). The
     latter should be 10x or 100x faster when used on big objects.
  2. Just fail with a friendly error message.

- Maybe implement svd() for SVT_SparseMatrix objects?

- Maybe implement chol() for symmetric positive-definite square
  SVT_SparseMatrix objects?

- More SBT ("Sparse Buffer Tree") use cases:

  1. Implement C helper _push_vals_to_SBT_by_Mindex(), and modify coercion
     from COO_SparseArray to SVT_SparseArray to use that instead of
     C_subassign_SVT_by_Mindex(). This will probably lead to a cleaner/simpler
     implementation. But is it faster too?

  2. Revisit implementation of C_subassign_SVT_by_Mindex() and
     C_subassign_SVT_by_Lindex(): Can they use an SBT instead of the "extended
     leaves" approach? E.g. they would use _push_vals_to_SBT_by_Mindex()
     and _push_vals_to_SBT_by_Lindex(), respectively, then "merge" the SBT
     with the original SVT. This will probably lead to a cleaner/simpler
     implementation. But is it faster too?

  3. Revisit implementation of C_readSparseCSV_as_SVT_SparseMatrix(): Try to
     use an SBT instead of an ExtendableJaggedArray. Performance should not
     be impacted. Then we can get rid of the ExtendableJaggedArray thing.

- To help implement the Kronecker product (see below): Introduce
  arep_times(x, times) and arep_each(x, each) generics. The 'times'
  and 'each' args must be integer vectors with the same length as 'dim(x)'.
  They are multidimensional versions of 'rep(, times=t)' and 'rep(, each=e)'
  that perform the replications along each dimension of the array.

  For example, with x = matrix(letters[1:6], ncol=2):

      > x
           [,1] [,2]
      [1,] "a"  "d" 
      [2,] "b"  "e" 
      [3,] "c"  "f" 

   1. arep_times(x, times=c(2, 4)) returns:

      > do.call(cbind, rep(list(do.call(rbind, rep(list(x), 2))), 4))
           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
      [1,] "a"  "d"  "a"  "d"  "a"  "d"  "a"  "d" 
      [2,] "b"  "e"  "b"  "e"  "b"  "e"  "b"  "e" 
      [3,] "c"  "f"  "c"  "f"  "c"  "f"  "c"  "f" 
      [4,] "a"  "d"  "a"  "d"  "a"  "d"  "a"  "d" 
      [5,] "b"  "e"  "b"  "e"  "b"  "e"  "b"  "e" 
      [6,] "c"  "f"  "c"  "f"  "c"  "f"  "c"  "f" 

   2. arep_each(x, each=c(2, 4)) returns:

     > matrix(rep(t(matrix(rep(x, each=2), ncol=ncol(x))), each=4),
              byrow=TRUE, ncol=ncol(x)*4)
           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
      [1,] "a"  "a"  "a"  "a"  "d"  "d"  "d"  "d" 
      [2,] "a"  "a"  "a"  "a"  "d"  "d"  "d"  "d" 
      [3,] "b"  "b"  "b"  "b"  "e"  "e"  "e"  "e" 
      [4,] "b"  "b"  "b"  "b"  "e"  "e"  "e"  "e" 
      [5,] "c"  "c"  "c"  "c"  "f"  "f"  "f"  "f" 
      [6,] "c"  "c"  "c"  "c"  "f"  "f"  "f"  "f" 

  Note that the arep_times() and arep_each() generics really belong to
  the S4Arrays package (with the methods for ordinary arrays defined there
  too).

  Then kronecker(X, Y) (see below) can simply be obtained with:

    arep_each(X, each=dim(Y)) * arep_times(Y, times=dim(X))

  And this should be made the kronecker() method for Array objects.

- Implement kronecker() method (Kronecker product) for SVT_SparseArray objects.
  A quick scan of BioC 3.18 software packages reveals that 25+ packages call
  the kronecker() function. Would be interesting to know how many of them need
  to do this on sparse objects?

  Note that one way to go is to simply implement arep_times() and arep_each()
  methods for SVT_SparseArray objects. This will give us kronecker() for
  free (via the kronecker() method for Array objects defined in S4Arrays,
  see above).

  Good things to test (on SVT_SparseArray objects) once this is implemented:

  1. First mixed-product property (mixing with element-wise array
     multiplication a.k.a. "Hadamard product"):
       With 4 arrays A, B, C, D, all with the same number of dimensions,
       with A and C conformable, with B and D conformable, then:
         kronecker(A, B) * kronecker(C, D)
       must be the same as:
         kronecker(A * C, B * D)
       For example:
         A <- array(1:60, dim=5:3)
         C <- array(runif(60), dim=5:3)
         B <- array(101:180, dim=c(2,10,4))
         D <- array(runif(80), dim=c(2,10,4))
         stopifnot(all.equal(kronecker(A, B) * kronecker(C, D),
                             kronecker(A * C, B * D)))
         stopifnot(all.equal(kronecker(B, A) * kronecker(D, C),
                             kronecker(B * D, A * C)))

  2. Second mixed-product property (mixing with matrix multiplication):
       With 4 matrices A, B, C, D, with dimensions that allow one to do
       A %*% C and B %*% D, then:
         kronecker(A, B) %*% kronecker(C, D)
       must be the same as:
         kronecker(A %*% C, B %*% D)
       For example:
         A <- matrix(1:12, ncol=3)
         C <- matrix(runif(18), nrow=3)
         B <- matrix(101:120, ncol=5)
         D <- matrix(runif(10), nrow=5)
         stopifnot(all.equal(kronecker(A, B) %*% kronecker(C, D),
                             kronecker(A %*% C, B %*% D)))
         stopifnot(all.equal(kronecker(B, A) %*% kronecker(D, C),
                             kronecker(B %*% D, A %*% C)))

  See https://en.wikipedia.org/wiki/Kronecker_product for other properties
  to test.

- Try to speed up SVT_SparseArray transposition by implementing a one-pass
  approach that uses ExtendableJaggedArray intermediate buffers (offss, valss).
  See src/readSparseCSV.c where this approach is already used.
  Note that this will require that ExtendableJaggedArray structs are able
  to support other types of columns (only support int at the moment).

- Support 'match(svt, table)' where 'svt' is an SVT_SparseArray object
  and 'table' an atomic vector. This will give us 'svt %in% table' for free.

- Implement more matrixStats methods for SVT_SparseMatrix objects. Those
  that are still missing and are actually used in Bioconductor are:
  rowMeans2, rowSums2, rowRanks, rowQuantiles, rowMads, rowIQRs, rowAlls,
  rowCumsums, rowWeightedMeans, rowAnyNAs) + corresponding col* methods.

- Implement more summarization methods for SVT_SparseArray objects.
  See R/SparseArray-summarization.R

- Add unit tests for the SVT_SparseArray stuff.

- Go after dgCMatrix objects in ExperimentHub (query(eh, "dgCMatrix")),
  convert them to SVT_SparseMatrix objects and try to do the things that
  are usually done on them.

- Convert 8322787x1098 dgTMatrix (ExperimentHub resource EH5453) to
  SVT_SparseMatrix and try to do the things that the curatedMetagenomicData
  folks usually do on it.