diff --git a/DESCRIPTION b/DESCRIPTION index ca0b632..261cba3 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -42,5 +42,5 @@ Description: Collective matrix factorization (a.k.a. multi-view or multi-way fac License: MIT + file LICENSE Suggests: Matrix, MatrixExtra, RhpcBLASctl, recosystem (>= 0.5), recommenderlab (>= 0.2-7), MASS, knitr, rmarkdown, kableExtra VignetteBuilder: knitr -RoxygenNote: 7.2.2 +RoxygenNote: 7.2.3 NeedsCompilation: yes diff --git a/R/fit.R b/R/fit.R index 620f0d7..497cf10 100644 --- a/R/fit.R +++ b/R/fit.R @@ -48,7 +48,8 @@ NULL #' and the objective is to minimize squared error over the non-missing entries, in the #' implicit-feedback variants the matrix `X` is assumed to be binary (all entries are zero #' or one, with no unknown values), with the positive entries (those which are not -#' missing in the data) having a weight determined by `X`. +#' missing in the data) having a weight determined by `X`, and without including any +#' user/item biases or centering for the 'X' matrix. #' #' `CMF` is intended for explicit feedback data (e.g. movie ratings, which contain both #' likes and dislikes), whereas `CMF_implicit` is intended for implicit feedback data diff --git a/cmfrec/__init__.py b/cmfrec/__init__.py index 9eac264..311d457 100644 --- a/cmfrec/__init__.py +++ b/cmfrec/__init__.py @@ -4412,6 +4412,14 @@ class CMF_implicit(_CMF): :math:`\mathbf{I} \sim \mathbf{B} \mathbf{D}^T` + Compared to the ``CMF`` class, here the interactions matrix 'X' treats missing + entries as zeros and non-missing entries as ones, while the values supplied for + interactions are applied as weights over this binarized matrix 'X' (see references + for more details). Roughly speaking, it is a more efficient version of `CMF` with + hard-coded arguments ``NA_as_zero=True``, ``center=False``, ``user_bias=False``, + ``item_bias=False``, ``scale_lam=False``, plus a different initialization of factor + matrices, and 'X' converted to a weighted binary matrix as explained earlier. + Note ---- The default hyperparameters in this software are very different from others. diff --git a/man/fit.Rd b/man/fit.Rd index c4cf01a..7bf6a51 100644 --- a/man/fit.Rd +++ b/man/fit.Rd @@ -780,7 +780,8 @@ case, based on reference [3]. While in `CMF` the values of `X` are taken at face and the objective is to minimize squared error over the non-missing entries, in the implicit-feedback variants the matrix `X` is assumed to be binary (all entries are zero or one, with no unknown values), with the positive entries (those which are not -missing in the data) having a weight determined by `X`. +missing in the data) having a weight determined by `X`, and without including any +user/item biases or centering for the 'X' matrix. `CMF` is intended for explicit feedback data (e.g. movie ratings, which contain both likes and dislikes), whereas `CMF_implicit` is intended for implicit feedback data