Nearest Neighbours Recommendations #14

benfred · 2016-12-27T07:05:42Z

This adds a fast and memory efficient of Item-Item KNN Recommendation models.

Calculating the Similarity matrix is based on the algorithm described in the
paper 'Sparse Matrix Multiplication Package (SMMP)'
(www.i2m.univ-amu.fr/~bradji/multp_sparse.pdf), but modified so that only the
top K rows are selected using a heap. This means that we can calculate
the similarity matrix even when the full similarity matrix wouldn't fit in
available memory. This calculation is also parallelized unlike the sparse matrix
multiply in scipy.

Also switch to using C++ instead of C for Cython, run flake8 on the Cython code,
add an isort check and cpplint check, and fix some issues with the ALS unittest
intermittently failing.

benfred · 2016-12-27T07:10:08Z

still todo:

parallelize calculation
add scorer class
example usage
add save/load to scorer

This adds a fast and memory efficient of Item-Item KNN Recommendation models. Calculating the Similarity matrix is based on the algorithm described in the paper 'Sparse Matrix Multiplication Package (SMMP)' (www.i2m.univ-amu.fr/~bradji/multp_sparse.pdf), but modified so that only the top K rows are selected using a heap. This means that we can calculate the similarity matrix even when the full similarity matrix wouldn't fit in available memory. This calculation is also parallelized unlike the sparse matrix multiply in scipy. Also switch to using C++ instead of C for Cython, run flake8 on the Cython code, add an isort check and cpplint check, and fix some issues with the ALS unittest intermittently failing.

chapleau · 2017-02-14T18:47:00Z

Thanks for providing this very neat package.
I was just wondering if, from a performance point of view, going to C++ from C for Cython makes a significant improvement ? Are the APIs/functions backward compatible ?
Thanks!

benfred · 2017-02-14T18:50:14Z

Performance should be identical between C++ and C.

The API's and functions are also compatible from Python - I changed to C++ mainly to use the heap functions provided with the STL: https://github.com/benfred/implicit/blob/master/implicit/nearest_neighbours.h#L21

benfred added 2 commits February 5, 2017 20:35

Merge branch 'master' into nearest_neighbours

bc548a0

benfred force-pushed the nearest_neighbours branch from 2230bc6 to bc548a0 Compare February 6, 2017 04:47

benfred changed the title ~~first draft nearest neighbours code~~ Nearest Neighbours Recommendations Feb 6, 2017

benfred merged commit f5a3cdc into master Feb 12, 2017

benfred deleted the nearest_neighbours branch February 12, 2017 17:51

benfred mentioned this pull request Mar 30, 2017

ValueError: negative row index found on input #4

Closed

MariosGr mentioned this pull request Mar 17, 2018

model fit crushes with more than 2^31 positive interactions #86

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nearest Neighbours Recommendations #14

Nearest Neighbours Recommendations #14

benfred commented Dec 27, 2016 •

edited

Loading

benfred commented Dec 27, 2016 •

edited

Loading

chapleau commented Feb 14, 2017

benfred commented Feb 14, 2017 •

edited

Loading

Nearest Neighbours Recommendations #14

Nearest Neighbours Recommendations #14

Conversation

benfred commented Dec 27, 2016 • edited Loading

benfred commented Dec 27, 2016 • edited Loading

chapleau commented Feb 14, 2017

benfred commented Feb 14, 2017 • edited Loading

benfred commented Dec 27, 2016 •

edited

Loading

benfred commented Dec 27, 2016 •

edited

Loading

benfred commented Feb 14, 2017 •

edited

Loading