Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to evaluate the recommender(e.g. P@k) with Implict #30

Closed
kyowill opened this issue May 16, 2017 · 15 comments
Closed

How to evaluate the recommender(e.g. P@k) with Implict #30

kyowill opened this issue May 16, 2017 · 15 comments

Comments

@kyowill
Copy link

kyowill commented May 16, 2017

No description provided.

@benfred
Copy link
Owner

benfred commented May 17, 2017

Unfortunately, there isn't currently any support for evaluating models built into this library. It shouldn't be too hard to add - and is something I'm looking at doing.

@Akarshit
Copy link

@benfred Hey I am planning to use this lib. in production. But before that I want to evaluate the performance of the algo. I am using the ALS for recommendation.
Could you point me to the right direction about testing this.

@vhfmag
Copy link

vhfmag commented Jun 22, 2017

How should the evaluation go? I'm also planning to use the lib in production, but it would be great to have a metric of its accuracy before that. Is help needed?

@snexus
Copy link

snexus commented Jun 23, 2017

Hey @benfred, support for evaluation would be great in order to use this library in production.

Meanwhile, was thinking about the following for evaluation:

  1. Split data in train/test
  2. Fit model on the train set.
  3. For every entry in the test set, hide randomly one item.
  4. Use the model to provide recommendations for every entry in the test set with hidden item excluded.
  5. If hidden item is in top N recommended items -> true positive.
  6. Calculate some metric, e.g. recall.
  7. Bootstrap steps 3-6 many times to estimate the distribution of metric of interest.

Do you think it is feasible approach, or some simpler evaluation may be implemented?

@Akarshit
Copy link

@snexus I did the same to evaluate the system.

@snexus
Copy link

snexus commented Jun 28, 2017

FYI - https://jessesw.com/Rec-System/ contains a good approach for validation. It is easy to adapt to your needs.

@jbochi
Copy link
Contributor

jbochi commented Jul 8, 2017

I've managed to do grid search and cross-evaluation using scikit-learn, even though it does not have built in support for recommenders: scikit-learn/scikit-learn#6142

I had to create a few custom classes:

  • ALSEstimator, that wraps AlternatingLeastSquares and turns it into a scikit-learn Estimator.
  • A cross-validation splitter wrapping PredefinedSplit. It holds out p items for each user in every split.
  • A custom scorer that calculates ndcg

The code is in this gist: https://gist.github.com/jbochi/2e8ddcc5939e70e5368326aa034a144e#file-evaluation-ipynb

Do you guys have any suggestions to improve it?

Would it make sense to add some of this to scikit-learn?

@antonioalegria
Copy link

@jbochi your code is great but doesn't seem to work with datasets with very sparse data where the train and test could have different users/items (gives out index-out-of-bounds errors). Any ideas here?

@benfred do you expect support for this kind of evaluation soon?

@benfred
Copy link
Owner

benfred commented Jun 14, 2018

@antonioalegria I've added some basic support for map@k and p@k that you can use in the latest version - there is an example of how to call here: #108 (comment)

I'm leaving this issues open until I get around to writing some documentation on this =)

@antonioalegria
Copy link

Thanks @benfred does the train/test split deal well with them ending up with different users and items?

@benfred
Copy link
Owner

benfred commented Jun 19, 2018

@antonioalegria the train_test_split function should handle that (the returned matrices should have the same dimensions as the input - so there shouldn't be any out of bounds errors).

@oliviernguyenquoc
Copy link

Any plan for a Recall@k metric ?

It shouldn't be a lot of work but I can't understand Cython myself :(

@benfred
Copy link
Owner

benfred commented Sep 1, 2018

I wasn't planning on adding a recall@k metric - but it shouldn't be difficult I guess (I think it's just replacing this line https://github.com/benfred/implicit/blob/master/implicit/evaluation.pyx#L114 with total += likes.size() ?).

@oliviernguyenquoc
Copy link

@benfred If I understand well, it should be that.
If you divide the results by the size of the test set, it's good (but again, it's difficult for me to read Cython).

A quick win ;)

Thanks for all. This library rocks.

thisisjl added a commit to thisisjl/implicit that referenced this issue Feb 15, 2019
@yvonnerahnfeld
Copy link

@antonioalegria I've added some basic support for map@k and p@k that you can use in the latest version - there is an example of how to call here: #108 (comment)

I'm leaving this issues open until I get around to writing some documentation on this =)

Hi, whenever I try the example in #108 I get this error:
index 3953 is out of bounds for axis 0 with size 3953

Is there possibly an error in the example code?
And is there a documentation available?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants