-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting poor performance with partial_fit_items #682
Comments
I found out the problem. The
I also created an example notebook with Google Colab showing that the test that I proposed works as expected with the Movie Lens Dataset. It might be of help to anyone trying to use |
We were seeing some poor result on partial_fit_*, that ended up caused by passing a CSC instead of a CSR matrix. (#682) Add the same `check_csr` code to the partial_fit methods that is already done in the fit method - this will warn if passed a non-csr matrix, and automatically convert.
Passing a CSC would explain the poor results! I'm glad you figured it out. I've added some checks in #683 that will detect if a non-csr matrix is passed to the partial_fit methods - so hopefully other people won't hit this same issue.
Thanks for creating that notebook! I do have a couple questions about this though:
Having the distance values be incredibly large seems like a bug =(. I wonder if this is because of issues with the item norms, which are cached between similar_items calls. Can you try with calling
Do you know what version of the CUDA driver/runtime is on this machine? (I work at nvidia, so feel like this really is something that I should get working =) ) |
We were seeing some poor result on partial_fit_*, that ended up caused by passing a CSC instead of a CSR matrix. (#682) Add the same `check_csr` code to the partial_fit methods that is already done in the fit method - this will warn if passed a non-csr matrix, and automatically convert.
I tried to remove the cache as you proposed and it worked wonderfully! The distances (or similarities) are very close among the tests employed on the notebook. Regarding to the CUDA error, did you get this when you tried to use the GPU on the Google Colab instance? I did a quick test here, just by changing the runtime to use a T4 GPU, and used the ALS version for GPU, and the code ran a lot faster! Also the CUDA version is:
|
Thanks @franzoni315 - I've added a fix here to automatically clear the norms #685, and will be in the next release, which I will push out later today. For the CUDA issue, I just tried it out on colab - and it looks like the warning message |
Hi! I am using this library in order to get user and item embeddings, which will be used in a clustering algorithm in order to find clusters of users according the their preferences. I already have some good results with these clusters, and now I am trying to make this work with new items and users.
From what I researched, the
partial_fit_items
(orpartial_fit_users
) are functions designed to update the factors only for a subset of items or users. This is great, because I want to keep the factors frozen for all items, except for the new ones, since this will guarantee the stability of my embeddings during time.The problem that I am dealing with is that the
partial_fit_items
function is not positioning the new item in a meaningful location inside the embedding space. I did the following test to verify this:model_1
)model_2
)partial_fit_items
with the item that was previously removed -> (updated_model_2
)I developed a helper function that can update the factors of new items and new users, which is used in step 4:
I expected that the factors produced by
model_1
andupdated_model_2
for the selected item would be close to the same items. For instance, imagine that I have data from movies, and there is a subset with Rocky Balboa's movies. Withmodel_1
all Rocky's movies are close to each other, but withupdated_model_2
, if Rocky I was the selected movie for the test, it falls on a weird position on the embedding space, and the other Rocky movies are no longer closer to it.Can someone help me to understand if my test procedure makes sense? It would be amazing to make this to work and put this in production! Thanks!
The text was updated successfully, but these errors were encountered: