Query top N recommended items #24

NumbaCruncha · 2017-04-05T01:29:49Z

Hi Ben,

I'm using implicit to predict a top7list of recommendations using a sparse matrix of aggregated customer purchases composed of 7101 customer purchases from 24 products.

The issue I'm having is that I'm a little confused at the output from .recommend which produces a list of N tuples:

[(845, 1.0136324354312989), (1150, 1.0028331824506354), (51, 1.0027650376439357), (2411, 1.0024685562873292), (1810, 1.0019960930254448), (1211, 1.0018685279069661), (775, 1.0018545578136604)]

Now I would have expected the first value in the tuple to be an index to the product list, but I suspect that I'm looking at the indices for the latent factor vectors? If you give me a steer about the process for extracting out the product identities it would be very much appreciated.

Kind regards,
Michael.

`

import pandas as pd
import scipy.sparse as sparse
import numpy as np
import implicit
# import data and add header rows
data = pd.read_csv('D:\santander\\train_sample_small.csv', names=['cust_id', 'product', 'rating'])
# transform dataset to sum by activity
grouped_data = data.groupby(['cust_id', 'product']).sum().reset_index()
grouped_data.head()

# Only get customers where purchase totals were positive
grouped_purchased = grouped_data.query('rating > 0')
print(grouped_purchased.head())

# Get our unique customers
customers = list(np.sort(grouped_purchased.cust_id.unique()))

# Get our unique products that were purchased
products = list(grouped_purchased['product'].unique())

# All of our purchases
rating = list(grouped_purchased.rating)

# Get the associated row/column indices
rows = grouped_purchased['cust_id'].astype('category', categories=customers).cat.codes
cols = grouped_purchased['product'].astype('category', categories=products).cat.codes

# create sparse matrix from data
purchases_sparse = sparse.csr_matrix((rating, (rows, cols)), shape=(len(customers),    len(products)), dtype=np.float64)

# Build, fit model and recommend top 7 products for first user
model = implicit.als.AlternatingLeastSquares(factors=50, regularization=0.1, iterations=50)
model.fit(item_users=purchases_sparse)
recom = model.recommend(userid=0, user_items=purchases_sparse.T, N=7)`

The text was updated successfully, but these errors were encountered:

benfred · 2017-05-03T03:09:28Z

Its the index into the matrix you passed into the 'fit' function - you'll need to map from the category id in your 'rows' back to the category. The example file shows how to do this here https://github.com/benfred/implicit/blob/master/examples/lastfm.py#L122-L128

Also the userid in the 'recommend' method is the column id in the item_users matrix.

igorkf · 2020-11-25T20:59:36Z

You can create a mapping like this:

user2idx = dict(zip(pivot_table['user_id'].cat.categories[pivot_table['user_id'].cat.codes].tolist(),
                    pivot_table['user_id'].cat.codes.tolist()))
idx2user = {x[1]: x[0] for x in user2idx.items()}

The first will create a dictionary where each key is a user_id, and each corresponding value is the user_index of the sparse matrix:

{user_id_0: user_index_0, user_id_1: user_index_1, ...}

The second is just the reverse mapping:

{user_index_0: user_id_0, user_index_1: user_id_1, ...}

So in the recommend() method you need to pass the user_index, not the user_id.
After this, the method will return the item_indexes, so you need a mapping idx2item too!

benfred closed this as completed May 3, 2017

MariosGr mentioned this issue Mar 17, 2018

model fit crushes with more than 2^31 positive interactions #86

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query top N recommended items #24

Query top N recommended items #24

NumbaCruncha commented Apr 5, 2017

benfred commented May 3, 2017

igorkf commented Nov 25, 2020 •

edited

Loading

Query top N recommended items #24

Query top N recommended items #24

Comments

NumbaCruncha commented Apr 5, 2017

benfred commented May 3, 2017

igorkf commented Nov 25, 2020 • edited Loading

igorkf commented Nov 25, 2020 •

edited

Loading