0% found this document useful (0 votes)
45 views

Recommender Systems-Chapter 4

This document discusses collaborative filtering techniques for recommender systems. It covers memory-based user-item and item-item collaborative filtering, and compares the two approaches. It also discusses model-based collaborative filtering using matrix factorization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

Recommender Systems-Chapter 4

This document discusses collaborative filtering techniques for recommender systems. It covers memory-based user-item and item-item collaborative filtering, and compares the two approaches. It also discusses model-based collaborative filtering using matrix factorization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

Recommender Systems

Chapter 4

Sara Qassimi
[email protected]

L2IS Laboratory , FST Marrakech, Cadi Ayyad University

1
Collaborative filtering
Memory based : Item based CF
Item-Item Collaborative Filtering: “Users who liked this item also liked …”
item-item filtering will take an item, find users who liked that item, and find
other items that those users or similar users also liked. It takes items and
outputs other items as recommendations.

If John, Robert and Jenny highly rated


sci-fi books Fahrenheit 451 and The
time machine, for example gave 5 stars,
then when Tom buys the book
Fahrenheit 451 then the book The time
machine is also recommended to him
because the system identified books as
similar based on user ratings.

Pr. Sara Qassimi 2


FST- UCA
Item-Item Collaborative Filtering

● 2 ways of looking at the problem:


○ One less intuitive (thinking about it from scratch)
○ Second is that it is no different from user-user CF

Pr. Sara Qassimi 3


FST- UCA
Recap
● For user-user CF, for a user x, we want to find “user like the user x”
● The items that those users have rated;that the user x had not rated,
become the recommendations for user x
● it is intuitive that if they are “like the user x”, user x would like the items
that they have rated highly

Pr. Sara Qassimi 4


FST- UCA
Recap (continue)
● user-user CF looks row-wise
● each row is a vector
● 2 users are similar if their row vectors have small distance between
them

Pr. Sara Qassimi 5


FST- UCA
Item-Item Collaborative Filtering
● what if we look column-wise instead?
● let’s find 2 items that are similar
● they are similar if their column vectors’ distance is small

Pr. Sara Qassimi 6


FST- UCA
Example
● The correlation between two column vectors is high
● if a user like Power Rangers, this user will also like Transformers
● The way that we see that two movies are similar is because users have
given them similar ratings

Pr. Sara Qassimi 7


FST- UCA
Item Correlation

Pr. Sara Qassimi 8


FST- UCA
Item score

● Deviation: How much user i likes item j’, compared to how much
everyone else likes j’ ( not as intuitive as user-user CF)
● If user i really likes j’ (more than other users do) and j is similar to j’
( is high), then user i probably likes j too

Pr. Sara Qassimi 9


FST- UCA
Comparison
● User-User CF: choose items for a user, because those items have been
liked by similar users

● Item-Item CF: choose item for a user, because this user has liked
similar items in the past

Pr. Sara Qassimi 10


FST- UCA
Another perspective

● Pretend items are people, so they have feelings

● Flip the user-items matrix sideways

● Each entry tells me “ How much item j likes user i”

● To choose a user to recommend to item j, I can look

at other items j’ who liked the same users as item j

● If item j and j’ are similar, then they like the same users

● User-based and Item-based CF are mathematically

identical
Pr. Sara Qassimi 11
FST- UCA
Practical Differences
● When comparing two items, there are more data than when comparing
2 users
○ Each user : up to ~ 20k items to look at
○ Each item: up to ~ 100k users to look at
○ Thus for item-based CF, weights are calculated based on more data

● Item-based CF is faster 2
○ Given a user, calculate
2
scores or each item; O(M N)
■ There are M item-item weights, and each vector is
length N 2
○ For user-based
2 CF, O(N M) 2
○ N >> M, so N compared ti M is even worse
● Like user-based CF, limit to neighborhood(20)

Pr. Sara Qassimi 12


FST- UCA
Exercice :
item based CF for recommending movies
● Use user-user CF script as a base to implement item-item CF
● Outline:
○ Splitting dataset into Train and test sets
○ Calculate weights using train set
○ Make a predict function , e.g predict (i,j) that return the score
○ Output MSE for train and test sets

Pr. Sara Qassimi 13


FST- UCA
Tutorial
Implementation : Item based CF

Pr. Sara Qassimi 14


FST- UCA
Implementation : Item based CF

Pr. Sara Qassimi 15


FST- UCA
Summary of Collaborative Filtering
● Previously, we considered s(j) only - a single score for each item regardless
user profile
● The section of CF; s(i,j) score depends on user i and item j
● Problem with average rating:
○ Not all ratings should be treated equally
○ Users who the target user agree with should be weighted higher
○ Users who the target user disagree with should be weighted lower
● Pearson correlation as weights
● Score as deviation : linear regression
○ How much better a user likes an item compared to
how they normally do
○ Deviations of users who are like the target user

Pr. Sara Qassimi 16


FST- UCA
Summary of Collaborative Filtering
User-User and Item-Item
● By flipping the ratings matrix sideways, we can convert user-user CF
algorithm into an item-item CF algorithm
● Item-based CF is more accurate; it gets to calculate its correlation
weights across a large number of users; which gives more data to work
with.

Pr. Sara Qassimi 17


FST- UCA
Summary of Collaborative Filtering
Do we care about accuracy?
● Not necessarily
● Item-based CF may be “too” accurate
○ Only recommends items that tend to be very obvious and expected “similar items”
○ Lack of diversity in the recommendations - the “Youtube Problem”
● While user-based CF gives worst Mean Squared Error MSE; it might
actually be more desirable in practice.

Pr. Sara Qassimi 18


FST- UCA
Summary of Collaborative Filtering
Not necessarily movies / ratings
● User-Item Matrix does not have to be ratings at all!
● Explicit Feedback is sparse
● # of times user viewed an item
● Did they purchase?
● Hit the like button?
● Share on social media?
● Context-Aware Recommender System

Pr. Sara Qassimi 19


FST- UCA
Collaborative filtering
Memory based Shortcomings:

● They are called memory-based because the algorithm is not


complicated, but requires a lot of memory to keep track of the results.
● Computationally expensive: user-item or item-item matrix are loaded
in-memory for similarity calculations
● Cold Start problem: fails to recommend to the first-time users and
items

Pr. Sara Qassimi 20


FST- UCA
Personalized RS : Model based Collaborative Filtering

Collaborative Filtering

Memory based Model based

Matrix
User based Item based Deep learning Clustering
Factorization

Pr. Sara Qassimi 21


FST- UCA
Personalized RS : Model based Collaborative Filtering

Pr. Sara Qassimi 22


FST- UCA
Personalized RS : Model based Collaborative filtering
● The model is typically trained on a large dataset of user-item interactions, and
various machine learning techniques can be used to learn the relationships
between users and items.
● This approach can handle the cold-start problem, where there is insufficient
information about a new user or item to make accurate recommendations.
● Matrix Factorization is a popular technique that involves decomposing the
user-item interaction matrix into two lower-dimensional matrices that capture the
latent factors that influence user preferences.

Pr. Sara Qassimi 23


FST- UCA
Personalized RS : Model based Collaborative filtering

Pr. Sara Qassimi 24


FST- UCA
Personalized RS : Model based Collaborative filtering

The idea behind Matrix Factorization is that we want to express the Matrix R
in terms of a product of two smaller matrices W in U
Pr. Sara Qassimi 25
FST- UCA
Personalized RS : Model based Collaborative filtering

Pr. Sara Qassimi 26


FST- UCA
Personalized RS : Model based Collaborative filtering

Pr. Sara Qassimi 27


FST- UCA
MF reduces the dimensionality of R : To learn the most important features required
to generate R

R(NXM)

W (NxK)
U(MxK)

Pr. Sara Qassimi 28


FST- UCA
Computationally expensive

Pr. Sara Qassimi 29


FST- UCA
Pr. Sara Qassimi 30
FST- UCA
is i'th row of an N x K matrix W, the result is not a matrix (1 x K)
is a column vectors (i.e., k x 1 matrices) (see: Linear Regression; design matrix)

is jth row of an M x K matrix U, the result is not a matrix (1 x K)


is a column vectors (i.e., k x 1 matrices) (see: Linear Regression; design matrix)

Interchange the order of the multiplication


and the transpose operation.
Scalar Pr. Sara Qassimi 31
FST- UCA
Pr. Sara Qassimi 32
FST- UCA
Pr. Sara Qassimi 33
FST- UCA
Pr. Sara Qassimi 34
FST- UCA
There's a strong negative correlation between the
features of the film Titanic and user preferences.

Pr. Sara Qassimi


FST- UCA 35
MF extracts the "Features" automatically using only ratings
by looking at the patterns between users items and their ratings
Pr. Sara Qassimi 36
FST- UCA
MF is a way of reducing the
dimensionality of the ratings
matrix.
The model learn the most
important features required to
recreate R.

Pr. Sara Qassimi 37


FST- UCA
R(NXM)

we would like them to be close together.

This scalars is the predicted rating of the user i to an item j;


It represents the correlation between the features of the item and user preferences.
Pr. Sara Qassimi 38
FST- UCA
Predicted Rating
Actual Rating

39
Pr. Sara Qassimi
FST- UCA
Predicted Rating
Actual Rating

The gradient : how much each


parameter should be adjusted
in order to decrease the loss.

Pr. Sara Qassimi 40


FST- UCA
During training, the goal is to update the model parameters in the direction of the negative
gradient of the loss function so as to minimize the loss.When we set the gradient to zero, we
are essentially finding the critical points of the loss function, i.e., the points where the slope
of the function is zero.

Pr. Sara Qassimi 41


FST- UCA
Pr. Sara Qassimi 42
FST- UCA
Dot product is commutative

Pr. Sara Qassimi 43


FST- UCA
Pr. Sara Qassimi 44
FST- UCA
Pr. Sara Qassimi 45
FST- UCA
Pr. Sara Qassimi 46
FST- UCA
Pr. Sara Qassimi 47
FST- UCA
Pr. Sara Qassimi 48
FST- UCA
Pr. Sara Qassimi 49
FST- UCA
Pr. Sara Qassimi 50
FST- UCA
Pr. Sara Qassimi 51
FST- UCA
Pr. Sara Qassimi 52
FST- UCA
We have two parameters w and u and therefore they both need to be updated.

Pr. Sara Qassimi 53


FST- UCA
It's been proven that on each step
we always get closer to a local
minimum.

Local minimum are called so since


the value of the loss function is
minimum at that point in a local
region.

Update

Update

Pr. Sara Qassimi 54


FST- UCA
y-intercept controls how much the line is pushed
upwards on the X plane.

If b is not included then the line would always be forced


to pass through the origin,
which may not actually fit the dataset. Pr. Sara Qassimi 55
FST- UCA
Pr. Sara Qassimi 56
FST- UCA
Help to improve the accuracy of the predicted ratings of a user concerning a movie.

Bias Terms

Pr. Sara Qassimi 57


FST- UCA
Help to improve the accuracy of the predicted ratings of a user concerning a movie.

Bias Terms

Users might be
very optimistic or
very pessimistic.
The bias of user
adjust the rating
based on the
user's general
tendency to rate
movies higher or
lower than
average.
Pr. Sara Qassimi 58
FST- UCA
Help to improve the accuracy of the predicted ratings of a user concerning a movie.

The bias of a movie would


describe how well this movie
Bias Terms is rated compared to the
average, across all movies. It
adjusts the rating based on
Users might be very
optimistic or very
the movie's general tendency
pessimistic. to be rated higher or lower
The bias of user than average
adjust the rating
based on the user's
general tendency to
rate movies higher
or lower than
average.

Pr. Sara Qassimi 59


FST- UCA
Help to improve the accuracy of the predicted ratings of a user concerning a movie.

The bias of a movie would


describe how well this movie is
rated compared to the average,
Bias Terms
across all movies. It adjusts the
rating based on the movie's general
tendency to be rated higher or
Users might be very lower than average
optimistic or very
pessimistic.
The bias of user Global Average for Censoring
adjust the rating
dataset.
based on the user's
general tendency to Censoring in a dataset refers to a
rate movies higher situation where some of the values
or lower than are not recorded accurately, and are
average. instead replaced with a known or
estimated value.
Pr. Sara Qassimi 60
FST- UCA
Pr. Sara Qassimi 61
FST- UCA
Pr. Sara Qassimi 62
FST- UCA
Pr. Sara Qassimi 63
FST- UCA
Pr. Sara Qassimi 64
FST- UCA
Since the equations for w and u are symmetric, we can apply the same logic to determine u

Pr. Sara Qassimi 65


FST- UCA
Pr. Sara Qassimi 66
FST- UCA
Set of movies that user i have rated
Pr. Sara Qassimi 67
FST- UCA
Set of users who rate the movie j Pr. Sara Qassimi 68
FST- UCA
Pr. Sara Qassimi 69
FST- UCA
Pr. Sara Qassimi 70
FST- UCA
Add square magnitude of each parameter multiplied by regularization constant

Pr. Sara Qassimi 71


FST- UCA
Regularization

Pr. Sara Qassimi 72


FST- UCA
Pr. Sara Qassimi 73
FST- UCA
Implementation : Matrix Factorization

Pr. Sara Qassimi 74


FST- UCA
The training and test losses are calculated after
each epoch using the get_loss function.

Pr. Sara Qassimi 75


FST- UCA
Both the training and test loss are low and close to
each other. This indicates that the model is
performing well on the training data and is also
generalizing well to new data

model is learning to fit the training data better

Pr. Sara Qassimi 76


FST- UCA

You might also like