RecSys - Final (Solution)
RecSys - Final (Solution)
Spring 2023
School of Computing
Final Examination [Solution]
27th May 2023, 09:00 am – 12:00 noon
Instructions:
1 OF 3
Model-based Systems: Use data only once for training and then make predictions without
needing to use the data again.
Memory-based Systems: Use the entire data every time a new prediction is to be made.
h) How well does a content-based system perform in terms of scalability?
Content-based systems are scalable in terms of users as only a new user profile is to be
constructed.
i) Name two major advantages of Naïve Bayes Filtering over regular Collaborative Filtering.
i. It provides more accurate recommendations compared to vanilla collaborative
filtering.
ii. It can provide ranking of predicted items more easily.
j) Will Pearson Correlation be an appropriate similarity measure to use if the underlying
distribution of ratings is uniform? Justify your answer with a reason.
No. The correlation coefficient flattens (becomes zero) if the underlying distribution of ratings
is uniform.
Rohail 5 1 ? 2
Saif 1 5 2 5
Ahsan 2 ? 3 4
Qasim 4 3 5 ?
Use item-based collaborative filtering (with adj. cosine) to predict R(Ahsan, Wall-E).
Solution:
For k = 1:
Mean(Rohail) = 2.66, Mean(Saif) = 3.25, Mean(Ahsan) = 3, Mean(Qasim) = 4
Ahsan -1 ? 0 1
Qasim 0 -1 1 ?
2 OF 3
b) For the interaction matrix presented in Part (a), write Python code snippet to perform the
following operations:
I. Read the matrix from ratings.csv into a dataframe named “inter_matrix”.
import pandas as pd
inter_matrix = pd.read_csv(‘ratings.csv’)
II. Print all the ratings given by the user “Saif”.
print(inter_matrix.iloc[1].values)
c) For the matrix in Part (a), find the user that is most similar to the active user (Ahsan) using
Pearson Correlation.
Corr(Ahsan, Rohail) = -0.87
Corr(Ahsan, Saif) = 0.960
Corr(Ahsan, Qasim) = 0
Most similar user is Saif.
a) Consider the following matrix which represents data for the performance rating given by
auditors to each of the branches. The rating system is {1, 2, 3}.
Auditor 1 1 1 2
Auditor 2 2 ? 1
Auditor 3 3 1 3
Find the missing rating using Naïve Bayes Collaborative Filtering based on users.
Solution:
(Assuming alpha/beta to be 0.05)
P(ri2 = 1) = 2+0.05 / 2+0.05 = 1
P(ri2 = 2) = 0+0.05 / 2+0.05 = 0.024
P(ri2 = 3) = 0+0.05 / 2+0.05 = 0.024
P(X|ri2=1) = (0+0.05 / 2+0.05) x (0+0.05 / 2+0.05) = 0
P(X|ri2=2) = (0+0.05 / 0+0.05) x (0+0.05 / 0+0.05) = 1
P(X|ri2=3) = (0+0.05 / 0+0.05) x (0+0.05 / 0+0.05) = 1
P(ri2=1|X) = 1 x 0 = 0
P(ri2=2|X) = 0.024 x 1 = 0.024
P(ri2=3|X) = 0.024 x 1 = 0.024
3 OF 3
b) Explain the role of serendipity in probabilistic collaborative filtering approach.
Unlike in vanilla collaborative filtering, we cannot control serendipity by adjusting the
hyperparameter value in probabilistic techniques. However, one way to control serendipity is
by filtering only the relevant users based on their significance or total items rated. Though it
still cannot give us full control over the degree of serendipity.
Consider the following matrix for anime recommendation where 1 and -1 represent like and dislike,
respectively. Answer the given questions:
User 1 1 ? 1 -1
User 2 -1 1 1 1
User 3 ? ? -1 -1
User 4 -1 1 1 ?
a) Is it possible to make more than three factors of this matrix? Defend your answer with a proof.
Let us assume that we want to break the matrix into four matrices such that:
M = F1 * F2 * F3 * F4
The factors we can make are:
F1 is a 4x2 matrix
F2 is a 2x2 matrix
F3 is a 2x1 matrix
F4 is a 1x4 matrix
b) For what values of m, k, l and n, will the three factors (m x k), (k x l) and (l x n) consume less
memory in total than the original interaction matrix?
For m = 4, k = 1, l = 2, n = 4 i.e., factors (4x1) (1x2) (2x4) it will take a storage size of 14
compared to the original matrix of size 16.
c) We used interactions instead of ratings in the given matrix. Can this approach help gradient
descent converge quicker?
Yes. Using binary ratings (interaction) can help the algorithm converge faster due to reduced
feasible solution space.
d) Does matrix factorization suffer from cold-start problem as severely as Collaborative
Filtering? Justify your answer.
Matrix factorization does suffer from cold-start problem and in many cases, it is even more severe
than collaborative filtering. To find meaningful factors, we require the matrix to be as less sparse as
possible in matrix factorization technique.
4 OF 3
Question 5: Content-based Filtering (CLO: 2) 20 points (15 + 5)
a) Consider the following interaction table for news articles and keywords where “Article 2” is
the current (active) item. The table is populated with pre-computed TF-IDF values. Perform
the given tasks.
b) For the interaction table presented in Part (a), write Python code snippet to save it to a
dataframe named “news” and then calculate the pairwise cosine similarity on it.
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity
news = pd.DataFrame(interaction_matrix)
similarities = cosine_similarity(news)
5 OF 3
Question 6: Neural Recommendations (CLO: 4) 10 points (5 + 5)
a) Explain why the following ratings data cannot be properly split into training and testing
samples for neural recommendations:
I1 I2 I3 I4 I5 I6 I7 I8 I9 I10
U1 2 1 3 5 4 2 1 2
U2
U3 4 5 3 2 4 5 1
U3 1 4 3 4 2 1 1 5
U5 4 2 1 3 2 1 2 3
U6 3 1 2 3 2 2
Note: Each row represents a unique user and each column is a unique item.
The given dataset has a missing record for I7 and U2. We cannot split it as is into training and
testing sets unless we preprocess it and handle the missing values.
b) Explain why we are usually interested in interactions more than ratings when working with
recommendations generated by a GAN.
Training a GAN and its convergence is resource intensive. If we use ratings instead of
interactions the resource requirements will significantly be more. Moreover, in some cases
we want to know whether the user would like to interact with a particular item more than
we are concerned with the rating they would give.
6 OF 3