RecommenderSystems-Shortened
RecommenderSystems-Shortened
Tho Quan
[email protected]
-1-
-2-
Recommender Systems
Application areas
-3-
In the Social Web
-4-
Even more …
Personalized search
"Computational advertising"
-5-
Agenda
-6-
-7-
Why using Recommender Systems?
-9-
Problem domain
RS are software agents that elicit the interests and preferences of individual
consumers […] and make recommendations accordingly.
They have the potential to support and improve the quality of the
decisions consumers make while searching for and selecting products online.
» [Xiao & Benbasat, MISQ, 2007]
- 11 -
Paradigms of recommender systems
- 12 -
Paradigms of recommender systems
Personalized recommendations
- 13 -
Paradigms of recommender systems
- 14 -
Paradigms of recommender systems
- 15 -
Paradigms of recommender systems
- 16 -
Paradigms of recommender systems
- 17 -
Recommender systems: basic techniques
Pros Cons
Collaborative No knowledge- Requires some form of rating
engineering effort, feedback, cold start for new users
serendipity of results, and new items
learns market segments
Content-based No community required, Content descriptions necessary,
comparison between cold start for new users, no
items possible surprises
- 18 -
- 19 -
Collaborative Filtering (CF)
- 20 -
1992: Using collaborative filtering to weave an information
tapestry, D. Goldberg et al., Communications of the ACM
Basic idea: "Eager readers read all docs immediately, casual readers wait
for the eager readers to annotate"
Experimental mail system at Xerox Parc that records reactions of users
when reading a mail
Users are provided with personalized mailing list filters instead of being
forced to subscribe
– Content-based filters (topics, from/to/subject…)
– Collaborative filters
E.g. Mails to [all] which were replied by [John Doe] and which received
positive ratings from [X] and [Y].
- 21 -
1994: GroupLens: an open architecture for collaborative filtering of
netnews, P. Resnick et al., ACM CSCW
Tapestry system does not aggregate ratings and requires knowing each
other
Basic idea: "People who agreed in their subjective evaluations in the
past are likely to agree again in the future"
Builds on newsgroup browsers with rating functionality
- 22 -
User-based nearest-neighbor collaborative filtering (1)
- 24 -
Measuring user similarity
a, b : users
ra,p : rating of user a for item p
P : set of items, rated both by a and b
Possible similarity values between -1 and 1; 𝒓𝒂 , 𝒓𝒃 = user's average ratings
6 Alice
5 User1
User4
4
Ratings
3
0
Item1 Item2 Item3 Item4
- 26 -
Making predictions
Calculate, whether the neighbors' ratings for the unseen item i are higher
or lower than their average
Combine the rating differences – use the similarity as a weight
Add/subtract the neighbors' bias from the active user's average and use
this as a prediction
- 27 -
Making recommendations
- 28 -
Improving the metrics / prediction function
- 29 -
Memory-based and model-based approaches
- 30 -
2001: Item-based collaborative filtering recommendation algorithms, B.
Sarwar et al., WWW 2001
Scalability issues arise with U2U if many more users than items
(m >> n , m = |users|, n = |items|)
– e.g. Amazon.com
– Space complexity O(m2) when pre-computed
– Time complexity for computing Pearson O(m2n)
- 31 -
Item-based collaborative filtering
Basic idea:
– Use the similarity between items (and not users) to make predictions
Example:
– Look for items that are similar to Item5
– Take Alice's ratings for these items to predict the rating for Item5
- 32 -
The cosine similarity measure
- 33 -
Pre-processing for item-based filtering
- 34 -
More on ratings
Implicit ratings
– clicks, page views, time spent on some page, demo downloads …
– Can be used in addition to explicit ones; question of correctness of interpretation
- 35 -
Data sparsity problems
- 36 -
Example algorithms for sparse datasets
Recursive CF
– Assume there is a very close neighbor n of u who however has not rated the
target item i yet.
– Idea:
Apply CF-method recursively and predict a rating for item i for the neighbor
Use this predicted rating instead of the rating of a more distant direct
neighbor
- 38 -
More model-based approaches
- 39 -
2000: Application of Dimensionality Reduction in
Recommender System, B. Sarwar et al., WebKDD Workshop
Basic idea: Trade more complex offline model building for faster online
prediction generation
Singular Value Decomposition for dimensionality reduction of rating
matrices
– Captures important factors/aspects and their weights in the data
– factors can be genre, actors but also non-understandable ones
– Assumption that k dimensions capture the signals and filter out noise (K = 20 to 100)
- 40 -
A picture says …
1
Sue
0.8
0.6
0.4
0.2
Bob
Mary
0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
-0.2
-0.4
Alice
-0.6
-0.8
-1
- 41 -
Matrix factorization
M k U k k Vk
T
• SVD:
- 42 -
Association rule mining
- 43 -
Probabilistic methods
- 44 -
2008: Factorization meets the neighborhood: a multifaceted collaborative
filtering model, Y. Koren, ACM SIGKDD
(rˆui rui ) 2
( u ,i )K
RMSE
K
- 45 -
2008: Factorization meets the neighborhood: a multifaceted collaborative
filtering model, Y. Koren, ACM SIGKDD
Neighborhood models
– good at detecting strong relationships between close items
Combination in one prediction single function
– Local search method such as stochastic gradient descent to determine
parameters
– Add penalty for high values to avoid over-fitting
rˆui bu bi puT qi
(rui bu bi puT qi ) 2 ( pu qi bu2 bi2 )
2 2
min
p* , q* ,b*
( u ,i )K
- 46 -
Summarizing recent methods
- 47 -
Collaborative Filtering Issues
Pros:
– well-understood, works well in some domains, no knowledge engineering required
Cons:
– requires user community, sparsity problems, no integration of other knowledge sources,
no explanation of results
- 48 -
- 49 -
Recommender Systems in e-Commerce
- 50 -
Recommender Systems in e-Commerce
- 51 -
Recommender Systems in e-Commerce
We hope
These youbeen
have will buy also for
in stock … quite a while now …
- 52 -
What is a good recommendation?
- 53 -
Purpose and success criteria (1)
Different perspectives/aspects
– Depends on domain and purpose
– No holistic evaluation scenario exists
Retrieval perspective
– Reduce search costs
– Provide "correct" proposals
– Assumption: Users know in advance what they want
Recommendation perspective
– Serendipity – identify items from the Long Tail
– Users did not know about existence
- 54 -
When does a RS do its job well?
"Recommend widely
unknown items that
users might actually
like!"
Recommend items
from the long tail 20% of items
accumulate 74% of all
positive ratings
- 55 -
Purpose and success criteria (2)
Prediction perspective
– Predict to what degree users like an item
– Most popular evaluation scenario in research
Interaction perspective
– Give users a "good feeling"
– Educate users about the product domain
– Convince/persuade users - explain
- 56 -
How do we as researchers
know?
- 57 -
Empirical research
Characterizing dimensions:
– Who is the subject that is in the focus of research?
– What research methods are applied?
– In which setting does the research take place?
- 58 -
Research methods
- 59 -
Experiment designs
- 60 -
Evaluation in information retrieval (IR)
Good
Rated False Negative (fn) True Negative (tn)
Bad
- 61 -
Metrics: Precision and Recall
- 62 -
Dilemma of IR measures in RS
- 63 -
Metrics: Rank Score – position matters
For a user:
Rank Score extends recall and precision to take the positions of correct
items in a ranked list into account
– Particularly important in recommender systems as lower ranked items may be
overlooked by users
– Learning-to-rank: Optimize models for such measures (e.g., AUC)
- 64 -
Accuracy measures
- 65 -
Offline experimentation example
Netflix competition
– Web-based movie rental
– Prize of $1,000,000 for accuracy improvement (RMSE) of 10% compared to own
Cinematch system.
Historical dataset
– ~480K users rated ~18K movies on a scale of 1 to 5 (~100M ratings)
– Last 9 ratings/user withheld
Probe set – for teams for evaluation
Quiz set – evaluates teams’ submissions for leaderboard
Test set – used by Netflix to determine winner
Today
– Rating prediction only seen as an additional input into the recommendation process
- 66 -
- 67 -
Content-based recommendation
Collaborative filtering does NOT require any information about the items,
However, it might be reasonable to exploit such information
E.g. recommend fantasy novels to people who liked fantasy novels in the past
What do we need:
Some information about the available items such as the genre ("content")
Some sort of user profile describing what the user likes (the preferences)
The task:
Learn user preferences
Locate/recommend items that are "similar" to the user preferences
- 68 -
Paradigms of recommender systems
- 69 -
What is the "content"?
Here:
– Classical IR-based methods based on keywords
– No expert recommendation knowledge involved
– User profile (preferences) are rather learned than explicitly elicited
- 70 -
Content representation and item similarities
Simple approach
– Compute the similarity of an unseen item with the user profile based on the
keyword overlap (e.g. using the Dice coefficient)
2 ∗|𝑘𝑒𝑦𝑤𝑜𝑟𝑑𝑠 𝑏𝑖 ∩𝑘𝑒𝑦𝑤𝑜𝑟𝑑𝑠 𝑏𝑗 |
– sim(bi, bj) = 𝑘𝑒𝑦𝑤𝑜𝑟𝑑𝑠 𝑏𝑖 +|𝑘𝑒𝑦𝑤𝑜𝑟𝑑𝑠 𝑏𝑗 |
- 71 -
Term-Frequency - Inverse Document Frequency (TF-IDF)
- 72 -
TF-IDF
- 73 -
Example TF-IDF representation
- 74 -
More on the vector space model
- 75 -
Recommending items
- 76 -
Rocchio details
- 77 -
Probabilistic methods
Remember:
P(Label=1|X)=
k*P(X|Label=1) * P(Label=1)
- 78 -
Improvements
- 79 -
Limitations of content-based recommendation methods
Overspecialization
Algorithms tend to propose "more of the same"
E.g. too similar news items
- 80 -
- 81 -
Why do we need knowledge-based recommendation?
- 82 -
Knowledge-based recommendation
- 83 -
Knowledge-based recommendation I
- 84 -
Knowledge-Based Recommendation II
– Similarity functions
Determine matching degree between query and item (case-based RS)
– Utility-based RS
E.g. MAUT – Multi-attribute utility theory
- 85 -
Constraint-based recommendation I
- 86 -
Item ranking
- 87 -
Customer-specific item utilities with MAUT
* **
- 88 -
Constraint-based recommendation II
- 89 -
Example: find minimal relaxations (minimal diagnoses)
- 91 -
Constraint-based recommendation III
– Bundling of recommendations
Find item bundles that match together according to some knowledge
– E.g. travel packages, skin care treatments or financial portfolios
– RS for different item categories, CSP restricts configuring of bundles
- 92 -
Conversational strategies
- 93 -
Example: adaptive strategy selection
- 94 -
Limitations of knowledge-based recommendation methods
- 95 -