Deep Learning for Recommender Systems @ TDC SP 2019

Deep Learning for
Recommender Systems
Gabriel Moreira
TDC 2019
Lead Data Scientist

About me
Gabriel Moreira
@gspmoreira
Lead Data Scientist DSc. candidate

DRIVEN BY
IMPACT
We are digital transformation agents
for the most valuable brands in the
world, generating business impact for
all projects we lead.

Investing in Machine
Learning since 2012
Recognized Expertise
Google ML Specialized Partner
Tensorﬂow.org Reference
ciandt.com
Cognitive
Solutions
End-to-End
Machine Learning
Capabilities

“We are leaving the
Information Age and
entering the
Recommendation Age."
Cris Anderson, "The long tail"

38% of sales
75% of watched content
Recommendations are responsible for...
39% of top news
visualization

What can a Recommender Systems do?
2 - Prediction
Given an item, what is its relevance for
each user?
1 - Recommendation
Given a user, produce an ordered list matching the
user needs

Recommender System Methods
Recommender System
Content-based filtering Collaborative filtering
Model-based filteringMemory-based filtering
Item-basedUser-based
ML-based: Clustering, Association Rules,
Matrix Factorization, Neural Networks
Hybrid filtering+ =
Most popular

User-Based Collaborative Filtering
Similar interests
Likes
Recommends

Item-Based Collaborative Filtering
Likes Recommends
Who likes A also likes B
Likes
Likes

Collaborative Filtering based on Matrix Factorization

Collaborative Filtering
Advantages
● Works to any item kind (ignore attributes)
Drawbacks
● Usually recommends more popular items
● Cold-start
○ Cannot recommend items not already
rated/consumed
○ Needs a minimum amount of users to match
similar users

Frameworks - Recommender Systems
Python
Python / ScalaJava
.NET
Java

Content-Based Filtering
Similar content (e.g. actor)
Likes
Recommends

Advantages
● Does not depend upon other users
● May recommend new and unpopular items
● Recommendations can be easily explained
Drawbacks
● Overspecialization
● May be difficult to extract attributes from audio,
movies or images
Content-Based Filtering

Hybrid Recommender Systems
Composite
Iterates by a chain of algorithm, aggregating
recommendations.
Weighted
Each algorithm has as a weight and the final
recommendations are defined by weighted averages.
Some approaches...

Why Deep Learning has a potential for RecSys?
1. Feature extraction directly from the content (e.g., image, text, audio)
Images Text Audio/Music
● CNN ● CNN
● RNNs
● Weighted word
embeddings
● CNN
● RNN

Why Deep Learning has a potential for RecSys?
2. Heterogeneous data handled easily
3. Dynamic behaviour modeling with RNNs
4. More accurate representation learning of users and items
○ Natural extensions of CF
5. RecSys is a complex domain
○ Deep learning worked well in other complex domains

The Deep Learning era of RecSys
2007
2015
2016
2017-2019
Deep Boltzmann Machines
for rating prediction
calm before the
storm
A few seminal papers
First DLRS workshop and
papers on RecSys, KDD,
SIGIR
Continued increase

Advances in DL-RecSys
And their combinations...

News Recommender Systems
The majority of web traffic (TREVISIOL et al. , 2014b)
27

Challenges
28
1. Streaming clicks and news articles
2. Most users are anonymous
3. Users’ preferences shift
4. Accelerated relevance decay
Percentile of clicks Article age
10% up to 4 hours
25% up to 5 hours
50% (Median) up to 8 hours
75% up to 14 hours
90% up to 26 hours

Factors affecting news relevance
29
News
relevance
Topics Entities Publisher
News static properties
Recency Popularity
News dynamic properties
News article
User
TimeLocation Device
User current context
Long-term
interests
Short-term
interests
Global factors
Season-
ality
User interests
Breaking
events
Popular
Topics
Referrer

News session-based recommender overview
30
User session clicks
C1
C2
C3
C4
Next-click prediction
model
Article B
Article A
Article C
Article D
...
Ranked articles
Recommendable articles (sample)

CHAMELEON: A Deep Learning
Meta-Architecture for News
Recommendation

CHAMELEON Meta-Architecture for News RS
Article
Context
Article
Content
Embeddings
Article Content Representation (ACR)
Textual Features Representation (TFR)
Metadata Prediction (MP)
Category Tags Entities
Article Metadata Attributes
Next-Article Recommendation (NAR)
Time
Location
Device
When a news article is published...
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
Article
Content
Embedding
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Word embeddings sequence
New York is a multicultural city , ...
News Article Text
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributes
Article Content Embedding
Legend:
Word
Embeddings
32

CHAMELEON - ACR module
Article
Content
Embeddings
Category Tags Entities
Article Metadata Attributes
Legend:
Word
Embeddings
T-SNE viz. of articles embeddings colored by
category, with similar articles highlighted
News Article Text
“Sports”
“Politics”

Article
Context
Article
Content
Embeddings
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
(positive and neg.)
active article
Active
Sessions
Recommended
articles
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend:
Article
Content
Embedding
34
CHAMELEON - NAR module
Sessions in a batch
I1,1
I1,2
I1,3
I1,4
I1,5
I2,1
I2,2
I3,1
I3,2
I3,3
Input
I1,1
I1,2
I1,3
I1,4
I1,2
I1,3
I1,4
I1,5
Expected Output (labels)

Article
Context
Article
Content
Embeddings
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
(positive and neg.)
active article
Active
Sessions
Recommended
articles
Active user session
Article
Content
Embedding
35
Sampling strategy
Negative samples:
Articles read by other users in the last hour
Positive sample:
Next article read by the user in his session
Samples

Article
Context
Article
Content
Embeddings
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
(positive and neg.)
active article
Active
Sessions
Active user session
Recommendations Ranking
(RR) sub-module
Article
Content
Embedding
Relevance Score of an item for a user session
36
Recommended
articles

CHAMELEON - Ranking loss function
Cosine similarity-based loss function implemented on TensorFlow
-
Cosine similarity Softmax over Relevance Score Loss function

CHAMELEON
Architecture Instatiations

An architecture instantiation of CHAMELEON (1D CNN and LSTM)
39
Article
Context
Article
Content
Embeddings
Category
Target Article Metadata Attributes
Platform
Device Type
User context
User interaction
past read articles
Popularity
Recency
Article contextArticle
Content
Embedding
(positive and neg.)
active articleActive
Sessions
Recommended
articles
News Article
Active user session
Legend:
Word
Embeddings
Convolutional Neural Network (CNN)
conv-3 (128)
max-pooling
conv-4 (128)
max-pooling
conv-5 (128)
max-pooling
Fully Connected
Fully Connected
Fully Connected
Fully Connected
LSTM
News Article Text

CHAMELEON Instantiation - Implementation
40
● CHAMELEON’s instantiations are implemented using TensorFlow
https://github.com/gabrielspmoreira/chameleon_recsys
● Training and evaluation performed in Google Cloud Platform ML Engine

Experiments - Dataset
42
● Provided by Globo.com (G1), the most popular news portal in Brazil
● Sample from Oct., 1 to 16, 2017, with over 3 M clicks, distributed in 1.2 M
sessions from 330 K users, who read over 50 K unique news articles
https://www.kaggle.com/gspmoreira/news-portal-user-interactions-by-globocom

ACR module training
43
Trained in a dataset with 364 K articles from 461 categories, to generate the
Articles Content Embeddings (vectors with 250 dimensions)
t-SNE visualization of trained Article Content
Embeddings (from top 15 categories)
Distribution of articles by the top 200 categories

Recommendation evaluation
44
Task: For each item within a session, predict the next-clicked item from a set
composed by the positive sample (correct article) and 50 negative samples.
Accuracy Metrics:
● HitRate@10 - Checks whether the positive item is among the top-10 ranked items
● MRR@10 - Ranking metric which assigns higher scores at top ranks.
Hours

45
Benchmark methods for session-based recommendations:
Frequent patterns methods
1. Co-occurrent - Recommends articles commonly viewed together with the last read article, in
other user sessions (simplified version of the association rules technique, with the maximum rule
size of two) (Jugovac, 2018) (Ludewig, 2018)
2. Sequential Rules (SR) - A more sophisticated version of association rules, which considers the
sequence of clicked items within the session. A rule is created when an item q appeared after an
item p in a session, even when other items were viewed between p and q. The rules are
weighted by the distance x (number of steps) between p and q in the session with a linear
weighting function (Ludewig, 2018)

46
Baseline methods for session-based recommendations:
KNN methods
4. Item-kNN - Returns most similar items to the last read article, in terms of the cosine similarity
between the vector of their sessions, i.e. it is the number of co-occurrences of two items in sessions
divided by the square root of the product of the numbers of sessions in which the individual items
are occurred.
5. Vector Multiplication Session-Based kNN (V-SkNN) - Compares the entire active session with
past sessions and find items to be recommended. The comparison emphasizes items more recently
clicked within the session, when computing the similarities with past sessions (Jannach,2017)
(Jugovac,2018) (Ludewig,2018)
Other baselines
6. Recently Popular - Recommends the most viewed articles from the last N clicks buffer
7. Content-Based - For each article read by the user, recommends similar articles based on the
cosine similarity of their Article Content Embeddings, from the last N clicks buffer.

48
Continuous training and evaluating during 16 days (Oct. 1-16, 2017)
Average MRR@10 by hour (evaluation each 5 hours)

49
Other recommendation quality factors
Item Coverage Novelty Diversity

50
Balancing conflicting objectives

51
https://arxiv.org/abs/1904.10367
https://arxiv.org/abs/1808.00076
References

References
https://www.infoq.com/br/presentations/deep-recommender-systems/

CI&T Deskdrop dataset on Kaggle!
https://www.kaggle.com/gspmoreira/articles-sharing-reading-from-cit-deskdrop
● 12 months logs
(Mar. 2016 - Feb. 2017)
● ~ 73k logged users interactions
● ~ 3k public articles shared in
the platform.
Recommender Systems in Python 101
https://www.kaggle.com/gspmoreira/recommender-systems-in-python-101

Questions?
TDC 2019
Gabriel Moreira
Lead Data Scientist
gabrielpm@ciandt.com
@gspmoreira

Deep Learning for Recommender Systems @ TDC SP 2019

More Related Content

What's hot (20)

Similar to Deep Learning for Recommender Systems @ TDC SP 2019 (20)

More from Gabriel Moreira (19)

Recently uploaded (20)

Deep Learning for Recommender Systems @ TDC SP 2019