SlideShare a Scribd company logo
An Open Framework for
Multi-source, Cross-domain Personalisation with
Semantic Interest Graphs
Benjamin Heitmann
Ph.D. Viva
Monday, 28 July 2014
 Personalisation has become an expected feature:
 75% of consumers prefer personalised E-Commerce retailers
 94% of companies view personalisation as critical to business performance
 Examples: Amazon, Last.fm, Facebook
Personalisation has become a commodity
2 Motivation
Architecture of recommender systems:
closed versus open inventory
3
Main research problem:
 How to enable cross-domain personalisation without
proprietary & closed infrastructure and algorithm ?
Motivation
State-of-the-art limitations:
Collaborative Filtering
Research problems:
 Provide cross-domain recommendations without overlap ?
 Cold-start problem ?
4 Motivation
Definitions
 Definition of a domain:
Any set of recommendable items
+ set of users
+ preferences between users and items
 Source domain:
A domain with non-empty preferences
 Target domain:
A domain with recommendable items
 Cross-domain personalisation
task:
 Using preferences in source domain to
provide recommendations in different
target domain
 No overlap between source and target
domain
5 Motivation
State-of-the-art limitations:
Content-based filtering
6
General requirement:
Data with links between different domains?
Research question:
Cross-domain recommendations without ratings in target domain?
content based single domain.pdf
Garth BrooksJohnny Cash
Iron MaidenMetallica similar
similar
Music
Catch 22
Harry Potter 1
Books
Kyoto
New York
Travel
Garth BrooksJohnny Cash
Iron MaidenMetallica similar
similar
Music
Catch 22
Harry Potter 1
Books
Kyoto
New York
Travel
?
?
?
Motivation
Enabling technology for cross-domain
personalisation: Linked Open Data (LOD)
7
LOD can enable cross-
domain personalisation:
1. Provides re-usable
concept identifiers
2. Cross-domain links for
many different domains
3. Standard for
interoperable graph data
Research question:
Best practices for LOD
recommender systems ?
As of September 2011
Music
Brainz
(zitgist)
P20
Turismo
de
Zaragoza
yovisto
Yahoo!
Geo
Planet
YAGO
World
Fact-
book
El
Viajero
Tourism
WordNet
(W3C)
WordNet
(VUA)
VIVO UF
VIVO
Indiana
VIVO
Cornell
VIAF
URI
Burner
Sussex
Reading
Lists
Plymouth
Reading
Lists
UniRef
UniProt
UMBEL
UK Post-
codes
legislation
data.gov.uk
Uberblic
UB
Mann-
heim
TWC LOGD
Twarql
transport
data.gov.
uk
Traffic
Scotland
theses.
fr
Thesau-
rus W
totl.net
Tele-
graphis
TCM
Gene
DIT
Taxon
Concept
Open
Library
(Talis)
tags2con
delicious
t4gm
info
Swedish
Open
Cultural
Heritage
Surge
Radio
Sudoc
STW
RAMEAU
SH
statistics
data.gov.
uk
St.
Andrews
Resource
Lists
ECS
South-
ampton
EPrints
SSW
Thesaur
us
Smart
Link
Slideshare
2RDF
semantic
web.org
Semantic
Tweet
Semantic
XBRL
SW
Dog
Food
Source Code
Ecosystem
Linked Data
US SEC
(rdfabout)
Sears
Scotland
Geo-
graphy
Scotland
Pupils &
Exams
Scholaro-
meter
WordNet
(RKB
Explorer)
Wiki
UN/
LOCODE
Ulm
ECS
(RKB
Explorer)
Roma
RISKS
RESEX
RAE2001
Pisa
OS
OAI
NSF
New-
castle
LAAS
KISTI
JISC
IRIT
IEEE
IBM
Eurécom
ERA
ePrints dotAC
DEPLOY
DBLP
(RKB
Explorer)
Crime
Reports
UK
Course-
ware
CORDIS
(RKB
Explorer)
CiteSeer
Budapest
ACM
riese
Revyu
research
data.gov.
ukRen.
Energy
Genera-
tors
reference
data.gov.
uk
Recht-
spraak.
nl
RDF
ohloh
Last.FM
(rdfize)
RDF
Book
Mashup
Rådata
nå!
PSH
Product
Types
Ontology
Product
DB
PBAC
Poké-
pédia
patents
data.go
v.uk
Ox
Points
Ord-
nance
Survey
Openly
Local
Open
Library
Open
Cyc
Open
Corpo-
rates
Open
Calais
OpenEI
Open
Election
Data
Project
Open
Data
Thesau-
rus
Ontos
News
Portal
OGOLOD
Janus
AMP
Ocean
Drilling
Codices
New
York
Times
NVD
ntnusc
NTU
Resource
Lists
Norwe-
gian
MeSH
NDL
subjects
ndlna
my
Experi-
ment
Italian
Museums
medu-
cator
MARC
Codes
List
Man-
chester
Reading
Lists
Lotico
Weather
Stations
London
Gazette
LOIUS
Linked
Open
Colors
lobid
Resources
lobid
Organi-
sations
LEM
Linked
MDB
LinkedL
CCN
Linked
GeoData
LinkedCT
Linked
User
Feedback
LOV
Linked
Open
Numbers
LODE
Eurostat
(Ontology
Central)
Linked
EDGAR
(Ontology
Central)
Linked
Crunch-
base
lingvoj
Lichfield
Spen-
ding
LIBRIS
Lexvo
LCSH
DBLP
(L3S)
Linked
Sensor Data
(Kno.e.sis)
Klapp-
stuhl-
club
Good-
win
Family
National
Radio-
activity
JP
Jamendo
(DBtune)
Italian
public
schools
ISTAT
Immi-
gration
iServe
IdRef
Sudoc
NSZL
Catalog
Hellenic
PD
Hellenic
FBD
Piedmont
Accomo-
dations
GovTrack
GovWILD
Google
Art
wrapper
gnoss
GESIS
GeoWord
Net
Geo
Species
Geo
Names
Geo
Linked
Data
GEMET
GTAA
STITCH
SIDER
Project
Guten-
berg
Medi
Care
Euro-
stat
(FUB)
EURES
Drug
Bank
Disea-
some
DBLP
(FU
Berlin)
Daily
Med
CORDIS
(FUB)
Freebase
flickr
wrappr
Fishes
of Texas
Finnish
Munici-
palities
ChEMBL
FanHubz
Event
Media
EUTC
Produc-
tions
Eurostat
Europeana
EUNIS
EU
Insti-
tutions
ESD
stan-
dards
EARTh
Enipedia
Popula-
tion (En-
AKTing)
NHS
(En-
AKTing) Mortality
(En-
AKTing)
Energy
(En-
AKTing)
Crime
(En-
AKTing)
CO2
Emission
(En-
AKTing)
EEA
SISVU
educatio
n.data.g
ov.uk
ECS
South-
ampton
ECCO-
TCP
GND
Didactal
ia
DDC Deutsche
Bio-
graphie
data
dcs
Music
Brainz
(DBTune)
Magna-
tune
John
Peel
(DBTune)
Classical
(DB
Tune)
Audio
Scrobbler
(DBTune)
Last.FM
artists
(DBTune)
DB
Tropes
Portu-
guese
DBpedia
dbpedia
lite
Greek
DBpedia
DBpedia
data-
open-
ac-uk
SMC
Journals
Pokedex
Airports
NASA
(Data
Incu-
bator)
Music
Brainz
(Data
Incubator)
Moseley
Folk
Metoffice
Weather
Forecasts
Discogs
(Data
Incubator)
Climbing
data.gov.uk
intervals
Data
Gov.ie
data
bnf.fr
Cornetto
reegle
Chronic-
ling
America
Chem2
Bio2RDF
Calames
business
data.gov.
uk
Bricklink
Brazilian
Poli-
ticians
BNB
UniSTS
UniPath
way
UniParc
Taxono
my
UniProt
(Bio2RDF)
SGD
Reactome
PubMed
Pub
Chem
PRO-
SITE
ProDom
Pfam
PDB
OMIM
MGI
KEGG
Reaction
KEGG
Pathway
KEGG
Glycan
KEGG
Enzyme
KEGG
Drug
KEGG
Com-
pound
InterPro
Homolo
Gene
HGNC
Gene
Ontology
GeneID
Affy-
metrix
bible
ontology
BibBase
FTS
BBC
Wildlife
Finder
BBC
Program
mes BBC
Music
Alpine
Ski
Austria
LOCAH
Amster-
dam
Museum
AGROV
OC
AEMET
US Census
(rdfabout)
Media
Geographic
Publications
Government
Cross-domain
Life sciences
User-generated content
Motivation
Overview of approach
 Open framework for cross-domain personalisation
1. Conceptual architecture for recommender systems using Linked Open Data
2. Cross-domain personalisation approach using RDF and Linked Data
 Prototype implementation based on the framework
8
Travel destinations:
Movies:
Multi-source user profiles with
preferences from multiple domains
Cross-domain recommendation
algorithm (SemStim) uses DBpedia
as background knowledge
Recommendations
for target domains
Conceptual Architecture for LOD
recommender systems: Methodology
 Goal:
 Identify best practices
 List most common components
 Enable recommender systems to use Linked Data
 Methodology with strong empirical grounding:
1. Survey of 124 RDF-based applications (2003 to 2009)
• 15 questions
• Original authors were contacted to verify or correct our
assessment
2. Architectural analysis to identify common components
3. Extend proposed architecture for recommender systems
9 Conceptual architecture for LOD recommender systems
Conceptual Architecture for LOD
recommender systems
10 Conceptual architecture for LOD recommender systems
Cross-domain algorithm: SemStim
 Requirements:
 Graph algorithm
 Graph search
between two
domains
 SemStim
extends
Spreading
Activation:
 Adds targeted
activation
 Adds constraints
for algorithm
duration
11
Douglas
Adams
User
profile
Recommendable
items
Start of
spreading
activation
DBpedia
Atheism
Activists
Cambridge
United
Kingdom
Macmillian
Restaurant at the
end of the universe
Kurt
Vonnegut
Richard
Dawkins
dc:subject
author
subsequentWork
influencedBy
influencedBy
dc:subject
publisher
author
birthplace
subdivisionName
country
The Hitchhikers
Guide to the
Galaxy (novel)
SemStim evaluation
Evaluation: Objectives
1. Can SemStim provide single-domain recommendations?
2. Can SemStim provide cross-domain recommendations?
3. How diversity are the SemStim recommendations?
4. Is there a connection between accuracy and diversity ?
12 SemStim evaluation
Evaluation: comparison algorithms
 Algorithms for comparison:
 k-nn Collaborative Filtering
 SVD++ Collaborative Filtering
 Random selection
 Linked Data Semantic Distance (LDSD)
 Set-based breadth first search (SetBFS)
 Background knowledge: DBpedia 3.8 (67m edges, 11m
vertices)
13 SemStim evaluation
Single-domain accuracy
experiment protocol
 Data set: MovieLens 100k
 Metrics: precision, recall, F1-score
 Experiment protocol:
 Adapted from Cremonesi
 Top-k recommendation task
 90%/10% train/probe split
 Test profile: highly rated items in probe set plus random
items
14 SemStim evaluation
●
●
●
●
●
●
●
●
● ● ● ● ● ● ● ● ● ● ● ●
0.00
0.05
0.10
0.15
0 5 10 15 20
number of recommendations
F1−score
●
CFknn
LDSD
Random
SemStim
SetBFS
SVD++
Single-domain accuracy experiment:
results
15
SemStim
SemStim evaluation
Cross-domain accuracy
experiment protocol
16
 Data set: Amazon SNAP
 Ratings from users with at least 20 ratings in two domains
 Metrics: precision, recall, F1-score
 Experiment protocol:
 Source domain provides train profile
 Target domain provides test profile
 CF algorithms unsuitable to high sparsity
SemStim evaluation
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
0.000
0.005
0.010
0.015
0 10 20 30
number of recommendations
F1−score
●
LDSD
Random
SemStim
SetBFS
Cross-domain accuracy experiment
17
SemStim
DVDs >> Music
SemStim evaluation
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
0.00
0.01
0.02
0 10 20 30
number of recommendations
F1−score
●
LDSD
Random
SemStim
SetBFS
Cross-domain accuracy experiment
18
SemStim
Music >> DVDs
SemStim evaluation
0.00
0.25
0.50
0.75
1.00
C
Fknn50
SVD
++
R
andom
LD
SD
SetBFS
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
algorithm name, or activation threshold for SemStim
diversity
Single-domain diversity experiment
19
 Data set: MovieLens
100k
 Experiment protocol:
 95%/5% train/test split
 Results:
 Diversity can be tuned
 Requires using all
preferences (incl. negative)
Less diverse More diverse
Increasing activation threshold
SemStim evaluation
LOD-RecSys challenge at ESWC 2014:
Diversity recommendation task
20
 Data set: DBbook
 Metrics:
 F1-score @20 & Inter-List Diversity @20
 Ranking based on average rank for both metrics
 Diversity rec. task:
 Recommend top-20 of all unrated items for each user
 Implementation challenge:
 Real-time result submission
 Hidden ground-truth
SemStim evaluation
●
●
●
●
●
●
●
●
0.03
0.04
0.05
0.06
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
activation threshold
F1−score@20
●
●
●
●
●
●
●
●
0.465
0.470
0.475
0.480
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
activation threshold
InterListDiversity@20
LOD-RecSys challenge at ESWC 2014:
Diversity recommendation task
21
 Results:
 3rd place out of 12 teams
 Competitive performance
 SemStim unbiased
 Can balance accuracy and diversity
Best rank Best rank
SemStim evaluation
ADVANSSE prototype
 Outcome of collaboration project with CISCO Galway
 Goals:
 Show relevance to real-world, industry use case
 Implement cross-domain personalisation framework
 Instantiate conceptual architecture for LOD recommender systems
 Provide distributed and open ecosystem for cross-domain
personalisation
22 ADVANSSE prototype
ADVANSSE use case
Functional requirements:
1. Filtering of subscriptions
2. Recommendation of posts
3. Updating of interests and recommendations
23
?
MARKETING
DEVELOPMENT
R & D
ADVANSSE prototype
ADVANSSE distributed social platform
24
Bob Cecilia
ADVANSSE
server
RDF store
XMPP
server
Personalisation
component
ADVANSSE connected
social platform (1)
XMPP
client
Application
logic
XMPP
Andrew
Data
homogenisation
service
Graph query
language
service
RDF store
Structured data
authoring
interface
User
interface
ADVANSSE connected
social platform (2)
XMPP
client
Application
logic
XMPP
RDF store
Structured data
authoring
interface
User
interface
ADVANSSE prototype
ADVANSSE prototype: user interface
25 ADVANSSE prototype
Summary of contributions
 Conceptual architecture:
 Describes best practices for leveraging LOD for recommender systems
 List of high-level components
 Strong empirical grounding
 Cross-domain recommendation approach using SemStim
 Can provide single-domain and cross-domain recommendations
 No overlap between source & target domain required
 No ratings in target domain required
 Competitive performance
 Diversity of recommendations can be tuned
 ADVANSSE prototype:
 Based on real-world use case
 Shows how to use LOD to enable an ecosystem for cross-domain pers.
26 Conclusion
Future work
 Investigate connection between performance of SemStim and
choice of target and source domains
 Learning of weights for different edge types
 Improving the quality of linkage data
27 Conclusion
Dissemination
 In top-3 for Diversity task at the
LOD-RecSys challenge, ESWC 2014
 Publications:
 2 book chapters
 1 journal paper
 2 conference papers
 2 workshop papers
 1 conference poster
 ADVANSSE web site: http://advansse.deri.ie
28 Conclusion
Extra graphs and data / details
29
Motivation: New requirements for
recommender systems
 Architecture of real-world
RecSys has changed:
 Shift from closed to open
inventories
 Emergence of ecosystems to
share user preference data
 New requirements for
recommender systems:
 Multi-source profiles
 Domain-neutral preferences
 Cross-domain personalisation
 Existing infrastructure and
algorithms are
proprietary and closed
30
Diversity test for Amazon SNAP DVD data
31
●
●
● ● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.0
0.1
0.2
0.3
0 5 10 15 20
value of topK
F1−score
●
CFknn50
LDSD
Random
SemStim02
SemStim03
SemStim04
SetBFS
SVDpp
Examples of cross-domain
recommendations
32
advansse:Question1
sioc:Post
advansse:Questiondc:title
dc:description
rdf:type
rdf:type
Title
Post
body
advansse:User1
dc:creator
sioc:UserAccountrdf:type
Display
Name
sioc:name
advansse:Tag1
ert:hasTopic
ctag:Tagrdf:type
Tag
String
ctag:label
http://dbpedia.org/
resource/Entity
ctag:means
ert:interestedIn
advansse:Answer1
sioc:Post
advansse:Answer
rdf:type
rdf:type
dc:description
Answer
body
advansse:hasAnswer
Namespaces:
sioc - http://rdfs.org/sioc/ns#
ert - http://www.cisco.com/ert/
advansse - http://advansse.uimr.deri.ie/demo#
rdf - http://www.w3.org/1999/02/22-rdf-syntax-ns#
dc - http://purl.org/dc/elements/1.1/
ctag - http://commontag.org/ns#
advansse:hasQuestion
ADVANSSE prototype: Implementing
domain-neutral user profiles
 Domain-neutral user
profiles implemented
using CISCO ERT
schema
 Content extracted
from 3 sites on
StackExchange
 Integrated with
DBpedia background
knowledge
 Data storage:
 Jena TDB
 HDT triple store
33
Ad

Recommended

A new direction for recommender systems: balancing privacy and personalisation
A new direction for recommender systems: balancing privacy and personalisation
Benjamin Heitmann
 
Data Mining and Recommendation Systems
Data Mining and Recommendation Systems
Salil Navgire
 
Increasing transparency in Medical Education through Open Data
Increasing transparency in Medical Education through Open Data
Rebecca Grant
 
Research in the time of Covid: Surveying impacts on Early Career Researchers
Research in the time of Covid: Surveying impacts on Early Career Researchers
Rebecca Grant
 
Do Open data badges influence author behaviour? A case study at Springer Nature
Do Open data badges influence author behaviour? A case study at Springer Nature
Rebecca Grant
 
Customer to Customer recommendation system
Customer to Customer recommendation system
sksaif95
 
Recommender system
Recommender system
Nilotpal Pramanik
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}([email protected]
HABIB FIGA GUYE {BULE HORA UNIVERSITY}([email protected]
HABIB FIGA GUYE
 
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
PaNOSC
 
Managing Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research Methods
Rebecca Grant
 
Collaborative filtering
Collaborative filtering
Kishor Datta Gupta
 
Retail products - machine learning recommendation engine
Retail products - machine learning recommendation engine
hkbhadraa
 
Overview of recommender system
Overview of recommender system
Stanley Wang
 
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Krishnaram Kenthapadi
 
A Multi-Criteria Recommender System Exploiting Aspect-based Sentiment Analysi...
A Multi-Criteria Recommender System Exploiting Aspect-based Sentiment Analysi...
Cataldo Musto
 
Privacy-preserving Data Mining in Industry: Practical Challenges and Lessons ...
Privacy-preserving Data Mining in Industry: Practical Challenges and Lessons ...
Krishnaram Kenthapadi
 
Personalizing the web building effective recommender systems
Personalizing the web building effective recommender systems
Aravindharamanan S
 
Recommender system and big data (design a smartphone recommender system based...
Recommender system and big data (design a smartphone recommender system based...
Siwar Abidi
 
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
Micah Altman
 
Multi Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - Overview
Davide Giannico
 
Recommender Systems and Linked Open Data
Recommender Systems and Linked Open Data
Polytechnic University of Bari
 
Twente ir-course 20-10-2010
Twente ir-course 20-10-2010
Arjen de Vries
 
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Matthew Rowe
 
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Creating Knowledge out of Interlinked Data
 
Reasesrty djhjan S - explanation required.pptx
Reasesrty djhjan S - explanation required.pptx
AnkitaVerma776806
 
Recsys 2016
Recsys 2016
Mindaugas Zickus
 
Content based recommendation systems
Content based recommendation systems
Aravindharamanan S
 
Personalised Access to Linked Data
Personalised Access to Linked Data
Milan Dojchinovski
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
Ioan Toma
 
MongoDB .local London 2019: Using AWS to Transform Customer Data in MongoDB i...
MongoDB .local London 2019: Using AWS to Transform Customer Data in MongoDB i...
Lisa Roth, PMP
 

More Related Content

What's hot (12)

PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
PaNOSC
 
Managing Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research Methods
Rebecca Grant
 
Collaborative filtering
Collaborative filtering
Kishor Datta Gupta
 
Retail products - machine learning recommendation engine
Retail products - machine learning recommendation engine
hkbhadraa
 
Overview of recommender system
Overview of recommender system
Stanley Wang
 
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Krishnaram Kenthapadi
 
A Multi-Criteria Recommender System Exploiting Aspect-based Sentiment Analysi...
A Multi-Criteria Recommender System Exploiting Aspect-based Sentiment Analysi...
Cataldo Musto
 
Privacy-preserving Data Mining in Industry: Practical Challenges and Lessons ...
Privacy-preserving Data Mining in Industry: Practical Challenges and Lessons ...
Krishnaram Kenthapadi
 
Personalizing the web building effective recommender systems
Personalizing the web building effective recommender systems
Aravindharamanan S
 
Recommender system and big data (design a smartphone recommender system based...
Recommender system and big data (design a smartphone recommender system based...
Siwar Abidi
 
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
Micah Altman
 
Multi Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - Overview
Davide Giannico
 
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
PaNOSC
 
Managing Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research Methods
Rebecca Grant
 
Retail products - machine learning recommendation engine
Retail products - machine learning recommendation engine
hkbhadraa
 
Overview of recommender system
Overview of recommender system
Stanley Wang
 
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Krishnaram Kenthapadi
 
A Multi-Criteria Recommender System Exploiting Aspect-based Sentiment Analysi...
A Multi-Criteria Recommender System Exploiting Aspect-based Sentiment Analysi...
Cataldo Musto
 
Privacy-preserving Data Mining in Industry: Practical Challenges and Lessons ...
Privacy-preserving Data Mining in Industry: Practical Challenges and Lessons ...
Krishnaram Kenthapadi
 
Personalizing the web building effective recommender systems
Personalizing the web building effective recommender systems
Aravindharamanan S
 
Recommender system and big data (design a smartphone recommender system based...
Recommender system and big data (design a smartphone recommender system based...
Siwar Abidi
 
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
Micah Altman
 
Multi Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - Overview
Davide Giannico
 

Similar to Benjamin Heitmann, PhD defence talk: An Open Framework for Multi-source, Cross-domain Personalisation with Semantic Interest Graphs (20)

Recommender Systems and Linked Open Data
Recommender Systems and Linked Open Data
Polytechnic University of Bari
 
Twente ir-course 20-10-2010
Twente ir-course 20-10-2010
Arjen de Vries
 
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Matthew Rowe
 
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Creating Knowledge out of Interlinked Data
 
Reasesrty djhjan S - explanation required.pptx
Reasesrty djhjan S - explanation required.pptx
AnkitaVerma776806
 
Recsys 2016
Recsys 2016
Mindaugas Zickus
 
Content based recommendation systems
Content based recommendation systems
Aravindharamanan S
 
Personalised Access to Linked Data
Personalised Access to Linked Data
Milan Dojchinovski
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
Ioan Toma
 
MongoDB .local London 2019: Using AWS to Transform Customer Data in MongoDB i...
MongoDB .local London 2019: Using AWS to Transform Customer Data in MongoDB i...
Lisa Roth, PMP
 
Master in Big Data Analytics and Social Mining 20015
Master in Big Data Analytics and Social Mining 20015
Andrea Gigli
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking
Mohamed BEN ELLEFI
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
dongchangim30
 
WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-ena...
WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-ena...
GUANGYUAN PIAO
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro Analytics
Navisro Analytics
 
Recommender Systems @ Scale, Big Data Europe Conference 2019
Recommender Systems @ Scale, Big Data Europe Conference 2019
Sonya Liberman
 
The Web of data and web data commons
The Web of data and web data commons
Jesse Wang
 
Keynote @iSWAG2015
Keynote @iSWAG2015
Michele Trevisiol
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory search
Fabien Gandon
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
BigMine
 
Twente ir-course 20-10-2010
Twente ir-course 20-10-2010
Arjen de Vries
 
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Matthew Rowe
 
Reasesrty djhjan S - explanation required.pptx
Reasesrty djhjan S - explanation required.pptx
AnkitaVerma776806
 
Content based recommendation systems
Content based recommendation systems
Aravindharamanan S
 
Personalised Access to Linked Data
Personalised Access to Linked Data
Milan Dojchinovski
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
Ioan Toma
 
MongoDB .local London 2019: Using AWS to Transform Customer Data in MongoDB i...
MongoDB .local London 2019: Using AWS to Transform Customer Data in MongoDB i...
Lisa Roth, PMP
 
Master in Big Data Analytics and Social Mining 20015
Master in Big Data Analytics and Social Mining 20015
Andrea Gigli
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking
Mohamed BEN ELLEFI
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
dongchangim30
 
WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-ena...
WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-ena...
GUANGYUAN PIAO
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro Analytics
Navisro Analytics
 
Recommender Systems @ Scale, Big Data Europe Conference 2019
Recommender Systems @ Scale, Big Data Europe Conference 2019
Sonya Liberman
 
The Web of data and web data commons
The Web of data and web data commons
Jesse Wang
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory search
Fabien Gandon
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
BigMine
 
Ad

More from Benjamin Heitmann (13)

Lessons and requirements from a decade of deployed Semantic Web apps
Lessons and requirements from a decade of deployed Semantic Web apps
Benjamin Heitmann
 
An architecture for privacy-enabled user profile portability on the Web of Data
An architecture for privacy-enabled user profile portability on the Web of Data
Benjamin Heitmann
 
What your hairstyle says about your political preferences, and why you should...
What your hairstyle says about your political preferences, and why you should...
Benjamin Heitmann
 
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
Benjamin Heitmann
 
Implementing Semantic Web applications: reference architecture and challenges
Implementing Semantic Web applications: reference architecture and challenges
Benjamin Heitmann
 
Representing discourse and argumentation as an application of Web Science
Representing discourse and argumentation as an application of Web Science
Benjamin Heitmann
 
Web Science: Motivation, Goals and Contributions
Web Science: Motivation, Goals and Contributions
Benjamin Heitmann
 
Presentation of current research: distributed architecture for recommendation...
Presentation of current research: distributed architecture for recommendation...
Benjamin Heitmann
 
Lessons learned from Futures Studies: Towards a method for Web Science
Lessons learned from Futures Studies: Towards a method for Web Science
Benjamin Heitmann
 
RDFa: putting RDF on the Web
RDFa: putting RDF on the Web
Benjamin Heitmann
 
Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...
Benjamin Heitmann
 
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Benjamin Heitmann
 
Applying the scientific method in Software Evaluation
Applying the scientific method in Software Evaluation
Benjamin Heitmann
 
Lessons and requirements from a decade of deployed Semantic Web apps
Lessons and requirements from a decade of deployed Semantic Web apps
Benjamin Heitmann
 
An architecture for privacy-enabled user profile portability on the Web of Data
An architecture for privacy-enabled user profile portability on the Web of Data
Benjamin Heitmann
 
What your hairstyle says about your political preferences, and why you should...
What your hairstyle says about your political preferences, and why you should...
Benjamin Heitmann
 
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
Benjamin Heitmann
 
Implementing Semantic Web applications: reference architecture and challenges
Implementing Semantic Web applications: reference architecture and challenges
Benjamin Heitmann
 
Representing discourse and argumentation as an application of Web Science
Representing discourse and argumentation as an application of Web Science
Benjamin Heitmann
 
Web Science: Motivation, Goals and Contributions
Web Science: Motivation, Goals and Contributions
Benjamin Heitmann
 
Presentation of current research: distributed architecture for recommendation...
Presentation of current research: distributed architecture for recommendation...
Benjamin Heitmann
 
Lessons learned from Futures Studies: Towards a method for Web Science
Lessons learned from Futures Studies: Towards a method for Web Science
Benjamin Heitmann
 
RDFa: putting RDF on the Web
RDFa: putting RDF on the Web
Benjamin Heitmann
 
Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...
Benjamin Heitmann
 
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Benjamin Heitmann
 
Applying the scientific method in Software Evaluation
Applying the scientific method in Software Evaluation
Benjamin Heitmann
 
Ad

Recently uploaded (20)

FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Alliance
 
June Patch Tuesday
June Patch Tuesday
Ivanti
 
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Safe Software
 
Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025
Safe Software
 
The State of Web3 Industry- Industry Report
The State of Web3 Industry- Industry Report
Liveplex
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Safe Software
 
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
Safe Software
 
AudGram Review: Build Visually Appealing, AI-Enhanced Audiograms to Engage Yo...
AudGram Review: Build Visually Appealing, AI-Enhanced Audiograms to Engage Yo...
SOFTTECHHUB
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Data Validation and System Interoperability
Data Validation and System Interoperability
Safe Software
 
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
FME for Good: Integrating Multiple Data Sources with APIs to Support Local Ch...
FME for Good: Integrating Multiple Data Sources with APIs to Support Local Ch...
Safe Software
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Alliance
 
June Patch Tuesday
June Patch Tuesday
Ivanti
 
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Safe Software
 
Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025
Safe Software
 
The State of Web3 Industry- Industry Report
The State of Web3 Industry- Industry Report
Liveplex
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Safe Software
 
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
Safe Software
 
AudGram Review: Build Visually Appealing, AI-Enhanced Audiograms to Engage Yo...
AudGram Review: Build Visually Appealing, AI-Enhanced Audiograms to Engage Yo...
SOFTTECHHUB
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Data Validation and System Interoperability
Data Validation and System Interoperability
Safe Software
 
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
FME for Good: Integrating Multiple Data Sources with APIs to Support Local Ch...
FME for Good: Integrating Multiple Data Sources with APIs to Support Local Ch...
Safe Software
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 

Benjamin Heitmann, PhD defence talk: An Open Framework for Multi-source, Cross-domain Personalisation with Semantic Interest Graphs

  • 1. An Open Framework for Multi-source, Cross-domain Personalisation with Semantic Interest Graphs Benjamin Heitmann Ph.D. Viva Monday, 28 July 2014
  • 2.  Personalisation has become an expected feature:  75% of consumers prefer personalised E-Commerce retailers  94% of companies view personalisation as critical to business performance  Examples: Amazon, Last.fm, Facebook Personalisation has become a commodity 2 Motivation
  • 3. Architecture of recommender systems: closed versus open inventory 3 Main research problem:  How to enable cross-domain personalisation without proprietary & closed infrastructure and algorithm ? Motivation
  • 4. State-of-the-art limitations: Collaborative Filtering Research problems:  Provide cross-domain recommendations without overlap ?  Cold-start problem ? 4 Motivation
  • 5. Definitions  Definition of a domain: Any set of recommendable items + set of users + preferences between users and items  Source domain: A domain with non-empty preferences  Target domain: A domain with recommendable items  Cross-domain personalisation task:  Using preferences in source domain to provide recommendations in different target domain  No overlap between source and target domain 5 Motivation
  • 6. State-of-the-art limitations: Content-based filtering 6 General requirement: Data with links between different domains? Research question: Cross-domain recommendations without ratings in target domain? content based single domain.pdf Garth BrooksJohnny Cash Iron MaidenMetallica similar similar Music Catch 22 Harry Potter 1 Books Kyoto New York Travel Garth BrooksJohnny Cash Iron MaidenMetallica similar similar Music Catch 22 Harry Potter 1 Books Kyoto New York Travel ? ? ? Motivation
  • 7. Enabling technology for cross-domain personalisation: Linked Open Data (LOD) 7 LOD can enable cross- domain personalisation: 1. Provides re-usable concept identifiers 2. Cross-domain links for many different domains 3. Standard for interoperable graph data Research question: Best practices for LOD recommender systems ? As of September 2011 Music Brainz (zitgist) P20 Turismo de Zaragoza yovisto Yahoo! Geo Planet YAGO World Fact- book El Viajero Tourism WordNet (W3C) WordNet (VUA) VIVO UF VIVO Indiana VIVO Cornell VIAF URI Burner Sussex Reading Lists Plymouth Reading Lists UniRef UniProt UMBEL UK Post- codes legislation data.gov.uk Uberblic UB Mann- heim TWC LOGD Twarql transport data.gov. uk Traffic Scotland theses. fr Thesau- rus W totl.net Tele- graphis TCM Gene DIT Taxon Concept Open Library (Talis) tags2con delicious t4gm info Swedish Open Cultural Heritage Surge Radio Sudoc STW RAMEAU SH statistics data.gov. uk St. Andrews Resource Lists ECS South- ampton EPrints SSW Thesaur us Smart Link Slideshare 2RDF semantic web.org Semantic Tweet Semantic XBRL SW Dog Food Source Code Ecosystem Linked Data US SEC (rdfabout) Sears Scotland Geo- graphy Scotland Pupils & Exams Scholaro- meter WordNet (RKB Explorer) Wiki UN/ LOCODE Ulm ECS (RKB Explorer) Roma RISKS RESEX RAE2001 Pisa OS OAI NSF New- castle LAAS KISTI JISC IRIT IEEE IBM Eurécom ERA ePrints dotAC DEPLOY DBLP (RKB Explorer) Crime Reports UK Course- ware CORDIS (RKB Explorer) CiteSeer Budapest ACM riese Revyu research data.gov. ukRen. Energy Genera- tors reference data.gov. uk Recht- spraak. nl RDF ohloh Last.FM (rdfize) RDF Book Mashup Rådata nå! PSH Product Types Ontology Product DB PBAC Poké- pédia patents data.go v.uk Ox Points Ord- nance Survey Openly Local Open Library Open Cyc Open Corpo- rates Open Calais OpenEI Open Election Data Project Open Data Thesau- rus Ontos News Portal OGOLOD Janus AMP Ocean Drilling Codices New York Times NVD ntnusc NTU Resource Lists Norwe- gian MeSH NDL subjects ndlna my Experi- ment Italian Museums medu- cator MARC Codes List Man- chester Reading Lists Lotico Weather Stations London Gazette LOIUS Linked Open Colors lobid Resources lobid Organi- sations LEM Linked MDB LinkedL CCN Linked GeoData LinkedCT Linked User Feedback LOV Linked Open Numbers LODE Eurostat (Ontology Central) Linked EDGAR (Ontology Central) Linked Crunch- base lingvoj Lichfield Spen- ding LIBRIS Lexvo LCSH DBLP (L3S) Linked Sensor Data (Kno.e.sis) Klapp- stuhl- club Good- win Family National Radio- activity JP Jamendo (DBtune) Italian public schools ISTAT Immi- gration iServe IdRef Sudoc NSZL Catalog Hellenic PD Hellenic FBD Piedmont Accomo- dations GovTrack GovWILD Google Art wrapper gnoss GESIS GeoWord Net Geo Species Geo Names Geo Linked Data GEMET GTAA STITCH SIDER Project Guten- berg Medi Care Euro- stat (FUB) EURES Drug Bank Disea- some DBLP (FU Berlin) Daily Med CORDIS (FUB) Freebase flickr wrappr Fishes of Texas Finnish Munici- palities ChEMBL FanHubz Event Media EUTC Produc- tions Eurostat Europeana EUNIS EU Insti- tutions ESD stan- dards EARTh Enipedia Popula- tion (En- AKTing) NHS (En- AKTing) Mortality (En- AKTing) Energy (En- AKTing) Crime (En- AKTing) CO2 Emission (En- AKTing) EEA SISVU educatio n.data.g ov.uk ECS South- ampton ECCO- TCP GND Didactal ia DDC Deutsche Bio- graphie data dcs Music Brainz (DBTune) Magna- tune John Peel (DBTune) Classical (DB Tune) Audio Scrobbler (DBTune) Last.FM artists (DBTune) DB Tropes Portu- guese DBpedia dbpedia lite Greek DBpedia DBpedia data- open- ac-uk SMC Journals Pokedex Airports NASA (Data Incu- bator) Music Brainz (Data Incubator) Moseley Folk Metoffice Weather Forecasts Discogs (Data Incubator) Climbing data.gov.uk intervals Data Gov.ie data bnf.fr Cornetto reegle Chronic- ling America Chem2 Bio2RDF Calames business data.gov. uk Bricklink Brazilian Poli- ticians BNB UniSTS UniPath way UniParc Taxono my UniProt (Bio2RDF) SGD Reactome PubMed Pub Chem PRO- SITE ProDom Pfam PDB OMIM MGI KEGG Reaction KEGG Pathway KEGG Glycan KEGG Enzyme KEGG Drug KEGG Com- pound InterPro Homolo Gene HGNC Gene Ontology GeneID Affy- metrix bible ontology BibBase FTS BBC Wildlife Finder BBC Program mes BBC Music Alpine Ski Austria LOCAH Amster- dam Museum AGROV OC AEMET US Census (rdfabout) Media Geographic Publications Government Cross-domain Life sciences User-generated content Motivation
  • 8. Overview of approach  Open framework for cross-domain personalisation 1. Conceptual architecture for recommender systems using Linked Open Data 2. Cross-domain personalisation approach using RDF and Linked Data  Prototype implementation based on the framework 8 Travel destinations: Movies: Multi-source user profiles with preferences from multiple domains Cross-domain recommendation algorithm (SemStim) uses DBpedia as background knowledge Recommendations for target domains
  • 9. Conceptual Architecture for LOD recommender systems: Methodology  Goal:  Identify best practices  List most common components  Enable recommender systems to use Linked Data  Methodology with strong empirical grounding: 1. Survey of 124 RDF-based applications (2003 to 2009) • 15 questions • Original authors were contacted to verify or correct our assessment 2. Architectural analysis to identify common components 3. Extend proposed architecture for recommender systems 9 Conceptual architecture for LOD recommender systems
  • 10. Conceptual Architecture for LOD recommender systems 10 Conceptual architecture for LOD recommender systems
  • 11. Cross-domain algorithm: SemStim  Requirements:  Graph algorithm  Graph search between two domains  SemStim extends Spreading Activation:  Adds targeted activation  Adds constraints for algorithm duration 11 Douglas Adams User profile Recommendable items Start of spreading activation DBpedia Atheism Activists Cambridge United Kingdom Macmillian Restaurant at the end of the universe Kurt Vonnegut Richard Dawkins dc:subject author subsequentWork influencedBy influencedBy dc:subject publisher author birthplace subdivisionName country The Hitchhikers Guide to the Galaxy (novel) SemStim evaluation
  • 12. Evaluation: Objectives 1. Can SemStim provide single-domain recommendations? 2. Can SemStim provide cross-domain recommendations? 3. How diversity are the SemStim recommendations? 4. Is there a connection between accuracy and diversity ? 12 SemStim evaluation
  • 13. Evaluation: comparison algorithms  Algorithms for comparison:  k-nn Collaborative Filtering  SVD++ Collaborative Filtering  Random selection  Linked Data Semantic Distance (LDSD)  Set-based breadth first search (SetBFS)  Background knowledge: DBpedia 3.8 (67m edges, 11m vertices) 13 SemStim evaluation
  • 14. Single-domain accuracy experiment protocol  Data set: MovieLens 100k  Metrics: precision, recall, F1-score  Experiment protocol:  Adapted from Cremonesi  Top-k recommendation task  90%/10% train/probe split  Test profile: highly rated items in probe set plus random items 14 SemStim evaluation
  • 15. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.00 0.05 0.10 0.15 0 5 10 15 20 number of recommendations F1−score ● CFknn LDSD Random SemStim SetBFS SVD++ Single-domain accuracy experiment: results 15 SemStim SemStim evaluation
  • 16. Cross-domain accuracy experiment protocol 16  Data set: Amazon SNAP  Ratings from users with at least 20 ratings in two domains  Metrics: precision, recall, F1-score  Experiment protocol:  Source domain provides train profile  Target domain provides test profile  CF algorithms unsuitable to high sparsity SemStim evaluation
  • 17. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.000 0.005 0.010 0.015 0 10 20 30 number of recommendations F1−score ● LDSD Random SemStim SetBFS Cross-domain accuracy experiment 17 SemStim DVDs >> Music SemStim evaluation
  • 18. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.00 0.01 0.02 0 10 20 30 number of recommendations F1−score ● LDSD Random SemStim SetBFS Cross-domain accuracy experiment 18 SemStim Music >> DVDs SemStim evaluation
  • 19. 0.00 0.25 0.50 0.75 1.00 C Fknn50 SVD ++ R andom LD SD SetBFS 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 algorithm name, or activation threshold for SemStim diversity Single-domain diversity experiment 19  Data set: MovieLens 100k  Experiment protocol:  95%/5% train/test split  Results:  Diversity can be tuned  Requires using all preferences (incl. negative) Less diverse More diverse Increasing activation threshold SemStim evaluation
  • 20. LOD-RecSys challenge at ESWC 2014: Diversity recommendation task 20  Data set: DBbook  Metrics:  F1-score @20 & Inter-List Diversity @20  Ranking based on average rank for both metrics  Diversity rec. task:  Recommend top-20 of all unrated items for each user  Implementation challenge:  Real-time result submission  Hidden ground-truth SemStim evaluation
  • 21. ● ● ● ● ● ● ● ● 0.03 0.04 0.05 0.06 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 activation threshold F1−score@20 ● ● ● ● ● ● ● ● 0.465 0.470 0.475 0.480 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 activation threshold InterListDiversity@20 LOD-RecSys challenge at ESWC 2014: Diversity recommendation task 21  Results:  3rd place out of 12 teams  Competitive performance  SemStim unbiased  Can balance accuracy and diversity Best rank Best rank SemStim evaluation
  • 22. ADVANSSE prototype  Outcome of collaboration project with CISCO Galway  Goals:  Show relevance to real-world, industry use case  Implement cross-domain personalisation framework  Instantiate conceptual architecture for LOD recommender systems  Provide distributed and open ecosystem for cross-domain personalisation 22 ADVANSSE prototype
  • 23. ADVANSSE use case Functional requirements: 1. Filtering of subscriptions 2. Recommendation of posts 3. Updating of interests and recommendations 23 ? MARKETING DEVELOPMENT R & D ADVANSSE prototype
  • 24. ADVANSSE distributed social platform 24 Bob Cecilia ADVANSSE server RDF store XMPP server Personalisation component ADVANSSE connected social platform (1) XMPP client Application logic XMPP Andrew Data homogenisation service Graph query language service RDF store Structured data authoring interface User interface ADVANSSE connected social platform (2) XMPP client Application logic XMPP RDF store Structured data authoring interface User interface ADVANSSE prototype
  • 25. ADVANSSE prototype: user interface 25 ADVANSSE prototype
  • 26. Summary of contributions  Conceptual architecture:  Describes best practices for leveraging LOD for recommender systems  List of high-level components  Strong empirical grounding  Cross-domain recommendation approach using SemStim  Can provide single-domain and cross-domain recommendations  No overlap between source & target domain required  No ratings in target domain required  Competitive performance  Diversity of recommendations can be tuned  ADVANSSE prototype:  Based on real-world use case  Shows how to use LOD to enable an ecosystem for cross-domain pers. 26 Conclusion
  • 27. Future work  Investigate connection between performance of SemStim and choice of target and source domains  Learning of weights for different edge types  Improving the quality of linkage data 27 Conclusion
  • 28. Dissemination  In top-3 for Diversity task at the LOD-RecSys challenge, ESWC 2014  Publications:  2 book chapters  1 journal paper  2 conference papers  2 workshop papers  1 conference poster  ADVANSSE web site: http://advansse.deri.ie 28 Conclusion
  • 29. Extra graphs and data / details 29
  • 30. Motivation: New requirements for recommender systems  Architecture of real-world RecSys has changed:  Shift from closed to open inventories  Emergence of ecosystems to share user preference data  New requirements for recommender systems:  Multi-source profiles  Domain-neutral preferences  Cross-domain personalisation  Existing infrastructure and algorithms are proprietary and closed 30
  • 31. Diversity test for Amazon SNAP DVD data 31 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 0.1 0.2 0.3 0 5 10 15 20 value of topK F1−score ● CFknn50 LDSD Random SemStim02 SemStim03 SemStim04 SetBFS SVDpp
  • 33. advansse:Question1 sioc:Post advansse:Questiondc:title dc:description rdf:type rdf:type Title Post body advansse:User1 dc:creator sioc:UserAccountrdf:type Display Name sioc:name advansse:Tag1 ert:hasTopic ctag:Tagrdf:type Tag String ctag:label http://dbpedia.org/ resource/Entity ctag:means ert:interestedIn advansse:Answer1 sioc:Post advansse:Answer rdf:type rdf:type dc:description Answer body advansse:hasAnswer Namespaces: sioc - http://rdfs.org/sioc/ns# ert - http://www.cisco.com/ert/ advansse - http://advansse.uimr.deri.ie/demo# rdf - http://www.w3.org/1999/02/22-rdf-syntax-ns# dc - http://purl.org/dc/elements/1.1/ ctag - http://commontag.org/ns# advansse:hasQuestion ADVANSSE prototype: Implementing domain-neutral user profiles  Domain-neutral user profiles implemented using CISCO ERT schema  Content extracted from 3 sites on StackExchange  Integrated with DBpedia background knowledge  Data storage:  Jena TDB  HDT triple store 33

Editor's Notes

  • #3: Personalisation has become an expected feature, but real-world recommender systems have also changed fundamentally, which motivates my PhD research.
  • #4: To introduce my main research problem, I first need to show you a quick comparison of the two different recommender systems architecture. Closed inventory is used by e.g. Amazon, in contrast to Facebook which uses an open inventory. Open inventory gets preference data from multiple sources.
  • #5: Multi-source user profiles with preference data about multiple domains, which is extremely hard to use for personalisation.
  • #9: EMPHASIS: ADVANSSE shows that it works in the real world, and ties everything together by implementing both parts of the framework Advansse research questions: Alternative, open ecosystem for cross-domain recommendations? Data structure for domain-neutral user profiles?
  • #11: data discovery service: aggregate distributed profiles data homogenisation service: Integrate user profiles RDF store and graph access layer: Store multi-source & domain-neutral profiles Personalisation component runs cross-domain algorithm User interface: shows Recommendations to user
  • #12: Now explain what happens in the personalisation component (uses RDF store and user interface). Emphasis: Graph algorithm on semantic network. Idea: Start with one domain/set of nodes, end in another. Graph search: Find a path between the two domains SA has been described by Crestani, we extend it.
  • #13: 5th goal: SemStim can use cross-domain recs. to mitigate cold-start problem for new users
  • #15: The Cremonesi experiment protocol is very demanding, as the two classes are very unbalanced.
  • #20: We had to come up with our own diversity metric, which is based on estimating the number of clusters.
  • #21: Main features of challenge: 1.) competition between teams 2.) real-time result submission 3.) secret ground truth for the test data
  • #22: Main features of challenge: 1.) competition between teams 2.) real-time result submission 3.) secret ground truth for the test data
  • #23: Target: 20 min
  • #24: Based
  • #25: XMPP Pub/Sub protocol enables distribution SPARQL Update used for data synchronisation Instantiates conceptual architecture Shows how to support an open ecosystem for personalisation
  • #28: SemStim performs differently for different pairings of domains SemStim currently uses uniform weights for all edges Currently naïve baselines are used, which shows that the algorithm is quite robust, more sophisticated approaches could improve results
  • #34: 3 StackExchange Sites: security, web apps, bicycles