SlideShare a Scribd company logo
RDF 
conjunc*ve 
query 
cardinality 
es*ma*on 
Owner 
: 
Stama,s/Damian 
Presenter 
: 
Soudip
Overview 
• This 
project 
provides 
methods 
to 
es,mate 
the 
cardinality 
of 
(for 
the 
result 
of) 
a 
conjunc,ve 
query 
• It 
requires 
a 
summary 
with 
sta,s,cs 
informa,on 
that 
can 
be 
provided 
as 
a 
serialized 
summary 
or 
by 
providing 
the 
database 
connec,on 
containing 
sta,s,cs 
and 
dic,onary 
tables 
informa,on 
• This 
project 
also 
include 
methods 
to 
generate 
a 
dic,onary-­‐ 
encoded 
version 
of 
the 
triples 
table 
and 
the 
triples 
table 
sta,s,cs 
in 
a 
database, 
for 
both 
the 
plain 
and 
the 
dic,onary-­‐ 
encoded 
triples 
tables 
• It 
is 
a 
refactored 
extrac,on 
(for 
code 
reusability) 
of 
the 
RDFViewSelec,on’s 
(project) 
cardinality 
es,mator
Few 
Details 
• Online 
repository: 
– hLps://scm.gforge.inria.fr/svn/distriples/RDFOp,m 
• Code 
size 
(java) 
– 4041 
(LoC), 
15 
packages 
• List 
of 
people 
contributed-­‐ 
– Present: 
Stama,s, 
Damian, 
Ioana 
– Past: 
Julien 
Leblay 
• Current 
Owner 
(OAK 
member) 
of 
the 
Code-­‐ 
Stama,s/Damian 
• Who 
is 
using 
the 
code 
now 
– Fragmented 
Query 
Execu,on 
(Damian) 
– CliqueSquare 
(Stama,s) 
– Op,mizer 
(hLps://scm.gforge.inria.fr/svn/distriples/trunk/Op,mizer) 
(Stama,s/Zoi)
Func*onal 
Architecture 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary
RDF 
Loader 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Parses 
input 
RDF 
files 
content 
– 
Extracts 
<subject, 
property, 
object> 
triples 
– Loads 
the 
triples 
into 
the 
DB
Data 
Store 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Stores 
triples 
in 
the 
DB 
and 
creates 
3 
different 
tables 
– 
Data 
Table 
• Stores 
the 
basic 
triples 
– Summary 
Table 
• Stores 
different 
summaries 
of 
triples 
• 6 
different 
summary 
tables 
– Dic,onary 
Table 
• Stores 
integer 
values 
corresponding 
to 
each 
entries 
in 
the 
data 
table
Cardinality 
Es*mator 
Module 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Takes 
input 
• A 
conjunc,ve 
query 
• Data 
from 
Summary 
and 
Dic,onary 
tables 
– Outputs 
cardinality 
informa,on 
for 
the 
input 
query
CQ 
Parser 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Taken 
as 
it 
is 
from 
the 
project 
Conjunc,ve 
Query
DB 
Interfaces 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Contains 
two 
interfaces 
to 
load 
data 
from 
the 
DB 
to 
the 
memory 
• Summary 
Interface 
– 
to 
get 
data 
from 
the 
Summary 
table 
• Dic,onary 
Interface 
-­‐ 
to 
get 
data 
from 
the 
Dic,onary 
table
Cardinality 
Es*mator 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Takes 
as 
input 
CQ 
and 
Summary 
data 
and 
produces 
the 
cardinality 
info 
for 
the 
input 
CQ 
as 
output 
– It 
uses 
3 
algorithms 
(two 
use 
sta,c 
summary 
and 
one 
uses 
MaxMin 
Summary) 
for 
calcula,ng 
the 
cardinality 
(hLps://scm.gforge.inria.fr/svn/distriples/RDFOp,m/)
Thank 
you!!

More Related Content

PPT
Inside database
Takashi Hoshino
 
PPTX
Python pandas Library
Md. Sohag Miah
 
PDF
Apache Spark — Fundamentals and MLlib
Jens Fisseler, Dr.
 
PPTX
A Workshop on R
Ajay Ohri
 
PDF
Adaptive Query Processing on RAW Data
Manos Karpathiotakis
 
PDF
Spark SQL Bucketing at Facebook
Databricks
 
PDF
Apache Hive Table Partition and HQL
Rupak Roy
 
PDF
Workload-Aware RDF Partitioning and SPARQL Query Caching for Massive RDF Gra...
Luiz Henrique Zambom Santana
 
Inside database
Takashi Hoshino
 
Python pandas Library
Md. Sohag Miah
 
Apache Spark — Fundamentals and MLlib
Jens Fisseler, Dr.
 
A Workshop on R
Ajay Ohri
 
Adaptive Query Processing on RAW Data
Manos Karpathiotakis
 
Spark SQL Bucketing at Facebook
Databricks
 
Apache Hive Table Partition and HQL
Rupak Roy
 
Workload-Aware RDF Partitioning and SPARQL Query Caching for Massive RDF Gra...
Luiz Henrique Zambom Santana
 

What's hot (20)

PPTX
A middleware for storing massive RDF graphs into NoSQL
Luiz Henrique Zambom Santana
 
PPTX
Optimized index structures for querying rdf from the web
Mahdi Atawneh
 
PPT
The MATLAB Low-Level HDF5 Interface
The HDF-EOS Tools and Information Center
 
PDF
pandas - Python Data Analysis
Andrew Henshaw
 
PDF
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
Big Data Spain
 
PDF
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Olaf Hartig
 
PPTX
Query Rewriting in RDF Stream Processing
Jean-Paul Calbimonte
 
PDF
Introduction to Data Mining with R and Data Import/Export in R
Yanchang Zhao
 
PPT
Hive integration: HBase and Rcfile__HadoopSummit2010
Yahoo Developer Network
 
PDF
GitHubGraph
ronaknnatnani
 
PDF
Partitioning SKA Dataflows for Optimal Graph Execution
Chen Wu
 
PPTX
File Format Benchmarks - Avro, JSON, ORC, & Parquet
Owen O'Malley
 
PPTX
Unit 3
Piyush Rochwani
 
PDF
Introduction to data analysis using R
Victoria López
 
PDF
Import web resources using R Studio
Rupak Roy
 
PDF
Data recovery using pg_filedump
Aleksander Alekseev
 
PDF
Hadoop and Hive Development at Facebook
elliando dias
 
PPTX
Apache pig presentation_siddharth_mathur
Siddharth Mathur
 
PPT
Xadoop - new approaches to data analytics
Maxim Grinev
 
PDF
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
Spark Summit
 
A middleware for storing massive RDF graphs into NoSQL
Luiz Henrique Zambom Santana
 
Optimized index structures for querying rdf from the web
Mahdi Atawneh
 
The MATLAB Low-Level HDF5 Interface
The HDF-EOS Tools and Information Center
 
pandas - Python Data Analysis
Andrew Henshaw
 
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
Big Data Spain
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Olaf Hartig
 
Query Rewriting in RDF Stream Processing
Jean-Paul Calbimonte
 
Introduction to Data Mining with R and Data Import/Export in R
Yanchang Zhao
 
Hive integration: HBase and Rcfile__HadoopSummit2010
Yahoo Developer Network
 
GitHubGraph
ronaknnatnani
 
Partitioning SKA Dataflows for Optimal Graph Execution
Chen Wu
 
File Format Benchmarks - Avro, JSON, ORC, & Parquet
Owen O'Malley
 
Introduction to data analysis using R
Victoria López
 
Import web resources using R Studio
Rupak Roy
 
Data recovery using pg_filedump
Aleksander Alekseev
 
Hadoop and Hive Development at Facebook
elliando dias
 
Apache pig presentation_siddharth_mathur
Siddharth Mathur
 
Xadoop - new approaches to data analytics
Maxim Grinev
 
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
Spark Summit
 
Ad

Similar to Rdf conjunctive query selectivity estimation (20)

PPT
Python redis talk
Josiah Carlson
 
PDF
RSP-QL*: Querying Data-Level Annotations in RDF Streams
keski
 
PDF
Enabling exploratory data science with Spark and R
Databricks
 
PDF
Toying with spark
Raymond Tay
 
PDF
Apache Spark and DataStax Enablement
Vincent Poncet
 
PDF
Enabling Exploratory Analysis of Large Data with Apache Spark and R
Databricks
 
PDF
Strata NYC 2015 - Supercharging R with Apache Spark
Databricks
 
PDF
Hybrid Databases - PHP UK Conference 22 February 2019
Dave Stokes
 
PPTX
Data Analytics with R and SQL Server
Stéphane Fréchette
 
PDF
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Databricks
 
PPTX
Intro to Apache Spark and Scala, Austin ACM SIGKDD, 7/9/2014
Roger Huang
 
PPTX
Scala 20140715
Roger Huang
 
PPTX
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
Rothamsted Research, UK
 
PDF
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
Spark Summit
 
PPTX
Rattle Graphical Interface for R Language
Majid Abdollahi
 
PDF
Parallelizing Existing R Packages
Craig Warman
 
PDF
Introduction to Spark Datasets - Functional and relational together at last
Holden Karau
 
PDF
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Duyhai Doan
 
PPTX
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Kai Chan
 
PDF
DataFrame: Spark's new abstraction for data science by Reynold Xin of Databricks
Data Con LA
 
Python redis talk
Josiah Carlson
 
RSP-QL*: Querying Data-Level Annotations in RDF Streams
keski
 
Enabling exploratory data science with Spark and R
Databricks
 
Toying with spark
Raymond Tay
 
Apache Spark and DataStax Enablement
Vincent Poncet
 
Enabling Exploratory Analysis of Large Data with Apache Spark and R
Databricks
 
Strata NYC 2015 - Supercharging R with Apache Spark
Databricks
 
Hybrid Databases - PHP UK Conference 22 February 2019
Dave Stokes
 
Data Analytics with R and SQL Server
Stéphane Fréchette
 
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Databricks
 
Intro to Apache Spark and Scala, Austin ACM SIGKDD, 7/9/2014
Roger Huang
 
Scala 20140715
Roger Huang
 
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
Rothamsted Research, UK
 
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
Spark Summit
 
Rattle Graphical Interface for R Language
Majid Abdollahi
 
Parallelizing Existing R Packages
Craig Warman
 
Introduction to Spark Datasets - Functional and relational together at last
Holden Karau
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Duyhai Doan
 
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Kai Chan
 
DataFrame: Spark's new abstraction for data science by Reynold Xin of Databricks
Data Con LA
 
Ad

More from INRIA-OAK (20)

PDF
Change Management in the Traditional and Semantic Web
INRIA-OAK
 
PDF
A Network-Aware Approach for Searching As-You-Type in Social Media
INRIA-OAK
 
PDF
Speeding up information extraction programs: a holistic optimizer and a learn...
INRIA-OAK
 
PDF
Querying incomplete data
INRIA-OAK
 
PPTX
ANGIE in wonderland
INRIA-OAK
 
PPTX
On building more human query answering systems
INRIA-OAK
 
PPSX
Dynamically Optimizing Queries over Large Scale Data Platforms
INRIA-OAK
 
PDF
Web Data Management in RDF Age
INRIA-OAK
 
PDF
Oak meeting 18/09/2014
INRIA-OAK
 
PPTX
Nautilus
INRIA-OAK
 
PDF
Warg
INRIA-OAK
 
PDF
Vip2p
INRIA-OAK
 
PDF
S4
INRIA-OAK
 
PDF
Rdf saturator
INRIA-OAK
 
PDF
Rdf generator
INRIA-OAK
 
PDF
rdf query reformulation
INRIA-OAK
 
PDF
postgres loader
INRIA-OAK
 
PDF
Plreuse
INRIA-OAK
 
PDF
Paxquery
INRIA-OAK
 
PDF
Conjunctive queries
INRIA-OAK
 
Change Management in the Traditional and Semantic Web
INRIA-OAK
 
A Network-Aware Approach for Searching As-You-Type in Social Media
INRIA-OAK
 
Speeding up information extraction programs: a holistic optimizer and a learn...
INRIA-OAK
 
Querying incomplete data
INRIA-OAK
 
ANGIE in wonderland
INRIA-OAK
 
On building more human query answering systems
INRIA-OAK
 
Dynamically Optimizing Queries over Large Scale Data Platforms
INRIA-OAK
 
Web Data Management in RDF Age
INRIA-OAK
 
Oak meeting 18/09/2014
INRIA-OAK
 
Nautilus
INRIA-OAK
 
Warg
INRIA-OAK
 
Vip2p
INRIA-OAK
 
Rdf saturator
INRIA-OAK
 
Rdf generator
INRIA-OAK
 
rdf query reformulation
INRIA-OAK
 
postgres loader
INRIA-OAK
 
Plreuse
INRIA-OAK
 
Paxquery
INRIA-OAK
 
Conjunctive queries
INRIA-OAK
 

Recently uploaded (20)

PPTX
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
PPTX
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
PDF
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
PDF
Mastering Financial Analysis Materials.pdf
SalamiAbdullahi
 
PPTX
International-health-agency and it's work.pptx
shreehareeshgs
 
PPTX
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
PPTX
Purple and Violet Modern Marketing Presentation (1).pptx
SanthoshKumar229321
 
PPTX
Extract Transformation Load (3) (1).pptx
revathi148366
 
PDF
Data Analyst Certificate Programs for Beginners | IABAC
Seenivasan
 
PDF
Chad Readey - An Independent Thinker
Chad Readey
 
PDF
Company Profile 2023 PT. ZEKON INDONESIA.pdf
hendranofriadi26
 
PPTX
Presentation1.pptxvhhh. H ycycyyccycycvvv
ItratBatool16
 
PPTX
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
PPTX
Probability systematic sampling methods.pptx
PrakashRajput19
 
PDF
Research about a FoodFolio app for personalized dietary tracking and health o...
AustinLiamAndres
 
PPTX
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
abhinavmemories2026
 
PPTX
Economic Sector Performance Recovery.pptx
yulisbaso2020
 
PDF
345_IT infrastructure for business management.pdf
LEANHTRAN4
 
PPTX
Azure Data management Engineer project.pptx
sumitmundhe77
 
PDF
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
Mastering Financial Analysis Materials.pdf
SalamiAbdullahi
 
International-health-agency and it's work.pptx
shreehareeshgs
 
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
Purple and Violet Modern Marketing Presentation (1).pptx
SanthoshKumar229321
 
Extract Transformation Load (3) (1).pptx
revathi148366
 
Data Analyst Certificate Programs for Beginners | IABAC
Seenivasan
 
Chad Readey - An Independent Thinker
Chad Readey
 
Company Profile 2023 PT. ZEKON INDONESIA.pdf
hendranofriadi26
 
Presentation1.pptxvhhh. H ycycyyccycycvvv
ItratBatool16
 
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
Probability systematic sampling methods.pptx
PrakashRajput19
 
Research about a FoodFolio app for personalized dietary tracking and health o...
AustinLiamAndres
 
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
abhinavmemories2026
 
Economic Sector Performance Recovery.pptx
yulisbaso2020
 
345_IT infrastructure for business management.pdf
LEANHTRAN4
 
Azure Data management Engineer project.pptx
sumitmundhe77
 
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 

Rdf conjunctive query selectivity estimation

  • 1. RDF conjunc*ve query cardinality es*ma*on Owner : Stama,s/Damian Presenter : Soudip
  • 2. Overview • This project provides methods to es,mate the cardinality of (for the result of) a conjunc,ve query • It requires a summary with sta,s,cs informa,on that can be provided as a serialized summary or by providing the database connec,on containing sta,s,cs and dic,onary tables informa,on • This project also include methods to generate a dic,onary-­‐ encoded version of the triples table and the triples table sta,s,cs in a database, for both the plain and the dic,onary-­‐ encoded triples tables • It is a refactored extrac,on (for code reusability) of the RDFViewSelec,on’s (project) cardinality es,mator
  • 3. Few Details • Online repository: – hLps://scm.gforge.inria.fr/svn/distriples/RDFOp,m • Code size (java) – 4041 (LoC), 15 packages • List of people contributed-­‐ – Present: Stama,s, Damian, Ioana – Past: Julien Leblay • Current Owner (OAK member) of the Code-­‐ Stama,s/Damian • Who is using the code now – Fragmented Query Execu,on (Damian) – CliqueSquare (Stama,s) – Op,mizer (hLps://scm.gforge.inria.fr/svn/distriples/trunk/Op,mizer) (Stama,s/Zoi)
  • 4. Func*onal Architecture RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary
  • 5. RDF Loader RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Parses input RDF files content – Extracts <subject, property, object> triples – Loads the triples into the DB
  • 6. Data Store RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Stores triples in the DB and creates 3 different tables – Data Table • Stores the basic triples – Summary Table • Stores different summaries of triples • 6 different summary tables – Dic,onary Table • Stores integer values corresponding to each entries in the data table
  • 7. Cardinality Es*mator Module RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Takes input • A conjunc,ve query • Data from Summary and Dic,onary tables – Outputs cardinality informa,on for the input query
  • 8. CQ Parser RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Taken as it is from the project Conjunc,ve Query
  • 9. DB Interfaces RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Contains two interfaces to load data from the DB to the memory • Summary Interface – to get data from the Summary table • Dic,onary Interface -­‐ to get data from the Dic,onary table
  • 10. Cardinality Es*mator RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Takes as input CQ and Summary data and produces the cardinality info for the input CQ as output – It uses 3 algorithms (two use sta,c summary and one uses MaxMin Summary) for calcula,ng the cardinality (hLps://scm.gforge.inria.fr/svn/distriples/RDFOp,m/)