SlideShare a Scribd company logo
RDF 
conjunc*ve 
query 
cardinality 
es*ma*on 
Owner 
: 
Stama,s/Damian 
Presenter 
: 
Soudip
Overview 
• This 
project 
provides 
methods 
to 
es,mate 
the 
cardinality 
of 
(for 
the 
result 
of) 
a 
conjunc,ve 
query 
• It 
requires 
a 
summary 
with 
sta,s,cs 
informa,on 
that 
can 
be 
provided 
as 
a 
serialized 
summary 
or 
by 
providing 
the 
database 
connec,on 
containing 
sta,s,cs 
and 
dic,onary 
tables 
informa,on 
• This 
project 
also 
include 
methods 
to 
generate 
a 
dic,onary-­‐ 
encoded 
version 
of 
the 
triples 
table 
and 
the 
triples 
table 
sta,s,cs 
in 
a 
database, 
for 
both 
the 
plain 
and 
the 
dic,onary-­‐ 
encoded 
triples 
tables 
• It 
is 
a 
refactored 
extrac,on 
(for 
code 
reusability) 
of 
the 
RDFViewSelec,on’s 
(project) 
cardinality 
es,mator
Few 
Details 
• Online 
repository: 
– hLps://scm.gforge.inria.fr/svn/distriples/RDFOp,m 
• Code 
size 
(java) 
– 4041 
(LoC), 
15 
packages 
• List 
of 
people 
contributed-­‐ 
– Present: 
Stama,s, 
Damian, 
Ioana 
– Past: 
Julien 
Leblay 
• Current 
Owner 
(OAK 
member) 
of 
the 
Code-­‐ 
Stama,s/Damian 
• Who 
is 
using 
the 
code 
now 
– Fragmented 
Query 
Execu,on 
(Damian) 
– CliqueSquare 
(Stama,s) 
– Op,mizer 
(hLps://scm.gforge.inria.fr/svn/distriples/trunk/Op,mizer) 
(Stama,s/Zoi)
Func*onal 
Architecture 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary
RDF 
Loader 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Parses 
input 
RDF 
files 
content 
– 
Extracts 
<subject, 
property, 
object> 
triples 
– Loads 
the 
triples 
into 
the 
DB
Data 
Store 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Stores 
triples 
in 
the 
DB 
and 
creates 
3 
different 
tables 
– 
Data 
Table 
• Stores 
the 
basic 
triples 
– Summary 
Table 
• Stores 
different 
summaries 
of 
triples 
• 6 
different 
summary 
tables 
– Dic,onary 
Table 
• Stores 
integer 
values 
corresponding 
to 
each 
entries 
in 
the 
data 
table
Cardinality 
Es*mator 
Module 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Takes 
input 
• A 
conjunc,ve 
query 
• Data 
from 
Summary 
and 
Dic,onary 
tables 
– Outputs 
cardinality 
informa,on 
for 
the 
input 
query
CQ 
Parser 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Taken 
as 
it 
is 
from 
the 
project 
Conjunc,ve 
Query
DB 
Interfaces 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Contains 
two 
interfaces 
to 
load 
data 
from 
the 
DB 
to 
the 
memory 
• Summary 
Interface 
– 
to 
get 
data 
from 
the 
Summary 
table 
• Dic,onary 
Interface 
-­‐ 
to 
get 
data 
from 
the 
Dic,onary 
table
Cardinality 
Es*mator 
RDF 
RDF Loader 
Parser 
Importer 
Conjunctive 
Query 
CQ Parser 
DB Interfaces 
Cardinality Info. 
Main Modules 
Cardinality 
Estimator 
Triples Summary, Dictonary 
Data Store (PostgreSQL DB) 
Data Table Summary Table (6) Dictonary Table 
S P O S 
Key Value 
Count 
(*) 
Count 
(P) 
Count 
(O) 
Min 
(P) 
Max 
(P) 
Min 
(O) 
Max 
(O) 
P 
Count 
(*) 
Count 
(S) 
Count 
(O) 
Min 
(S) 
Max 
(S) 
Min 
(O) 
Max 
(O) 
O 
Count 
(*) 
Count 
(S) 
Count 
(P) 
Min 
(S) 
Max 
(S) 
Min 
(P) 
Max 
(P) 
S P 
Count 
(*) 
Count 
(O) 
Min 
(O) 
Max 
(O) 
P O 
Count 
(*) 
Count 
(S) 
Min 
(S) 
Max 
(S) 
S O 
Count 
(*) 
Count 
(P) 
Min 
(P) 
Max 
(P) 
CQ, Summary, 
Dictonary 
– 
Takes 
as 
input 
CQ 
and 
Summary 
data 
and 
produces 
the 
cardinality 
info 
for 
the 
input 
CQ 
as 
output 
– It 
uses 
3 
algorithms 
(two 
use 
sta,c 
summary 
and 
one 
uses 
MaxMin 
Summary) 
for 
calcula,ng 
the 
cardinality 
(hLps://scm.gforge.inria.fr/svn/distriples/RDFOp,m/)
Thank 
you!!

More Related Content

What's hot (20)

A middleware for storing massive RDF graphs into NoSQL
A middleware for storing massive RDF graphs into NoSQLA middleware for storing massive RDF graphs into NoSQL
A middleware for storing massive RDF graphs into NoSQL
Luiz Henrique Zambom Santana
 
Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the web
Mahdi Atawneh
 
The MATLAB Low-Level HDF5 Interface
The MATLAB Low-Level HDF5 InterfaceThe MATLAB Low-Level HDF5 Interface
The MATLAB Low-Level HDF5 Interface
The HDF-EOS Tools and Information Center
 
pandas - Python Data Analysis
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data Analysis
Andrew Henshaw
 
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
Big Data Spain
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result VisualizationRethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Olaf Hartig
 
Query Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream ProcessingQuery Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream Processing
Jean-Paul Calbimonte
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in R
Yanchang Zhao
 
Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010
Yahoo Developer Network
 
GitHubGraph
GitHubGraphGitHubGraph
GitHubGraph
ronaknnatnani
 
Partitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph ExecutionPartitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph Execution
Chen Wu
 
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetFile Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & Parquet
Owen O'Malley
 
Unit 3
Unit 3Unit 3
Unit 3
Piyush Rochwani
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using R
Victoria López
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R Studio
Rupak Roy
 
Data recovery using pg_filedump
Data recovery using pg_filedumpData recovery using pg_filedump
Data recovery using pg_filedump
Aleksander Alekseev
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at FacebookHadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
elliando dias
 
Apache pig presentation_siddharth_mathur
Apache pig presentation_siddharth_mathurApache pig presentation_siddharth_mathur
Apache pig presentation_siddharth_mathur
Siddharth Mathur
 
Xadoop - new approaches to data analytics
Xadoop - new approaches to data analyticsXadoop - new approaches to data analytics
Xadoop - new approaches to data analytics
Maxim Grinev
 
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
Spark Summit
 
A middleware for storing massive RDF graphs into NoSQL
A middleware for storing massive RDF graphs into NoSQLA middleware for storing massive RDF graphs into NoSQL
A middleware for storing massive RDF graphs into NoSQL
Luiz Henrique Zambom Santana
 
Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the web
Mahdi Atawneh
 
pandas - Python Data Analysis
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data Analysis
Andrew Henshaw
 
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
Big Data Spain
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result VisualizationRethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Olaf Hartig
 
Query Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream ProcessingQuery Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream Processing
Jean-Paul Calbimonte
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in R
Yanchang Zhao
 
Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010Hive integration: HBase and Rcfile__HadoopSummit2010
Hive integration: HBase and Rcfile__HadoopSummit2010
Yahoo Developer Network
 
Partitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph ExecutionPartitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph Execution
Chen Wu
 
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetFile Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & Parquet
Owen O'Malley
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using R
Victoria López
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R Studio
Rupak Roy
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at FacebookHadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
elliando dias
 
Apache pig presentation_siddharth_mathur
Apache pig presentation_siddharth_mathurApache pig presentation_siddharth_mathur
Apache pig presentation_siddharth_mathur
Siddharth Mathur
 
Xadoop - new approaches to data analytics
Xadoop - new approaches to data analyticsXadoop - new approaches to data analytics
Xadoop - new approaches to data analytics
Maxim Grinev
 
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
Spark Summit
 

Similar to Rdf conjunctive query selectivity estimation (20)

Python redis talk
Python redis talkPython redis talk
Python redis talk
Josiah Carlson
 
RSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF StreamsRSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF Streams
keski
 
Enabling exploratory data science with Spark and R
Enabling exploratory data science with Spark and REnabling exploratory data science with Spark and R
Enabling exploratory data science with Spark and R
Databricks
 
Toying with spark
Toying with sparkToying with spark
Toying with spark
Raymond Tay
 
Apache Spark and DataStax Enablement
Apache Spark and DataStax EnablementApache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
Vincent Poncet
 
Enabling Exploratory Analysis of Large Data with Apache Spark and R
Enabling Exploratory Analysis of Large Data with Apache Spark and REnabling Exploratory Analysis of Large Data with Apache Spark and R
Enabling Exploratory Analysis of Large Data with Apache Spark and R
Databricks
 
Strata NYC 2015 - Supercharging R with Apache Spark
Strata NYC 2015 - Supercharging R with Apache SparkStrata NYC 2015 - Supercharging R with Apache Spark
Strata NYC 2015 - Supercharging R with Apache Spark
Databricks
 
Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019
Dave Stokes
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
Stéphane Fréchette
 
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Databricks
 
Scala 20140715
Scala 20140715Scala 20140715
Scala 20140715
Roger Huang
 
Intro to Apache Spark and Scala, Austin ACM SIGKDD, 7/9/2014
Intro to Apache Spark and Scala, Austin ACM SIGKDD, 7/9/2014Intro to Apache Spark and Scala, Austin ACM SIGKDD, 7/9/2014
Intro to Apache Spark and Scala, Austin ACM SIGKDD, 7/9/2014
Roger Huang
 
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
A Preliminary survey of RDF/Neo4j as backends for KnetMinerA Preliminary survey of RDF/Neo4j as backends for KnetMiner
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
Rothamsted Research, UK
 
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
Spark Summit
 
Rattle Graphical Interface for R Language
Rattle Graphical Interface for R LanguageRattle Graphical Interface for R Language
Rattle Graphical Interface for R Language
Majid Abdollahi
 
Parallelizing Existing R Packages
Parallelizing Existing R PackagesParallelizing Existing R Packages
Parallelizing Existing R Packages
Craig Warman
 
Introduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastIntroduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at last
Holden Karau
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Duyhai Doan
 
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Kai Chan
 
DataFrame: Spark's new abstraction for data science by Reynold Xin of Databricks
DataFrame: Spark's new abstraction for data science by Reynold Xin of DatabricksDataFrame: Spark's new abstraction for data science by Reynold Xin of Databricks
DataFrame: Spark's new abstraction for data science by Reynold Xin of Databricks
Data Con LA
 
RSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF StreamsRSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF Streams
keski
 
Enabling exploratory data science with Spark and R
Enabling exploratory data science with Spark and REnabling exploratory data science with Spark and R
Enabling exploratory data science with Spark and R
Databricks
 
Toying with spark
Toying with sparkToying with spark
Toying with spark
Raymond Tay
 
Apache Spark and DataStax Enablement
Apache Spark and DataStax EnablementApache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
Vincent Poncet
 
Enabling Exploratory Analysis of Large Data with Apache Spark and R
Enabling Exploratory Analysis of Large Data with Apache Spark and REnabling Exploratory Analysis of Large Data with Apache Spark and R
Enabling Exploratory Analysis of Large Data with Apache Spark and R
Databricks
 
Strata NYC 2015 - Supercharging R with Apache Spark
Strata NYC 2015 - Supercharging R with Apache SparkStrata NYC 2015 - Supercharging R with Apache Spark
Strata NYC 2015 - Supercharging R with Apache Spark
Databricks
 
Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019
Dave Stokes
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
Stéphane Fréchette
 
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Databricks
 
Intro to Apache Spark and Scala, Austin ACM SIGKDD, 7/9/2014
Intro to Apache Spark and Scala, Austin ACM SIGKDD, 7/9/2014Intro to Apache Spark and Scala, Austin ACM SIGKDD, 7/9/2014
Intro to Apache Spark and Scala, Austin ACM SIGKDD, 7/9/2014
Roger Huang
 
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
A Preliminary survey of RDF/Neo4j as backends for KnetMinerA Preliminary survey of RDF/Neo4j as backends for KnetMiner
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
Rothamsted Research, UK
 
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
Spark Summit
 
Rattle Graphical Interface for R Language
Rattle Graphical Interface for R LanguageRattle Graphical Interface for R Language
Rattle Graphical Interface for R Language
Majid Abdollahi
 
Parallelizing Existing R Packages
Parallelizing Existing R PackagesParallelizing Existing R Packages
Parallelizing Existing R Packages
Craig Warman
 
Introduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastIntroduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at last
Holden Karau
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Duyhai Doan
 
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Kai Chan
 
DataFrame: Spark's new abstraction for data science by Reynold Xin of Databricks
DataFrame: Spark's new abstraction for data science by Reynold Xin of DatabricksDataFrame: Spark's new abstraction for data science by Reynold Xin of Databricks
DataFrame: Spark's new abstraction for data science by Reynold Xin of Databricks
Data Con LA
 

More from INRIA-OAK (20)

Change Management in the Traditional and Semantic Web
Change Management in the Traditional and Semantic WebChange Management in the Traditional and Semantic Web
Change Management in the Traditional and Semantic Web
INRIA-OAK
 
A Network-Aware Approach for Searching As-You-Type in Social Media
A Network-Aware Approach for Searching As-You-Type in Social MediaA Network-Aware Approach for Searching As-You-Type in Social Media
A Network-Aware Approach for Searching As-You-Type in Social Media
INRIA-OAK
 
Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...
INRIA-OAK
 
Querying incomplete data
Querying incomplete dataQuerying incomplete data
Querying incomplete data
INRIA-OAK
 
ANGIE in wonderland
ANGIE in wonderlandANGIE in wonderland
ANGIE in wonderland
INRIA-OAK
 
On building more human query answering systems
On building more human query answering systemsOn building more human query answering systems
On building more human query answering systems
INRIA-OAK
 
Dynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsDynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data Platforms
INRIA-OAK
 
Web Data Management in RDF Age
Web Data Management in RDF AgeWeb Data Management in RDF Age
Web Data Management in RDF Age
INRIA-OAK
 
Oak meeting 18/09/2014
Oak meeting 18/09/2014Oak meeting 18/09/2014
Oak meeting 18/09/2014
INRIA-OAK
 
Nautilus
NautilusNautilus
Nautilus
INRIA-OAK
 
Warg
WargWarg
Warg
INRIA-OAK
 
Vip2p
Vip2pVip2p
Vip2p
INRIA-OAK
 
S4
S4S4
S4
INRIA-OAK
 
Rdf saturator
Rdf saturatorRdf saturator
Rdf saturator
INRIA-OAK
 
Rdf generator
Rdf generatorRdf generator
Rdf generator
INRIA-OAK
 
rdf query reformulation
rdf query reformulationrdf query reformulation
rdf query reformulation
INRIA-OAK
 
postgres loader
postgres loaderpostgres loader
postgres loader
INRIA-OAK
 
Plreuse
PlreusePlreuse
Plreuse
INRIA-OAK
 
Paxquery
PaxqueryPaxquery
Paxquery
INRIA-OAK
 
Conjunctive queries
Conjunctive queriesConjunctive queries
Conjunctive queries
INRIA-OAK
 
Change Management in the Traditional and Semantic Web
Change Management in the Traditional and Semantic WebChange Management in the Traditional and Semantic Web
Change Management in the Traditional and Semantic Web
INRIA-OAK
 
A Network-Aware Approach for Searching As-You-Type in Social Media
A Network-Aware Approach for Searching As-You-Type in Social MediaA Network-Aware Approach for Searching As-You-Type in Social Media
A Network-Aware Approach for Searching As-You-Type in Social Media
INRIA-OAK
 
Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...
INRIA-OAK
 
Querying incomplete data
Querying incomplete dataQuerying incomplete data
Querying incomplete data
INRIA-OAK
 
ANGIE in wonderland
ANGIE in wonderlandANGIE in wonderland
ANGIE in wonderland
INRIA-OAK
 
On building more human query answering systems
On building more human query answering systemsOn building more human query answering systems
On building more human query answering systems
INRIA-OAK
 
Dynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsDynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data Platforms
INRIA-OAK
 
Web Data Management in RDF Age
Web Data Management in RDF AgeWeb Data Management in RDF Age
Web Data Management in RDF Age
INRIA-OAK
 
Oak meeting 18/09/2014
Oak meeting 18/09/2014Oak meeting 18/09/2014
Oak meeting 18/09/2014
INRIA-OAK
 
Rdf saturator
Rdf saturatorRdf saturator
Rdf saturator
INRIA-OAK
 
Rdf generator
Rdf generatorRdf generator
Rdf generator
INRIA-OAK
 
rdf query reformulation
rdf query reformulationrdf query reformulation
rdf query reformulation
INRIA-OAK
 
postgres loader
postgres loaderpostgres loader
postgres loader
INRIA-OAK
 
Conjunctive queries
Conjunctive queriesConjunctive queries
Conjunctive queries
INRIA-OAK
 

Recently uploaded (20)

DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136
illuminati Agent uganda call+256776963507/0741506136
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...
Pixellion
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...
Pixellion
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 

Rdf conjunctive query selectivity estimation

  • 1. RDF conjunc*ve query cardinality es*ma*on Owner : Stama,s/Damian Presenter : Soudip
  • 2. Overview • This project provides methods to es,mate the cardinality of (for the result of) a conjunc,ve query • It requires a summary with sta,s,cs informa,on that can be provided as a serialized summary or by providing the database connec,on containing sta,s,cs and dic,onary tables informa,on • This project also include methods to generate a dic,onary-­‐ encoded version of the triples table and the triples table sta,s,cs in a database, for both the plain and the dic,onary-­‐ encoded triples tables • It is a refactored extrac,on (for code reusability) of the RDFViewSelec,on’s (project) cardinality es,mator
  • 3. Few Details • Online repository: – hLps://scm.gforge.inria.fr/svn/distriples/RDFOp,m • Code size (java) – 4041 (LoC), 15 packages • List of people contributed-­‐ – Present: Stama,s, Damian, Ioana – Past: Julien Leblay • Current Owner (OAK member) of the Code-­‐ Stama,s/Damian • Who is using the code now – Fragmented Query Execu,on (Damian) – CliqueSquare (Stama,s) – Op,mizer (hLps://scm.gforge.inria.fr/svn/distriples/trunk/Op,mizer) (Stama,s/Zoi)
  • 4. Func*onal Architecture RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary
  • 5. RDF Loader RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Parses input RDF files content – Extracts <subject, property, object> triples – Loads the triples into the DB
  • 6. Data Store RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Stores triples in the DB and creates 3 different tables – Data Table • Stores the basic triples – Summary Table • Stores different summaries of triples • 6 different summary tables – Dic,onary Table • Stores integer values corresponding to each entries in the data table
  • 7. Cardinality Es*mator Module RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Takes input • A conjunc,ve query • Data from Summary and Dic,onary tables – Outputs cardinality informa,on for the input query
  • 8. CQ Parser RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Taken as it is from the project Conjunc,ve Query
  • 9. DB Interfaces RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Contains two interfaces to load data from the DB to the memory • Summary Interface – to get data from the Summary table • Dic,onary Interface -­‐ to get data from the Dic,onary table
  • 10. Cardinality Es*mator RDF RDF Loader Parser Importer Conjunctive Query CQ Parser DB Interfaces Cardinality Info. Main Modules Cardinality Estimator Triples Summary, Dictonary Data Store (PostgreSQL DB) Data Table Summary Table (6) Dictonary Table S P O S Key Value Count (*) Count (P) Count (O) Min (P) Max (P) Min (O) Max (O) P Count (*) Count (S) Count (O) Min (S) Max (S) Min (O) Max (O) O Count (*) Count (S) Count (P) Min (S) Max (S) Min (P) Max (P) S P Count (*) Count (O) Min (O) Max (O) P O Count (*) Count (S) Min (S) Max (S) S O Count (*) Count (P) Min (P) Max (P) CQ, Summary, Dictonary – Takes as input CQ and Summary data and produces the cardinality info for the input CQ as output – It uses 3 algorithms (two use sta,c summary and one uses MaxMin Summary) for calcula,ng the cardinality (hLps://scm.gforge.inria.fr/svn/distriples/RDFOp,m/)