SlideShare a Scribd company logo
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/1
Outline
• Introduction
• Background
• Distributed Database Design
• Database Integration
• Semantic Data Control
• Distributed Query Processing
➡ Overview
➡ Query decomposition and localization
➡ Distributed query optimization
• Multidatabase Query Processing
• Distributed Transaction Management
• Data Replication
• Parallel Database Systems
• Distributed Object DBMS
• Peer-to-Peer Data Management
• Web Data Management
• Current Issues
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/2
Query Processing in a DDBMS
high level user query
query
processor
Low-level data manipulation
commands for D-DBMS
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/3
Query Processing Components
• Query language that is used
➡ SQL: “intergalactic dataspeak”
• Query execution methodology
➡ The steps that one goes through in executing high-level (declarative) user
queries.
• Query optimization
➡ How do we determine the “best” execution plan?
• We assume a homogeneous D-DBMS
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/4
SELECT ENAME
FROM EMP,ASG
WHERE EMP.ENO = ASG.ENO
AND RESP = "Manager"
Strategy 1
ENAME( RESP=“Manager” EMP.ENO=ASG.ENO(EMP×ASG))
Strategy 2
ENAME(EMP ⋈ENO ( RESP=“Manager” (ASG))
Strategy 2 avoids Cartesian product, so may be “better”
Selecting Alternatives
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/5
What is the Problem?
Site 1 Site 2 Site 3 Site 4 Site 5
EMP1= ENO≤“E3”(EMP) EMP2= ENO>“E3”(EMP)ASG2= ENO>“E3”(ASG)ASG1= ENO≤“E3”(ASG) Result
Site 5
Site 1 Site 2 Site 3 Site 4
ASG1 EMP1 EMP2ASG2Site 4Site 3
Site 1 Site 2
Site 5
EMP’
1=EMP1 ⋈ENO ASG’
1
'
2
EMPEMPresult
'
1
1Manager""RESP1
ASGσASG
'
2Manager""RESP2
ASGσASG
'
'
1
ASG
'
2
ASG
'
1
EMP
'
2
EMP
result= (EMP1 × EMP2)⋈ENOσRESP=“Manager”(ASG1× ASG2)
EMP’
2=EMP2 ⋈ENO ASG’
2
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/6
Cost of Alternatives
• Assume
➡ size(EMP) = 400, size(ASG) = 1000
➡ tuple access cost = 1 unit; tuple transfer cost = 10 units
• Strategy 1
➡ produce ASG': (10+10) tuple access cost 20
➡ transfer ASG' to the sites of EMP: (10+10) tuple transfer cost 200
➡ produce EMP': (10+10) tuple access cost 2 40
➡ transfer EMP' to result site: (10+10) tuple transfer cost 200
Total Cost 460
• Strategy 2
➡ transfer EMP to site 5: 400 tuple transfer cost 4,000
➡ transfer ASG to site 5: 1000 tuple transfer cost 10,000
➡ produce ASG': 1000 tuple access cost 1,000
➡ join EMP and ASG': 400 20 tuple access cost 8,000
Total Cost 23,000
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/7
Query Optimization Objectives
• Minimize a cost function
I/O cost + CPU cost + communication cost
These might have different weights in different distributed environments
• Wide area networks
➡ communication cost may dominate or vary much
✦ bandwidth
✦ speed
✦ high protocol overhead
• Local area networks
➡ communication cost not that dominant
➡ total cost function should be considered
• Can also maximize throughput
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/8
Complexity of Relational
Operations
• Assume
➡ relations of cardinality n
➡ sequential scan
Operation Complexity
Select
Project
(without duplicate elimination)
O(n)
Project
(with duplicate elimination)
Group
O(n log n)
Join
Semi-join
Division
Set Operators
O(n log n)
Cartesian Product O(n2)
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/9
Query Optimization Issues –
Types Of Optimizers
• Exhaustive search
➡ Cost-based
➡ Optimal
➡ Combinatorial complexity in the number of relations
• Heuristics
➡ Not optimal
➡ Regroup common sub-expressions
➡ Perform selection, projection first
➡ Replace a join by a series of semijoins
➡ Reorder operations to reduce intermediate relation size
➡ Optimize individual operations
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/10
Query Optimization Issues –
Optimization Granularity
• Single query at a time
➡ Cannot use common intermediate results
• Multiple queries at a time
➡ Efficient if many similar queries
➡ Decision space is much larger
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/11
Query Optimization Issues –
Optimization Timing
• Static
➡ Compilation  optimize prior to the execution
➡ Difficult to estimate the size of the intermediate results error
propagation
➡ Can amortize over many executions
➡ R*
• Dynamic
➡ Run time optimization
➡ Exact information on the intermediate relation sizes
➡ Have to reoptimize for multiple executions
➡ Distributed INGRES
• Hybrid
➡ Compile using a static algorithm
➡ If the error in estimate sizes > threshold, reoptimize at run time
➡ Mermaid
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/12
Query Optimization Issues –
Statistics
• Relation
➡ Cardinality
➡ Size of a tuple
➡ Fraction of tuples participating in a join with another relation
• Attribute
➡ Cardinality of domain
➡ Actual number of distinct values
• Common assumptions
➡ Independence between different attribute values
➡ Uniform distribution of attribute values within their domain
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/13
Query Optimization Issues –
Decision Sites
• Centralized
➡ Single site determines the “best” schedule
➡ Simple
➡ Need knowledge about the entire distributed database
• Distributed
➡ Cooperation among sites to determine the schedule
➡ Need only local information
➡ Cost of cooperation
• Hybrid
➡ One site determines the global schedule
➡ Each site optimizes the local subqueries
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/14
Query Optimization Issues –
Network Topology
• Wide area networks (WAN) – point-to-point
➡ Characteristics
✦ Low bandwidth
✦ Low speed
✦ High protocol overhead
➡ Communication cost will dominate; ignore all other cost factors
➡ Global schedule to minimize communication cost
➡ Local schedules according to centralized query optimization
• Local area networks (LAN)
➡ Communication cost not that dominant
➡ Total cost function should be considered
➡ Broadcasting can be exploited (joins)
➡ Special algorithms exist for star networks
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/15
Distributed Query Processing
Methodology
Calculus Query on Distributed Relations
CONTROL
SITE
LOCAL
SITES
Query
Decomposition
Data
Localization
Algebraic Query on Distributed
Relations
Global
Optimization
Fragment Query
Local
Optimization
Optimized Fragment Query
with Communication Operations
Optimized Local Queries
GLOBAL
SCHEMA
FRAGMENT
SCHEMA
STATS ON
FRAGMENTS
LOCAL
SCHEMAS
Ad

More Related Content

What's hot (20)

DS UNIT 1.pdf
DS UNIT 1.pdfDS UNIT 1.pdf
DS UNIT 1.pdf
SeethaDinesh
 
R言語による アソシエーション分析-組合せ・事象の規則を解明する-(第5回R勉強会@東京)
R言語による アソシエーション分析-組合せ・事象の規則を解明する-(第5回R勉強会@東京)R言語による アソシエーション分析-組合せ・事象の規則を解明する-(第5回R勉強会@東京)
R言語による アソシエーション分析-組合せ・事象の規則を解明する-(第5回R勉強会@東京)
Koichi Hamada
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
Turi, Inc.
 
GIS Standards and Interoperability
GIS Standards and InteroperabilityGIS Standards and Interoperability
GIS Standards and Interoperability
Nasr Khashoggi
 
Spectral Clustering
Spectral ClusteringSpectral Clustering
Spectral Clustering
ssusered887b
 
Iris - Most loved dataset
Iris - Most loved datasetIris - Most loved dataset
Iris - Most loved dataset
DrAsmitaTitre
 
Data Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisData Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data Analysis
Eva Durall
 
Data Visualisation: Types, Principles, and Tools
Data Visualisation: Types, Principles, and ToolsData Visualisation: Types, Principles, and Tools
Data Visualisation: Types, Principles, and Tools
Sumandro C
 
Outlier detection handling
Outlier detection handlingOutlier detection handling
Outlier detection handling
zekeLabs Technologies
 
Circular linked list
Circular linked listCircular linked list
Circular linked list
dchuynh
 
{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析
Takashi Kitano
 
Minimum Spanning Tree
Minimum Spanning TreeMinimum Spanning Tree
Minimum Spanning Tree
zhaokatherine
 
Graph Representation
Graph RepresentationGraph Representation
Graph Representation
Ramkrishna bhagat
 
岩波データサイエンス_Vol.5_勉強会資料01
岩波データサイエンス_Vol.5_勉強会資料01岩波データサイエンス_Vol.5_勉強会資料01
岩波データサイエンス_Vol.5_勉強会資料01
goony0101
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
Krish_ver2
 
Knn
KnnKnn
Knn
YashwantGahlot1
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
DataWorks Summit
 
Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)
Ram Narasimhan
 
Knn
KnnKnn
Knn
Narkaji Gurung
 
データサイエンス概論第一=2-2 クラスタリング
データサイエンス概論第一=2-2 クラスタリングデータサイエンス概論第一=2-2 クラスタリング
データサイエンス概論第一=2-2 クラスタリング
Seiichi Uchida
 
R言語による アソシエーション分析-組合せ・事象の規則を解明する-(第5回R勉強会@東京)
R言語による アソシエーション分析-組合せ・事象の規則を解明する-(第5回R勉強会@東京)R言語による アソシエーション分析-組合せ・事象の規則を解明する-(第5回R勉強会@東京)
R言語による アソシエーション分析-組合せ・事象の規則を解明する-(第5回R勉強会@東京)
Koichi Hamada
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
Turi, Inc.
 
GIS Standards and Interoperability
GIS Standards and InteroperabilityGIS Standards and Interoperability
GIS Standards and Interoperability
Nasr Khashoggi
 
Spectral Clustering
Spectral ClusteringSpectral Clustering
Spectral Clustering
ssusered887b
 
Iris - Most loved dataset
Iris - Most loved datasetIris - Most loved dataset
Iris - Most loved dataset
DrAsmitaTitre
 
Data Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisData Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data Analysis
Eva Durall
 
Data Visualisation: Types, Principles, and Tools
Data Visualisation: Types, Principles, and ToolsData Visualisation: Types, Principles, and Tools
Data Visualisation: Types, Principles, and Tools
Sumandro C
 
Circular linked list
Circular linked listCircular linked list
Circular linked list
dchuynh
 
{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析
Takashi Kitano
 
Minimum Spanning Tree
Minimum Spanning TreeMinimum Spanning Tree
Minimum Spanning Tree
zhaokatherine
 
岩波データサイエンス_Vol.5_勉強会資料01
岩波データサイエンス_Vol.5_勉強会資料01岩波データサイエンス_Vol.5_勉強会資料01
岩波データサイエンス_Vol.5_勉強会資料01
goony0101
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
Krish_ver2
 
Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)
Ram Narasimhan
 
データサイエンス概論第一=2-2 クラスタリング
データサイエンス概論第一=2-2 クラスタリングデータサイエンス概論第一=2-2 クラスタリング
データサイエンス概論第一=2-2 クラスタリング
Seiichi Uchida
 

Viewers also liked (20)

Database ,7 query localization
Database ,7 query localizationDatabase ,7 query localization
Database ,7 query localization
Ali Usman
 
Query decomposition in data base
Query decomposition in data baseQuery decomposition in data base
Query decomposition in data base
Salman Memon
 
Database ,16 P2P
Database ,16 P2P Database ,16 P2P
Database ,16 P2P
Ali Usman
 
Database ,10 Transactions
Database ,10 TransactionsDatabase ,10 Transactions
Database ,10 Transactions
Ali Usman
 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 Replication
Ali Usman
 
Database ,2 Background
 Database ,2 Background Database ,2 Background
Database ,2 Background
Ali Usman
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query Optimization
Ali Usman
 
Database , 4 Data Integration
Database , 4 Data IntegrationDatabase , 4 Data Integration
Database , 4 Data Integration
Ali Usman
 
Introduction to database
Introduction to databaseIntroduction to database
Introduction to database
Pongsakorn U-chupala
 
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Beat Signer
 
Query optimization and challenges in DDBMS with Review Algorithms.
Query optimization and challenges in DDBMS with Review Algorithms.Query optimization and challenges in DDBMS with Review Algorithms.
Query optimization and challenges in DDBMS with Review Algorithms.
Beingprp
 
Coyaima ie. juan xxiii manual de convivencia
Coyaima ie. juan xxiii manual de convivenciaCoyaima ie. juan xxiii manual de convivencia
Coyaima ie. juan xxiii manual de convivencia
sebasecret
 
Anwar e-sabiri(complete)
Anwar e-sabiri(complete)Anwar e-sabiri(complete)
Anwar e-sabiri(complete)
Ali Usman
 
BrunnerForbes2
BrunnerForbes2BrunnerForbes2
BrunnerForbes2
Q Financial / TaxFreeYou.com / SellMyBusinessNow.com
 
Ethernet Technology
Ethernet Technology Ethernet Technology
Ethernet Technology
Ali Usman
 
Hank Iving Media Plan
Hank Iving Media PlanHank Iving Media Plan
Hank Iving Media Plan
confar90
 
Virgen de Chiquinquirá en Colombia
Virgen de Chiquinquirá en ColombiaVirgen de Chiquinquirá en Colombia
Virgen de Chiquinquirá en Colombia
Maria Daud
 
Mariquita iet francisco nuñez pedrozo manual convivencia antiguo
Mariquita iet francisco nuñez pedrozo manual convivencia antiguoMariquita iet francisco nuñez pedrozo manual convivencia antiguo
Mariquita iet francisco nuñez pedrozo manual convivencia antiguo
sebasecret
 
College Students
College StudentsCollege Students
College Students
confar90
 
Network internet
Network internetNetwork internet
Network internet
Kumar
 
Database ,7 query localization
Database ,7 query localizationDatabase ,7 query localization
Database ,7 query localization
Ali Usman
 
Query decomposition in data base
Query decomposition in data baseQuery decomposition in data base
Query decomposition in data base
Salman Memon
 
Database ,16 P2P
Database ,16 P2P Database ,16 P2P
Database ,16 P2P
Ali Usman
 
Database ,10 Transactions
Database ,10 TransactionsDatabase ,10 Transactions
Database ,10 Transactions
Ali Usman
 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 Replication
Ali Usman
 
Database ,2 Background
 Database ,2 Background Database ,2 Background
Database ,2 Background
Ali Usman
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query Optimization
Ali Usman
 
Database , 4 Data Integration
Database , 4 Data IntegrationDatabase , 4 Data Integration
Database , 4 Data Integration
Ali Usman
 
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Beat Signer
 
Query optimization and challenges in DDBMS with Review Algorithms.
Query optimization and challenges in DDBMS with Review Algorithms.Query optimization and challenges in DDBMS with Review Algorithms.
Query optimization and challenges in DDBMS with Review Algorithms.
Beingprp
 
Coyaima ie. juan xxiii manual de convivencia
Coyaima ie. juan xxiii manual de convivenciaCoyaima ie. juan xxiii manual de convivencia
Coyaima ie. juan xxiii manual de convivencia
sebasecret
 
Anwar e-sabiri(complete)
Anwar e-sabiri(complete)Anwar e-sabiri(complete)
Anwar e-sabiri(complete)
Ali Usman
 
Ethernet Technology
Ethernet Technology Ethernet Technology
Ethernet Technology
Ali Usman
 
Hank Iving Media Plan
Hank Iving Media PlanHank Iving Media Plan
Hank Iving Media Plan
confar90
 
Virgen de Chiquinquirá en Colombia
Virgen de Chiquinquirá en ColombiaVirgen de Chiquinquirá en Colombia
Virgen de Chiquinquirá en Colombia
Maria Daud
 
Mariquita iet francisco nuñez pedrozo manual convivencia antiguo
Mariquita iet francisco nuñez pedrozo manual convivencia antiguoMariquita iet francisco nuñez pedrozo manual convivencia antiguo
Mariquita iet francisco nuñez pedrozo manual convivencia antiguo
sebasecret
 
College Students
College StudentsCollege Students
College Students
confar90
 
Network internet
Network internetNetwork internet
Network internet
Kumar
 
Ad

Similar to Database , 6 Query Introduction (20)

6-Query_Intro (5).pdf
6-Query_Intro (5).pdf6-Query_Intro (5).pdf
6-Query_Intro (5).pdf
JaveriaShoaib4
 
AUERY.pptxHDSOILDKCJSIDVCBIDCSDCJNSOIDCNSOD
AUERY.pptxHDSOILDKCJSIDVCBIDCSDCJNSOIDCNSODAUERY.pptxHDSOILDKCJSIDVCBIDCSDCJNSOIDCNSOD
AUERY.pptxHDSOILDKCJSIDVCBIDCSDCJNSOIDCNSOD
AhmadSajjad34
 
Hadoop Map Reduce OS
Hadoop Map Reduce OSHadoop Map Reduce OS
Hadoop Map Reduce OS
Vedant Mane
 
Database ,14 Parallel DBMS
Database ,14 Parallel DBMSDatabase ,14 Parallel DBMS
Database ,14 Parallel DBMS
Ali Usman
 
Database ,18 Current Issues
Database ,18 Current IssuesDatabase ,18 Current Issues
Database ,18 Current Issues
Ali Usman
 
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptxPPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
neju3
 
HFM vs Essbase BSO: A Comparative Anatomy
HFM vs Essbase BSO: A Comparative AnatomyHFM vs Essbase BSO: A Comparative Anatomy
HFM vs Essbase BSO: A Comparative Anatomy
aa026593
 
1 introduction
1 introduction1 introduction
1 introduction
Amrit Kaur
 
try
trytry
try
Lamha Agarwal
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Kognitio
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
areej qasrawi
 
Designing analytics for big data
Designing analytics for big dataDesigning analytics for big data
Designing analytics for big data
J Singh
 
1 introduction DDBS
1 introduction DDBS1 introduction DDBS
1 introduction DDBS
naimanighat
 
Database , 1 Introduction
 Database , 1 Introduction Database , 1 Introduction
Database , 1 Introduction
Ali Usman
 
Manjeet Singh.pptx
Manjeet Singh.pptxManjeet Singh.pptx
Manjeet Singh.pptx
RAMCHANDRASHARMA7
 
fmewt19 - Around the world stories master deck
fmewt19 - Around the world stories master deckfmewt19 - Around the world stories master deck
fmewt19 - Around the world stories master deck
Consortech
 
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Sumeet Singh
 
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for successArchitecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
DataWorks Summit
 
Introduction of MapReduce
Introduction of MapReduceIntroduction of MapReduce
Introduction of MapReduce
HC Lin
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
Mateusz Dymczyk
 
AUERY.pptxHDSOILDKCJSIDVCBIDCSDCJNSOIDCNSOD
AUERY.pptxHDSOILDKCJSIDVCBIDCSDCJNSOIDCNSODAUERY.pptxHDSOILDKCJSIDVCBIDCSDCJNSOIDCNSOD
AUERY.pptxHDSOILDKCJSIDVCBIDCSDCJNSOIDCNSOD
AhmadSajjad34
 
Hadoop Map Reduce OS
Hadoop Map Reduce OSHadoop Map Reduce OS
Hadoop Map Reduce OS
Vedant Mane
 
Database ,14 Parallel DBMS
Database ,14 Parallel DBMSDatabase ,14 Parallel DBMS
Database ,14 Parallel DBMS
Ali Usman
 
Database ,18 Current Issues
Database ,18 Current IssuesDatabase ,18 Current Issues
Database ,18 Current Issues
Ali Usman
 
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptxPPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
neju3
 
HFM vs Essbase BSO: A Comparative Anatomy
HFM vs Essbase BSO: A Comparative AnatomyHFM vs Essbase BSO: A Comparative Anatomy
HFM vs Essbase BSO: A Comparative Anatomy
aa026593
 
1 introduction
1 introduction1 introduction
1 introduction
Amrit Kaur
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Kognitio
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
areej qasrawi
 
Designing analytics for big data
Designing analytics for big dataDesigning analytics for big data
Designing analytics for big data
J Singh
 
1 introduction DDBS
1 introduction DDBS1 introduction DDBS
1 introduction DDBS
naimanighat
 
Database , 1 Introduction
 Database , 1 Introduction Database , 1 Introduction
Database , 1 Introduction
Ali Usman
 
fmewt19 - Around the world stories master deck
fmewt19 - Around the world stories master deckfmewt19 - Around the world stories master deck
fmewt19 - Around the world stories master deck
Consortech
 
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Sumeet Singh
 
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for successArchitecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
DataWorks Summit
 
Introduction of MapReduce
Introduction of MapReduceIntroduction of MapReduce
Introduction of MapReduce
HC Lin
 
Ad

More from Ali Usman (20)

Cisco Packet Tracer Overview
Cisco Packet Tracer OverviewCisco Packet Tracer Overview
Cisco Packet Tracer Overview
Ali Usman
 
Islamic Arts and Architecture
Islamic Arts and  ArchitectureIslamic Arts and  Architecture
Islamic Arts and Architecture
Ali Usman
 
Database , 17 Web
Database , 17 WebDatabase , 17 Web
Database , 17 Web
Ali Usman
 
Database , 15 Object DBMS
Database , 15 Object DBMSDatabase , 15 Object DBMS
Database , 15 Object DBMS
Ali Usman
 
Database , 12 Reliability
Database , 12 ReliabilityDatabase , 12 Reliability
Database , 12 Reliability
Ali Usman
 
Database ,11 Concurrency Control
Database ,11 Concurrency ControlDatabase ,11 Concurrency Control
Database ,11 Concurrency Control
Ali Usman
 
Database , 5 Semantic
Database , 5 SemanticDatabase , 5 Semantic
Database , 5 Semantic
Ali Usman
 
Database, 3 Distribution Design
Database, 3 Distribution DesignDatabase, 3 Distribution Design
Database, 3 Distribution Design
Ali Usman
 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor Specifications
Ali Usman
 
Fifty Year Of Microprocessor
Fifty Year Of MicroprocessorFifty Year Of Microprocessor
Fifty Year Of Microprocessor
Ali Usman
 
Discrete Structures lecture 2
 Discrete Structures lecture 2 Discrete Structures lecture 2
Discrete Structures lecture 2
Ali Usman
 
Discrete Structures. Lecture 1
 Discrete Structures. Lecture 1  Discrete Structures. Lecture 1
Discrete Structures. Lecture 1
Ali Usman
 
Muslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyMuslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-Astronomy
Ali Usman
 
Muslim Contributions in Geography
Muslim Contributions in GeographyMuslim Contributions in Geography
Muslim Contributions in Geography
Ali Usman
 
Muslim Contributions in Astronomy
Muslim Contributions in AstronomyMuslim Contributions in Astronomy
Muslim Contributions in Astronomy
Ali Usman
 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor Specifications
Ali Usman
 
Ptcl modem (user manual)
Ptcl modem (user manual)Ptcl modem (user manual)
Ptcl modem (user manual)
Ali Usman
 
Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali
Ali Usman
 
Muslim Contributions in Mathematics
Muslim Contributions in MathematicsMuslim Contributions in Mathematics
Muslim Contributions in Mathematics
Ali Usman
 
Osi protocols
Osi protocolsOsi protocols
Osi protocols
Ali Usman
 
Cisco Packet Tracer Overview
Cisco Packet Tracer OverviewCisco Packet Tracer Overview
Cisco Packet Tracer Overview
Ali Usman
 
Islamic Arts and Architecture
Islamic Arts and  ArchitectureIslamic Arts and  Architecture
Islamic Arts and Architecture
Ali Usman
 
Database , 17 Web
Database , 17 WebDatabase , 17 Web
Database , 17 Web
Ali Usman
 
Database , 15 Object DBMS
Database , 15 Object DBMSDatabase , 15 Object DBMS
Database , 15 Object DBMS
Ali Usman
 
Database , 12 Reliability
Database , 12 ReliabilityDatabase , 12 Reliability
Database , 12 Reliability
Ali Usman
 
Database ,11 Concurrency Control
Database ,11 Concurrency ControlDatabase ,11 Concurrency Control
Database ,11 Concurrency Control
Ali Usman
 
Database , 5 Semantic
Database , 5 SemanticDatabase , 5 Semantic
Database , 5 Semantic
Ali Usman
 
Database, 3 Distribution Design
Database, 3 Distribution DesignDatabase, 3 Distribution Design
Database, 3 Distribution Design
Ali Usman
 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor Specifications
Ali Usman
 
Fifty Year Of Microprocessor
Fifty Year Of MicroprocessorFifty Year Of Microprocessor
Fifty Year Of Microprocessor
Ali Usman
 
Discrete Structures lecture 2
 Discrete Structures lecture 2 Discrete Structures lecture 2
Discrete Structures lecture 2
Ali Usman
 
Discrete Structures. Lecture 1
 Discrete Structures. Lecture 1  Discrete Structures. Lecture 1
Discrete Structures. Lecture 1
Ali Usman
 
Muslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyMuslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-Astronomy
Ali Usman
 
Muslim Contributions in Geography
Muslim Contributions in GeographyMuslim Contributions in Geography
Muslim Contributions in Geography
Ali Usman
 
Muslim Contributions in Astronomy
Muslim Contributions in AstronomyMuslim Contributions in Astronomy
Muslim Contributions in Astronomy
Ali Usman
 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor Specifications
Ali Usman
 
Ptcl modem (user manual)
Ptcl modem (user manual)Ptcl modem (user manual)
Ptcl modem (user manual)
Ali Usman
 
Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali
Ali Usman
 
Muslim Contributions in Mathematics
Muslim Contributions in MathematicsMuslim Contributions in Mathematics
Muslim Contributions in Mathematics
Ali Usman
 
Osi protocols
Osi protocolsOsi protocols
Osi protocols
Ali Usman
 

Recently uploaded (20)

What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 

Database , 6 Query Introduction

  • 1. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/1 Outline • Introduction • Background • Distributed Database Design • Database Integration • Semantic Data Control • Distributed Query Processing ➡ Overview ➡ Query decomposition and localization ➡ Distributed query optimization • Multidatabase Query Processing • Distributed Transaction Management • Data Replication • Parallel Database Systems • Distributed Object DBMS • Peer-to-Peer Data Management • Web Data Management • Current Issues
  • 2. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/2 Query Processing in a DDBMS high level user query query processor Low-level data manipulation commands for D-DBMS
  • 3. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/3 Query Processing Components • Query language that is used ➡ SQL: “intergalactic dataspeak” • Query execution methodology ➡ The steps that one goes through in executing high-level (declarative) user queries. • Query optimization ➡ How do we determine the “best” execution plan? • We assume a homogeneous D-DBMS
  • 4. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/4 SELECT ENAME FROM EMP,ASG WHERE EMP.ENO = ASG.ENO AND RESP = "Manager" Strategy 1 ENAME( RESP=“Manager” EMP.ENO=ASG.ENO(EMP×ASG)) Strategy 2 ENAME(EMP ⋈ENO ( RESP=“Manager” (ASG)) Strategy 2 avoids Cartesian product, so may be “better” Selecting Alternatives
  • 5. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/5 What is the Problem? Site 1 Site 2 Site 3 Site 4 Site 5 EMP1= ENO≤“E3”(EMP) EMP2= ENO>“E3”(EMP)ASG2= ENO>“E3”(ASG)ASG1= ENO≤“E3”(ASG) Result Site 5 Site 1 Site 2 Site 3 Site 4 ASG1 EMP1 EMP2ASG2Site 4Site 3 Site 1 Site 2 Site 5 EMP’ 1=EMP1 ⋈ENO ASG’ 1 ' 2 EMPEMPresult ' 1 1Manager""RESP1 ASGσASG ' 2Manager""RESP2 ASGσASG ' ' 1 ASG ' 2 ASG ' 1 EMP ' 2 EMP result= (EMP1 × EMP2)⋈ENOσRESP=“Manager”(ASG1× ASG2) EMP’ 2=EMP2 ⋈ENO ASG’ 2
  • 6. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/6 Cost of Alternatives • Assume ➡ size(EMP) = 400, size(ASG) = 1000 ➡ tuple access cost = 1 unit; tuple transfer cost = 10 units • Strategy 1 ➡ produce ASG': (10+10) tuple access cost 20 ➡ transfer ASG' to the sites of EMP: (10+10) tuple transfer cost 200 ➡ produce EMP': (10+10) tuple access cost 2 40 ➡ transfer EMP' to result site: (10+10) tuple transfer cost 200 Total Cost 460 • Strategy 2 ➡ transfer EMP to site 5: 400 tuple transfer cost 4,000 ➡ transfer ASG to site 5: 1000 tuple transfer cost 10,000 ➡ produce ASG': 1000 tuple access cost 1,000 ➡ join EMP and ASG': 400 20 tuple access cost 8,000 Total Cost 23,000
  • 7. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/7 Query Optimization Objectives • Minimize a cost function I/O cost + CPU cost + communication cost These might have different weights in different distributed environments • Wide area networks ➡ communication cost may dominate or vary much ✦ bandwidth ✦ speed ✦ high protocol overhead • Local area networks ➡ communication cost not that dominant ➡ total cost function should be considered • Can also maximize throughput
  • 8. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/8 Complexity of Relational Operations • Assume ➡ relations of cardinality n ➡ sequential scan Operation Complexity Select Project (without duplicate elimination) O(n) Project (with duplicate elimination) Group O(n log n) Join Semi-join Division Set Operators O(n log n) Cartesian Product O(n2)
  • 9. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/9 Query Optimization Issues – Types Of Optimizers • Exhaustive search ➡ Cost-based ➡ Optimal ➡ Combinatorial complexity in the number of relations • Heuristics ➡ Not optimal ➡ Regroup common sub-expressions ➡ Perform selection, projection first ➡ Replace a join by a series of semijoins ➡ Reorder operations to reduce intermediate relation size ➡ Optimize individual operations
  • 10. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/10 Query Optimization Issues – Optimization Granularity • Single query at a time ➡ Cannot use common intermediate results • Multiple queries at a time ➡ Efficient if many similar queries ➡ Decision space is much larger
  • 11. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/11 Query Optimization Issues – Optimization Timing • Static ➡ Compilation  optimize prior to the execution ➡ Difficult to estimate the size of the intermediate results error propagation ➡ Can amortize over many executions ➡ R* • Dynamic ➡ Run time optimization ➡ Exact information on the intermediate relation sizes ➡ Have to reoptimize for multiple executions ➡ Distributed INGRES • Hybrid ➡ Compile using a static algorithm ➡ If the error in estimate sizes > threshold, reoptimize at run time ➡ Mermaid
  • 12. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/12 Query Optimization Issues – Statistics • Relation ➡ Cardinality ➡ Size of a tuple ➡ Fraction of tuples participating in a join with another relation • Attribute ➡ Cardinality of domain ➡ Actual number of distinct values • Common assumptions ➡ Independence between different attribute values ➡ Uniform distribution of attribute values within their domain
  • 13. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/13 Query Optimization Issues – Decision Sites • Centralized ➡ Single site determines the “best” schedule ➡ Simple ➡ Need knowledge about the entire distributed database • Distributed ➡ Cooperation among sites to determine the schedule ➡ Need only local information ➡ Cost of cooperation • Hybrid ➡ One site determines the global schedule ➡ Each site optimizes the local subqueries
  • 14. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/14 Query Optimization Issues – Network Topology • Wide area networks (WAN) – point-to-point ➡ Characteristics ✦ Low bandwidth ✦ Low speed ✦ High protocol overhead ➡ Communication cost will dominate; ignore all other cost factors ➡ Global schedule to minimize communication cost ➡ Local schedules according to centralized query optimization • Local area networks (LAN) ➡ Communication cost not that dominant ➡ Total cost function should be considered ➡ Broadcasting can be exploited (joins) ➡ Special algorithms exist for star networks
  • 15. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/15 Distributed Query Processing Methodology Calculus Query on Distributed Relations CONTROL SITE LOCAL SITES Query Decomposition Data Localization Algebraic Query on Distributed Relations Global Optimization Fragment Query Local Optimization Optimized Fragment Query with Communication Operations Optimized Local Queries GLOBAL SCHEMA FRAGMENT SCHEMA STATS ON FRAGMENTS LOCAL SCHEMAS