1 Introduction
1 Introduction
• Introduction
● What is a distributed DBMS
● Distributed DBMS Architecture
• Background
• Distributed Database Design
• Database Integration
• Semantic Data Control
• Distributed Query Processing
• Multidatabase query processing
• Distributed Transaction Management
• Data Replication
• Parallel Database Systems
• Distributed Object DBMS
• Peer-to-Peer Data Management
• Web Data Management
• Current Issues
Distributed © M. T. Özsu & P.
Ch.1/1
File Systems
History of Distributed DBMS
progra
m1 File
data
1
description 1
progra
m2
data File
description 2 2
progra
m3
data File
description 3 3
Before the advent of database systems
in the 1960s
Distributed © M. T. Özsu & P.
Ch.1/2
Database Management
Applicatio
n
program 1
(with data
semantics DBM
) S
descriptio
Applicatio
n n
manipulatio
program 2 databa
n
contro se
(with data
semantics l
)
Applicatio
n
program 3
(with data
semantics
Distributed
) © M. T. Özsu & P.
Ch.1/3
Motivation
Database Comput
Technolo er
gy Networ
integratio ksdistributio
n n
Distribut
ed
Database
Systems
integratio
n
integration ≠
Distributed
centralization
© M. T. Özsu & P.
Ch.1/4
Distributed Computing
• A number of autonomous processing elements (not necessarily
homogeneous) that are interconnected by a computer network
and that cooperate in performing their assigned tasks.
• What is being distributed?
● Processing logic
● Function (various pieces of hardware or software)
● Data
● Control (execution of various tasks)
Site 1
Site 2
Site 5
Central
Communicati
Database on
on
Network Network
Site 4 Site 3
Site 1
Site 2
Site 5
Communicati DDBMS
on environment
Network
Site 4 Site 3
❸ Improved performance
● Replication transparency
● Fragmentation transparency
● horizontal fragmentation: selection
● vertical fragmentation: projection
● hybrid
Distributed
Database
• Parallelism in execution
● Inter-query parallelism
● Intra-query parallelism
● Interoperator parallelism
✓ pipeline parallelism
✓ Independent parallelism
● Intraoperator parallelism
● Full replication
● Mutual consistency
● Freshness of copies
Query Distribution
Reliability
Processing Design
Concurrency
Control
Deadlock
Managemen
t
Distributed © M. T. Özsu & P.
Ch.1/26
Architecture
• Defines the structure of the system
● components identified
● functions of each component defined
● interrelationships and interactions between components defined
User
s
Concept Conceptu
al
ual view
Schema
Intern Internal
al view
Sche
Distributed ma © M. T. Özsu & P.
Ch.1/28
Generic DBMS Architecture
Advantages :
✔ Single focus on data management
✔ Overall performance of database
management can be significantly
enhanced
✔ Database servers can also exploit
advanced hardware
Costs :
additional overhead introduced
by another layer of
communication between the
application and the data servers
Distributed © M. T. Özsu & P.
Ch.1/38
Advantages of Client-Server
Architectures
• More efficient division of labor
• Horizontal and vertical scaling of resources
• Better price/performance on client machines
• Ability to use familiar tools on client machines
• Client access to remote data (via standards)
• Full DBMS functionality provided to client workstations
• Overall better system price/performance