Distributed Dbs
Distributed Dbs
Vera Goebel
Department of Informatics
University of Oslo
2011
1
Contents
Review: Layered DBMS Architecture
Distributed DBMS Architectures
DDBMS Taxonomy
Client/Server Models
Key Problems of Distributed DBMS
Distributed data modeling
Distributed query processing & optimization
Distributed transaction management
Concurrency Control
Recovery
2
Applications
Data Model / View Management
Transaction Management
Functional Layers
Interface
of a DBMS
Data Access
Buffer Management
Consistency
Database
3
Sorting
component
Lock
component
Log component
(with savepoint mgmt)
Central
Components
System Buffer
Mgmt
Indicates a dependency
4
Centralized DBS
logically integrated
physically centralized
T1
T2
T3
network
T4
DBS
T5
Distributed DBS
Data logically integrated (i.e., access based on one schema)
Data physically distributed among multiple database nodes
Processing is distributed among multiple database nodes
T1
T3
T2
DBS1
network
DBS2
DBS3
Distributed Homog.
Federated DBMS
Distributed
Multi-DBMS
Client/Server Distribution
Distributed
Heterogeneous DBMS
Distributed Heterog.
Federated DBMS
Centralized
DBMS DBMS
Homogeneous
Centralized
Heterogeneous DBMS
Distributed Heterog.
Multi-DBMS
Centralized Homog.
Federated DBMS
Centralized Heterog.
Federated DBMS
Centralized Homog.
Multi-DBMS
Autonomy
Centralized Heterog.
Multi-DBMS
Heterogeneity
7
N2
N4
Federated
Multi
D1
GTM & SM
D3
D2
D4
Independent DBMSs
Implement some
cooperation functions:
- Transaction Mgmt
- Schema mapping
Aware of other DBSs
in the federation
D1
D2
D3
D4
Shared Nothing
... ProcN
Mem2
...
MemM
Disk 1
...
Disk p
Proc1
...
Fast Interconnection
Network
MemN
Mem1
Disk 1
ProcN
...
Disk N
Distributed DBMS
Advantages:
Improved performance
Efficiency
Extensibility (addition of new nodes)
Transparency of distribution
Storage of data
Query execution
Problems:
11
Client/Server Environments
data (object) server + n smart clients (workstations)
...
local communication
network
Data
Server
interface
database functions
...
Disk 1
database
Disk n
13
...
Data Server m
interface
interface
distributed DBMS
...
distributed DBMS
...
...
Disk 1
database
Disk n
Disk 1
database
Disk p
14
...
user interface
Application
Server
query parsing
data server interface
communication
channel
Data
Server
...
Disk 1
database
Disk n
15
16
Client/Server Architectures
server process
Relational
client process
Application
cursor
management
Object-Oriented
SQL
client process
Application
server process
object/page cache
management
objects,
pages, or
files
DBMS
17
Log/Lock
Manager
Application
Object Manager
Object
Cache
objects
object references
queries
method calls
locks
log records
Object
Manager
Object
Cache
File/Index
Manager
Page Cache
Manager
Page
Cache
Storage Allocation
and I/O
Database
18
Disadvantages/problems:
- remote procedure calls (RPC) for object references
- complex server design
- client cache consistency problems
- page-level locking, objects get copied multiple times, large
objects
19
Object Manager
server process
File/Index
Manager
Page Cache
Manager
pages
page references
Log/Lock Page Cache
locks
Manager
Manager
log records
Page
Cache
Storage Allocation
and I/O
Page
Cache
database
20
Object
Cache
Object Manager
File/Index
Manager
Page
Cache
server process
locks
log records
Log/Lock
Manager
pages
page references
Space
Allocation
NFS
Page Cache
Manager
database
22
AvoidanceBased
Algorithms
DetectionBased
Algorithms
Asynchronous
Deferred
Object Server
Complex server design
Relatively simple client
design
Fine-grained concurrency
control
Reduces data movement,
relatively insensitive to
clustering
Sensitive to client buffer
pool size
Conclusions:
No clear winner
Depends on object size and applications object access pattern
File server ruled out by poor NFS performance
25
query processing
distributed DB design
concurrency control
(lock)
influences
reliability (log)
transaction
management
deadlock management
26
Evaluation Criteria
Cost metrics for: network traffic, query processing, transaction mgmt
A system-wide goal: Maximize throughput or minimize latency
27
Approaches:
Store objects in relations and use RDBMS strategies to distribute data
All objects are stored in binary relations [OID, attr-value]
One relation per class attribute
Use nested relations to store complex (multi-class) objects
Evaluation Criteria:
Affinity metric (maximize)
Cost metric (minimize)
28
: Node2
Class C
Node1: Subcl X
Subcl Y : Node2
Fragment #1
Fragment #2
Fragment #3
31
Remote
Methods
Replication Options
Objects
Classes (collections of objects)
Methods
Class/Type specifications
32
replicated zoned
33
Node 1
Node 2
Processing Timelines
Rel B
Join A and B on y
Node 1
Select A.x > 100
Node 2
Node 1:
Node 2:
Processing Timelines
Node 3
Node 1:
Node 2:
Node 3:
Union
34
35
control site
query decomposition
Global
schema
Fragment
schema
fragment query
global optimization
Statistics on
fragments
local sites
Local
schema
Reduce to logical
algebraic expressions
Cost function
Should consider object size, structure, location, indexes, etc.
Breaks object encapsulation to obtain this info?
Objects can reflect their access cost estimate
TM in DDBS
Achieves transaction ACID properties by using:
Transaction Manager
Concurrency Control
(Isolation)
Recovery Protocol
(Durability)
Log Manager
Buffer Manager
Commit/Abort Protocol
(Atomicity)
Strong algorithmic
dependencies
39
Pessimistic
Locking
Centralized
Primary
Copy
Distributed
Timestamp
Ordering
Optimistic
Hybrid
Locking
Timestamp
Ordering
Basic
MultiVersion
Conservative
phases of pessimistic
transaction execution
validate
phases of optimistic
transaction execution
read
read
compute
write
compute
validate
write
40
Obtain Lock
Number of locks
Isolation
Release Lock
BEGIN
LOCK
POINT
END
Transaction
Duration
Number of locks
Obtain Lock
Release Lock
END
Period of data item use
Transaction
Duration
41
Coordinating TM
Central Site LM
1
Lock Request
2
Lock Granted
3
Operation
4
End of Operation
5
Release Locks
42
Participating LMs
Coordinating TM
1
Lock Request
2
Operation
3
End of Operation
4
Release Locks
43
Isolation
Example
Site X
waits for T1 x
to release Lc
T2 x
T1 x
holds lock Lc
holds lock Lb
T1 x needs a
waits for T1
on site y
to complete
Site Y
T2 y needs b
waits for T2
on site x
to complete
waits for T2 y
to release Ld
T1 y
holds lock La
T2 y
holds lock Ld
44
Atomicity
Participating Sites
1
Prepare
2
Vote-Commit or Vote-Abort
Who participates?
Depends on the CC alg.
Global-Abort or Global-Commit
Write log entry
4
ACK
Other communication
structures are
possible:
Linear
Distributed
45
Participating Sites
Initial
1
Prepare
Advantages:
- Preserves atomicity
- All processes are
synchronous within
one state transition
Initial
Wait
2
Vote-Commit
or Vote-Abort
Ready
Global-Abort or
Global-Commit
Commit
Atomicity
Abort
4
ACK
Commit
Abort
Disadvantages:
- Many, many messages
- If failure occurs,
the 2PC protocol blocks!
Attempted solutions for the
blocking problem:
1) Termination Protocol
2) 3-Phase Commit
3) Quorum 3P Commit
46
Transaction failure
Node failure
Media failure
Network failure
Partitions each containing 1 or more sites
Who addresses the problem?
Issues to be addressed:
How to continue service
How to maintain ACID properties
while providing continued service
How to ensure ACID properties after
recovery from the failure(s)
Termination Protocols
Modified Concurrency Control
& Commit/Abort Protocols
Recovery Protocols,
Termination Protocols,
& Replica Control Protocols
47
Termination Protocols
Coordinating TM
Atomicity, Consistency
Participating Sites
Initial
Timeout states:
Prepare
Initial
Wait
2
3
Vote-Commit
or Vote-Abort
Ready
Global-Abort or
Global-Commit
Abort
4
ACK
Commit
Abort
Consistency
Number of locks
STRICT UPDATE
LAZY UPDATE
END
BEGIN
Obtain Locks
2-Phase Commit
Release
Locks
COMMIT POINT
49
Consistency
Read-One-Write-All (ROWA)
Part of the Concurrency Control Protocol
and the 2-Phase Commit Protocol
CC locks all copies
2PC propagates the updated values with 2PC
messages (or an update propagation phase is
inserted between the wait and commit states
for those nodes holding an updateable value).
50
Consistency
Atomicity, Durability
N1
N3
N8
N4
network
Partition #1
N7
N6
N5
Partition #3
Issues:
Termination of interrupted transactions
Partition integration upon recovery from a network failure
Data availability while failure is ongoing
53
54
Quorums
ACID
Advantages:
More transactions can be executed during site failure and
network failure (and still retain ACID properties)
Disadvantages:
Many messages are required to establish a quorum
Necessity for a read-quorum slows down read operations
Not quite sufficient (failures are not clean)
55
Isolation
Simple Example:
N=8
Nr = 4
Nw = 5
ACID
N=7
N=7
Nc = 4
Nc = 5
Na = 4
Na = 3
57
Conclusions
Nearly all commerical relational database systems offer some
form of distribution
Client/server at a minimum