23 Distributedoltp
23 Distributedoltp
Databases
ADMINISTRIVIA
LAST CLASS
System Architectures
→ Shared-Memory, Shared-Disk, Shared-Nothing
Partitioning/Sharding
→ Hash, Range, Round Robin
Transaction Coordination
→ Centralized vs. Decentralized
O LT P V S . O L A P
D E C E N T R A L I Z E D C O O R D I N AT O R
Partitions
Begin Request P1 P2
Application
Server P3 P4
D E C E N T R A L I Z E D C O O R D I N AT O R
Partitions
Query
P1 P2
Query
Application
Server P3 P4
Query
D E C E N T R A L I Z E D C O O R D I N AT O R
Partitions
Commit Request P1 P2
Safe to commit?
Application
Server P3 P4
O B S E R VAT I O N
I M P O R TA N T A S S U M P T I O N
T O D AY ' S A G E N D A
AT O M I C C O M M I T P R O T O C O L
Examples:
→ Two-Phase Commit
→ Three-Phase Commit (not used)
→ Paxos
→ Raft
→ ZAB (Apache Zookeeper)
→ Viewstamped Replication
CMU 15-445/645 (Fall 2019)
10
T W O -P H A S E C O M M I T ( S U C C E S S )
Commit Request
Participant
Application
Server
Node 2
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
10
T W O -P H A S E C O M M I T ( S U C C E S S )
Commit Request
Participant
Application
Server
Phase1: Prepare Node 2
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
10
T W O -P H A S E C O M M I T ( S U C C E S S )
Commit Request
Participant
OK
Application
Server
Phase1: Prepare Node 2
OK
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
10
T W O -P H A S E C O M M I T ( S U C C E S S )
Commit Request
Participant
OK
Application
Server
Phase1: Prepare Node 2
OK
Coordinator
Participant
Phase2: Commit
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
10
T W O -P H A S E C O M M I T ( S U C C E S S )
Commit Request
Participant
OK
Application OK
Server
Phase1: Prepare Node 2
OK
Coordinator
Participant
Phase2: Commit
OK
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
10
T W O -P H A S E C O M M I T ( S U C C E S S )
Success!
Participant
Application
Server
Node 2
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
11
T W O -P H A S E C O M M I T ( A B O R T )
Commit Request
Participant
Application
Server
Node 2
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
11
T W O -P H A S E C O M M I T ( A B O R T )
Commit Request
Participant
Application
Server
Phase1: Prepare Node 2
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
11
T W O -P H A S E C O M M I T ( A B O R T )
Commit Request
Participant
Application
Server
Phase1: Prepare Node 2
ABORT!
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
11
T W O -P H A S E C O M M I T ( A B O R T )
Aborted
Participant
Application
Server
Node 2
ABORT!
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
11
T W O -P H A S E C O M M I T ( A B O R T )
Aborted
Participant
Application
Server
Node 2
ABORT!
Coordinator
Participant
Phase2: Abort
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
11
T W O -P H A S E C O M M I T ( A B O R T )
Aborted
Participant
Application OK
Server
Node 2
ABORT!
Coordinator
Participant
Phase2: Abort
OK
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
12
2 P C O P T I M I Z AT I O N S
E A R LY A C K N O W L E D G E M E N T
Commit Request
Participant
Application
Server
Node 2
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
13
E A R LY A C K N O W L E D G E M E N T
Commit Request
Participant
Application
Server
Phase1: Prepare Node 2
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
13
E A R LY A C K N O W L E D G E M E N T
Commit Request
Participant
OK
Application
Server
Phase1: Prepare Node 2
OK
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
13
E A R LY A C K N O W L E D G E M E N T
Success!
Participant
OK
Application
Server
Phase1: Prepare Node 2
OK
Coordinator
Participant
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
13
E A R LY A C K N O W L E D G E M E N T
Success!
Participant
OK
Application
Server
Phase1: Prepare Node 2
OK
Coordinator
Participant
Phase2: Commit
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
13
E A R LY A C K N O W L E D G E M E N T
Success!
Participant
OK
Application OK
Server
Phase1: Prepare Node 2
OK
Coordinator
Participant
Phase2: Commit
OK
Node 1 Node 3
CMU 15-445/645 (Fall 2019)
14
T W O -P H A S E C O M M I T
PA XO S
PA XO S
Acceptor
Commit Request
Node 2
Application
Acceptor
Server
Node 3
Proposer
Acceptor
Node 1
Node 4
CMU 15-445/645 (Fall 2019)
16
PA XO S
Acceptor
Commit Request
Node 2
Application
Acceptor
Server
Propose
Node 3
Proposer
Acceptor
Node 1
Node 4
CMU 15-445/645 (Fall 2019)
16
PA XO S
Acceptor
Commit Request
Node 2
X
Application
Acceptor
Server
Propose
Node 3
Proposer
Acceptor
Node 1
Node 4
CMU 15-445/645 (Fall 2019)
16
PA XO S
Acceptor
Agree
Commit Request
Node 2
X
Application
Acceptor
Server
Propose
Node 3
Agree
Proposer
Acceptor
Node 1
Node 4
CMU 15-445/645 (Fall 2019)
16
PA XO S
Acceptor
Agree
Commit Request
Node 2
X
Application
Acceptor
Server
Propose
Commit
Node 3
Agree
Proposer
Acceptor
Node 1
Node 4
CMU 15-445/645 (Fall 2019)
16
PA XO S
Acceptor
Agree
Commit Request
Accept Node 2
X
Application
Acceptor
Server
Propose
Commit
Node 3
Agree
Proposer
Acceptor
Accept
Node 1
Node 4
CMU 15-445/645 (Fall 2019)
16
PA XO S
Acceptor
Success!
Node 2
X
Application
Acceptor
Server
Node 3
Proposer
Acceptor
Node 1
Node 4
CMU 15-445/645 (Fall 2019)
17
PA XO S
Proposer Acceptors Proposer
TIME
PA XO S
Proposer Acceptors Proposer
Propose(n)
TIME
PA XO S
Proposer Acceptors Proposer
Propose(n)
Agree(n)
TIME
PA XO S
Proposer Acceptors Proposer
Propose(n)
Agree(n)
Propose(n+1)
TIME
PA XO S
Proposer Acceptors Proposer
Propose(n)
Agree(n)
Propose(n+1)
Commit(n)
TIME
PA XO S
Proposer Acceptors Proposer
Propose(n)
Agree(n)
Propose(n+1)
Commit(n)
TIME
Reject(n,n+1)
PA XO S
Proposer Acceptors Proposer
Propose(n)
Agree(n)
Propose(n+1)
Commit(n)
TIME
Reject(n,n+1)
Agree(n+1)
PA XO S
Proposer Acceptors Proposer
Propose(n)
Agree(n)
Propose(n+1)
Commit(n)
TIME
Reject(n,n+1)
Agree(n+1)
Commit(n+1)
PA XO S
Proposer Acceptors Proposer
Propose(n)
Agree(n)
Propose(n+1)
Commit(n)
TIME
Reject(n,n+1)
Agree(n+1)
Commit(n+1)
Accept(n+1)
M U LT I -PA X O S
2 P C V S . PA XO S
Two-Phase Commit
→ Blocks if coordinator fails after the prepare message is
sent, until coordinator recovers.
Paxos
→ Non-blocking if a majority participants are alive,
provided there is a sufficiently long period without
further failures.
R E P L I C AT I O N
Design Decisions:
→ Replica Configuration
→ Propagation Scheme
→ Propagation Timing
→ Update Method
R E P L I C A C O N F I G U R AT I O N S
R E P L I C A C O N F I G U R AT I O N S
Master-Replica Multi-Master
Writes Reads
Reads Writes
Reads P1
P1
Node 1
P1
P1 Writes
Reads P1
Master
Replicas
Node 2
CMU 15-445/645 (Fall 2019)
23
K-S A F E T Y
P R O PA G AT I O N S C H E M E
Propagation levels:
→ Synchronous (Strong Consistency)
→ Asynchronous (Eventual Consistency)
P R O PA G AT I O N S C H E M E
Flush!
Approach #1: Synchronous Commit? Flush?
P R O PA G AT I O N S C H E M E
Flush!
Approach #1: Synchronous Commit? Flush?
P R O PA G AT I O N S C H E M E
Flush!
Approach #1: Synchronous Commit? Flush?
P R O PA G AT I O N T I M I N G
A C T I V E V S . PA S S I V E
CAP THEOREM
Brewer
CAP THEOREM
All up nodes can satisfy
Linearizability all requests.
C A
Consistency
Availability Impossible
Partition Tolerant
P
Still operate correctly
despite message loss.
CMU 15-445/645 (Fall 2019)
31
CAP CONSISTENCY
Set A=2
Application Application
Server Server
A=1 A=1
B=8 B=8
NETWORK
Master Replica
CMU 15-445/645 (Fall 2019)
31
CAP CONSISTENCY
Set A=2
Application Application
Server Server
A=2
A=1 A=1
B=8 B=8
NETWORK
Master Replica
CMU 15-445/645 (Fall 2019)
31
CAP CONSISTENCY
Set A=2
Application Application
Server Server
A=2
A=1 A=2
A=1
B=8 B=8
NETWORK
Master Replica
CMU 15-445/645 (Fall 2019)
31
CAP CONSISTENCY
Set A=2
Application Application
Server ACK Server
A=2
A=1 A=2
A=1
B=8 B=8
NETWORK
Master Replica
CMU 15-445/645 (Fall 2019)
31
CAP CONSISTENCY
A=2
A=1 A=2
A=1
B=8 B=8
NETWORK
Master Replica
CMU 15-445/645 (Fall 2019)
31
CAP C
If O N Ssays
master I S the
T Etxn
Ncommitted,
CY
then it should be immediately
visible on replicas.
A=2
A=1 A=2
A=1
B=8 B=8
NETWORK
Master Replica
CMU 15-445/645 (Fall 2019)
32
CAP AVA I L A B I L I T Y
Application Application
Server Server
A=1
B=8
Master
NETWORK X A=1
B=8
Replica
CMU 15-445/645 (Fall 2019)
32
CAP AVA I L A B I L I T Y
Read B
Application Application
Server Server
A=1
B=8
Master
NETWORK X A=1
B=8
Replica
CMU 15-445/645 (Fall 2019)
32
CAP AVA I L A B I L I T Y
Read B
Application Application
Server B=8 Server
A=1
B=8
Master
NETWORK X A=1
B=8
Replica
CMU 15-445/645 (Fall 2019)
32
CAP AVA I L A B I L I T Y
Read A
Application Application
Server Server
A=1
B=8
Master
NETWORK X A=1
B=8
Replica
CMU 15-445/645 (Fall 2019)
32
CAP AVA I L A B I L I T Y
Read A
Application Application
Server A=1 Server
A=1
B=8
Master
NETWORK X A=1
B=8
Replica
CMU 15-445/645 (Fall 2019)
33
CAP PA R T I T I O N TO L E R A N C E
Application Application
Server Server
A=1 A=1
B=8 B=8
NETWORK
Master Replica
CMU 15-445/645 (Fall 2019)
33
CAP PA R T I T I O N TO L E R A N C E
Application Application
Server Server
A=1 A=1
B=8 B=8
Master Master
CMU 15-445/645 (Fall 2019)
33
CAP PA R T I T I O N TO L E R A N C E
A=1 A=1
B=8 B=8
Master Master
CMU 15-445/645 (Fall 2019)
33
CAP PA R T I T I O N TO L E R A N C E
A=2
A=1 A=3
A=1
B=8 B=8
Master Master
CMU 15-445/645 (Fall 2019)
33
CAP PA R T I T I O N TO L E R A N C E
A=2
A=1 A=3
A=1
B=8 B=8
Master Master
CMU 15-445/645 (Fall 2019)
33
CAP PA R T I T I O N TO L E R A N C E
A=2
A=1 A=3
A=1
B=8 B=8
NETWORK
Master Master
CMU 15-445/645 (Fall 2019)
34
C A P F O R O LT P D B M S s
Traditional/NewSQL DBMSs
→ Stop allowing updates until a majority of nodes are
reconnected.
NoSQL DBMSs
→ Provide mechanisms to resolve conflicts after nodes are
reconnected.
O B S E R VAT I O N
F E D E R AT E D D ATA B A S E S
F E D E R AT E D D ATA B A S E E X A M P L E
Back-end DBMSs
Middleware
Query Requests
Connectors
Application
Server
F E D E R AT E D D ATA B A S E E X A M P L E
Back-end DBMSs
Query Requests
Connectors
Foreign
Data
Application Wrappers
Server
C O N C LU S I O N
NEXT CLASS