Distributed Systems: Replication and Consistency: Fall 2013
Distributed Systems: Replication and Consistency: Fall 2013
Distributed Systems:
Replication and Consistency
Fall 2013
Jussi Kangasharju
Chapter Outline
n Replication
n Consistency models
n Distribution protocols
n Consistency protocols
user C
user A object
object
n Dependability requirements
n availability
- at least some server somewhere
- wireless connections => a local cache
n reliability (correctness of data)
- fault tolerance against data corruption
- fault tolerance against faulty operations
n Performance
n response time, throughput
n scalability
- increasing workload
- geographic expansion
n mobile workstations => a local cache
n Price to be paid: consistency maintenance
n performance vs. required level of consistency
(need not care updates immediately visible)
A sequentially consistent data store. A data store that is not sequentially consistent.
x = 1; y = 1; z = 1;
print ( y, z); print (x, z); print (x, y);
Initial values: x = y = z = 0
All statements are assumed to be indivisible.
Execution sequences
- 720 possible execution sequences (several of which violate program order)
- 90 valid execution sequences
x = 1; x = 1; y = 1; y = 1;
print (y, z); y = 1; z = 1; x = 1;
y = 1; print (x,z); print (x, y); z = 1;
print (x, z); print(y, z); print (x, z); print (x, z);
z = 1; z = 1; x = 1; print (y, z);
print (x, y); print (x, y); print (y, z); print (x, y);
Necessary condition:
A violation of a
causally-consistent
store.
A correct
sequence
of events in a
causally-consistent
store.
Necessary Condition:
Guarantee:
writes from a single source must arrive in order
no other guarantees.
Easy to implement!
n Implementation method
n control variable
- synchronization / locking
n operation
- synchronize
- lock/unlock and synchronize
x
P1: Acq(Lx) W(x)a Acq(Ly) W(y)b Rel(Lx) Rel(Ly)
Consistency Description
Strict Absolute time ordering of all shared accesses matters.
Linearizability All processes see all shared accesses in the same order.
Accesses are furthermore ordered according to a
(nonunique) global timestamp
Sequential All processes see all shared accesses in the same order.
Accesses are not ordered in time
FIFO All processes see writes from each other in the order they
were performed. Writes from different processes may not always
be seen in the same order by other processes.
Consistency Description
Release All shared data are made consistent after the exit out of the
critical section
n Wanted
n eventual consistency
n consistency seen by one single client
A monotonic-read consistent
data store
A monotonic-write
consistent data store.
A writes-follow-reads
consistent data store
n Replica placement
n Update propagation
n Epidemic protocols
mirror
permanent replicas
server-initiated replicas servers
client-initiated replicas clients
n Issues:
n improve response time
n reduce server load; reduce data communication load
bring files to servers placed in the proximity of clients
n Where and when should replicas be created/deleted?
n For example:
n determine two threshold values for each (server, file): rep > del
n #[req(S,F)] > rep => create a new replica
n #[req(S,F)] < del => delete the file (replica)
n otherwise: the replica is allowed to be migrated
n Consistency: responsibility of the data store
n Push
n a server sends updates to other replica servers
n typically used between permanent and server-initiated replicas
n Pull
n client asks for update / validation confirmation
n typically used by client caches
- client to server: {data X, timestamp ti, OK?}
- server to client: OK or {data X, timestamp ti+k}
Messages sent Update (and possibly fetch update later) Poll and update
Response time at
Immediate (or fetch-update time) Fetch-update time
client
n Data communication
n LAN: push & multicasting, pull & unicasting
n wide-area network: unicasting
n Information propagation: epidemic protocols
n a node with an update: infective
n a node not yet updated: susceptible
n a node not willing to spread the update: removed
n propagation: anti-entropy
- P picks randomly Q
- three information exchange alternatives:
P => Q or P <= Q or PQ
n propagation: gossiping
The problem
1. server P deletes data D => all information on D is destroyed
[server Q has not yet deleted D]
2. communication P Q => P receives D (as new data)
A solution: deletion is a special update (death certificate)
n allows normal update communication
n a new problem: cleaning up of death certificates
n solution: time-to-live for the certificate
- after TTL elapsed: a normal server deletes the certificate
- some special servers maintain the historical certificates
forever (for what purpose?)
Primary-based remote-write protocol with a fixed server to which all read and
write operations are forwarded.
Sequential consistency
Read Your Writes
The principle of primary-backup protocol.
Mobile workstations!
Name service overhead!
Read
n Collect a read quorum
n Read from any up-to-date replica (the newest timestamp)
Write
n Collect a write quorum
n If there are insufficient up-to-date replicas, replace non-current
replicas with current replicas (WHY?)
n Update all replicas belonging to the write quorum.
Notice: each replica may have a different number of votes assigned to it.
n Replication
n Consistency models
n Distribution protocols
n Consistency protocols