Lecture 7.2 Consistency
Lecture 7.2 Consistency
CAP Theorem
• Proposed by Eric Brewer (late 90s)
o Later proved by (Gilbert and Lynch)
• In a distributed data store, we can satisfy at most 2 out of the 3
guarantees:
o Consistency: all nodes see same data at any time, or reads return
latest written value by any client
§ E.g., Bank transactions (write) should be propagated to all
the replicas before subsequent write operations
o Availability: the system allows operations all the time, and
operations return quickly
§ E.g., At Amazon, each added millisecond of latency implies
a $6M yearly loss (2009)
o Partition-tolerance: the system continues to work in spite of
network partitions
§ E.g., Internet router outages
• Traditional RDBMSs
o Not replicated, so partition tolerance is not important
§ Provides strong consistency, availability
• For replicated storages, partition-tolerance is essential
o So, a replicated system has to choose between consistency and
availability
o Cassandra, Dynamo
§ Partition-tolerance, Availability, Eventual(weak)
consistency
o BigTable, Spanner
§ Partition-tolerance, Consistency
Consistency model: a contract between a (distributed) data store and
processes, in which the data store specifies what the results of read and
write operations are in the presence of concurrency
Strict Consistency
• Any read on a data item ‘x’ returns a value corresponding to the result
of the most recent write on ‘x’ (regardless of where the write
occurred)
• All writes are instantaneously visible to all processes and absolute
global time order is maintained throughout the distributed system
o Not easy to achieve in real world
Example:
Sequential consistency
Causal consistency
• Writes W(x)b and W(x)c are concurrent, so it is not required that all
processes see them in the same order
• W(x)b has a causal relationship with W(x)a but not with W(x)c
• The two writes are causally related, so all processes must see them in
the same order
Note: Figure (b) reflects a situation that would not be acceptable for a
sequentially consistent store
Linearizability
• "Linearizability" is the most common and intuitive definition
formalizes behavior expected of a single server ("strong" consistency)
• An execution history is linearizable:
o If one can find a total order of all operations, that matches
real-time (for non-overlapping ops), and
o In which each read sees the value from the write preceding it in
the order
Note: A history is a record of client operations, each with arguments, return
value, invocation time, and completion time. A history is usually a trace of
what clients saw in an actual execution
Example history 1
|-W(x1)-| |-W(x)2-|
|---R(x)2---|
|-R(x)1-|
• Constraint arrows:
o The order obeys value constraints (W -> R)
o The order obeys real-time constraints (W(x)1 -> W(x)2)
• This order satisfies the constraints:
o W(x)1 R(x)1 W(x)2 R(x)2
• So, the history is linearizable
Note:
• The definition is based on external behavior
• So, we can apply it without having to know how service works
• Histories explicitly incorporates concurrency in the form of
overlapping operations
• Thus, good match for how distributed systems operate
Example history 2
|-W(x)1-| |-W(x)2-|
|--R(x)2--|
|-R(x)1-|
• Constraint arrows:
o W(x)1 before W(x)2 (time)
o W(x)2 before R(x)2 (value)
o R(x)2 before R(x)1 (time)
o R(x)1 before W(x)2 (value)
• There's a cycle
o So it cannot be turned into a linear order
• So, this history is not linearizable
Example history 3
|--W(x)0--| |--W(x)1--|
|--W(x)2--|
|-R(x)2-| |-R(x)1-|
• Order: W(x)0 W(x)2 R(x)2 W(x)1 R(x)1
• So, the history linearizable
• So, the service can pick either order for concurrent writes
Example history 4
|--W(x)0--| |--W(x)1--|
|--W(x)2--|
C1: |-R(x)2-| |-R(x)1-|
C2: |-R(x)1-| |-R(x)2-|
Constraints:
• W(x)2 then C1: R(x)2 (value)
• C1: R(x)2 then W(x)1 (value)
• W(x)1 then C2: R(x)1 (value)
• C2: R(x)1 then W(x)2 (value)
• Cycle! so not linearizable
Example history 5
|-W(x)1-|
|-W(x)2-|
|-R(x)1-|
• Constraints:
o W(x)2 before R(x)1 (time)
o R(x)1 before W(x)2 (value)
o Or: time constraints mean only possible order is W(x)1 W(x)2
R(x)1
• There's a cycle; not linearizable
Example history 6
• Suppose clients re-send requests if they don't get a reply
• In case it was the response that was lost:
o Leader remembers client requests it has already seen
o If sees duplicate, replies with saved response from first
execution
• But this may yield a saved value from long time ago
o A stale value!
Monotonic Reads
• If a process reads the value of a data item x, any successive read
operation on x by that process will always return that same or a more
recent value
• In our example:
o The read operations performed by a single process P at two
different local copies of the same data store
o Local data stores are L1 and L2
(a) A monotonic-read consistent data store. (b) A data store that does not
provide monotonic reads
Monotonic Writes
• A write operation by a process on a data item x is completed before any
successive write operation on x by the same process
• More formally, if we have two successive operations Wk(xi) and Wk(xj) by
process Pk, then, regardless where Wk(xj) takes place, we also have
WS(xi;xj)
• Thus, completing a write operation means that the copy on which a
successive operation is performed reflects the effect of a previous
write operation by the same process, no matter where that operation was
initiated
o In other words, a write operation on a copy of item x is
performed only if that copy has been brought up to date by means
of any preceding write operation by that same process, which may
have taken place on other copies of x
• If need be, the new write must wait for old ones to finish
(a) A monotonic-write consistent data store. (b) A data store that does not
provide monotonic-write consistency. (c) Again, no consistency as WS(x1|x2)
and thus also WS(x1|x3) (d) Consistent as WS(x1;x3) although x1 has apparently
overwritten x2.