Unit 1
Unit 1
Regulation : 2017
Logical clocks are based on capturing chronological and causal relationships of processes and
ordering events based on these relationships.
In a system of logical clocks, every process has a logical clock that is advanced using a set
of rules. Every event is assigned a timestamp and the causality relation between events can
be generally inferred from their timestamps.
The timestamps assigned to events obey the fundamental monotonicity property; that is, if
an event a causally affects an event b, then the timestamp of a is smaller than the timestamp
of b.
A system of logical clocks consists of a time domain T and a logical clock C. Elements of T form
a partially ordered set over a relation <. This relation is usually called the happened before or
causal precedence.
The logical clock C is a function that maps an event e in a distributed system to an element
in the time domain T denoted as C(e).
such that
for any two events ei and ej,. eiej C(ei)< C(ej).
This monotonicity property is called the clock consistency condition.When T and C satisfy
the following condition,
Then the system of clocks is strongly consistent.
Data structures:
Each process pimaintains data structures with the given capabilities:
•
A local logical clock (lci), that helps process pi measure itsown progress.
•
A logical global clock (gci), that is a representation of process pi’s local view of the
logical global time. It allows this process to assignconsistent timestamps to its local events.
Protocol:
The protocol ensures that a process’s logical clock, and thus its view of theglobal time, is
managed consistently with the following rules:
Rule 1: Decides the updates of the logical clock by a process. It controls send, receive and
other operations.
Rule 2: Decides how a process updates its global logical clock to update its view of the
global time and global progress. It dictates what information about the logical time is
piggybacked in a message and how this information is used by the receiving process to
update its view of the global time.
2. Total Reordering:Scalar clocks order the events in distributed systems.But all the events
do not follow a common identical timestamp. Hence a tie breaking mechanism is essential to
order the events. The tie breaking is done through:
Linearly order process identifiers.
Process with low identifier value will be given higher priority.
The term (t, i) indicates timestamp of an event, where t is its time of occurrence and i is the
identity of the process where it occurred.
The total order relation () over two events x and y with timestamp (h, i) and (k, j) is given by:
3. Event Counting
If event e has a timestamp h, then h−1 represents the minimum logical duration,
counted in units of events, required before producing the event e. This is called height of the
event e. h-1 events have been produced sequentially before the event e regardless of the
processes that produced these events.
4. No strong consistency
The scalar clocks are not strongly consistent is that the logical local clock and
logical global clock of a process are squashed into one, resulting in the loss causal
dependency information among events at different processes.
The time domain is represented by a set of n-dimensional non-negative integer vectors in vector
time.
There is an isomorphism between the set of partially ordered events produced by a
distributed computation and their vector timestamps.
If the process at which an event occurred is known, the test to compare
two timestamps can be simplified as:
2. Strong consistency
The system of vector clocks is strongly consistent; thus, by examining the vector timestamp
of two events, we can determine if the events are causally related.
3. Event counting
If an event e has timestamp vh, vh[j] denotes the number of events executed by process pj
that causally precede e.
Clock synchronization is the process of ensuring that physically distributed processors have a
common notion of time.
Due to different clocks rates, the clocks at various sites may diverge with time, and
periodically a clock synchrinization must be performed to correct this clock skew in
distributed systems. Clocks are synchronized to an accurate real-time standard like UTC
(Universal Coordinated Time). Clocks that must not only be synchronized with each other
but also have to adhere to physical time are termed physical clocks. This degree of
synchronization additionally enables to coordinate and schedule actions between multiple
computers connected to a common network.
Fig 1.30 a) Offset and delay estimation Fig 1.30 b) Offset and delay estimation
between processes from same server between processes from different servers
Let T1, T2, T3, T4 be the values of the four mostrecent timestamps. The clocks A and B are
stable andrunning at the same speed. Let a = T1 − T3 and b = T2 − T4. If the networkdelay
difference from A to B and from B to A, called differential delay, is
CS8603 DS
small, the clock offset and roundtrip delay of B relative to A at time T4are
approximately given by the following:
Each NTP message includes the latest three timestamps T1, T2, andT3, while
T4 is determined upon arrival.
(i) non-FIFO
(ii) FIFO
(iii) causal order
(iv) synchronous order
There is always a trade-off between concurrency and ease of use and implementation.
Asynchronous Executions
An asynchronous execution (or A-execution) is an execution (E, ≺) for which the causality relation
is a partial order.
There cannot be any causal relationship between events in asynchronous execution.
The messages can be delivered in any order even in non FIFO.
Though there is a physical link that delivers the messages sent on it in FIFO order due
to the physical properties of the medium, a logical link may be formed as a composite
of physical links and multiple paths may exist between the two end points of the
logical link.
CS8603 DS
executions
In general on any logical link, messages are delivered in the order in which they are sent.
• To implement FIFO logical channel over a non-FIFO channel, use a separate numbering scheme to
sequence the messages.
• The sender assigns and appends a tuple to each message.
The receiver uses a buffer to order the incoming messages as per the sender’s sequence numbers, and
accepts only the “next” message in sequence
Two send events s and s’ are related by causality ordering (not physical time
ordering), then a causally ordered execution requires that their corresponding receive
events r and r’ occur in the same order at all common destinations.
If s and s’ are not related by causality, then CO is vacuously(blankly)satisfied.
Causal order is used in applications that update shared data, distributed shared
memory, or fair resource allocation.
The delayed message m is then given to the application for processing. The event of
an application processing an arrived message is referred to as a delivery event.
No message overtaken by a chain of messages between the same (sender, receiver)
pair.
If send(m1) ≺ send(m2) then for each common destination d of messages m1 and m2,
deliverd(m1) ≺deliverd(m2) must be satisfied.
Synchronous Execution
When all the communication between pairs of processes uses synchronous send and
receives primitives, the resulting order is the synchronous order.
The synchronous communication always involves a handshake between the receiver
and the sender, the handshake events may appear to be occurring instantaneously and
atomically.
The instantaneous communication property of synchronous executions requires a
modified definition of the causality relation because for each (s, r) ∈ T, the send
event is not causally ordered before the receive event.
The two events are viewed as being atomic and simultaneous, and neither event
CS8603 DS
precedes the other.
When all the communication between pairs of processes is by using synchronous send
and receive primitives, the resulting order is synchronous order. The algorithms run on
asynchronous systems will not work in synchronous system and vice versa is also true.
An execution can be modeled to give a total order that extends the partial order
(E, ≺).
In an A-execution, the messages can be made to appear instantaneous if there exist a
linear extension of the execution, such that each send event is immediately followed
by its corresponding receive event in this linear extension.
Non-separated linear extension is an extension of (E, ≺) is a linear extension of (E, ≺) such that
for each pair (s, r) ∈ T, the interval { x∈ E s ≺ x ≺ r } is empty.
A A-execution (E, ≺) is an RSC execution if and only if there exists a non-separated linear
extension of the partial order (E, ≺).
In the non-separated linear extension, if the adjacent send event and its corresponding
receive event are viewed atomically, then that pair of events shares a common past
and a common future with each other.
Crown
Let E be an execution. A crown of size k in E is a sequence <(si, ri), i ∈{0,…, k-1}> of pairs of
corresponding send and receive events such that: s0 ≺ r1, s1 ≺ r2, sk−2 ≺ rk−1, sk−1 ≺ r0.
The crown is <(s1, r1) (s2, r2)> as we have s1 ≺ r2 and s2 ≺ r1. Cyclic dependencies
may exist in a crown. The crown criterion states that an A-computation is RSC, i.e., it can be
realized on a system with synchronous communication, if and only if it contains no crown.
2.2.3 Simulations
The events in the RSC execution are scheduled as per some non-separated linear
extension, and adjacent (s, r) events in this linear extension are executed sequentially
in the synchronous system.
The partial order of the asynchronous execution remains unchanged.
If an A-execution is not RSC, then there is no way to schedule the events to make
them RSC, without actually altering the partial order of the given A-execution.
However, the following indirect strategy that does not alter the partial order can be
used.
Each channel Ci,j is modeled by a control process Pi,j that simulates the channel buffer.
An asynchronous communication from i to j becomes a synchronous communication
from i to Pi,j followed by a synchronous communication from Pi,j to j.
This enables the decoupling of the sender from the receiver, a feature that is essential
CS8603 DS
in asynchronous systems.
Schedule to satisfy the progress property (i.e., find a schedule within a bounded number of steps)
in addition to the safety (i.e., correctness) property.
• Additional features of a good algorithm are: (i) symmetry or some form of fairness, i.e., not
favoring particular processes (ii) efficiency, i.e., using as few messages as possible
• A simple algorithm by Bagrodia, makes the following assumptions:
1. Receive commands are forever enabled from all processes.
2. A send command, once enabled, remains enabled until it completes.
3. To prevent deadlock, process identifiers are used to break the crowns.
4. Each process attempts to schedule only one send event at any time.
• The algorithm illustrates how crown-free message scheduling is achieved on-line.
Propagation Constraint II: it is not known that a message has been sent to d in the causal
future of Send(M), and hence it is not guaranteed using a reasoning based on transitivity that
the message M will be delivered to d in CO.
2.7 TOTAL ORDER
For each pair of processes Pi and Pj and for each pair of messages Mx and My that are delivered to
both the processes, Pi is delivered Mx before My if and only if Pj is delivered Mxbefore My.
Each process sends the message it wants to broadcast to a centralized process, which
relays all the messages it receives to every other process over FIFO channels.
Sender side
Phase 1
In the first phase, a process multicasts the message M with a locally unique tag
and the local timestamp to the group members.
Phase 2
The sender process awaits a reply from all the group members who respond with a
tentative proposal for a revised timestamp for that message M.
The await call is non-blocking.
Phase 3
The process multicasts the final timestamp to the group.
csau
Phase 2
The receiver sends the revised timestamp back to the sender. The receiver then waits
in a non-blocking manner for the final timestamp.
Phase 3
The final timestamp is received from the multicaster. The corresponding
message entry in temp_Q is identified using the tag, and is marked as deliverable
after the revised timestamp is overwritten by the final timestamp.
The queue is then resorted using the timestamp field of the entries as the key. As the
CS8603 DS
queue is already sorted except for the modified entry for the message under
consideration, that message entry has to be placed in its sorted position in the
queue.
If the message entry is at the head of the temp_Q, that entry, and all consecutive
subsequent entries that are also marked as deliverable, are dequeued from temp_Q,
and enqueued in deliver_Q.
Complexity
This algorithm uses three phases, and, to send a message to n − 1 processes, it uses 3(n – 1)
messages and incurs a delay of three message hops
Law of conservation of messages: Every messagem ijthat is recorded as sent in the local state of a
process pi must be capturedin the state of the channel C ij or in the collected local state of the
receiver process pj.
CS8603 DS
In a consistent global state, every message that is recorded as received isalso recorded
as sent. Such a global state captures the notion of causalitythat a message cannot be
received if it was not sent.
Consistent global statesare meaningful global states and inconsistent global states are
not meaningful in the sense that a distributed system can never be in an
inconsistentstate.
Interpretation in terms of cuts •
Cuts is a zig-zag line that connects a point in the space–time diagram at some arbitrary point
in the process line.
• Cut is a powerful graphical aid for representing and reasoning about the global states of a
computation.
• Left side of the cut is referred as PAST event and right is referred as FUTURE event..
A consistent global state corresponds to a cut in which every message received in the PAST of the cut has
been sent in the PAST of that cut. Such a cut is known as a consistent cut. Example: Cut C2 in the above
figure.
All the messages that cross the cut from the PAST to the FUTURE are captured in the corresponding
channel state.
If the flow is from the FUTURE to the PAST is inconsistent. Example: Cut C1
A snapshot captures the local states of each process along with the state of each communication channel.
Initiating a snapshot
Process Pi initiates the snapshot
Pi records its own state and prepares a special marker message.
Send the marker message to all other processes.
Start recording all incoming messages from channels Cij for j not equal to i.
Propagating a snapshot
For all processes Pjconsider a message on channel Ckj.
if marker message is seen for the first time:
Pjrecords own sate and marks Ckj as empty
Send the marker message to all other processes.
Record all incoming messages from channels Clj for 1 not equal to j or k.
Else add all messages from inbound channels.
Terminating a snapshot
All processes have received a marker.
All process have received a marker on all the N-1 incoming channels.
A central server can gather the partial state to build a global snapshot.
Complexity
The recording part of a single instance of the algorithm requires O(e) messages
O(d) time - e is the number of edges in the network and d is the diameter of the network.