0% found this document useful (0 votes)
12 views

13-Reliable Synchronization in Distributed Systems

Uploaded by

espinheiront
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

13-Reliable Synchronization in Distributed Systems

Uploaded by

espinheiront
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

This article was downloaded by: [Universite Laval]

On: 06 October 2014, At: 05:58


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal of Computer


Mathematics
Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/gcom20

Reliable synchronization in distributed


systems
Seyed H. Roosta
a
Department of Computer Science , University of South Carolina
Spartanburg , Spartanburg, SC, 29303, USA
Published online: 25 Jan 2007.

To cite this article: Seyed H. Roosta (2004) Reliable synchronization in distributed


systems, International Journal of Computer Mathematics, 81:6, 661-673, DOI:
10.1080/00207160410001708779

To link to this article: http://dx.doi.org/10.1080/00207160410001708779

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in relation to or
arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
International Journal of Computer Mathematics
Vol. 81, No. 6, June 2004, pp. 661–673

RELIABLE SYNCHRONIZATION IN DISTRIBUTED


SYSTEMS
SEYED H. ROOSTA
Department of Computer Science, University of South Carolina Spartanburg,
Spartanburg, SC 29303, USA
Downloaded by [Universite Laval] at 05:58 06 October 2014

(Revised 19 December 2003; In final form 5 January 2004)

In distributed computer systems, processors often need to be synchronized to maintain correctness and consistency.
Unlike shared-memory parallel systems, the lack of shared memory and a clock considerably complicates the task
of synchronization in distributed systems. The objective of this article is two-fold: (1) We present a new randomized
agreement algorithm to synchronize cooperating processors in a distributed system. This algorithm achieves the
desired agreement in expected five rounds of message exchanges, tolerating a maximum of one-fifth of the processors
failures. The algorithm belongs to the class of broadcast-based synchronization problems. (2) We present a new self-
stabilization algorithm for an acyclic directed-graph structured distributed systems. This new fault-tolerant algorithm
survives all imaginable faults in distributed systems. The algorithm belongs to arbiter-based and broadcast-based
synchronization problems.

Keywords: Distributed systems; Synchronization; Agreement; Stablization

C.R. Categories: C.2.4; D.4.1

1 INTRODUCTION

In distributed computer systems, processors often need to be synchronized to maintain correct-


ness and consistency. Unlike shared-memory parallel systems, the lack of shared memory and
a clock considerably complicates the task of synchronization in distributed systems [1–4].
Synchronization appears in various contexts and under various guises, but based on the nature
of synchronization, it is possible to classify various synchronization tasks into four cate-
gories: interrupt-based, arbiter-based, broadcast-based and global state-based. In this article, we
explore arbiter-based synchronization and broadcast-based synchronization problems in par-
allel computer systems. The first problem raises agreement among processors in a distributed
system and belongs to the class of broadcast-based synchronization. The second problem con-
cerns self-stabilizing issue and belongs to arbiter-based and broadcast-based synchronization
problems. In this article, we are intended to present new algorithms to implement a reliable
broadcast facility in a distributed system. The remainder contains four sections. In Section 2,
we describe the randomized agreement followed by the new algorithm. Section 3 deals with

∗ E-mail: [email protected]

ISSN 0020-7160 print; ISSN 1029-0265 online 


c 2004 Taylor & Francis Ltd
DOI: 10.1080/00207160410001708779
662 S. H. ROOSTA

self-stablization with the discussion about the new algorithm. Finally, in Section 4 we express
some concluding remarks.

2 RANDOMIZED AGREEMENT

In centralized systems, the presence of shared memory makes agreement immediate, whereas
its accomplishment in distributed systems seems to be the hardest problem [5, 6]. The agree-
ment problem (AP) can be informally described as follows: there are N processors in a dis-
tributed system that are connected to each other by reliable communication channels. All
processors start with (possible different) private values in their local copy of the agreement
variable. Each proper processor tries to obtain the values of the agreement variable from all
other processors in the system. Once a processor has all the values, it evaluates a decision func-
tion on all the values and sets its local agreement variable to the output of the decision function.
Downloaded by [Universite Laval] at 05:58 06 October 2014

It repeats this procedure until there is sufficient evidence that all the proper processors in the
system have agreed on a common value of the agreement variable. By exchanging their private
values, these processors want to converge to a common public value of the agreement vari-
able. This agreement among processors is quite simple and straightforward in case of fail-safe
(proper) processors [2, 7]. However, if some processors can malfunction, they may indefinitely
postpone any agreement by sending different information to different processors. The faulty
processors may even collude to confuse the proper processors. The problem of reaching agree-
ment on the value of the agreement parameter in the presence of malfunctioning processors is
known as the AP. The following algorithm seems to capture all the essential properties of an
agreement algorithm.
Agreement
Var X: agreement variable
Begin
Repeat
Broadcast phase: broadcast your local x;
Collection phase: collect local copies of x from all the processors;
Decision phase: x = decision function (xs from all processors);
Until termination-condition
End.
By properly defining the decision function and termination function, one can fit any existing
algorithm into this framework. A typical execution is b1 , c1 , d1 , b2 , c2 , d2 , . . . , and so on, in
which it will be a cyclic sequence of broadcast. By regulating the influence of earlier phases
on the current phase, one can prove certain properties of the algorithm, such as correctness
and convergence. An important parameter of an agreement algorithm is the ratio t/n, where t
is the maximum number of processors whose failure can be tolerated. In the literature, there
are many deterministic agreement algorithms. The first reference is found in Pease [8] and
it is referred as an interactive consistency problem. The Byzantine Generals problem [9] is prob-
ably the most well-known characterization of theAP, which shows that it is impossible to achieve
agreement in case t ≥ n/3, and for t < n/3; they present an algorithm that achieves the desired
agreement in t + 1 rounds using an exponential number of messages. Even though this expo-
nential behavior makes the algorithm impractical for large systems, the algorithm does exhibit
the optimal time complexity [10–14]. Fisher et al. [15] and Kwok and Ahmad [16] presented an
impossibility result asserting that there does not exist any deterministic agreement algorithm
that can survive even a single death of a processor in a totally asynchronous system consist-
ing of asynchronous processors, asynchronous communication, and asynchronous message
SYNCHRONIZED DISTRIBUTED SYSTEMS 663

order. Later, Dolev et al. [17], Hasselbring [18], and Dymond and Ruzzo [19] strengthened this
impossibility result by showing that this result holds even when processors work in lock-step
synchrony. In the wake of these impossibility results about deterministic algorithms, we present
instead a random agreement algorithm for the totally asynchronous system, that achieves the
desired consensus among all processors in an expected five rounds of message exchanges,
even when one-fifth of the total processors may have stopped completely. The new agreement
algorithm is very simple, yet, under the assumption of uniform message distribution, achieves
the desired agreement in fewer rounds of message exchange than all exiting random algorithms
[20, 21].
The word deterministic in deterministic algorithms stands for the fact that each state tran-
sition in the protocol is fixed once the input state is specified, whereas in random algorithms,
the outcome of some transitions may be chosen uniformly from a possible set of outcomes.
Since our abstract model is applicable to both deterministic and random algorithms, we can
explore the feasibility of randomizing each phase, one at a time, as the following.
Downloaded by [Universite Laval] at 05:58 06 October 2014

• Randomizing the broadcast phase: A broadcast must be complete to ensure the maximum
information exchange among all processors and consequently to expedite the convergence
of the protocol. Therefore, we cannot apply randomization to this phase.
• Randomizing the collection phase: The concept of adversary is useful for this exposition. An
adversary is an external entity that can select which t processor to fail, and can schematically
change the contents of faulty messages. An adversary can inflict the most damage if it can
force all the proper processors to accept its malicious messages. Naturally for a speedier
agreement, a proper processor must strive to contain the damage caused by a malicious
adversary. If a proper processor can somehow keep the adversary guessing till the last
moment before accepting any messages, the damage will be decreased with respect to the
scenario where an adversary could force a proper processor to accept a faulty message. In
this sense, we present an optimization technique of random sampling that is an example of
a random collection phase.
• Randomizing the decision phase: This is the usual way to exercise randomization. The main
idea is as follows: If at any instance during the decision phase, a processor discovers that
with its current information it cannot decide in favor of any common value for the agreement
variable, it changes its own agreement variable to a value selected at random from the set
of all possible outcomes.

2.1 Proposed Randomized Agreement Algorithm

We assume synchronous processors, asynchronous communication and no message ordering.


The processors can fail only in fail-stop manner, meaning that they can malfunction only
by completely stopping all their activities. Even though the communication is asynchronous,
communication is reliable in that messages are delivered without any alteration in transit.
In addition, messages can be delivered in an order different than the one in which they are
sent on a channel. It is worth noting that, none of the algorithms presented in the literature
satisfies the simultaneity property [12, 16, 22–24], instead their algorithms satisfy the property
of coordinated agreement, which is thought of as a relaxed form of simultaneous agreement.
We define a new class of agreement algorithm by giving a parametric decision function:
If number of messages with majority x value ≥c

Then X = majority x value


Else X = 0 or 1 each with probability 1/2

where (N − t)/2 < c ≤ (N − t).


664 S. H. ROOSTA

Since we have asynchronous communication, messages from an earlier broadcast phase


may get mixed up with the messages from the current broadcast phase. To eliminate any such
confusion, we introduce the concept of round number and tag the current round number to
every message generated in the current broadcast phase. We restrict ourselves to the AP for a
binary x since Turpin [25], Perry [26], and Diniz and Rinard [21] independently have shown
that agreement on a multi-valued variable can be reduced to agreement on a binary-valued
variable. Our algorithm is uniform in the sense that all processors follow the same algorithm.
We also use c = (N − t + 1)/2, the leanest majority in a collection of N − t messages. Our
algorithms is as follows.

Algorithm Agreement
Var
N: Integer {total number of processors}
t: Integer {maximum number of faulty processors: t/N < 1/3}
{binary agreement variable}
Downloaded by [Universite Laval] at 05:58 06 October 2014

X: 0 and 1
Plurality: Integer {number of messages with majority X value}
Round: Integer {current round number}
Begin
Round = 0;
Repeat
Broadcast local X;
Collect N −t messages;
Plurality = number of messages with majority X value;
If plurality ≥ (N + t + 1)/2
Then
Decide on majority X value;
Else if plurality ≥ (N − t + 1)/2
Then
X = majority X value;
Else
X = 0 or 1 each with probability 1/2;
Round = Round + 1;
Forever
End.

The salient features of the algorithm are:

• a simple and single decision function,


• all processors follow the same algorithm, and
• all processors terminate simultaneously.

2.2 Example of Randomized Agreement

We claim that any agreement algorithm can be described in terms of our abstract model. As
an example, we describe Ben-Or’s consensus algorithm for fail-stop processors [22] using our
abstract model as illustrated in Figure 1.
Assume an optimization technique that is applicable to any random algorithm. A decision
function can easily be configured to decide on the basis of the ratio faulty/proper nodes and
not on the actual number of nodes in the system. Since a random sampling does not modify
the ratio of faulty to proper nodes, it is quite feasible that a node can, at random, select
SYNCHRONIZED DISTRIBUTED SYSTEMS 665

FIGURE 1 Ben-Or’s consensus algorithm for fail-stop processors.


Downloaded by [Universite Laval] at 05:58 06 October 2014

a partial number of messages out of a possibly large number of actual messages and still
decide correctly as long as a certain minimum number of messages is received (t + 1 in case
of Ben-Or’s algorithm in Figure 1). For example, suppose there are 500 nodes in the system
out of which 100 are faulty. According to Ben-Or’s algorithm, each node should wait for
N − t = 400 messages in every collection phase. Following the new algorithm, it can select
randomly the first 100 + 1 messages out of a total 400 messages and still decide correctly. The
exact manner in which the selection is done can be based on the evaluation of a random toss
at the receipt of each message (the coin must be suitably biased to turn heads up exactly 101
times out of 400 tosses). If the toss comes up, the message is accepted otherwise the message
is rejected. This sampling strategy reduces the CPU requirement at each node by reducing the
waiting to only 101 messages instead of 400 messages. In addition, this random toss also does
away with the possible situation in which the adversary can force all its 101 messages to be
accepted in the total 300 messages. In fact, one can establish the general results that random
sampling is useful in any agreement algorithm that is sensitive only to the ratio of faulty to
proper nodes, and not sensitive to the actual number of nodes.

2.3 Correctness of Randomized Agreement Algorithm

The correctness of an agreement algorithm can be defined as a conjunction of the following


three properties:

Agreement: All the proper nodes decide upon the identical v.


Non-triviality: If all proper nodes start with identical value v, the final decision is v.
Simultaneity: All proper nodes terminate simultaneously.

There is no paper to satisfy the simultaneity property [13, 16, 23, 27–29]. Instead, they
satisfy the property of coordinated agreement. Coordinated agreement can be thought of as
a relaxed form of simultaneous agreement, meaning that: If a proper node decides on v in
round r, all other proper nodes decide on v in round r + 1. It seems that simultaneity, though
being desirable, is not necessary for some distributed applications. For example, in commit
protocols, as long as all proper nodes make a unanimous choice of either commit or abort,
the protocol works fine regardless of the actual time of committing. Moreover, Perry [26]
and Roosta [30] suggest ways to modify a coordinated agreement to simultaneous agreement.
Basically the strategy is to delay the earlier deciding nodes until such times when all nodes
are ready to decide. Our algorithm satisfies the coordinated agreement property. The fact
666 S. H. ROOSTA

that the coordinated agreement property implies the agreement property, we next prove the
non-triviality and coordinated agreement properties for our algorithm.

THEOREM 1 (Non-triviality) Our algorithm solves the strong AP in the sense that if all proper
nodes start with identical input v, they all will decide on v in one round.

Proof If all N − t proper nodes start with x = v, then each proper node will collect at least
(N − t) − t messages in its collection phase. For N ≥ 5t + 1, N − 2t ≥ (n + t + 1)/2.
Therefore, all nodes will decide on v in their first round. 

THEOREM 2 (Coordinated agreement) If a proper node decides on v in round r, then all other
proper nodes decide on v in round r + 1.

Proof If a proper node decides on v in round r, it must have received at least (N + t + 1)/2
messages containing x = v. Since other proper nodes will receive at least (N + t + 1)/2 − t
Downloaded by [Universite Laval] at 05:58 06 October 2014

of these messages (they may skip t of these messages if they may have already received t
messages from the faulty nodes). Since (N + t + 1)/2 − t = (N − t + 1)/2, all undecided
proper nodes will set their x = v, v being the majority x value. In this way, at the end of round
r, all nodes (inclusive of the one that decided) have x = v. By Theorem 1, all the undecided
nodes will decide on v in the next round r + 1. By a similar argument as in Theorem 2, we
can relax Theorem 1 to read as follows. 

COROLLARY 1 If at least (N + t + 1)/2 proper nodes start with identical input v, all proper
nodes will decide on v in two rounds.

Proof As shown in the proof of Theorem 2, all proper nodes will set x = v in the first round,
and therefore they all will start the second round with x = v. By Theorem 1, all nodes will
decide on v in the second round. Therefore, our algorithm satisfies all three properties of
correctness: agreement, non-triviality, and coordinated agreement. 

3 SELF-STABILIZATION ALGORITHM

A self-stabilizing distributed system can be defined as a finite-state processors which, when


started in any initial configuration, always converges to a legitimate configuration. Such a
system exhibits a behavior that is very useful and desirable for any distributed computation,
meaning that after a transient error the system automatically recovers and returns to a legitimate
configuration within a finite number of state transitions. Some notable references which intro-
duced the concept of self-stabilizing algorithms are Kruijer [31], Brown et al. [32], Roosta [33]
and Arpaci-Dusseau [34], in which the algorithms are asymmetric in the sense that there is
at least one special processor that follows a different protocol than the rest of the processors.
We present an asymmetric self-stabilizing algorithm for acyclic directed-graph structure. Our
algorithm extends Kruijer’s algorithm to acyclic digraph structures, in which we reduce the
digraph to a logical tree where a processor can have all its predecessors acting as its parents,
meaning that it enforces a strict layer-by-layer order on the information flow. Self-stabilizing
systems, by definition, can survive transient failures but our self-stabilization algorithm can
tolerate even the permanent failure of nodes and edges as long as these failures do not dis-
connect any node from the root of the digraph. Consider an acyclic digraph whose nodes
and edges correspond to the processors and channels of the distributed system, respectively.
There is no shared memory and, apart from its local state, a node can access only the states of
SYNCHRONIZED DISTRIBUTED SYSTEMS 667

its adjacent neighbors. For each node, there are a number of conditions (privileges) defined in
terms of its local state and states of its neighbors only (i.e. the control is distributed). When
a privilege is true for a node, we say that it is present for that node, and that node can make
the corresponding state transition (move). If more than one privilege is present for a node,
we assume that the node non-deterministically selects one of the privileges for the next move.
However, if there are privileges present for more than one node, the enabled nodes can make
their moves concurrently. We have defined privilege in such a way that there does not occur
any deadlock or inconsistency in case two neighboring nodes make concurrent moves. One can
view self-stabilization as a coordination of local actions to ensure convergence to some global
objectives. As a result, the exact definition of a legitimate configuration is left open as long as
legitimate configuration satisfy the global objective. Nonetheless, there are some consistency
requirements for a legitimate state. Dijkstra [35] uses the following definition to determine a
legitimate state.

1. In each legitimate state one or more privileges will be present.


Downloaded by [Universite Laval] at 05:58 06 October 2014

2. In each legitimate state each possible move will bring the system again in a legitimate state.
3. Each privilege must be present in at least one legitimate state.
4. For any pair of legitimate states there exists a sequence of moves transferring the system
from one into the other.
The system is self-stabilizing if and only if
5. Regardless of the initial state and the privilege selected each time for the next move, at
least one privilege will always be present and the system is guaranteed to find itself in a
legitimate state after a finite number of moves.

We present a new self-stabilizing algorithm for acyclic digraph-structured distributed systems.


Self-stabilizing systems can survive transient failures but our self-stabilization algorithm can
tolerate even the permanent failures of processors and edges as long as these failures do not
disconnect any processor from the root of the graph.

3.1 Proposed Self-stablization Algorithm

For the ease of explanation, we number the nodes in the system from 1 to N . We use i, j, k, . . .
to refer to nodes and (i, j ) to represent that a directed channel exists from i to j . Each node
has a finite-state machine with an even number (2 ∗ K, where K is a constant ≥2) of states.
The state of a node i is encoded by a pair of values (S[i], Eq[i]) where S[i] can take on values
from 0 to K − 1 and Eq[i] is a Boolean variable. Node i maintains information about its pre-
decessor and successor nodes in local variables Pred[i] and Succ[i], respectively (in direction
of the channel only corresponds to the structure of the system and a node can access the states
of both its successors and predecessors directly). Succ[i] and Pred[i] can be dynamically mod-
ified to reflect the current communication channels of the system, provided that the following
invariant is maintained.
Invariant A: Succ[i] ∩ Pred[i] = φ
One of the nodes in the system is designated the root and is not changed during the execution of
the algorithm. All other nodes are identical in every respect. The root node follows a different
protocol than all other nodes.
Invariant B: There exists a directed path from the root to every other node in the system.
Since the digraph is acyclic, invariant B implies that there does not exist a directed path to the
root from any other node in the digraph (otherwise we will have cycles in the digraph). Each
processor i can enjoy two privileges A and B {format: privilege ⇒ move}
Privilege A: ¬ Eq[i] ∧ Test[i] ⇒ Eq[i] := true
668 S. H. ROOSTA

Privilege B: For root: Eq[i] ⇒ S[i], Eq[i] := S[i] ⊕ 1, false


For others: Eq[i] ∧ New S[i] ⇒ S[i], Eq[i] := new, false
⊕ is a modulo K add, and predicates Test and New S are defined as follows:

Test[i] =< ∀j : (j ∈ Succ[i]) :: Eq[j ] ∧ (S[j ] = S[i]) >


{ – true when all the successors of I have stabilized.}
New S[i] =< ∃ New ∀j : (j ∈ Pred[i]) :: (S[j ] = New) ∧ (S[i] = New) >
{ – true when all the predecessors of i have a common S value New = S[i].}

Apart from knowing its own state (S[i], Eq[i]), node i only needs to know the states (S, Eq) of
its successor nodes (to evaluate Test[i]) and the S values of its predecessor nodes (to evaluate
Downloaded by [Universite Laval] at 05:58 06 October 2014

New S[i]). This conforms to the assumption of distributed control. The legitimate states of the
system are defined to be:

1. the so-called perfect states: S[1] = S[2] = · · · = S[N ], and Eq[1] = Eq[2] = · · · =
Eq[N] = true, and
2. the states that arise from perfect states by the completion of one or more valid moves.

3.2 Example of Self-stabilization

Consider a system in which K = 2; i.e. S[i] can be either 0 or 1. There are five nodes in the
system numbered 1–5. Node 1 is the root. Nodes are represented by circles and channels by
edges. The local values of Pred and Succ are shown by the side of each node. The S value
is shown inside the circle and the Eq value is given by the color of the node; we say a node
is black if Eq[i] is false and white otherwise. Any privileges present are shown enclosed in
boxes and placed adjacent to the enabled node (Fig. 2). In the initial state, all nodes are white
and have S = 0 (a perfect state). In this state, only privilege B is present for the root node
(node 1) and consequently it makes the corresponding transition by increasing its S to 1 and
turning itself black as shown in Figure 3. Now the only privilege present is B for Machine 2;
it makes the move, so bringing the system into the state shown in Figure 4. Now the only
privilege present is B for machine 3. After the move, the new state is shown in Figure 5. In this

FIGURE 2 Initial state (which is perfect).


SYNCHRONIZED DISTRIBUTED SYSTEMS 669

FIGURE 3 State after first move.

new state there are two privileges present: B for each machine 4 and 5. Since 4 and 5 are not
neighbors, they move concurrently bringing the system to the state shown in Figure 6. In the
Downloaded by [Universite Laval] at 05:58 06 October 2014

current state, all machines are black. The condition Test evaluates to true for machines 4 and 5,
thereby making privilege a present for both machines. They move concurrently entering the
system in the new state shown in Figure 7. Machines 4 and 5 have become white and the only
privilege present is A for machine 3 in which Figure 8 shows the result of making that move.
In this state, the only privilege present is A for machine 2. The corresponding move results
in the state shown in Figure 9. In this state, the only privilege present is A for machine 1, the
root. The move brings the system to the perfect state as shown in Figure 10. In this final state,
all machines are white again. The root node has privilege B present and can restart a similar
sequence of moves bringing the system forward (2 mod 2 = 0) to the perfect state of Figure 2.
It is worth noting that in this transition from perfect state of Figure 2 to the perfect state of
Figure 10, there are total of 10 moves and each machine enjoyed both privilege A and B exactly
once.

FIGURE 4 State after second move.

FIGURE 5 State after third move.


670 S. H. ROOSTA

FIGURE 6 State after fifth move.


Downloaded by [Universite Laval] at 05:58 06 October 2014

FIGURE 7 State after seventh move.

FIGURE 8 State after eight move.

FIGURE 9 State after ninth move.


SYNCHRONIZED DISTRIBUTED SYSTEMS 671

FIGURE 10 State after tenth move (a perfect state).

3.3 Correctness of Self-stabilization Algorithm

We now verify that our algorithm satisfies the properties 1–5. First, we formalize the notion of
Downloaded by [Universite Laval] at 05:58 06 October 2014

a depth of a node.

DEFINITION The depth d of a node is the length of the longest path from the root to it.

For example, in Figure 2, the depth of node 5 is 3 corresponding to path 1–2–3–5. The other
two paths 1–2–5, and 1–3–5 only have length 2. From the definition of depth, it immediately
follows that there must not be a path between any two nodes at the same depth, and a node
cannot have an incoming path from nodes at higher depths.

THEOREM 3 In each state of the system at least one privilege is present.

Proof Assume the system is in a state in which no privilege is present. Since test is
always true for each leaf node i, the absence of privilege A ⇒ Eq[i] = true. The absence
of privilege B ⇒ New S[i] = false. ¬New S[i] ⇒ ∀j : (j ∈ Pred[i]) :: (S[j ] = S[i]). This
is, in particular, true of all leaf nodes at the deepest depth in the digraph, dMax . Hence for
each node j at depth dMax − 1, test must be true. Now we can repeat the above argument all
over again for all nodes at depth dMax − 1. By repeated application of the above reasoning,
we reach the root of the digraph, which is at depth 0. Now, the absence of privilege a at
root ⇒ Eq[root] = true. But then privileged B is present at the root. A contradiction. 

THEOREM 4 Any infinite sequence of moves must necessarily involve an infinite number of
moves by the root.

Proof Suppose the contrary is true after a finite number of moves, the state of the root, i0
(depth 0), remains constant. Then, there must exist a subgraph rooted at, i1 , i1 ∈ Succ(i0 ),
depth(i1 ) = 1, in which an infinite number of moves take place. It follows from the definition
of depth that the subgraph rooted at i1 does not violate invariant B. But after the moment i1
reaches its final state, there are at most three moves with i1 . Hence after a finite number of
moves the state of i1 remains constant. By repeated application of the above argument, we
arrive at a leaf node, ileaf , at depth dMax with which an infinite number of moves take place
while the state of predecessors of ileaf remain constant. A contradiction. 

THEOREM 5 Regardless of the initial state and regardless of the privilege selected each time
for the next move, the system will find itself in a perfect state after a finite number of moves.

Proof Suppose that, starting at an initial state, there exist an infinite sequence of moves
such that no move leaves the system in a perfect state. If follows from Theorem 4, that in
672 S. H. ROOSTA

this infinite sequence, root i0 must make an infinite number of moves. Concentrate on the
consecutive moves with i0 , at the initiation of which Eq[i0 ] alternately has value (false, true,
false, true, . . .). We denote those moves by their serial number (m1 , m2 , m3 , m4 , . . .) in the
infinite sequence. We know that only privilege B can enable i0 to make a transition from
Eq[i0 ] = true to Eq[i0 ] = false and, similarly, only privilege a (Test[i0 ] ∧ ¬Eq[i0 ]) can enable
i0 to make a transition from eq[i0 ] = false to Eq[i0 ] = true.
Since m1 changes Eq[i0 ] from false to true, i0 must have enjoyed privilege a. Therefore
Test[i0 ] = true at the initiation of m1 . From definition of Test, Test[i0 ] ⇒

S[i] = S[i0 ] ∧ Eq[i] = true (∗)

for all nodes at depth 1. After m1 that established Eq[i0 ] = true, (∗) contains to hold until
completion of m2 (privilege B) yields the situation that S[i] = S[i0 ]θ 1 and Eq[i] = true for
all nodes at depth 1 (θ is the modulo K subtraction) while Eq[i0 ] = false. At the initiation
Downloaded by [Universite Laval] at 05:58 06 October 2014

of move m3 , Test[i0 ] is again true. In order to achieve this, each successor of i0 must have
exactly two moves in the time between m2 and m3 . The first move established S[i] = S[i0 ] and
eq[i] = false, the second move established Eq[i] = true at a moment that Test[i] = true which
implies that then on (∗) continues to hold – at least until m3 – for both i and its successors.
Hence at the initiation of m3 , we find (∗) fulfilled by all nodes at depth 1 and 2. Repeated
application of the above argument yields (∗) to be valid after a finite number of moves for all
nodes i excluding i0 , at the initiation of a move that establishes Eq[i0 ] = true and hence leave
the system in a perfect state. A contradiction. 

THEOREM 6 Let the system be in a perfect state, and denote the common value of the variables
s[i](i = 1, . . . , N) by s0 . Then the next time the system finds itself in a perfect state again is after
exactly 2N moves; in this new perfect state the common value of the variables s[i](i =, . . . , N )
equals s0 ⊕ 1, and in the sequence of moves which has established the transformation from the
old perfect state into the new one each privilege has been used exactly once for the completion
of a move.

Proof This proof follows similar reasoning as in the proof of Theorem 5 and is omitted for
brevity. Now, our algorithm is correct as requirements 1–5 are met because:
The definition of a legitimate state ⇒ 2.
Theorems 5 and 6 ⇒ 5.
5 ⇒ 1.
Theorem 6 ⇒ 3 and 4.

4 CONCLUSION

In this article, we presented new algorithms that belong to the class of synchronization
problems. We have suggested a new randomized agreement algorithm to synchronize
cooperating processors in Section 2. Our approach obtains the desired agreement in expected
five rounds of message exchanges in distributed systems, tolerating a maximum of one-fifth
of the processors failures. In Section 3, we have presented a new self-stabilization algorithm
for acyclic directed-graph structures. The algorithm is a fault-tolerant algorithm in the sense
that it survive all imaginable faults in the system.
SYNCHRONIZED DISTRIBUTED SYSTEMS 673

References
[1] Attie, P. A. and Emerson, E. A. (2001). Synthesis of concurrent programs for an atomic read/write model of
computation. ACM Transactions on Prog. Lang. and Sys., 23(2), 187–242.
[2] Burns, J. and Pachl, J. (1989). Uniform self-stabilizing ringles. ACM, 11(2), 330–344.
[3] Dijkstra, E. W. (1986). A belated proof of self-stabilization. Journal of Distributed Computing, 1, 5–6.
[4] Gehani, N. (1984). Broadcasting sequential processes. IEEE Transactions on Software Engineering SE-10, 4,
343–351.
[5] Lamport, L. (1978). Time, clocks, and the ordering of events in a distributed system. CACM, 21(7), 558–565.
[6] Rabin, M. O. (1983). Randomized Byzantine Generals. 24th IEEE FOCS, 403–409.
[7] Abu-Amara, H. (1988). Fault-tolerant distributed algorithm for election in complete networks. IEEE Transactions
on Computers, C-37(4), 449–453.
[8] Pease, M., Shotak, R. and Lamport, L. (1980) Reaching agreement in presence of faults. JACM, 27(2), 228–234.
[9] Lamport, L., Shostak, R. R. and Pease, M. (1982). The Byzantine Generals problem. ACM TOPLAS, 4(3), July,
382–401.
[10] Fisher, M. J. and Lynch, N. A. (1982). A lower bound for the time to assure interactive consistency. Information
Processing Letters, 14(4), June, 183–186.
[11] Dolev, D. (1982). Polynomial algorithms for multiple processor agreement. 14th ACM Symposium on Theory of
Computing, 383–400.
Downloaded by [Universite Laval] at 05:58 06 October 2014

[12] Perry, K. J. (1987). A framework for agreement. Proceedings of 2nd International Workshop on Distributed
Algorithms. 57–75.
[13] Perry, K. J. (1995). Randomized Byzantine agreement. IEEE Transactions of Software Engineering, 6, 539–546.
[14] Bilas, Jiang, D. and Singh, J. P. (2001). Accelerating shared virtual memory via general-purpose network interface
support. ACM Transactions on Computer Systems, 19(1), 1–35.
[15] Fisher, M. J., Lynch, N. A. and Paterson, M. S. (1985). Impossibility of distributed consensus with one faulty
process. JACM, 32(2), 374–382.
[16] Kwok, Y.-K. and Ahmad, I. (1999). Static scheduling algorithms for allocating directed task graphs to
multiprocessors. ACM Surveys, 31(4), December, 406–471.
[17] Dolev, D., Dwork, C. and Stockmeyer, L. (1987). On the minimal synchronization needed for distributed systems.
JACM, 34(1), 77–97.
[18] Hasselbring, W. (2000). Programming languages and systems for prototyping concurrent applications. ACM
Computing Surveys, 32(1), March, 43–79.
[19] Dymond, P. W. and Ruzzo, W. L. (2000). Parallel RAMs with owned global memory and deterministic context-free
language recognition. Journal of the ACM, 47(1), January, 16–45.
[20] Feldman, P. and Micali, S. (1988). Optimal algorithms for Byzantine agreement. 20th ACM Symposium on Theory
of Computing, 148–161.
[21] Diniz, P. and Rinard, M. C. (1999). Eliminating synchronization overhead in parallelized programs using dynamic
feedback. ACM Transactions on Computer Systems, 17(2), May, 89–132.
[22] Ben-Or, M. (1983). Another advantage of free-choice: completely asynchronous agreement protocols. 2nd ACM
PODC, 27–30.
[23] Mendelson and Gabbay, F. (2001). The effect of communication on multiprocessing systems. ACM Transactions
on Computer Systems, 19(2), May, 252–281.
[24] Reischuk, R. (1999). A new solution for the Byzantine Generals problems. Information and Control, 64, 23–34.
[25] Turpin, R. (1984). Extending binary Byzantine agreement to multi-valued Byzantine agreement. Information
Processing Letters, 18, February, 73–76.
[26] Perry, K. J. (1985) Randomized Byzantine agreement. IEEE Trans. of Software Engineering, SE-11, 6, 539–546.
[27] Roosta, S. (2002). Implicit and explicit synchronization in parallel computers. ACM Computing Surveys,
submitted for publication.
[28] Yeung, D. (2000). MultiGrain shared memory. ACM Transactions on Computer Systems, 18(2), May, 154–196.
[29] Keleher, P. J. (2000). A high-level abstraction of shared accesses. ACM Transactions on Computer Systems,
18(1), February, 1–36.
[30] Roosta, S. (2002). Performance evaluation models for parallel computers. ACM Transactions on Computer
Systems, submitted for publication.
[31] Kruijer, H. S. (1979). Self-stabilization in tree-structured systems. Information Processing Letters, 8(2), February,
91–95.
[32] Brown, G. M., Gouda, M. G. and Wu, C. (1989). Token systems that self-stavlize. IEEE Transactions on
Computers C-38, 6, June, 845–852.
[33] Roosta, S. (2000). Parallel Processing and Parallel Algorithms: Theory and Computation. Springer.
[34] Arpaci-Dusseau, C. (2001). Implicit coscheduling: coordinated scheduling with implicit information in
distributed systems. ACM Transactions on Computer Systems, 19(3), 83–331.
[35] Dijkstra, E. W. (1974). Self-stabilizing systems in spite of distributed control. CACM, 17(11), 643–644.
Downloaded by [Universite Laval] at 05:58 06 October 2014

You might also like