0% found this document useful (0 votes)

54 views

Group Comm PDF

This document proposes using group communication techniques to improve distributed system management. Specifically, it discusses how group communication can help with three tasks: simultaneous execution of commands on multiple machines, efficient software installation on multiple machines, and consistent management of network tables. The architecture would exploit multicast capabilities to speed up processes and minimize network load during management tasks.

Uploaded by

Sai Shankar

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views

Group Comm PDF

Uploaded by

Sai Shankar

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Group Communication as an Infrastructure for

Distributed System Management

Y. Amir D. Breitgand, G. V. Chockler, D. Dolev
Department of Computer Science Institute of Computer Science
The Johns Hopkins University The Hebrew University of Jerusalem
Baltimore MD 21218 Jerusalem 91904 Israel
and the NASA Center of Excellence fdavb, grishac, [email protected]
in Space Data and Information Sciences
[email protected]

Abstract ture exploits a group communication service to mini-

In the past, system management tools for computer mize communication costs and to help preserve com-
systems were oriented towards managing a single complete and consistent operation, despite of potential
puter with, possibly, many users. When the networked network partitions and site crashes. Although we fo-
system concept became widespread, centralized solu- cus on the Unix environment, a de-facto standard for
tions such as the Network Information Service (NIS1 ) distributed environments, the same mechanisms are
were developed to help the system manager to control applicable for other settings as well.
a network of workstations. Today, when many sites We address problems in three areas of distributed
contain hundreds of workstations, these solutions may system management:
no longer be adequate. Table Management: Examples for table manage-
This paper proposes the usage of techniques, devel- ment include the management of user accounts,
oped for group communication and database replica- maintenance of a unied le system and various
tion, for distributed cluster management. We show network tables. We will show how new group
how group communication can be exploited to provide communication and replication techniques can
three classes of frequently needed operations: simulta- render table management services ecient, sym-
neous execution of the same operation in a group of metric, consistent and highly available, while pre-
workstations; software installation in multiple work- serving the existing interface to these services.
stations; and consistent network table management
(improving the consistency of NIS) Software Installation
1 Introduction and Version Control: Presently, software installa-
tion is mostly done manually by the system man-
The rapid growth of distributed environments is ager on a per-machine basis. We will show how
motivated by a number of advantages provided by group communication techniques can exploit the
a distributed architecture over a centralized one. multicast and broadcast capabilities of local area
Among them are the better cost-performance ratio, networks to speed up the installation process and
potential for higher availability, provision for geo- to minimize latency and network load during the
graphical spread of organizations, and, from the user installation process.
point of view, more autonomy for users.
Unfortunately, distributed environments are harder Simultaneous Execution: It is sometimes neces-
to manage. Often they require management data to be sary to invoke the same management command
scattered and duplicated in several sites. When sys- on several machines. For example, if an electricity
tem size grows, controlling the management data and shutdown time is expected, it might be advisable
keeping it consistent becomes a complex and tedious to shut down the whole system. Consequently, it
task. is benecial to have a method to invoke the shut-
This paper shows how group communication mech- down command from one control station. We will
anisms can help in building ecient solutions for dis- show how such a centralized management tool can
tributed system management. Our proposed architec- be easily constructed within our architecture.
This work was supported by United States - Israel Bina- The rest of this paper is organized as follows: the
tional Science Foundation, grant number 92-00189. next section brie y describes existing work in the area.
1 NIS was formerly called Yellow Pages, but later renamed Section 3 describes Transis, our group communication
to avoid confusion with other trademarks. toolkit. Section 4 presents the proposed architecture
for distributed system management. Sections 5-7 fo- acknowledgments in order to cope with omission faults
cus on simultaneous execution, software installation, not handled by UDP. To monitor the status of each
and table management respectively, and Section 8 of- participating server, DSMIT clients retain the last ac-
fers concluding remarks. knowledged transmission. In contrast to this, our so-
lution monitors the status of the whole group of man-
2 Related Work agement servers. The need to care about each par-
In this section we brie y survey some existing solu- ticular target arises only upon a membership change,
tions for distributed system management. In partic- reported by the group communication layer.
ular, we consider Network Information Service (NIS) The primary reason for DSMIT not to use group
as a conguration management framework and Dis- communication toolkits (such as Transis) was that ef-
tributed SMIT as a tool for concurrent execution of cient group communication solutions were restricted
system administration tasks in a heterogeneous en- to LANs at the time of DSMIT's development. Re-
vironment. In addition, Tivoli Management Environ- cent work ([1, 3, 8]) demonstrates that the group com-
ment (TME) is discussed as an example of a leading in- munication paradigm can be eectively extended to
tegrated solution for distributed system management. WAN environment. As group communication tech-
nology over WAN matures, the reliability layer imple-
Network Information Service. mented in DSMIT will become less-eective.
Sections 5, 6 show how a group communication
The Network Information Service (NIS) [10] is sup- transport layer can be utilized for building eective
plied as a part of the operating system by all major and reliable solutions for problems DSMIT was de-
UNIX vendors. In NIS, a collection of network tables signed to tackle.
(maps) constituting a conguration database, can op-
tionally be replicated among a group of servers. Up- Tivoli Management Environment
dates to the conguration database are always made
at the distinguished server, termed master, and later The Tivoli Management Environment [12] (TME) is
may be propagated to the other servers, called slaves probably the most comprehensive integrated solution
(if such exist). In this architecture, the master cen- for distributed system management existing today.
tralizes the conguration management, and the slaves We focus on two components of TME: Tivoli/Admin
are for higher availability and better performance. which deals with system conguration management,
While this architecture has proved to be successful, and Tivoli/Courier which addresses software distribu-
current implementations of NIS lack built-in facilities tion.
for guaranteeing consistency of replicas in the presence Both Tivoli/Admin and Tivoli/Courier use a com-
of network partitions and server crashes. munication toolkit, named MDist (multiplexed distri-
In particular, propagation of updates is not done bution) [11]. MDist is designed to distribute a large
automatically on a natural, per-update basis, but is amount of data to a predened set of target machines
relayed over the system administrator to be performed using point-to-point communication. These targets
periodically. This may lead to undesirable temporal can be either end-receivers or repeaters, which can
inconsistencies even when the system is stable. After in turn become distributors. All participants are or-
each update, the corresponding table is completely re- ganized into a tree which is constructed in order to
built, and the whole table is sent over the network. speed-up data distribution and to improve scalability.
Since TCP/IP is used, the number of slaves which can Tivoli/Admin allows replication of certain congu-
be reasonably employed is limited. Slaves are not al- ration data in order to increase availability and per-
lowed to propagate updates to other slaves. Thus, the formance. Consistency among the dierent copies is
system cannot reach a consistent state if the network maintained using the two phase commit protocol [6].
partitions and the master is not present in a partition. Two phase commit performs end-to-end acknowledg-
If the master crashes, NIS cannot continue to operate ment between all of the replicas for each update.
without a complete reconguration. Therefore, it is resource consuming and achieves lim-
In Section 7 we show how NIS implementation can ited performance which deteriorates linearly as the
be substantially improved while preserving all of its number of replicas increases.
appealing features. While the concepts developed by Tivoli certainly
constitute an integrated and comprehensive dis-
Distributed SMIT tributed system management solution, its infrastruc-
ture can be improved. Thanks to the open design of
Distributed SMIT[5] (DSMIT) presents an integrated TME, our solutions can be integrated into it and co-
tool for heterogeneous system management. DSMIT exist with other approaches.
consists of clients which emit management commands
to servers in a unied platform-independent syntac- 3 The Transis System
tic form. The servers translate the commands into a Transis [3] is a group communication sub-system
platform-specic form and perform them in parallel. developed at the Hebrew University of Jerusalem.
DSMIT utilizes a reliability layer built on top of Transis supports the process group paradigm in which
UDP 2 . This layer makes extensive use of end-to-end
the number of simultaneously opened connections and, hence,
2 TCP could not be used as a transport layer because it limits limits system scalability.
16 Pentiums

2500 0.8

0.7
2000
0.6

Utilization
Messages/Second
0.5
1500
0.4
1000 0.3

0.2
500
0.1
0 0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400
Message Size (Bytes)

Figure 1: Throughput as a function of message size.

a process can join groups and multicast messages to (i.e. processes that are not members of a group
groups. Using Transis, messages can be addressed can multicast to that group).
to the entire process group by specifying the group A process is linked with a library that connects it
name (a character string selected by the user). The to the Transis daemon. When the process connects to
group membership changes when a new process joins Transis, an inter-process communication handle (simi-
or leaves the group, a processor containing processes lar to a socket handle) is created. A process can main-
belonging to the group crashes, or a network partition tain multiple connections to Transis. A process may
or re-merge occurs. Processes belonging to the group voluntarily join specic process groups on a specic
receive conguration change notication when such an connection. A message which is received can be a reg-
event occurs. ular message sent by a process, or a membership no-
Transis incorporates sophisticated algorithms for tication created by Transis regarding a membership
membership and reliable ordered delivery of messages change of one of the groups to which this process be-
that tolerate message omission, processor crashes longs. Transis service semantics is described in [2, 9].
and recoveries, and network partitions and remerges. Transis is operational for almost three years now. It
High performance is achieved by utilizing non-reliable is used by students of the Distributed Systems course
broadcast or multicast where possible (such as on lo- at The Hebrew University and by the members of The
cal area networks). Transis performance can be seen High Availability Lab. Several projects were imple-
in Figure 1. mented on top of Transis. Among them were highly
Transis application programming interface (API) available mail system, two types of replication servers
provides a connection oriented service that, in princi- and several graphical demonstration programs.
ple, extends a point-to-point service such as TCP/IP Ongoing work on the Transis project focuses,
to a reliable multicast service. The API contains en- among other things, on security and authentication of
tries that allow a process to connect to Transis, to join users which is important for useful distributed system
and leave process groups, to multicast messages to pro- management tools.
cess groups, to receive messages and to disconnect.
Transis is implemented as a daemon. The Tran- 4 The Architecture
sis daemon handles the physical multicast communi- The architecture is composed of two layers as de-
cation. It keeps track of the processes residing in its picted in Figure 2. The bottom layer is Transis, our
computer which participate in group communication, group communication toolkit, which provides reliable
and also keeps track of the computer's membership multicast and membership services. The top layer is
(i.e. connectivity). The benet of this structure are composed of a management server and a monitor. Al-
signicant. The main advantages in our context are: though we use Transis as our group communication
Message ordering and reliability are maintained layers, other existing toolkits such as Totem [4], Ho-
at the level of the daemons and not on a per rus [13] or Newtop [7] could have been used.
group basis. Therefore, the number of groups in The management server provides two classes of ser-
the system has almost no in uence on system per- vices: long-term services and short-term ones. Long-
formance. term services provide consistent semantics across par-
titions and over time. They are used for replication of
Flow control is maintained at the level of the dae- network tables (maps) such as the password database,
mons rather than at the level of the individual which are maintained on a secondary storage. These
process group. This leads to better overall per- services implement an ecient replica control protocol
formance. that applies changes on a per-update basis.
Short-term services are reliable as long as the net-
Implementing an open group semantics is easy work is not partitioned and the management server
External External
Application Application

Management Server Management Server

Long Term Short Term Long Term Short Term

Monitor
Representation Representation Management
Converter Converter Server API

Transis API Transis API Transis API

T r a n s i s T r a n s i s T r a n s i s

Figure 2: Two levels architecture.

does not crash. In case of a network partition or a The command is executed by each of the rele-
server crash, the monitor and the management servers vant servers relatively to the working directory
receive notication from Transis. The application may that corresponds to the initiating monitor.
be informed and may take whatever steps necessary.
Short-term services include simultaneous task execu- Siminst. Install a software package on a num-
tion and software installation. ber of specied hosts. The installation is per-
The monitor provides a user interface to the ser- formed relatively to the working directory that
vices of the management server. The monitor is a corresponds to the initiating monitor.
process which may run on any of the nodes that run
Transis. Several monitors may run simultaneously in Update-map. Update map while preserving con-
the network. sistency between replicas.
The management server runs as a daemon on each Query-map. Retrieve information from a map.
of the participating nodes. It is an event driven pro-
gram. Events can be generated by the monitor, an- Exit. Terminate the management server process.
other server or Transis.
Each server maintains a vector of contexts, with In practice, sites may be heterogeneous both in
one entry for each monitor it is currently interacting terms of software (e.g. operating system) and hard-
with. Each entry contains (among other things) the ware. We make use of a generic platform-independent
current working directory of the server as set by the representation for management commands and for the
corresponding monitor. reports of their execution. This representation is the
The long-term services are a non-intervening ex- only format used for communication between the pro-
tension of the current standard Unix NIS. Since the tocol entities. The Representation Converter (see Fig-
hosts NIS map repositories retain their original for- ure 2) is responsible for converting this generic repre-
mat, applications (e.g. gethostbyname) that use RPC sentation into a platform-specic form. This archi-
to retrieve information from them are not changed, tecture enables the support of new platforms with a
The service quality is improved because the replication relative ease.
scheme implemented by the management server guar- A prototype of the presented architecture was im-
antees consistency and is much more ecient com- plemented on top of Transis and was tested in a clus-
pared to the ad-hoc solution provided by NIS. ter of Unix workstations. The code, developed in the
The management server API contains the following C programming language, spans approximately 6500
entries: lines. The table management protocol (the more so-
Status. Return the status of the server and its phisticated part) constitutes about half of the code.
host machine. 5 Simultaneous Execution
Chdir. Change the server's working directory The system manager may frequently need to in-
which corresponds to the requesting monitor. voke an identical management command on several
machines. Potentially, the machines may be of dier-
Simex. Execute a command simultaneously ent types. The activation of a particular daemon or
(more or less) on a number of specied hosts. script on several machines, or the shutdown operation
Initially: Initially:
connect to Transis; connect to Transis;
join private group; join private group;
join group Cluster; join group Cluster;
while(true) f while(true) f
m = receive(); m = receive();
switch(m:type) switch(m:type)
case command from a user: case Chdir(dir) from the monitor M :
NR = M; contexts[M ]:working dir = dir;
multicast(command, Cluster); send ACK to M 's private group;
while(NR 6= ;) case Status from the monitor M :
m = receive(); get status of my machine;
switch(m:type) convert status to a system-independent form;
case view change message: send status to M 's private group;
NR = NR n (M n m:M); case Simex(command) from the monitor M :
M = m:M; convert command to a system-specic form;
case result of execution from server: chdir(contexts[M ]:working dir);
NR = NR n server; result = execute(command);
return the result; convert result to a system-independent form;
case view change message: send result to M 's private group;
M = m:M; case Exit:
g terminate my process;
g
command can be one of the following:
Chdir, Status, Simex or Exit
(a) The Monitor (b) The Management Server

Figure 3: The Simultaneous Execution Protocol

of several machines are good examples. Another ex- and time, due to the point-to-point nature of TCP/IP.
ample is the simultaneous monitoring of CPU load, In addition, repeating the same procedure many times
memory usage and other relevant system parameters is prone to human errors resulting in inconsistent in-
on all or part of the machines in a cluster. stallations.
Figure 3(a) and Figure 3(b) present the pseudo- In contrast, we use Transis to disseminate the rel-
code of the relevant parts of the management server evant les to the members of the subgroup eciently.
and the monitor respectively. The management server We use the technique presented in Section 5 to exe-
maintains two sets: M and NR. M is the most re- cute the installation commands simultaneously at all
cent membership of the group Cluster as reported the involved locations. Each command is submitted
by Transis. NR is the set of the currently connected only once, reducing the possibility of human errors.
management servers which have not yet report the Using the process group paradigm, the system admin-
outcome of a command execution to the monitor. istrator can dynamically organize hosts with the same
It is easy to see how other tasks are integrated with installation requirements into a single multicast group.
the simultaneous execution task to form our proposed Our installation protocol proceeds in the following
architecture. steps. First, the monitor multicasts a message adver-
tising the installation of a package P , the set Rp of its
6 Software Installation installation requirements (e.g. disk space, available
Software installation and version upgrade consti- memory, operating system version etc.), the installa-
tute one of the most time-consuming system manage- tion multicast group Gp and the target list Tp . Upon
ment tasks. In large heterogeneous sites which com- reception of this message, the management server joins
prise tens or even hundreds of machines, there are of- Gp , if the system which it controls conforms to Rp
ten subgroups of computers with identical (or similar) and belongs to Tp . When all the management servers
architecture running copies of the same application from Tp have either joined Gp or reported that they
software and operating system. Presently, it is a com- will not participate in the installation, the monitor be-
mon practice to perform installation or upgrade by re- gins multicasting the les comprising the package P to
peating the same procedure at all locations in the sub- the group G. Finally, the status of the installation at
group separately. Installation or upgrade procedures every management server is reported to the monitor.
include the transfer of the packages, the execution of The Transis membership service helps detecting hosts
installation utilities and the update of relevant cong- which may not have completed the installation due to
uration les. Traditionally, all the above mentioned a network partition or host crash.
operations are performed using the TCP/IP protocol. The same protocol may later be repeated for a more
This approach is wasteful in terms of both bandwidth restricted multicast group G0 G. The monitor ques-
tions the members of G0 about the missing les prior Vec: a vector of sequence numbers containing one
to the redistribution, and only the needed les are entry for each of the SG's members. If V ec[i] = n
multicast to G0 in order to save bandwidth and time. then S knows that server i has all the updates
up to n. Initially, all V ec's entries are 0. Vec is
7 Table Management retained on a non-volatile storage.
This section presents the protocol for ecient and
consistent management of the replicated network ta- SGT: the Transis group name of SG.
bles, each of which represents a service. Servers which
share replicas of the same table form the same ser- Memb: the current membership of SGT as re-
vice group (SG). A service group consists of an ad- ported by Transis. This is a structure which
ministratively assigned primary server and a number contains a unique identier of the membership
of secondary ones. For the sake of simplicity we will (memb id) and a set of currently connected
consider a single SG in the following discussion. servers (set).
The primary server enforces a single total order on
all the update messages inside the SG. This is achieved ARU: 4 a sequence number such that S knows
by forwarding each new request for update from a that all the updates with sequence numbers no
client to the SG's primary. The primary creates an greater than ARU were received and applied to
update message from the request, assigns it a unique the table by all the members of SGT. Note that
sequence number, and multicasts this update message ARU = min1ijV ecj (V ec[i]).
to the SG. Each secondary server applies the update
messages to the SG's table in the order consistent with min sn, max sn: the minimal and maximal se-
the primary's one. This guarantees that all the servers quence numbers of update messages that need to
in the same network component remain in a consistent be retransmitted upon a membership change.
state. If the network partitions, at most one compo-
nent (the one that includes the primary) can perform Memb counter: variable that counts the State
new updates. Therefore, con icting updates are never messages during the information exchange upon
possible. a membership change.
When a membership change (network partition or
merge, or server crash) is reported by the group com- Message Types
munication layer, the connected servers exchange in-
formation and converge to the most updated consis- Req: a new request to perform an update to the
tent state known to any of them. Note that this hap- table. This request is sent by a client to one of
pens even if the primary is not a member of the cur- the servers. The update operation is stored in the
rent membership. The information exchange is done action eld of this message.
in two stages. In the rst stage, the servers exchange
state messages containing a vector, representing their Upd: an update message multicast by SG's pri-
knowledge about the last update known to each server. mary to SGT . This message carries a unique se-
In the second stage, the most updated server multi- quence number in the sn eld in addition to the
casts updates that are missed by any member of the elds of a Req message.
currently connected group. 3
Each server logs all the update messages from the M: a membership change notication delivered
primary on a non-volatile storage. This log is used by Transis. This message contains the same two
for restoring of a server's state when a server recovers elds as the Memb structure.
from a crash. A server discards an update from the
log when it learns that all the other servers have ap- State: a state message which carries the V ec
plied this update to their table (and hence, no server and the identier of the sender. This message
will need to recover that update in the future). is stamped with the membership identier of the
membership it was sent in.
Data Structures StateP: similar to the state message which is used
Each management server S 2 SG maintains the for garbage collection when the membership con-
following data structures: tains all the members of SG.
my id: a unique identier of S. Qry: a query message from a client.
p id: the identier of SG's primary server.
In addition, a type eld is included with each message.
MQ: a list of the updates received by S. MQ is The Pseudo Code
retained on a non-volatile storage.
The following subsections present the pseudo-code
3 If the primary server is present in a component, it will be of the table management protocol.
the one performing the retransmission. Otherwise, one of the
most updated secondary servers is deterministically chosen. 4 ARU stands for \all-received-up-to"
Request from a client State message from a server
The server which receives an update request from a When a valid State message is received, the server
client, forwards it to the primary server. The pri- updates its knowledge regarding other servers' knowl-
mary server creates an update message from this re- edge. After all the States messages have been re-
quest, applies it to the SG's table and multicasts it to ceived, the needed update messages are retransmitted
the group. Procedure handle-request details these by the most updated server. If the primary server is
steps. a member of the current membership, it is selected as
the most updated server, otherwise the most updated
handle-request(m) secondary server with the smallest identier is selected
f using the procedure most-updated-server.
if (my id == p id) then Procedure handle-state details these steps.
V ec[my id] = V ec[my id] + 1;
m:sn = V ec[my id];
m:type = Upd; handle-state(m)
append m to MQ; f
sync MQ and V ec to disk; if (m:memb id 6= Memb:memb id) then
apply m:action to SG's table; return;
multicast(m, SGT); V ec = max(V ec; m:V ec);
else if (p id 2 Memb) then if (m:V ec[m:sender] < min sn) then
send(m, p id); min sn = m:V ec[m:sender];
g if (m:V ec[m:sender] > max sn) then
max sn = m:V ec[m:sn];
Memb counter = Memb counter ; 1;
Update from a server if (Memb counter == 0) then
if ( most-updated-server () ) then
A secondary server which receives an update message for each m0 2 MQ0 s:t: m0 :sn > min sn do
in the correct order, applies the update to the table multicast(m , SGT);
and changes its data structures accordingly. Proce- g
dure handle-update details these steps. The most-updated-server procedure presented
below returns true if the invoking server is the most-
handle-update(m) updated-server with the minimal identier, and false
f otherwise.
if (my id 6= p id and boolean most-updated-server()
m:sn == V ec[my id] + 1) then f
V ec[my id] = m:sn; for each i 2 Memb:set and i < my id do
append m to MQ; if (V ec[i] == max sn) then
sync MQ and V ec to disk; return false;
apply m:action to SG's table; if (V ec[my id] == max sn) then
else return true;
discard m; g
g
Garbage collection
Membership change notication from Transis In order to discard updates which are no longer
Upon a membership change, the connected servers ex- needed, procedure collect-garbage is called upon
change information in order to converge to the most the reception of either a State message, or a StateP
updated state. Procedure handle-membership pre- message. The StateP message is sent periodically
pares the data structures for this recovery process and if the membership contains all the members of the
multicasts a State message. Note that the State mes- SG. The reason for having the StateP message, is to
sage contains V ec, representing the local knowledge avoid maintaining large amounts of updates that are
regarding other servers' states. no longer needed because each member of the SG has
already applied them.
handle-membership(m)
f collect-garbage(m)
Memb:set = m:set; f
Memb:memb id = m:memb id; V ec = max(V ec; m:V ec);
min sn = max sn = V ec[my id]; new ARU = min1ijV ecj (V ec[i]);
Memb counter = j Memb j; if (new ARU >0 ARU) then 0
create a State 0message m0 ; for each m 20 MQ s:t: m new ARU do
multicast(m , SGT ); remove m from MQ;
g ARU = new ARU;
sync MQ and V ec to disk; [5] N. Amit, D. Ginat, S. Kipnis, and J. Mihaeli.
g Distributed SMIT: System management tool for
large Unix environments. Research report, IBM
Events handling Israel Science and Technology, 1995. In prepara-
tion.
The following is the main loop of the table manage-
ment part of the management server. [6] P. A. Bernstein, V. Hadzilacos, and N. Goodman.
Concurrency Control and Recovery in Database
Initially: Systems, chapter 7. Addison Wesley, 1987.
connect to Transis; [7] P. Ezhilchelvan, R. Macedo, and S. Shrivastava.
join group SGT; Newtop: A fault-tolerant group communication
initialize all the V ec entries to 0; protocol. In Proceedings of the 15th International
bring in MQ and V ec (if present) from disk; Conference on Distributed Computing Systems,
ARU = min1ijV ecj (V ec[i]); May 1995.
while(true) f
m = receive(); [8] N. Huleihel. Ecient ordering of messages in
switch(m:type) wide area networks. Master's thesis, Institute
case Req: of Computer Science, The Hebrew University of
handle-request(m); Jerusalem, Israel, 1996.
case Upd:
handle-update(m); [9] L. E. Moser, Y. Amir, P. Melliar-Smith, and D. A.
case Qry: Agarwal. Extended virtual synchrony. In Proceed-
retrieve an answer from the local table; ings of the 14th International Conference on Dis-
send the answer to the client; tributed Computing Systems, pages 56{65, June
case M: 1994.
handle-membership(m);
case State: [10] H. Stern. Managing NFS and NIS, chapter 2, 3,
handle-state(m); 4. O'Reilly & Associates Inc, rst edition, June
collect-garbage(m); 1991.
case StateP: [11] Tivoli Systems Inc. Multiplexed Distribution
collect-garbage(m); (MDist), November 1994. Available via anony-
g mous ftp from ftp.tivoli.com /pub/info.
8 Conclusion [12] Tivoli Systems Inc. TME 2.0: Technology Con-
We have presented an architecture that utilizes cepts and Facilities, 1994. Technology white pa-
group communication to provide ecient and reliable per discussing Tivoli 2.0 components and ca-
distributed system management. The common man- pabilities. Available via anonymous ftp from
agement tasks of simultaneous execution, software in- ftp.tivoli.com /pub/info.
stallation and table management were addressed. The
resulting services are convenient to use, consistent in [13] R. van Renesse, K. P. Birman, R. Friedman,
presence of failures, and complementary to the exist- M. Hayden, and D. Karr. A framework for pro-
ing standard mechanisms. tocol composition in Horus. In Proceedings of
References the ACM symposium on Principles of Distributed
Computing, August 1995.
[1] Y. Amir, 1995. The Spread toolkit, Private Com-
munication.
[2] Y. Amir. Replication Using Group Communica-
tion Over a Partiotioned Network. PhD thesis,
Institute of Computer Science, The Hebrew Uni-
versity of Jerusalem, Israel, 1995.
[3] Y. Amir, D. Dolev, S. Kramer, and D. Malki.
Transis: A communication sub-system for high
availability. In Proceedings of the 22nd Annual In-
ternational Symposium on Fault-Tolerant Com-
puting, pages 76{84, July 1992. The full version
of this paper is available as TR CS91-13, Dept. of
Comp. Sci., the Hebrew University of Jerusalem.
[4] Y. Amir, L. E. Moser, P. M. Melliar-Smith, D. A.
Agarwal, and P. Ciarfella. The Totem single-
ring ordering and membership protocol. 13(4),
November 1995.

Female Dating FMT PDF Love Romance (Love)
100% (26)
Female Dating FMT PDF Love Romance (Love)
1 page
Auto Refractometer Repair Manual
100% (9)
Auto Refractometer Repair Manual
54 pages
Microsoft Azure Fundamentals Exam Cram: Second Edition
From Everand
Microsoft Azure Fundamentals Exam Cram: Second Edition
IP Specialist
5/5 (1)
Network Design Proposal Capstone Final Project
25% (4)
Network Design Proposal Capstone Final Project
24 pages
Software-Defined Networks: A Systems Approach
From Everand
Software-Defined Networks: A Systems Approach
Larry Peterson
5/5 (1)
Computer Science Self Management: Fundamentals and Applications
From Everand
Computer Science Self Management: Fundamentals and Applications
Fouad Sabry
No ratings yet
Operating System Interview Questions and Answers
From Everand
Operating System Interview Questions and Answers
Manish Soni
No ratings yet
Operating System Text Book
From Everand
Operating System Text Book
Manish Soni
No ratings yet
System Design Basics
From Everand
System Design Basics
Kai Turing
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
From Everand
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
Poonam Devi
No ratings yet
Edge Cloud Operations: A Systems Approach
From Everand
Edge Cloud Operations: A Systems Approach
Larry L Peterson
No ratings yet
Mainframe Modernization with DevOps Mastery: Mainframes
From Everand
Mainframe Modernization with DevOps Mastery: Mainframes
Ricardo Nuqui
No ratings yet
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
From Everand
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
Robert Johnson
No ratings yet
Comptia Network+ V6 Study Guide - Indie Copy
From Everand
Comptia Network+ V6 Study Guide - Indie Copy
Matthew Bennett
5/5 (1)
Networking Programming with C++: Build Efficient Communication Systems
From Everand
Networking Programming with C++: Build Efficient Communication Systems
Robert Johnson
No ratings yet
Distributed Artificial Intelligence: Fundamentals and Applications
From Everand
Distributed Artificial Intelligence: Fundamentals and Applications
Fouad Sabry
No ratings yet
Autonomic Networking: Fundamentals and Applications
From Everand
Autonomic Networking: Fundamentals and Applications
Fouad Sabry
No ratings yet
Linux vs Windows
From Everand
Linux vs Windows
Alisa Turing
No ratings yet
Operating Systems: Concepts to Save Money, Time, and Frustration
From Everand
Operating Systems: Concepts to Save Money, Time, and Frustration
Jonathan Rigdon
No ratings yet
Multi Agent System: Fundamentals and Applications
From Everand
Multi Agent System: Fundamentals and Applications
Fouad Sabry
No ratings yet
Mastering Terraform A Comprehensive Guide to Infrastructure As Code
From Everand
Mastering Terraform A Comprehensive Guide to Infrastructure As Code
Mario Marinov
No ratings yet
Cloud Computing For Noobs
From Everand
Cloud Computing For Noobs
Silas Meadowlark
No ratings yet
Database Management System
From Everand
Database Management System
Knowledge Flow
No ratings yet
Key Principles of IT Architecture
From Everand
Key Principles of IT Architecture
Nelson Ambrose
No ratings yet
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
From Everand
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
Mamta Devi
No ratings yet
Encapsulating Legacy: A Guide to Service-Oriented Architecture in Mainframe Systems: Mainframes
From Everand
Encapsulating Legacy: A Guide to Service-Oriented Architecture in Mainframe Systems: Mainframes
Isaac Nangan
No ratings yet
AZURE AZ 500 STUDY GUIDE-2: Microsoft Certified Associate Azure Security Engineer: Exam-AZ 500
From Everand
AZURE AZ 500 STUDY GUIDE-2: Microsoft Certified Associate Azure Security Engineer: Exam-AZ 500
Mamta Devi
No ratings yet
Software Architecture
From Everand
Software Architecture
Kai Turing
No ratings yet
Cloud vs Edge
From Everand
Cloud vs Edge
Isaac Berners-Lee
No ratings yet
Software Defined Networking (SDN) - a definitive guide
From Everand
Software Defined Networking (SDN) - a definitive guide
Rajesh Kumar Sundararajan
2/5 (2)
DevOps for the Desperate: A Hands-On Survival Guide
From Everand
DevOps for the Desperate: A Hands-On Survival Guide
Bradley Smith
No ratings yet
Shedding Light on Cloud Computing
From Everand
Shedding Light on Cloud Computing
Gregor Petri
5/5 (1)
Modern Mainframe Mastery: Navigating the New Era of Systems Management: Mainframes
From Everand
Modern Mainframe Mastery: Navigating the New Era of Systems Management: Mainframes
Isaac Nangan
No ratings yet
Model-Driven Online Capacity Management for Component-Based Software Systems
From Everand
Model-Driven Online Capacity Management for Component-Based Software Systems
André van Hoorn
No ratings yet
Essays on Infrastructure-as-code
From Everand
Essays on Infrastructure-as-code
Ravi Rajamani
No ratings yet
Embedded Systems Programming with C++: Real-World Techniques
From Everand
Embedded Systems Programming with C++: Real-World Techniques
Robert Johnson
No ratings yet
The Ultimate Guide to Unlocking the Full Potential of Cloud Services: Tips, Recommendations, and Strategies for Success
From Everand
The Ultimate Guide to Unlocking the Full Potential of Cloud Services: Tips, Recommendations, and Strategies for Success
Rick Spair
No ratings yet
Decision Support System: Fundamentals and Applications for The Art and Science of Smart Choices
From Everand
Decision Support System: Fundamentals and Applications for The Art and Science of Smart Choices
Fouad Sabry
No ratings yet
Computer Knowledge Guide For All Competitive Exams
From Everand
Computer Knowledge Guide For All Competitive Exams
Mohmmad Khaja Shareef
3/5 (4)
Cloud computing: Moving IT out of the office
From Everand
Cloud computing: Moving IT out of the office
BCS, The Chartered Institute for IT
No ratings yet
Cloud Computing Interview Questions You'll Most Likely Be Asked: Second Edition
From Everand
Cloud Computing Interview Questions You'll Most Likely Be Asked: Second Edition
Vibrant Publishers
No ratings yet
Windows Failover Clustering Design Handbook
From Everand
Windows Failover Clustering Design Handbook
Stefanos Evangelou
No ratings yet
Digital Technologies – an Overview of Concepts, Tools and Techniques Associated with it
From Everand
Digital Technologies – an Overview of Concepts, Tools and Techniques Associated with it
Editor IJSMI
No ratings yet
Communication and Network Security: CISSP, #4
From Everand
Communication and Network Security: CISSP, #4
Selwyn Classen
No ratings yet
Mastering System Programming with C: Files, Processes, and IPC
From Everand
Mastering System Programming with C: Files, Processes, and IPC
Larry Jones
No ratings yet
Linux Services Deployment
From Everand
Linux Services Deployment
Fabian Mestre
No ratings yet
Mastering the Art of Unix Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of Unix Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Cloud Computing Essentials: A Practical Guide with Examples
From Everand
Cloud Computing Essentials: A Practical Guide with Examples
William E. Clark
No ratings yet
Cloud Infrastructure and Data Center
From Everand
Cloud Infrastructure and Data Center
Duong Tran
No ratings yet
Building Scalable Systems with C: Optimizing Performance and Portability
From Everand
Building Scalable Systems with C: Optimizing Performance and Portability
Larry Jones
No ratings yet
The Pragmatic Linux Performance Handbook
From Everand
The Pragmatic Linux Performance Handbook
Samuel Aitken
No ratings yet
Cloud Computing: Harnessing the Power of the Digital Skies: The IT Collection
From Everand
Cloud Computing: Harnessing the Power of the Digital Skies: The IT Collection
Christopher Ford
No ratings yet
Mastering Linux Network Administration
From Everand
Mastering Linux Network Administration
Jay LaCroix
4/5 (5)
Introduction To Building Dapps: A Comprehensive Guide
From Everand
Introduction To Building Dapps: A Comprehensive Guide
Joshua Baba Adugibilla
No ratings yet
Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments
From Everand
Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments
EMC Education Services
No ratings yet
Mainframe to Cloud Mastery: Best Practices: Mainframes
From Everand
Mainframe to Cloud Mastery: Best Practices: Mainframes
Ricardo Nuqui
No ratings yet
Evolving Legacy Systems: Transitioning to Microservices and Cloud-Native Architectures
From Everand
Evolving Legacy Systems: Transitioning to Microservices and Cloud-Native Architectures
Adam Jones
No ratings yet
“Information Systems Unraveled: Exploring the Core Concepts”: GoodMan, #1
From Everand
“Information Systems Unraveled: Exploring the Core Concepts”: GoodMan, #1
Patrick Mukosha
No ratings yet
A Concise Guide to Microservices for Executive (Now for DevOps too!)
From Everand
A Concise Guide to Microservices for Executive (Now for DevOps too!)
alasdair gilchrist
1/5 (1)
Mastering Operating Systems: Comprehensive Guide to Overview, Administration, and Security
From Everand
Mastering Operating Systems: Comprehensive Guide to Overview, Administration, and Security
Virversity Online Courses
No ratings yet
Cloud-based Knowledge Management Systems
From Everand
Cloud-based Knowledge Management Systems
Sarah W Muriithi
No ratings yet
Virtual Networks Unlocked: Your Guide to Azure Connectivity
From Everand
Virtual Networks Unlocked: Your Guide to Azure Connectivity
Kameron Hussain
No ratings yet
Model Can Be Any Element. Ex: Ball in A Bearing, Bicycle, Motor Vehicle, Car, Carburetor, Cap, Goggles Etc.s
No ratings yet
Model Can Be Any Element. Ex: Ball in A Bearing, Bicycle, Motor Vehicle, Car, Carburetor, Cap, Goggles Etc.s
1 page
Anits (A) Department of Mechanical Engineering R-19-Revised Course Structure
No ratings yet
Anits (A) Department of Mechanical Engineering R-19-Revised Course Structure
6 pages
Coordination and Agreement: Distributed Systems
No ratings yet
Coordination and Agreement: Distributed Systems
37 pages
Instructions To The Students
No ratings yet
Instructions To The Students
2 pages
hGCOMM PDF
No ratings yet
hGCOMM PDF
22 pages
Grid Computing Final
No ratings yet
Grid Computing Final
18 pages
SmartAX MT882 ADSL Router Quick Start
No ratings yet
SmartAX MT882 ADSL Router Quick Start
14 pages
LON Terminator 44xxx
No ratings yet
LON Terminator 44xxx
2 pages
Delta V
No ratings yet
Delta V
55 pages
Red Hat Enterprise Linux 7 System Level Authentication Guide en US
No ratings yet
Red Hat Enterprise Linux 7 System Level Authentication Guide en US
153 pages
Flogi Plogi
No ratings yet
Flogi Plogi
2 pages
Family Vlogging Final
No ratings yet
Family Vlogging Final
6 pages
Eurotherm 5100V and 5180V
No ratings yet
Eurotherm 5100V and 5180V
164 pages
Downloaded From
No ratings yet
Downloaded From
6 pages
Experiment 01
No ratings yet
Experiment 01
2 pages
Bms Honeywell
No ratings yet
Bms Honeywell
24 pages
Product Data Sheet 3RS1703-1CE00
No ratings yet
Product Data Sheet 3RS1703-1CE00
4 pages
AlienVault USM 5.1 5.2 Asset Management Guide
No ratings yet
AlienVault USM 5.1 5.2 Asset Management Guide
60 pages
Final Diploma Sheme-Hardware Syllabus
No ratings yet
Final Diploma Sheme-Hardware Syllabus
27 pages
Interview 1
No ratings yet
Interview 1
2 pages
04 Spring Microservices Material
No ratings yet
04 Spring Microservices Material
35 pages
UNIT-1 Introduction To Cloud Computing
No ratings yet
UNIT-1 Introduction To Cloud Computing
38 pages
Think 2 Unit 4 Sample Unit
100% (1)
Think 2 Unit 4 Sample Unit
10 pages
Schools Division Office of Lucena City Periodical Exam - Quarter I Computer Systems Servicing NC II Grade 10-12
100% (2)
Schools Division Office of Lucena City Periodical Exam - Quarter I Computer Systems Servicing NC II Grade 10-12
4 pages
Computer Communication Networks: ECC 602 AY 2019-20
No ratings yet
Computer Communication Networks: ECC 602 AY 2019-20
47 pages
Trelleborg - Ship Shore ESD Link PDF
No ratings yet
Trelleborg - Ship Shore ESD Link PDF
3 pages
ILP Proposal Format-Rsn
0% (1)
ILP Proposal Format-Rsn
13 pages
Using Machine Learning To Secure Iot Systems: Janice Ca Nedo, Anthony Skjellum
No ratings yet
Using Machine Learning To Secure Iot Systems: Janice Ca Nedo, Anthony Skjellum
4 pages
Plasma Antenna New
No ratings yet
Plasma Antenna New
11 pages
100G OTN Muxponder: Cost-Efficient Transport of 10x10G Over 100G in Metro Networks
No ratings yet
100G OTN Muxponder: Cost-Efficient Transport of 10x10G Over 100G in Metro Networks
2 pages
Top Ten Advantages: AGPS Server and Worldwide Reference Network™
No ratings yet
Top Ten Advantages: AGPS Server and Worldwide Reference Network™
2 pages
BF-430_WEB
No ratings yet
BF-430_WEB
26 pages

Group Comm PDF

Uploaded by

Group Comm PDF

Uploaded by

Group Communication as an Infrastructure for

Distributed System Management

Abstract ture exploits a group communication service to mini-

Figure 1: Throughput as a function of message size.

Management Server Management Server

Long Term Short Term Long Term Short Term

Transis API Transis API Transis API

Figure 2: Two levels architecture.

Figure 3: The Simultaneous Execution Protocol

You might also like