Distributed System (CACS 352) : Compiled by
Distributed System (CACS 352) : Compiled by
Distributed System
(CACS 352)
BCA 6th semester
Tribhuvan University (TU)
Compiled By
Ujjwal Rijal
[email protected]
[email protected] 1
Unit-1
Introduction
Compiled By
Ujjwal Rijal
[email protected]
[email protected] 2
What is Distributed System?
A distributed system is a collection of autonomous computing
elements that appears to its users as a single coherent system.
This definition refers to two characteristic features of
distributed systems.
The first one is that a distributed system is a collection of
computing elements each being able to behave independently of
each other.
A computing element, which will generally refer to as a node, can
be either a hardware device or a software process.
A second feature is that users (be they people or applications)
believe they are dealing with a single system. This means that one
way or another the autonomous nodes need to collaborate.
How to establish this collaboration lies at the heart of developing
distributed systems.
[email protected] 3
Characteristics of
Distributed System
There are specially and characteristically two primary
characteristic features of distributed systems and they are
discussed below:
I. Collection of autonomous computing elements
o This illustrates that a distributed system is a collection of computing
elements each being able to behave independently of each other.
o A computing element, which will generally refer to as a node can be
either a hardware device or a software process.
o Modern distributed systems can, and often will consist of all kinds of
nodes, ranging from very high performance computers to small plug
computers or even smaller devices.
o A fundamental principle is that nodes can act independently from
each other, although it should be obvious that if they ignore each
other, then there is no use in putting into the same distributed
system.
[email protected] 4
Characteristics of
Distributed System
II. Single coherent system
o It illustrates that users (be they people or applications) believe they are dealing
with a single system. This means that one way or another the autonomous nodes
need to collaborate.
o How to establish this collaboration lies at the heart of developing distributed
systems.
o As mentioned, a distributed system should appear as a single coherent system.
In some cases, researchers have even gone so far as to say that there should be
a single-system view, meaning that end users should not even notice that they
are dealing with the fact that processes, data, and control are dispersed across
a computer network.
o Achieving a single-system view is often asking too much, for which reason, in our
definition of a distributed system, we have opted for something weaker, namely
that it appears to be coherent.
o Roughly speaking, a distributed system is coherent if it behaves according to the
expectations of its users.
o More specifically, in a single coherent system, the collection of nodes as a whole
operates the same, no matter where, when, and how the interaction between a
[email protected] 5
user and the system takes place.
Design Goals of
Distributed System
In this section, we discuss four important goals
that should be met to build a distributed system
worth the effort.
A distributed system should make resources easily
accessible; it should hide the fact that resources
are distributed across a network; it should be
open; and it should be scalable.
Now, they are discussed in detail as:
o Supporting Resource Sharing
o Making Distribution Transparent
o Openness
o Scalability
o Pitfalls / Fallacies [email protected] 6
Design Goals of
Distributed System
Supporting Resource Sharing
o An important goal of distributed system is to make it easy for users
and applications to access and share remote resources.
o Resources can be virtually anything, but typical examples include
peripherals, storage facilities, data, files, services, and networks, to
name just a few.
o There are many reasons for wanting to share resources. One obvious
reason is that of economics. For example, it is cheaper to have a
single high-end reliable storage facility be shared than having to buy
and maintain storage for each user separately.
o However, resource sharing in distributed systems is perhaps best
illustrated by the success of file-sharing peer-to-peer networks like
BitTorrent. These distributed systems make it extremely simple for
users to share files across the internet.
[email protected] 7
Design Goals of
Distributed System
Making Distribution Transparent
o Another important goal of distributed system is to hide
the fact that it processes and resources are physically
distributed across multiple computers, possibly separated
by large distances.
o In other words, it tries to make the distribution of
processes and resources transparent or tries to ensure
distribution transparency i.e. invisible, to end users and
applications.
[email protected] 8
Design Goals of
Distributed System
Types of Distribution Transparency
o Access transparency, Location transparency,
Relocation transparency, Migration transparency,
Replication transparency, Concurrency transparency,
Mobility transparency, Performance transparency,
Scaling transparency, and Failure transparency.
o Now, they are discussed below:
[email protected] 9
Design Goals of
Distributed System
I. Access Transparency
o It deals with hiding differences in data representation and the
way that objects can be accessed.
o At a basic level, we want to hide differences in machine
architectures, but more important is that we reach agreement on
how data is to be represented by different machines and
operating systems.
o For example, a distributed system may have computer systems
that run different operating systems, each having their own file-
naming conventions.
o Differences in naming conventions, differences in file operations,
or differences in how low-level communication with other
processes is to take place, are the examples of access issues that
should preferably be hidden from the users and the applications.
o Examples: File system operations in NFS (Network File System),
Navigation in the Web, SQL Queries, etc. [email protected]
10
Design Goals of
Distributed System
II. Location Transparency
o An important group of transparency types concerns the
location of a process or a resource.
o Location transparency refers to the fact that users can
not tell where an object is physically located n the system.
Naming plays an important role in achieving location
transparency.
o An example of such a name is Uniform Resource Locator
(URL). A typical URL could have the
form http://www.example.com/index.html, which
indicates a protocol (http), a hostname
(www.example.com), and a file name (index.html).
o Examples: File system operations in NFS, Pages in the
Web, Tables in distributed databases, etc.
[email protected] 11
Design Goals of
Distributed System
III. Relocation Transparency
o The entire site may have been moved from one data centre
to another, yet users should not notice. The URL gives no
clue as to whether the file has always been at its current
location or was recently moved there.
o This was the case of relocation transparency, which is
becoming increasingly important in the context of cloud
computing.
[email protected] 12
Design Goals of
Distributed System
IV. Migration Transparency
o Where relocation transparency refers to being moved by
the distributed system, migration transparency is offered
by a distributed system when it supports the mobility of
processes and resources initiated by the users, without
affecting ongoing communication and operations.
o A typical example is communication between mobile
phones; regardless whether two people are actually
moving, mobile phones will allow them to continue their
conversation.
o Other examples could be online tracking and tracing of
goods as they are being transported from one place to
another, and teleconferencing using devices that are
equipped with mobile internet.
[email protected] 13
Design Goals of
Distributed System
V. Replication Transparency
o Replication plays an important role in distributed systems.
For example, resources may be replicated to increase
availability or to improve performance by placing a copy close
to the place where it is accessed.
o Replication transparency deals with hiding the fact that
several copies of a resource exist, or that several processes
are operating in some form of lockstep mode so that one can
take over when another fails.
o To hide replication from users, it is necessary that all
replicas have the same name. Consequently, a system that
supports replication transparency should generally support
location transparency as well, because it would otherwise be
impossible to refer to replicas at different locations.
o Examples: Distributed DBMS, Mirroring Web Pages, etc.
[email protected] 14
Design Goals of
Distributed System
VI. Concurrency Transparency
o Concurrency transparency signifies that concurrent access
to a shared resource leaves that resource in a consistent
state.
o Consistency can be achieved through locking mechanisms,
by which users are, in turn, given exclusive access to the
desired resource.
o A more refined mechanism is to make use of transactions,
but this may be difficult to implement in a distributed
system, notably when scalability is an issue.
o Examples: NFS, Automatic teller machine network,
Database Management System (DBMS), etc.
[email protected] 15
Design Goals of
Distributed System
VII. Mobility Transparency
o Allows the movement of information objects within a
system without affecting the operations of users or
application programs
o Examples: NFS, Web Pages, etc.
[email protected] 16
Design Goals of
Distributed System
[email protected] 17
Design Goals of
Distributed System
X. Failure Transparency
o Failure transparency means that a user or an application does
not notice that some piece of the system fails to work
properly, and that the system subsequently and automatically
recovers from that failure.
o Masking failures is one of the hardest issues in distributed
systems, and is even impossible when certain apparently
realistic assumptions are made.
o The main difficulty in masking and transparently recovering
from failures lies in the inability to distinguish between a dead
process and a painfully slowly responding one.
o For example, when contacting a busy web server, a browser will
eventually time out and reports that the web page is
unavailable. At that point, the user can not tell whether the
server is actually down or that the network is fully congested.
[email protected] 18
Design Goals of
Distributed System
Openness
o An open distributed system is a system that offers services according to
the standard rules that describe the syntax and semantics of those
services.
o For example, in computer networks, standard rules govern the format,
contents, and meaning of messages sent and received. Such rules are
formalized in the protocols.
o Open systems are characterized by the fact that their key interfaces are
published. They are based on a uniform communication mechanism and
published interface for access to shared resources.
o These types of systems can be constructed from heterogeneous hardware
and software.
o The openness of distributed systems is determined primarily by the
degree to which new resource-sharing services can be added and be made
available for use by a variety of client programs.
o For example, Facebook and Twitter have API that allows developers to
develop their own software interactively. [email protected] 19
Design Goals of
Distributed System
Scalability
o Scalability of the system should remain efficient even with a significant increase in
the number of resources and users connected.
o Moreover, a system is said to be scalable if it can handle the addition of users and
resources without suffering a noticeable loss of performance or increase in
administrative complexity.
o According to Neuman(1994), scalability of a system can be measure along at least
three different dimensions, and they are:
With respect to its size
It means that we can easily add more users and resources to the system. The problem
associated with this dimension is the problem of overloading.
With respect to geography
A geographically scalable system is one in which the users and resources may lie far
apart. The problem associated with this dimension is the problem of communication
reliability.
With respect to administration
It means that an administratively scalable system can still be easier to manage even if
it spans many independent administrative organizations. The problem associated with
this dimension is the problem of administrative mess.
[email protected] 20
Design Goals of
Distributed System
Pitfalls / Fallacies
Pitfall means for a hidden or unsuspected danger or difficulty.
Distributed systems differ from traditional software because components are
dispersed across a network. Not taking this dispersion into account during design time
is what makes so many systems needlessly complex and results in mistakes that need to
be patched later on.
Peter Deutsch, then at Sun Microsystems, formulated these mistakes as the
following false assumptions that everyone makes when developing a distributed
application for the first time:
o The network is reliable.
o The network is secure.
o The network is homogeneous.
o The topology does not change.
o Latency is zero.
o Bandwidth is infinite.
o Transport cost is zero.
o There is one administrator.
These kinds of pitfalls should be strictly taken into account during the design of a
distributed application or system.
[email protected] 21
Types of
Distributed System
[email protected] 22
Types of
Distributed System
I. Client-Server Systems
o This type of system requires the client to request a resource,
after which the server gives the requested resource. When a
client connects to a server, the server may serve multiple
clients at the same time.
o Client-Server systems are also referred to as “Tightly Coupled
Operating Systems”. This system is primarily intended for
multiprocessors and homogeneous multicomputer.
o Client-Server systems function as a centralized server since
they approve all the requests issued by the client systems.
o It is a distributed application structure that partitions tasks
or workloads between the providers of a resource or service,
called servers, and service requesters, called clients.
[email protected] 23
Types of
Distributed System
[email protected] 24
Types of
Distributed System
[email protected] 25
Types of
Distributed System
II. Peer-to-Peer Systems
o In contrast to the tightly coupled systems, the computer
networks used in these applications consist of a collection of
processors that do not share memory or a clock. Instead, each
processor has its own local memory.
o The processors communicate with one another through various
communication lines, such as high- speed buses or the
telephone lines.
o These systems are usually referred to as loosely coupled
systems (or distributed systems). The aim of the peer-to-peer
architecture is to exploit the resources in a large number of
participating computers for the fulfilment of a given task or
an activity.
[email protected] 26
Types of
Distributed System
[email protected] 27
Types of
Distributed System
III. Middleware
o Middleware enables the interoperability of all
applications running on the different operating systems.
Those programs are capable of transferring all data to
one another by using these services.
o Middleware is an infrastructure that appropriately
supports the development and execution of distributed
applications. It provides a buffer between the
applications and the network.
[email protected] 28
Types of
Distributed System
[email protected] 29
Types of
Distributed System
[email protected] 30
Types of
Distributed System
IV. Three Tier
o The information about the client is saved n the intermediate
tier rather than in the client, which simplifies development.
This type of architecture is most commonly used in online
applications.
o In the three-tier architecture, an intermediate process
connects the clients and servers. The intermediary can
accumulate frequently used server data to guarantee enhanced
performance and scalability.
o In database-based 3-tier client-server architecture, normally
there are three essential components; a client computer, an
application server, and a database server.
[email protected] 31
Types of
Distributed System
o The application server is the middle tier server which runs the
business application. The data is retrieved from database
server and it is forwarded to the client by the middle tier
server.
o Database-oriented middleware offers an Application
Programming Interface (API) access to a database.
o In 3-tier architecture, it is easier to modify or even replace
any tier without affecting the other tiers. The separation of
application and database functionality achieves better load
balancing.
o Moreover, necessary security guidelines can be put into effect
within the server tiers without affecting the clients.
[email protected] 32
Types of
Distributed System
[email protected] 33
Types of
Distributed System
V. N-Tier
o When a server or application has to transmit requests to other
enterprise services on the network, then n-tier systems are used.
o Multi-tier architecture is a client-server architecture in which the
functions such as presentation, application processing, and data
management are physically separated.
o By separating an application into tiers, developers obtain the option
of changing or adding a specific layer, instead of reworking the
entire application.
o It provides a model by which developers can create flexible and
reusable applications.
o The most general use of multi-tier architecture is the three-tier
architecture. A three-tier architecture is typically composed of a
presentation tier, and a data storage tier, and may execute on a
separate processor.
[email protected] 34
Types of
Distributed System
[email protected] 35
ASSIGNMENT - 1
ASSIGNMENT
Interceptors
• An interceptor is nothing but a software construct that will break the usual flow of control and
allow other code to be executed.
• Interceptors are a primary means for adapting middleware to the specific needs of an application
and it too plays a vital role in achieving openness.
• The working principle of an interceptor can be best demonstrated through remote-object
invocation and let us demonstrate it as follows:
The basic idea is that an object A can call a method that belongs to an object B, while the latter
one resides on a different machine than A.
Now, a remote-object invocation is carried out in following three steps:
(Discuss Yourself)
• The following figure shows a simplified layout of a process inside a main memory:
Stack
Heap
Data
Text
Process Program
i. It is a sequence of instruction execution. i. It consists of a set of instructions in a
programming language.
ii. A process is a dynamic object. ii. A program is static in nature.
iii. A process is loaded into the main memory. iii. A program is loaded into the secondary storage
device.
iv. The time span of a process is limited, iv. The time span of a program is unlimited,
comparatively. comparatively.
v. The process is an active entity. v. The program is a passive entity.
vi. The resource requirement is much higher in vi. A program just requires memory for storage or
case of a process, etc. resource requirement is lower in comparison to a
process, etc.
• When a process executes, it passes through different states. These states may differ in
different OSs, and the names of these states are also not standardized.
• The OS maintains management information about a process in PCB. Modern OS allows a
process to be divided into multiple threads of execution, which share all the process
management information except for information directly related to its execution. This
information is held in a Thread Control Block (TCB).
• Threads in a process can execute different parts of the program code at the same time.
• Threads can also execute the same parts of the code at the same time, but with
different execution state:
They have independent current instructions i.e. they have independent program
counters.
They are working with different data i.e. they are working with independent registers.
• In general, a process can have one of the following five states at a time:
• The state diagram of a process in the above figure illustrates the five stages of a process
execution and they are discussed below:
I. Start
• This is the initial state when a process is created or first started.
II. Ready
• The ready state signifies that the process is writing to be assigned to a processor for the
execution.
III. Running
• The running state signifies that the process is actually using the CPU resources and the
instructions are being executed as per the working mechanism of OS scheduler.
Suspended Process
The prime characteristics of a suspended process are as follows:
• Suspended process is not immediately available for execution.
• The process may or may not be waiting on an event.
• For preventing the execution, process is suspended by OS, parent process, process itself and an
agent.
• Process may not be removed from the suspended state until the agent orders the removal.
Prime Reasons for Process Suspension
• Swapping
• Timing
• Interactive user request
• Parent process request
Swapping
• Swapping is used to move all of a process from main memory to disk. OS needs to release
required main memory to bring in a process that is ready to execute.
Timing
• Process may be suspended while waiting for the next time interval.
Interactive user request
• Process may be suspended for debugging purpose by the user.
Parent process request
• Process may be suspended by the parent process to modify the suspended process or to co-
ordinate the activity of various descendants.
• The process of creating new process from their parent process and again creating new
process from previously newly created process in tree like structure is called process
hierarchy.
• Windows has no concept of process hierarchy.
• Modern general purpose operating systems permit a user to create and destroy processes. In
UNIX, this is done by the fork system call, which creates a child process, and the exit
system call, which terminates the current process.
• After a fork system call, both the parent and the child keep running and each can fork off
other processes.
• The root of the tree is a special process created by the OS during start-up.
• The following figure illustrates the concept of process hierarchy:
• A process is divided into small tasks and each task is called as a thread. A thread is a single
sequence stream within a process.
• Moreover, a thread is a flow of control within a process. Processes are used to group the
resources together, and threads are the entities scheduled for execution on the CPU.
• A thread uses a flow of execution through the process code with its own program counters,
register set, stack space and life cycle.
• A thread is similar to a sequential program i.e. like a process, a thread also has its
beginning and end. But a thread itself is not a running program because it cannot run on its
own. So, it runs within a running program.
• A thread is a light-weight process because it runs within a process and it makes the use of
the resources that were actually allocated to the process.
• In a process, threads allow multiple execution of the streams.
• Threads are not independent of one another like processes, as a result, threads share with
other threads their code section, data section, and OS resources.
• Threads are very useful n modern programming because one of the tasks may block, and it
is desired to allow the other tasks to proceed without blocking.
• For example: In a word processor, a background thread may check spelling and grammar
while a foreground thread processes user inputs (keystrokes), while yet a third thread loads
images from the hard drive, and fourth does periodic automatic backups of the file being
edited during manipulation, and so on.
• In short, we regard the best outcome of the threads as the concept of “Multithreading”.
• Only one system call can create more than one thread (Lightweight Process).
• Threads share data and information.
• Threads have their own Program counter (PC), register set and stack space.
• Thread management consumes no or fewer system calls as the communication between
threads can be achieved using shared memory.
• The isolation property of the process increases its overhead in terms of resource
consumption criteria, etc.
• Like processes, threads share CPU and only one thread is active (running) at a time.
• Like processes, threads within a process execute sequentially.
• Like processes, thread can create children.
• And, Like processes, if one thread is blocked, then another thread can run.
User-Level Thread
• In a user-level thread, all of the task of thread management is done by the application i.e. in
the user space and kernel is not aware of the existence of thread.
• The thread library contains code for creating and destroying thread, for message passing
and sharing the data between the threads, for scheduling thread execution, and for saving
and restoring thread context-switching activities.
• The application (i.e. the work in the user space) begins with a single thread and starts
running that thread.
• Creating a thread, switching between threads, and synchronizing threads are done via
procedure calls i.e. no kernel involvement.
• User level threads are hundred times faster than kernel level threads.
• The following figure shows the structure of a user-level thread:
user space
space
Many-To-Many Model
• In this model, many user level threads multiplex to the kernel level threads of smaller or
equal numbers.
• The number of kernel threads may be specific to either a particular application or a
particular machine.
• Users have no restrictions on the number of threads created. Also, blocking kernel system
calls do not block the entire process.
• In this model, users can create as many user threads as necessary and the corresponding
kernel threads can run in parallel on a multi-processor platform.
• The figure below shows the many-to-many model of multithreading:
Many-To-One Model
• Many-to-one model maps many user level threads to one kernel level thread.
• Thread management is done in user space. When the thread makes a blocking system call,
then the entire process will be blocked.
• Only one thread can access the kernel at a time, so multiple threads are unable to run in
parallel on multiple processors i.e. many-to-one model does not allow individual process to
be split across multiple CPUs.
• The below figure shows the many-to-one model of multithreading:
One-To-One Model
• In this thread model, there is one to one relationship of user level threads to the kernel level
threads.
• This model provides more concurrency than the many-to-one model by allowing another
thread to run during blocking.
• This model also allows parallelism. The only overhead is for each thread, corresponding
kernel thread should be created.
• The below figure shows the one-to-one model of multithreading:
Types of Hypervisor
• There are two different types of hypervisor and they are as follows:
Type 1 hypervisor
• It directly runs on the physical hardware (usually a server), taking the place of the
operating system.
• Here, we typically use a separate software product to create and manipulate VMs on the
hypervisor.
Type 2 hypervisor
• It runs as an application within a host OS and usually targets a single-user desktop or
notebook platforms.
• With a Type 2 hypervisor, we manually create a VM and then install a guest OS in that
VM.
Types of Virtualization
Communication
Compiled by Ujjwal Rijal
[email protected]
Compiled by Ujjwal Rijal ||
2 [email protected]
Introduction to Communication
Inter-process communication is at the heart of all distributed systems. It makes no sense
to study distributed systems without carefully examining the ways that processes on
different machines can exchange information.
Communication in distributed systems is always based on low-level message passing as
offered by the underlying network.
Here, about the communication in distributed systems, at first, we discuss about the
rules that communicating processes must stand up to, known as protocols.
Then, after this, we discuss about three widely-used models for communication:
Remote Procedure Call (RPC), Message-Oriented Middleware (MOM) offered by
transport layer as part of middleware solution, and data streaming.
And finally, we discuss about the general problem of sending data to the multiple
receivers, called the multicasting communication.
Types of Communication
• Persistent communication: Message submitted for transmission stored
by communication middleware as long as it takes to deliver it to the receiver.
Neither sending application nor receiving application need to be executing.
• Transient communication: Message stored by communication system
only as long as sending and receiving application are executing.
• Asynchronous communication: Sender continues immediately after
submitting message for transmission. Message temporarily stored by
middleware on submission.
• Synchronous communication: Sender blocks until message received and
processed, and receiver returns acknowledgement.
Parameter Passing
Passing Value Parameters
• Client stub takes parameters and puts them in the message. It also puts
the name or number of the procedure to be called in the message
• When message arrives at server, server stub examines the message to
see which procedure is needed and then makes appropriate call. Stub
takes the result and packs it into a message. Message is sent back to
client stub.
• Client stub unpacks the message to extract the result and returns it to
waiting client procedure
Asynchronous RPC
Client continues immediately after issuing RPC request and receiving
acknowledgement from server, it is not blocked.
Server sends immediately a reply or acknowledgement to the client the
moment RPC request is received
Server then calls requested procedure
One-way RPC: Client does not wait for even an acknowledgement from
server. Reliability not guaranteed as client has no acknowledgement from
server.
Deferred synchronous RPC: It is a combination of two asynchronous
RPCs, where client polls the server periodically to see whether results are
available yet rather than server calling back the client.
Compiled by Ujjwal Rijal ||
19 [email protected]
Message-Oriented Communication
Berkeley Sockets
Sockets interface introduced in 1970s in Berkeley Unix
Standardizes the interface of the transport layer to allow programmers the use of messaging
protocols through simple set of primitives
Another interface XTI stands for X/Open Transport Interface, also formerly called Transport
Layer Interface (TLI) developed by AT&T
Sockets and XTI similar in their model of network programming but differ in their set of
primitives
Socket forms an abstraction over the communication end point to which an application can write
data that are sent over the underlying network and from which incoming data can be read.
Servers execute the first four primitives in table
When calling the socket primitive, the caller creates a new communication end point for a specific
transport protocol. Internally, the OS reserves resources to accommodate sending and receiving
messages for the specific protocol.
Message-Queuing Model
Each application has its own private queue to which other applications
can send messages
Multicast Session
Node generates a multicast identifier mid (randomly chosen 160 bit key). It then
looks up succ(mid) that is the node responsible for this key and promotes it to
become the root of the multicast tree that is used to send data to interested
nodes
To join the tree, a node P executes operation LOOKUP(mid) that allows a
lookup message with request to join the multicast group mid to be routed from P
to succ(mid).
On the way up to the root, join request will add forwarder nodes or helpers for
the group
Multicasting implemented by a node sending a multicast message towards the
root by executing LOOKUP(mid) after which message can be sent along the tree.
Naming
Compiled by Ujjwal Rijal
[email protected]
Compiled by Ujjwal Rijal ||
2 [email protected]
Name, Identifiers and Addresses
Names are used to share resources, to uniquely identify entities, to refer to locations,
and more.
A name in a distributed system is a string of bits or characters that is used to refer to an
entity. An entity in a distributed system can be practically any resource such as hosts,
printers, disks, files, processes, users, mailboxes, newsgroups, web pages, graphical
windows, messages, network connections, and so on.
To operate an entity, it needs to be accessed from an access point, which is also called
an address. For example, to print a file, we need to access it at an access point.
An address is thus just a special kind of name: it refers to an access point of an entity. A
name for an entity that is independent of an address is referred to as Location
independent.
Identifier is a reference to an entity that is often unique and never reused.
Fig: A general naming graph with a single root node (directed acyclic graph)
Name Resolution
Name spaces offer a convenient mechanism for storing and retrieving
information about entities by means of names.
A path name should be possible to look up any information stored in the node
referred to by that name.
Hence, the process of looking up a name is called the name
resolution.
Fig: Example of Partitioning of the DNS name space into three logical layers
Compiled by Ujjwal Rijal ||
17 [email protected]
Contd. ….
Implementation of Name resolution
The distribution of a name space across multiple name servers affects the
implementation of name resolution.
Let us assume that name servers are not replicated and that no client-side
caches are used. Each client has access to a local name resolver, which is
responsible for ensuring that the name resolution process is carried out.
There are two ways to implement name resolution and they are:
Iterative name resolution
Recursive name resolution
Fig: GNS directory tree and value tree for user Peter.Smith
Coordination
Compiled by Ujjwal Rijal
[email protected]
Compiled by Ujjwal Rijal ||
2 [email protected]
Introduction
Coordination and synchronization are two closely related phenomena. In
process synchronization, we make sure that one process waits for
another to complete its operation.
While dealing with data synchronization, the problem is to ensure that
two sets of data are the same.
When it comes to coordination, the goal is to manage the interactions
and dependencies between activities in a distributed system.
From this perspective, one could state that coordination encapsulates
synchronization.
Fig: Three processes, each with its own clock, running at different rates
Fig: Relation between clock time and UTC when clocks tick at different rates
Compiled by Ujjwal Rijal ||
18 [email protected]
Contd. ….
Network Time Protocol (NTP) → (Cristian’s Algorithm)
A common approach in many protocols and originally proposed by Cristian
is to let clients contact a time server.
The latter can accurately provide the current time, for example, because it is
equipped with a UTC receiver or an accurate clock.
The problem, of course, is that when contacting the server, message delays
will have outdated the reported time.
The trick is to find a good estimation for these delays.
Cristian’s algorithm can also give external synchronization if the time server
is synced with external clock reference
NTP requires a special node with a time source
NTP is prone to failure of the central server.
Compiled by Ujjwal Rijal ||
19 [email protected]
Contd. ….
In this method, each node periodically sends a message to the server. When
the time server receives the message it responds with a message T, where T is
the current time of server node.
Fig: Communication between the active and passive thread in a peer-sampling service.
Compiled by Ujjwal Rijal ||
51 [email protected]
Contd. ….
The different selection operations are specified as follows:
selectPeer: Randomly select a neighbour from the local partial view.
selectToSend: Select some other entries from the partial view, and
add to the list intended for the selected neighbour.
selectToKeep: Add the received entries to partial view, remove
repeated items, and shrink view to the different items.
Consistency and
Replication
Compiled by Ujjwal Rijal
[email protected]
Compiled by Ujjwal Rijal ||
2 [email protected]
Introduction to Replication
Replication is the process of storing multiple data in more than one
site or node.
It is useful in improving the availability of data.
It is simply copying data from a database from one server to
another server so that all the users can share the same data without
any inconsistency.
The result is a distributed database in which users can access data
relevant to their tasks without interfering with the work of others.
Thus, in conclusion, replication is the mechanism of maintaining
multiple copies of data at multiple nodes or sites.
Note:
W1(x)a → W2(x)b, but W2(x)b || W1(x)c
Sequential All processes see all shared accesses in the same order.
Accesses are not ordered in time.
Messages sent Update (and possibly fetch update later) Poll and update
Fault Tolerance
Compiled by Ujjwal Rijal
[email protected]
Compiled by Ujjwal Rijal ||
2 [email protected]
Introduction to Fault Tolerance
Fault means defect within hardware or software. Fault
tolerance is defined as the characteristic by which a system can
mask the occurrence and recovery from failures.
In other words, a system is said to be fault tolerant if it can
continue to operate in the presence of failures.
A system is said to be k-fault tolerant system if it is able to function
properly even if k-nodes of the system suffers from concurrent
failures.
Security
Management
Access Control • Key Management
• Access Control • Authorization
Secure Channels Matrix Management
• Protection Domains
• Authentication
Introduction • Message Integrity
and Confidentiality
• Threats, policies
and mechanisms
• Cryptography
Security
Eve/
Chuck
Let us meet at 12 PM
Alice Bob
2. Authentication
• Verifies the claimed identity of the user, host or other entity
Access Control
3. Authorization
• Verifies if the entity is authorized to perform an operation
4. Auditing
• To trace which clients accessed what, and which way
Security
Types of Cryptographic
Protocols
systems hash functions
Active Intruder
Passive can listen and
C modify C, and C
Intruder can
PInatsrsuidveerIntruder insert messages Active Intruder
listen to C C’
C=EK(P)
Plain text
message
Cipher text
Sender K K Receiver
𝐶
Alice Bob
𝐶 = 𝐸𝐾 𝑃 𝑃 = 𝐷𝐾 𝐶
• Approach:
• At A: Encrypt using B’s public key
• At B: Decrypt using B’s private key
𝑚′
A B
𝑚′ = 𝐸𝐾+ 𝑚 𝑚 = 𝐷𝐾𝐵− 𝑚′
𝐵
Drawback: How does ‘B’ know that ‘A’ sent the message?
• Problem: Bob wants to make sure that message came from Alice
(and
not from some intruder)
• Approach:
• At A: Encrypt using A’s private key
• At B: Decrypt using A’s public key
𝑚′
A B
𝑚′ = 𝐸𝐾𝐴− 𝑚 𝑚 = 𝐷𝐾+ 𝑚′
𝐴
2.Weak-collision resistance
• Given a message, it is hard to find another message that has the same
hash value
• Given 𝑚 and ℎ = 𝐻 𝑚 , it is computationally infeasible to find 𝑚′ ≠ 𝑚
such that
𝐻(𝑚) = 𝐻(𝑚 ′ )
3.Strong-collision resistance
• Given a hash function, it is hard to find two messages with the same
hash value
• Given 𝐻 . , it is computationally infeasible to find two messages 𝑚 and
𝑚′ such
that 𝐻(𝑚) = 𝐻(𝑚 ′ )
𝑚𝑐 = 𝐸𝐾 𝑚𝑝
𝑚 𝑝 = 𝐷𝐾 𝑚𝑐
Cryptographic
Types of
hash Protocols
systems
functions
Symmetric Public-key
DES RSA Hybrid
systems systems
• RSA protocol
• Encryption/Decryption in Public-Key cryptosystems
Encryption ...
KM Algorithm
56-bit key
64-bit Cipher-text
... DES 64-bit plain-text
Decryption ...
KM Algorithm
56-bit key
Round 1
K1
iii. Mangle the bits in Li and Ri using Ki f(L1,R1,K1)
to produce Ri+1 L2 R2
iv. Extract Ri as Li+1
L15 R15
Round 16
K16
4. Perform an inverse permutation f(L16,R16,K16)
on the block L16-R16 to produce
L16 R16
the encrypted output block
...
Encrypted output block
• History of DES:
• DES was invented in 1974
• In 1997, it was shown that DES can be easily cracked using
brute-
force attacks
• Approach:
1. Choose two very large prime numbers, p and q
2. Compute n = pq Example*
• Given d and e are the two keys computed by RSA, which row
indicates correct choice of keys?
Correct d e
x e d
At the receiver:
• Receive ci from sender
• For each block ci mi = 6265 (mod 133)
• Compute actual message m i = cid (mod n) =6
• Merge all ci’s to obtain the complete message
Introductory Cryptographic
Secure Channels
Concepts Systems
Message Integrity
Cryptographic
Types of systems Protocols Authentication and
hash functions Confidentiality
Authentication
Shared Secret Key Authentication
using a Key
based using Public-key
Distribution
Authentication Cryptography
Center
• We will study
• Authentication
• Confidentiality and Message Integrity
Alice
RB with KA,B (denoted by KA,B (RB)), and
Bob
KA,B (RB)
sending it back to ‘B’
RA
4. ‘A’ challenges ‘B’ by sending RA
5. ‘B’ responds to the challenge by sending KA,B (RA)
the encrypted message KA,B(RA)
38 UNIT- 9 Security 6/18/2023
4/15/2022 A and B are mutually autDhisetribnutteidcSaystteemdbyPrashant Gautam 39
Overview
Security
Management
Access Control • Key Management
• Access Control • Authorization
Secure Channels Matrix Management
• Protection Domains
• Authentication
Introduction • Message Integrity
and Confidentiality
• Threats, policies
and mechanisms
• Cryptography
Ticket
A, B
KDC
Alice
Bob
KA,KDC(KA,B) , KB,KDC(KA,B)
A, KB,KDC(KA,B)
KDC
M2 3. RA2 is returned in M4,
KA,KDC(RA1 , B, KA,B , KB,KDC(A, KA,B) ) instead of
(RA2-1)?
Alice
Bob
M3
KA,B(RA2), KB,KDC(A,KA,B)
4. Chuck has an old key
M4 KA,B and intercepts
KA,B(RA2 -1, RB)
M3, and replays M3
M5
KA,B(RB -1) at a later point of
time to Alice?
• Recall:
• K+N: Public key of user N
• K-N: Private key of user N
K+B(A, RA)
Alice
Bob
K+A(RA, RB, KA,B )
KA,B (RB)
Introductory Cryptographic
Secure Channels
Concepts Systems
m m
K-A(m) K-A(m)
m K+B K-B m
-
Hash H(m) KA
K-A(H(m)) K-A(H(m))
Bob can verify the signature by comparing the hash of received message, and the
hash attached in the digitally signed message
Introductory Cryptographic
Secure Channels
Concepts Systems
Message
Secure Group
Authentication Integrity and Communication
Confidentiality
• Naïve Solution:
• Client collects the response from each server
• Client authenticates every response
• Client takes a majority vote
• Main Idea:
• Multiple servers share a secret
• None of them know the entire secret
• Secret can be revealed only if all/most of them cooperate
Security
Message
Secure Group General Access Controlling
Authentication Integrity and
Communication Control Issues outsider attacks
Confidentiality
Access Control
Access Control Protection Controlling DoS
List and Firewalls
Matrix Domains Attacks
Capabilities
Subjects
• The reference monitor will look up AC Matrix S2
R W X
√ x x
R W X
√ x x
Protection
Domains
• Object permission is given to a protection domain (PD) PD2
PD: Guests
R W X
61 UNIT- 9 Security x 6/18/2023
x x
69
A matrix 4is/1s5h/2o0w22nonly for representation. Similar to AC MDisattrribiuxt,ePdDSysctaemnbbyePirmashpalnetmGaeuntatmedusing
techniques similar to ACL
Hierarchical Protection Domains
• Advantage
• Managing group membership is easy World
• An optimization:
• Let each user carry a certificate of all the
groups they belong to
• Reference monitor validates the certificate
(similar to the Capabilities approach)
Security
Controlling
General Access
outsider
Control Issues attacks
Access Control
Access Control Protection Controlling DoS
List and Firewalls
Matrix Domains Attacks
Capabilities
• But, what if any user on the Internet can access the resources
of DS
• Search engines, Sending mails, …
2. Resource Depletion
• Trick the server to allocate large amount of resources
• Example: TCP SYN flooding
• Initiate a lot of TCP connections on server (by
sending TCP SYN packets)
• Do not close the connections
• Each connection request blocks some memory for some
time
Three types:
• Monitor at ingress routers:
• Monitor incoming traffic for large flow towards a destination
• Drawback: Too late to detect since regular traffic is already blocked
Security
General Controlling
Key
Access outsider
Management
Control Issues attacks
n, g,
x y
n, g, x, g mod n, g mod n gx mod n, Bob transmits [gy mod n]
gy mod n
• Public-key Cryptosystems
• A’s private key = x
• A’s public key = gx mod n
Security
Security
Introductory Cryptographic
Concepts Systems
Cryptographic
Security Policy and Types of
hash Protocols
threats Mechanisms systems
functions
Symmetric Public-key
DES RSA Hybrid
systems systems
Security
Secure Channels
Message Integrity
Secure Group
Authentication and
Communication
Confidentiality
Authentication
Shared Secret Key Authentication
using a Key
based using Public-key
Distribution
Authentication Cryptography
Center
Security
Security
Access Control
Management
Controlling
General Access Key
outsider
Control Issues attacks Management
Access Control
Access Control Protection Controlling DoS Key
List and Firewalls Key Generation
Matrix Domains Attacks Distribution
Capabilities