Distributed Systems Lecturer Notes Latest 149 191
Distributed Systems Lecturer Notes Latest 149 191
3
SYSTEM
Unlike the client/server model, in which the client makes a service request and the
server fulfills the request, the P2P network model allows each node to function as both
client and server.
P2P systems can be used to provide anonymized routing of network traffic, massive
parallel computing environments, distributed storage and other functions. Most P2P programs
are focused on media sharing and P2P is therefore often associated with software
piracy and copyright violation.
3.2 Peer to Peer Services and File System
5) The over-all cost of building and maintaining this type of network is comparatively
very less.
4) Lot of movies, music and other copyrighted files are transferred using this type of
file transfer. P2P is the technology used in torrents.
Distributed System 3.3
P2P Middleware
Middleware is the software that manages and supports the different components of
a distributed system.
Peer clients need to locate and communicate with any available resource, even
though resources may be widely distributed and configuration may be dynamic,
constantly adding and removing resources and connections.
Global Scalability
Load Balancing
Local Optimization
Security of data
Any node can access any object by routing each request through a sequence of
nodes, exploiting knowledge at each of theme to locate the destination object.
Global User IDs (GUID) also known as opaque identifiers are used as names, but
they do not contain the location information.
IP Overlay Network
The scalability of IPV4 is limited to 232 Peer to peer systems can address more
nodes and IPv6 is 2128. objects using GUID.
Distributed Computation
Here a single problem is divided into many parts, and each part is solved by
different computers.
As long as the computers are networked, they can communicate with each other to
solve the problem.
The computers perform like a single entity.
It also ensures fault tolerance and enables resource accessibility in the event that
one of the components fails.
Later the transfer of resources between systems took place by means of client
server communication.
Later Napster was developed for peer –to-peer file sharing especially MP3 files.
They are not fully peer-to-peer since it used central servers to maintain lists of
connected systems and the files they provided, while actual transactions were
conducted directly between machines.
The user runs the Napster program. Once executed, this program checks for an
Internet connection.
If an Internet connection is detected, another connection between the user's
computer and one of Napster's Central Servers will be established. This connection
is made possible by the Napster file-sharing software.
The Napster Central Server keeps a directory of all client computers connected to
it and stores information on them as described above.
3.6 Peer to Peer Services and File System
If a user wants a certain file, they place a request to the Napster Centralised Server
that it's connected to.
The Napster Server looks up its directory to see if it has any matches for the user's
request.
The Server then sends the user a list of all that matches (if any) it as found
including the corresponding, IP address, user name, file size, ping number, bit rate
etc.
The user chooses the file it wishes to download from the list of matches and tries
to establish a direct connection with the computer upon which the desired file
resides.
It tries to make this connection by sending a message to the client computer
indicating their own IP address and the file name they want to download from the
client.
If a connection is made, the client computer where the desired file resides is now
considered the host.
The host now transfers the file to the user.
The host computer breaks the connection with the user computer when
downloading is complete.
peers
2. List of peers
of f ering the f ile
5. Index update
4. File deliv ered
Peer clients need to locate and communicate with any available resource, even
though resources may be widely distributed and configuration may be dynamic,
constantly adding and removing resources and connections.
The following are the functional requirements of the middleware:
Any node can access any object by routing each request through a sequence of
nodes, exploiting knowledge at each of theme to locate the destination object.
Global User IDs (GUID) also known as opaque identifiers are used as names, but
they do not contain the location information.
A client wishing to invoke an operation on an object submits a request including
the object‟s GUID to the routing overlay, which routes the request to a node at
which a replica of the object resides.
They are actually sub-systems within the peer-to-peer middleware that are meant
locating nodes and objects.
They implement a routing mechanism in the application layer.
They ensure that any node can access any object by routing each request thru a
sequence of nodes
Features of GUID:
They are pure names or opaque identifiers that do not reveal anything about the
locations of the objects.
They are the building blocks for routing overlays.
They are computed from all or part of the state of the object using a function that
deliver a value that is very likely to be unique. Uniqueness is then checked against
all other GUIDs.
They are not human understandable.
Client submits a request including the object GUID, routing overlay routes the
request to a node at which a replica of the object resides.
A node introduces a new object by computing its GUID and announces it to the
routing overlay.
Note that clients can remove an object and also nodes may join and leave the
service
Distributed System 3.9
3.5 PASTRY
Pastry provides efficient request routing, deterministic object location, and load
balancing in an application-independent manner.
Furthermore, Pastry provides mechanisms that support and facilitate application-
specific object replication, caching, and fault recovery.
Each node in the Pastry network has a unique, uniform random identifier (nodeId) in
a circular 128-bit identifier space.
When presented with a message and a numeric 128-bit key, a Pastry node efficiently
routes the message to the node with a nodeId that is numerically closest to the key,
among all currently live Pastry nodes.
The expected number of forwarding steps in the Pastry overlay network is O(log N),
while the size of the routing table maintained in each Pastry node is only O(log N) in
size (where N is the number of live Pastry nodes in the overlay network).
At each Pastry node along the route that a message takes, the application is notified
and may perform application-specific computations related to the message.
Each Pastry node keeps track of its L immediateneighbors in the nodeId space called
the leaf set, and notifies applications of new node arrivals, node failures and node
recoveries within the leaf set.
Pastry takes into account locality (proximity) in the underlying Internet.
It seeks to minimize the distance messages travel, according to a scalar proximity
metric like the ping delay.
Pastry is completely decentralized, scalable, and self-organizing; it automatically
adapts to the arrival, departure and failure of nodes.
3.10 Peer to Peer Services and File System
Capabilities of Pastry
Mapping application objects to Pastry nodes
Application-specific objects are assigned unique, uniform random identifiers
(objIds) and mapped to the k, (k >= 1) nodes with nodeIds numerically closest
to the objId.
The number k reflects the application's desired degree of replication for the
object.
Inserting objects
Application-specific objects can be inserted by routing a Pastry message, using
the objId as the key.
When the message reaches a node with one of the k closest nodeIds to the
objId, that node replicates the object among the other k-1 nodes with closest
nodeIds (which are, by definition, in the same leaf set for k <= L/2).
Accessing objects
Application-specific objects can be looked up, contacted, or retrieved by
routing a Pastry message, using the objId as the key.
By definition, the message is guaranteed to reach a node that maintains a
replica of the requested object unless all k nodes with nodeIds closest to the
objId have failed.
Availability and persistence
Applications interested in availability and persistence of application-specific
objects maintain the following invariant as nodes join, fail and recover: object
replicas are maintained on the k nodes with numerically closest nodeIds to the
objId, for k> 1.
The fact that Pastry maintains leaf sets and notifies applications of changes in
the set's membership simplifies the task of maintaining this invariant.
Diversity
The assignment of nodeIds is uniform random, and cannot be corrupted by an
attacker. Thus, with high probability, nodes with adjacent nodeIds are diverse
in geographic location, ownership, jurisdiction, network attachment, etc.
The probability that such a set of nodes is conspiring or suffers from correlated
failures is low even for modest set sizes.
This minimizes the probability of a simultaneous failure of all k nodes that
maintain an object replica.
Distributed System 3.11
Load balancing
Both nodeIds and objIds are randomly assigned and uniformly distributed in
the 128-bit Pastry identifier space.
Without requiring any global coordination, this results in a good first-order
balance of storage requirements and query load among the Pastry nodes, as
well as network load in the underlying Internet.
Object caching
Applications can cache objects on the Pastry nodes encountered along the
paths taken by insert and lookup messages.
Subsequent lookup requests whose paths intersect are served the cached copy.
Pastry's network locality properties make it likely that messages routed with
the same key from nearby nodes converge early, thus lookups are likely to
intercept nearby cached objects.
This distributed caching offloads the k nodes that hold the primary replicas of
an object, and it minimizes client delays and network traffic by dynamically
caching copies near interested clients.
Efficient, scalable information dissemination
Applications can perform efficient multicast using reverse path forwarding along
the tree formed by the routes from clients to the node with nodeId numerically
closest to a given objId.
Pastry's network locality properties ensure that the resulting multicast trees are
efficient; i.e., they result in efficient data delivery and resource usage in the
underlying Internet.
In this stage a routing scheme is used, that routes messages correctly but
inefficiently without a routing table.
Each active node stores a leaf set – a vector L (of size 2l) containing the
GUIDs and IP addresses of the nodes whose GUIDs are numerically closest on
either side of its own (l above and l below).
3.12 Peer to Peer Services and File System
Leaf sets are maintained by Pastry as nodes join and leave. Even after a node
failure, they will be corrected within a short time.
In the Pastry system that the leaf sets reflect a recent state of the system and
that they converge on the current state in the face of failures up to some
maximum rate of failure.
Every leaf set includes the GUIDs and IP addresses of the current node‟s
immediateneighbours.
A Pastry system with correct leaf sets of size at least 2 can route messages to
any GUID trivially as follows: any node A that receives a messageM with
destination address D routes the message by comparing D with its own GUIDA
and with each of the GUIDs in its leaf set and forwarding M to the node
amongstthem that is numerically closest to D.
Second Stage:
This has a full routing algorithm, which routes a request to any node
in O(log N) messages.
Each Pastry node maintains a tree-structured routing table with GUIDs and
IP addresses for a set of nodes spread throughout the entire range of 2128
possible GUID values, with increased density of coverage for GUIDs
numerically close to its own.
The structure of routing table is: GUIDs are viewed as hexadecimal values and
the table classifies GUIDs based on their hexadecimal prefixes.
The table has as many rows asthere are hexadecimal di gits in a GUID, so for
the prototype Pastry system that weare describing, there are 128/4 = 32 rows.
Any row n contains 15 entries – one foreach possible value of the nth
hexadecimal digit, excluding the value in the local node ‟s GUID.
Each entry in the table points to one of the potentially many nodes
whose GUIDs have the relevant prefix.
The routing process at any node A uses the information in its routing table R
and leaf set L to handle each request from an application and each incoming
messagefrom another node according to the algorithm.
Distributed System 3.13
Locality
The entries in the ithrow give the addresses of 16 nodes with GUIDs with i–1 initial
hexadecimal digits thatmatch the current node ‟s GUID and an ith di git that takes each
of the possible hexadecimal mal values.
A well-populated Pastry overlay will contain many more nodesthan can be contained
in an individual routing table; whenever a new routing table isbeing constructed a
choice is made for each position between several candidates based on a proximity
neighbour selection algorithm.
A locality metric is used to compare candidates and the closest available node is
chosen.
Since the information available is not comprehensive, this mechanism cannot
produce globally optimal routings, but simulations have shown that it results in routes
that are on average only about 30–50% longer than the optimum.
3.14 Peer to Peer Services and File System
Fault tolerance
The Pastry routing algorithm assumes that allentries in routing tables and leaf sets
refer to live, correctly functioning nodes.
All nodessend „heartbeat‟ messages i.e., messages sent at fixed time intervals to
indicate that thesender is alive to neighbouring nodes in their leaf sets, but
information about failednodes detected in this manner may not be disseminated
sufficiently rapidly to eliminaterouting errors.
Nor does it account for malicious nodes that may attempt to interfere with
correct routing.
To overcome these problems, clients that depend upon reliable messagedelivery
are expected to employ an at-least-once delivery mechanism and repeat their
requests several times in the absence of a response.
This willallow Pastry a longer time window to detect and repair node failures.
To deal with any remaining failures or malicious nodes, a small degree of
randomness is introduced into the route selection algorithm.
Dependability
Dependability measures include the use of acknowledgements at each hop in the
routing algorithm.
If the sending host does not receive an acknowledgement after aspecified timeout,
it selects an alternative route and retransmits the message.
The nodethat failed to send an acknowledgement is then noted as a suspected
failure.
To detect failed nodes each Pastry node periodically sends aheartbeat message to
its immediate neighbour to the left (i.e., with a lower GUID) in the
leaf set.
Each node also records the time of the last heartbeat message received from its
immediate neighbour on the right (with a higher GUID).
If the interval since the lastheartbeat exceeds a timeout threshold, the detecting
node starts a repair procedure thatinvolves contacting the remaining nodes in the
leaf set with a notification about thefailed node and a request for suggested
replacements.
Even in the case of multiplesimultaneous failures, this procedure terminates with
all nodes on the left side of thefailed node having leaf sets that contain the l live
nodes with the closest GUIDs.
Distributed System 3.15
Suspectedfailed nodes in routing tables are probed in a similar manner to that used
for the leaf setand if they fail to respond, their routing table entries are replaced
with a suitablealternative obtained from a nearby node.
A simple gossip protocol is used to periodically exchange routing table
information betweennodes in order to repair failed entries and prevent slow
deterioration of the localityproperties.
The gossip protocol is run about every 20 minutes.
3.6 TAPESTRY
Tapestry is a decentralized distributed system.
It is an overlay network that implements simple key-based routing.
Each node serves as both an object store and a router that applications can contact
to obtain objects.
In a Tapestry network, objects are “published” at nodes, and once an object has
been successfully published, it is possible for any other node in the network to find
the location at which that objects is published.
Identifiers are either NodeIds, which refer to computers that perform routing
operations, or GUIDs, which refer to the objects.
For any resource with GUID G there is a unique root node with GUID RG that is
numerically closest to G.
Hosts H holding replicas of G periodically invoke publish(G) to ensure that newly
arrived hosts become aware of the existence of G.
4377
43FE 437A
4228
4361
4378 4664
Phil‟s
Books 4B4F 4A6D
E791 4378
57EC AA93
Phil‟s
Books
Gnutella
The Gnutella network is a fully decentralized, peer-to-peer application layer
network that facilitates file sharing; and is built around an open protocol developed
to enable host discovery, distributed search, and file transfer.
Distributed System 3.17
The GUID field provides a unique identifier for a message on the network; the
Type field indicates which type of message is being communicated.
TTL field enumerates the maximum number of hops that this message is
allowed to traverse.
Hops field provides a count of the hops already traversed.
Payload Size field provides a byte count of all data expected to follow the
message.
Types of Message
There are only five types of messages
Ping
Pong
Query
3.18 Peer to Peer Services and File System
Query-Hit
Push
Ping and Pong messages facilitate host discovery, Query and Query-Hit messages make
possible searching the network, and Push messages ease file transfer from firewalled
hosts.Because there is no central index providing a list of hosts connected to the network, a
disconnected host must have offline knowledge of a connected host in order to connect to the
network.
Once connected to the network, a host is always involved in discovering hosts,
establishing connections, and propagating messages that it receives from its peers,
but it may also initiate any of the following six voluntary activities: searching for
content, responding to searches, retrieving content, and distributing content. A host
will typically engage in these activities simultaneously.
A host will search for other hosts and establish connections so as to satisfy a
maximum requirement for active connections as specified by the user, or to replace
active connections dropped by it or its peer.
Consequently, a host tends to always maintain the maximum requirement for
active connections as specified by the user, or the one connection that it needs to
remain connected to the network.
To engage in host discovery, a host must issue a Ping message to the host of which
it has offline knowledge.
That host will then forward the ping message across its open connections, and
optionally respond to it with a Pong message.
Each host that subsequently receives the Ping message will act in a similar manner
until the TTL of the message has been exhausted.
A Pong message may only be routed along the reverse of the path that carried the
Ping message to which it is responding.
After having discovered a number of other hosts, the host that issued the initial
ping message may begin to open further connections to the network.
Doing so allows the host to issue ping messages to a wider range of hosts and
therefore discover a wider range of other hosts, as well as to begin querying the
network.
In searching for content, a host propagates search queries across its active
connections.
Those queries are then processed and forwarded by its peers.
Distributed System 3.19
When processing a query, a host will typically apply the query to its local database
of content, and respond with a set of URLs pointing to the matching files.
The propagation of Query and Query-Hit messages is identical to that of Ping and
Pong messages.
A host issues a Query message to the hosts to which it is connected.
The hosts receiving that Query message will then forward it across their open
connections, and optionally respond with a Query-Hit message.
Each host that subsequently receives the Query message will act in a similar
manner until the TTL of the message has been exhausted.
Query-Hit messages may only be routed along the reverse of the path that carried
the Query message to which it is a response.
The sharing of content on the Gnutella network is accomplished through the use of
the HTTP protocol.
Due to the decentralized nature of the architecture, each host participating in the
Gnutella network plays a key role in the organization and operation of the network.
Both the sharing of host information, as well as the propagation of search requests
and responses are the responsibility of all hosts on the network rather than a central
index.
Each host in the Gnutella network is capable of acting as both a server and client,
allowing a user to distribute and retrieve content simultaneously.
This spreads the provision of content across many hosts, and helps to eliminate the
bottlenecks that typically occur when placing enormous loads on a single host.
In this network, the amount and variety of content available to a user scales with
the number of users participating in the network.
Distributed file systems support the sharing of information in the form of files
and hardware resources.
Apart from the above transparency properties, the file systems also need the following:
Fault Tolerance: It is a design that enables a system to continue operation,
possibly at a reduced level, rather than failing completely, when some part
of the system fails.
Network Transparency: Same access operation as if they are local files.
Location Independence: The file name should not be changed when the physical
location of the file changes.
User Mobility: User should be able to access the file from any where.
File Mobility: Moves files from one place to the other in a running system.
Examples:
The GOOGLE File System: A scalable distributed file system for large
distributed data-intensive applications. It provides fault tolerance while running on
inexpensive commodity hardware, and it delivers high aggregate performance to a
large number of clients.
The CODA distributed file system: Developed at CMU, incorporates many
distinguished features which are not present in any other distributed file system.
A client module.
Client Computer Server Computer
Client module
Unique File Identifiers (UFIDs) are used to refer to files in all requests for flat file
service operations.
UFIDs are long sequences of bits chosen so that each file has a unique among all
of the files in a distributed system.
Directory service:
This provides mapping between text names for the files and their UFIDs.
Clients may obtain the UFID of a file by quoting its text name to directory service.
Directory service supports functions needed generate directories, to add new files
to directories.
Client module:
It runs on each computer and provides integrated service (flat file and directory) as
a single API to application programs.
It holds information about the network locations of flat-file and directory server
processes; and achieve better performance through implementation of a cache of
recently used file blocks at the client.
Create() FileId: Creates a new file of length0 and delivers a UFID for it.
File Group
A file group is a collection of files that can be located on any server or moved between
servers while maintaining the same names.
A similar construct is used in a UNIX file system.
It helps with distributing the load of file serving between several servers.
File groups have identifiers which are unique throughout the system.
To construct a globally unique ID we use some unique attribute of the machine on
which it is created, e.g. IP number, even though the file group may move
subsequently.
– Like NFS, AFS provides transparent access to remote shared files for UNIX
programs running on workstations.
– AFS is implemented as two software components that exist at UNIX processes called
Vice and Venus.
– The files available to user processes running on workstations are either local or
shared.
– Local files are handled as normal UNIX files.
– They are stored on the workstation‟s disk and are available only to local user
processes.
3.26 Peer to Peer Services and File System
Workstations
Venus
Servers
User
program
UNIX Kernel Vice
UNIX Kernel
Venus
User
program Network
UNIX Kernel
Vice
Venus UNIX Kernel
User
program
UNIX Kernel
– The UNIX kernel in each workstation and server is a modified version of BSD
UNIX.
– The modifications are designed to intercept open, close and some other file system
calls when they refer to files in the shared name space and pass them to the Venus
process in the client computer.
Workstation
Non-local file
User operations Venus
program UNIX file
system calls
UNIX Kernel
UNIX file system
Local
disk
Create() fid: Creates a new file and records a callback promise on it.
Remove(fid) Deletes the specified file.
SetLock(fid, mode): Sets a lock on the specified file or directory. The mode of the
lock may be shared or exclusive. Locks that are not removed expire after 30
minutes.
ReleaseLock(fid):Unlocks the specified file or directory.
RemoveCallback(fid): Informs the server that a Venus process has flushed a file
from
its cache.
BreakCallback(fid:) Call made by a Vice server to a Venus process; cancels the
call back promise on the relevant file.
Thus, the client‟s request for file access is delivered across the network as a
message to the server, the server machine performs the access request, and the
result is sent to the client.
This need to minimize the number of messages sent and the overhead per
message.
2. Data-caching model
This model attempts to reduce the network traffic of the previous model by
caching the data obtained from the server node.
This takes advantage of the locality feature of the found in file accesses.
3.28 Peer to Peer Services and File System
A replacement policy such as LRU is used to keep the cache size bounded.
This model reduces network traffic it has to deal with the cache coherency
problem during writes, because the local cached copy of the data needs to be
updated, the original file at the server node needs to be updated and copies in
any other caches need to be updated.
The data-caching model offers the possibility of increased performance and
greater system scalability because it reduces network traffic, contention for the
network, and contention for the file servers. Hence almost all distributed file
systems implement some form of caching.
In file systems that use the data-caching model, an important design issue is to
decide the unit of data transfer.
This refers to the fraction of a file that is transferred to and form clients as a
result of single read or write operation.
A file block is a contiguous portion of a file and is of fixed length (can also be
a equal to a virtual memory page size).
This does not require client nodes to have large storage space.
It eliminates the need to copy an entire file when only a small portion of the
data is needed.
Distributed System 3.29
We operate on an entity through its Access Point. The Address is the name of the
access point.
The naming faciliy of a distributed operating system enables users and programs to
assign character-string names to objects and subsequently use these names to refer
to those objects.
The locating faciliy, which is an integral part of the naming facility, maps an
object's name to the object's location in a distributed system.
The naming and locating facilities jointly form a naming system that provides the
users with an abstraction of an object that hides the details of how and where an
object is actually located in the network.
Given an object name, it returns a set of the locations of the object's replicas.
The naming system plays a very important role in achieving the goal of
location transparency,
Example:
3.12.1 Identifiers:
3.12.2 Namespaces:
Since an object's properties are stored and maintained by the authoritative name
servers of that object, name resolution is basically the process of mapping an
object's name to the authoritative name servers of that object.
Once an authoritative name server of the object has been located, operations can be
invoked to read or update the object's properties.
Each name agent in a distributed system knows about at least one name server
apriori.
To get a name resolved, a client first contacts its name agent, which in turn
contacts a known name server, which may in turn contact other name servers.
– On the other hand, a relative name defines a path from the current context to the
specified object. It is called a relative name because it is "relative to” (start from)
the user's current context.
In this method, a user may specify an object in any of the following ways:
1. Using the full (absolute) name
2. Using a relative name
3. Changing the current context first and then using a relative name
Managerial layer
– After this, the name servers that store the contexts of the given pathname are
recursively activated one after another until the authority attribute of the named
3.34 Peer to Peer Services and File System
object is extracted from the context corresponding to the last component name of
the pathname.
– The last name server returns the authority attribute to its previous name server,
which then returns it to its own previous name server, and so on.
– Finally, the fast name server that received the request from the name agent returns
the authority attribute to the name agent
– Rather, the name agent retains control over the resolution process and one by one
calls each of the servers involved in the resolution process.
– To continue the name resolution, the name agent sends a name resolution request
along with the unresolved portion of the name to the next name server.
– The process continues until the name agent receives the authority attribute of the
named object
Caching at the client is an effective way of pushing the processing workload from
the server out to client devices,if a client has the capacity.
If result data is likely to be reused by multiple clients or if the client devices do not
have the capacity then caching at the server is more effective.
Multi-level caching
– Client sends operation requests to the server and the server sends responses in turn.
– With some exceptions the client need not wait for a response before sending the
next request. Server may send the responses in any order.
Start TLS : Optionally protect the connection with Transport Layer Security
(TLS), to have a more secure connection
Bind - authenticate and specify LDAP protocol version
Delete an entry
Modify an entry
Directory Structure
Directory is a tree of directory entries.
Each entry consists of a set of attributes.
An attribute has: a name , an attribute type or attribute description and one or
more values
Attributes are defined in a schema.
Each entry has a unique identifier called Distinguished Name (DN).
The DN consists of its Relative Distinguished Name (RDN) constructed from
some attribute(s) in the entry.
It is followed by the parent entry's DN.
Think of the DN as a full filename and the RDN as a relative filename in a folder.
DN may change over the lifetime of the entry.
To reliably and unambiguously identify entries, a UUID might be provided in the
set of the entry's operational attributes.
3.38 Peer to Peer Services and File System
REVIEW QUESTIONS
PART - A
3. Data recovery or backup is very difficult. Each computer should have its own
back-up system
4. Lot of movies, music and other copyrighted files are transferred using this type of
file transfer. P2P is the technology used in torrents.
5. Give the characteristics of peer-to-peer middleware.
The P2P middleware must possess the following characteristics:
Global Scalability
Load Balancing
Local Optimization
Adjusting to dynamic host availability
Security of data
Anonymity, deniability, and resistance to censorship
6. Define routing overlay.
A routing overlay is a distributed algorithm for a middleware layer responsible for
routing requests from any client to a host that holds the object to which the request is
addressed.
7. Give the differences between Overlay networks and IP routing
IP Overlay Network
32
The scalability of IPV4 is limited to 2 Peer to peer systems can address more
nodes and IPv6 is 2128. objects using GUID.
8. What is Napster?
Napster was developed for peer –to-peer file sharing especially MP3 files.They are not
fully peer-to-peer since it used central servers to maintain lists of connected systems
and the files they provided, while actual transactions were conducted directly between
machines.
9. Give the features of GUID.
They are pure names or opaque identifiers that do not reveal anything about the
locations of the objects.
They are the building blocks for routing overlays.
They are computed from all or part of the state of the object using a function that
deliver a value that is very likely to be unique. Uniqueness is then checked against
all other GUIDs.
They are not human understandable.
10. Give the types of routing overlays.
DHT – Distributed Hash Tables. GUIDs are stored based on hash values.
DOLR – Distributed Object Location and Routing. DOLR is a layer over the DHT that
maps GUIDs and address of nodes. GUIDs host address is notified using the Publish()
operation.
11. Define pastry.
Pastry is a generic, scalable and efficient substrate for peer-to-peer applications. Pastry
nodes form a decentralized, self-organizing and fault-tolerant overlay network within
the Internet.
12. Give the Capabilities of Pastry
Mapping application objects to Pastry nodes
Inserting objects
Accessing objects
Availability and persistence
Diversity
Load balancing
Object caching
Efficient, scalable information dissemination
Distributed System 3.41
Location Transparency: The names of the files do not reveal their physical
location.
3.42 Peer to Peer Services and File System
PART – B
1. Explain about peer to peer communication.
2. Explain the working of routing overlays.
3. Describe Napster.
4. Write in detail about peer to peer middleware.
5. Explain about pastry.
6. Describe tapestry.