0% found this document useful (0 votes)
68 views

Assessments, Exam Practice

This 3 page document contains a 20 question assessment for a Distributed Systems Development course. The assessment covers topics such as distributed system identifiers, concurrency control, security, and key management. Students are instructed to attempt all questions, convert their responses to PDF, and not use online resources for answers. Questions involve short answers, explanations, examples, and discussions of distributed computing concepts. The assessment aims to evaluate students' understanding of essential distributed systems principles.

Uploaded by

Mthethwa Sbahle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Assessments, Exam Practice

This 3 page document contains a 20 question assessment for a Distributed Systems Development course. The assessment covers topics such as distributed system identifiers, concurrency control, security, and key management. Students are instructed to attempt all questions, convert their responses to PDF, and not use online resources for answers. Questions involve short answers, explanations, examples, and discussions of distributed computing concepts. The assessment aims to evaluate students' understanding of essential distributed systems principles.

Uploaded by

Mthethwa Sbahle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 174

Monday, 14th of December 2020

UNIVERSITY OF ZULULAND

FACULTY SCIENCE AND AGRICULTURE

DEPARTMENT OF COMPUTER SCIENCE

Assessment III

COURSE CODE: SCPS 312 Distributed Systems Development

Examiner: Ms I. N. Ezeji
Moderator: Mr. P Tarwireyi
Duration: 2 hours 75 marks

Instructions

1. This paper is three (3) pages long.


2. Attempt all questions.
3. Please convert the document to a PDF before submitting.
4. Please try not to use existing online memo to answer these questions

1. Give an example of where an address of an entity E needs to be further resolved into another
address to actually access E. [2]

ANSWER
IP addresses are used to address hosts, but to access a host, the IP address
needs to be resolved to an Ethernet address

2. Would you consider a URL such as http://www.acme.org/index.html and


http://www.acme.nl/index.html to be location independent? Expain your answer. [3]

ANSWER
Yes, the name of the entity is independent of its address. Based on the name,
nothing can be said about the associated entity’s address

1|Page
Monday, 14th of December 2020

3. Give at least four examples of true identifiers. [2]

ANSWER
A true identifier is one that is unique, that will never change. For example a MAC
address or port on a single machine.

4. State the three properties of true identifiers. [3]

ANSWER
An identifier refers to at most one entity.
Each entity is referred to by at most one identifier.
An identifier always refers to the same entity.

5. Is an identifier allowed to contain information on the entity it refers to? Explain the of the
consequence. [4]

ANSWER

Yes, but that information cannot be allowed to change as it would mean the
identifier also needs to be changed. For example, a MAC address often contains
information about manufacture of the device.

6. List the two simple solutions for locating an entity with regards to flat naming. [2]

ANSWER:
Broadcasting and Multicasting
Forwarding pointers

2|Page
Monday, 14th of December 2020

7. Discuss the listed solutions from 6. above in terms of their strengths and weaknesses. [9]

ANSWER

forwarding Pointers

• Advantage:
Dereferencing can be made transparent to client follow the pointer chain
• Geographical scalability problems:
Chain can be very long for highly mobile entities

Multicasting:

Does not affect all computers on the network.


Can prevent unwanted message transmission and avoids clogging of the network.
Removes the overload burden of client computers who are not interested in that multicast
traffic in processing unwanted data packets.

WEAKNESS:
Muticasting has no reliabilty, flow control or error recovery functions in UDP.
Since TCP window mechanism is not availabe in UDP, possibility of congestion is there in
multicast.
Broadcasting becomes inefficient when the network grows. Not only is network bandwidth
wasted by request messages, but, more seriously, too many hosts may be interrupted by
requests they cannot answer.

3|Page
Monday, 14th of December 2020

8. Figure 4 shows bank account operations in which transaction V transfers a sum from account
A to B and transaction W invokes the branchTotal method to obtain the sum of the balances
of all the accounts in the bank. The balances of the two bank accounts, A and B, are both
initially $200.

Figure 1

a. Identify and describe the concurrency control problem exhibited in Figure 1. [2]

b. Show the serially equivalent interleaving of Transactions V and W in Figure 1. [3]

9. Why are concurrency control protocols designed? [2]

ANSWER
To coordinate execution so that the VIEW or effect from the database's perspective is the
same as if the concurrently executing transactions were executed in a serial fashion.

10. All concurrency control protocols are based on serial equivalence and are derived from three
rules of conflicting operations. What are these rules? [6]

ANSWER

11. Explain how the granting of locks is implemented. [3]

ANSWER
The granting of locks is implemented by a separate object in the server that is called the lock
manager. The lock manager provides setLock and unLock operations for use by the server.
The lock manager holds a set of locksEach lock is an instance of the class Lock and is associated
with a particular object.

4|Page
Monday, 14th of December 2020

12. How can role changes be expressed in an access control matrix? [3]

protrction domains in general can be viewed as objects with a single operation .


Some more advanced approaches include allowing back to previous roles.

13. It is required that transactions delay both their read and write operations so as to avoid both
‘dirty reads’ and ‘premature writes’. Explain this type of transaction? [3]

14. Describe how a non-recoverable situation could arise if write locks are released after the last
operation of a transaction but before its commitment. [2]

15. State and discuss the two aspects of atomicity in distributed systems transactions.
[4]

16. Suppose you were asked to develop a distributed application that would allow teachers to set
up exams. Give at least three statements that would be part of the security policy for such an
application. [6]

17. Would it be safe to join message 3 and message 4 in the authentication framework shown in
Figure 1? Justify your answer.

Figure 1 [2]

18. Why is it not necessary in Figure 2 for the KDC to know for sure it was talking to Alice
when it receives a request for a secret key that Alice can share with Bob?

answer

Key: Because the secret key between Alice and Bob is encrypted by the shared key between the
KDC and Alice, which means that only Alice can decrypt the message.

5|Page
Monday, 14th of December 2020

Figure 2 [2]

19. Access rights of objects with reference to subjects are enforced using Access Control Matrix
approaches like the Access Control List and Capabilities. Discuss their differences. [3]

20. Discuss any two advantages and two disadvantages of using centralized servers for key
management. [8

advantage:
it is relatively easy to secure several servers instead of all clients. Easy to change to
shared keys between members. Can be trusted by all members.

Using a centralized server allows efficient storage and maintenace facilities.


Simplicity.

Disadvantage:

server becoming a bottleneck with respect to performance and availability


scalability issue. Once a center server is corrupted, it is a disaster for the group.

6|Page
Monday, 14th of December 2020

-------------------------------------------------------------------------------------------------------------------------

Wish You All The Best

7|Page
Monday, 14th of December 2020

UNIVERSITY OF ZULULAND

FACULTY SCIENCE AND AGRICULTURE

DEPARTMENT OF COMPUTER SCIENCE

Assessment III

COURSE CODE: SCPS 312 Distributed Systems Development

Examiner: Ms I. N. Ezeji
Moderator: Mr. P Tarwireyi
Duration: 2 hours
75 marks

Instructions

1. This paper is three (3) pages long.


2. Attempt all questions.
3. Please convert the document to a PDF before submitting.
4. Please try not to use existing online memo to answer these questions

1. Give an example of where an address of an entity E needs to be further


resolved into another address to actually access E.
[2]

ANSWER
IP addresses are used to address hosts, but to access a host, the IP
address
needs to be resolved to an Ethernet address
a DNS server ask for the IP address associated with "www.example.com".
Then DNS server replies with the corresponding IP address (e.g., 192.168.1.1).

1|Page
Monday, 14th of December 2020

2. Would you consider a URL such as http://www.acme.org/index.html and


http://www.acme.nl/index.html to be location independent? Expain your
answer. [3]

NO, In the given example, the URLs contain specific domain names (acme.org and acme.nl) that indicate the
specific location or organization associated with them.

The key idea is that, by considering just the name (domain), you should not be able to deduce the precise
address (geographical or network) of the entity

ANSWER
Yes, the name of the entity is independent of its address. Based on
the name,
nothing can be said about the associated entity’s address

3. Give at least four examples of true identifiers. IDnumber


EmployeeNumber
[2] VIN number
TableName
ANSWER
A true identifier is one that is unique, that will never change. For ex-
ample a MAC
address or port on a single machine.

4. State the three properties of true identifiers.


[3]

ANSWER
An identifier refers to at most one entity.
Each entity is referred to by at most one identifier.

2|Page
Monday, 14th of December 2020

An identifier always refers to the same entity.

5. Is an identifier allowed to contain information on the entity it refers to?


Explain the of the consequence.
[4]

ANSWER
for consistency
Yes, but that information cannot be allowed to change as it would mean
the
identifier also needs to be changed. For example, a MAC address often
contains
information about manufacture of the device.

6. List the two simple solutions for locating an entity with regards to flat
naming. [2] Broadcasting
Forwarding pointers

ANSWER:
Broadcasting and Multicasting
Forwarding pointers

7. Discuss the listed solutions from 6. above in terms of their strengths and
weaknesses. [9]
BROADCASTING
ANSWER
Good for message delivery Congestion on big scale netwk
Simplicity on implementation Limited scaling
forwarding Pointers Suitable for small network High security risk

Forwarding pointers 3|Page


Efficient and scalable Complex
High security Latency
Flexible Rules misconfiguration
Monday, 14th of December 2020

• Advantage:
Dereferencing can be made transparent to client follow the pointer chain
• Geographical scalability problems:
Chain can be very long for highly mobile entities

Multicasting:

Does not affect all computers on the network.


Can prevent unwanted message transmission and avoids clogging of the
network.
Removes the overload burden of client computers who are not interested in that
multicast traffic in processing unwanted data packets.

WEAKNESS:
Muticasting has no reliabilty, flow control or error recovery functions in UDP.
Since TCP window mechanism is not availabe in UDP, possibility of congestion
is there in multicast.
Broadcasting becomes inefficient when the network grows. Not only is network
bandwidth wasted by request messages, but, more seriously, too many hosts
may be interrupted by requests they cannot answer.

4|Page
Monday, 14th of December 2020

8. Figure 4 shows bank account operations in which transaction V transfers


a sum from account A to B and transaction W invokes the branchTotal
method to obtain the sum of the balances of all the accounts in the
bank. The balances of the two bank accounts, A and B, are both initially
$200.

Figure 1

a. Identify and describe the concurrency control problem exhibited in


Figure 1. [2]

b. Show the serially equivalent interleaving of Transactions V and W


in Figure 1. [3]

9. Why are concurrency control protocols designed?


[2]

ANSWER
To coordinate execution so that the VIEW or effect from the database's
perspective is the same as if the concurrently executing transactions
were executed in a serial fashion.

10. All concurrency control protocols are based on serial equivalence


and are derived from three rules of conflicting operations. What are these
rules? [6]

5|Page
Monday, 14th of December 2020

ANSWER

11. Explain how the granting of locks is implemented.


[3]

ANSWER
The granting of locks is implemented by a separate object in the server that
is called the lock manager. The lock manager provides setLock and unLock
operations for use by the server.
The lock manager holds a set of locksEach lock is an instance of the class Lock
and is associated with a particular object.

12. How can role changes


be expressed in an access control matrix? [3]

protrction domains in general can be viewed as objects with a single


operation .
Some more advanced approaches include allowing back to previous roles.

13. It is required that transactions delay both their read and write
operations so as to avoid both ‘dirty reads’ and ‘premature writes’.
Explain this type of transaction? [3]

14. Describe how a non-recoverable situation could arise if write locks


are released after the last operation of a transaction but before its
commitment. [2]

15. State and discuss the two aspects of atomicity in distributed systems
transactions.
[4]

6|Page
Monday, 14th of December 2020

16. Suppose you were asked to develop a distributed application that


would allow teachers to set up exams. Give at least three statements that
would be part of the security policy for such an application.
[6]

17. Would it be safe to join message 3 and message 4 in the authenti-


cation framework shown in Figure 1? Justify your answer.

Figure 1 [2]

18. Why is it not necessary in Figure 2 for the KDC to know for sure it
was talking to Alice when it receives a request for a secret key that Alice
can share with Bob?

answer

Key: Because the secret key between Alice and Bob is encrypted by the shared
key between the KDC and Alice, which means that only Alice can decrypt the
message.

Figure 2 [2]

19. Access rights of objects with reference to subjects are enforced us-
ing Access Control Matrix approaches like the Access Control List and
Capabilities. Discuss their differences. [3]

7|Page
Monday, 14th of December 2020

20. Discuss any two advantages and two disadvantages of using


centralized servers for key management.
[8

ANSWER:

Centralized auditing and monitoring

allows to flexibly manage a very large number of keys throughout their entire lifecycle

A fully automated and centralized key management system, such as used by MasterCard, allows a

business to maintain their secure infrastructure while significantly reducing costs and improving

operational efficiency.

Centralized policy management

A centralized and granular cryptographic policy can enable seamless updates for all necessary

cryptographic functions without any changes in the application code. Implementinging centralized

policy enforcement where the system collects all relevant information in a single place for easy

audit and in human-readable form makes demonstration of compliance with internal and external

policies a more straightforward task.

8|Page
Monday, 14th of December 2020

advantage:
it is relatively easy to secure several servers instead of all clients. Easy to
change to shared keys between members. Can be trusted by all members.

Using a centralized server allows efficient storage and maintenace facilities.


Simplicity.

Disadvantage:

server becoming a bottleneck with respect to performance and availability


scalability issue. Once a center server is corrupted, it is a disaster for the
group.

9|Page
Monday, 14th of December 2020

------------------------------------------------------------------------------------------------------
-------------------

Wish You All The Best

10 | P a g e

Powered by TCPDF (www.tcpdf.org)


Go over number nine

Assessment Three

1] Identify the three Communication primitives of request –reply protocol and explain them.

A: They are: doOperation, getRequest and sendReply.

The doOperation method is used by clients to invoke remote operations.

getRequest is used by a server process to acquire service requests,

When the server has invoked the specified operation, it then uses sendReply to send the reply message to
the client.
2] Indirect communication avoids direct coupling and hence inherits interesting properties. Two key
properties of this scheme are space and time uncoupling. Perform a comparative study between space and
time uncoupling.

Time Coupled Time Uncoupled

Space Coupling Properties: Communication Properties: Communication


directed towards a given directed towards a given
receiver or receivers; receivers receiver or receivers; sender
must exist at that time. e.g. and receivers can have
Remote invocation, Message different life times
Passing

Space Un-Coupling Properties: Sender does not Properties: Sender does not
need to know the identity of need to know the identity of
the receiver; receiver must the receiver ; sender and
exist at that time. E.g. IP receiver can have independent
multicast times e.g. direct
communication paradigms

3] Group communication is an important building block for reliable distributed systems. Identify the
four key areas of application.

• The reliable dissemination of information to potentially large numbers of clients, including in the financial
industry, where institutions require accurate and up-to-date access to a wide variety of information sources;
• Support for collaborative applications, where again events must be disseminated to multiple users to
preserve a common user view – for example, in multiuser games.
• Support for a range of fault-tolerance strategies, including the consistent update of replicated data.
Go over number nine

• Support for system monitoring and management, including for example load balancing strategies.

4] Describe a scenario in remote invocation in which request-reply protocols are required.


RRP are used in environments where overheads of communication must be minimized – for example, in
embedded systems.

5] Describe the two components of message identifiers in request-reply communication.


A:
RequestId, which is taken from an increasing sequence of integers by the sending process;
An identifier for the sender process, for example, its port and Internet address.
6] What are the three protocols used for implementing various types of request behaviour? Which
of these protocols can be used when the client requires no confirmation that the operation has
been executed?

• The request (R) protocol;


• The request-reply (RR) protocol;
• The request-reply-acknowledge reply (RRA) protocol.

The request protocol may be used when the client requires no confirmation that the operation has
been executed.

7] In publish–subscribe systems, explain how channel based approaches can trivially be


implemented using a group communication service?

They subscribe to a given channel and receive all events published by that channel. This is a less
optimal strategy because each and every event within a topic can have some attributes that they
describe.

8] Explain the mechanism of the request-reply (RR) protocol. How does the RR protocol
compensate for not using the acknowledge message?

Request-reply protocols are designed to support client-server communication. RRP compensate for
not using the acknowledge message in that the server’s reply message is regarded as an
acknowledgement of the client’s request message.

9] List and discuss any three key elements of a group communication management.

Failure detection: The service monitors the group members not only in case they should crash, but also in
case they should become unreachable because of a communication failure.
Go over number nine

Notifying members of group membership changes: The service notifies the group’s members when a
process is added, or when a process is excluded

Performing group address expansion: When a process multicasts a message, it supplies the group identifier
rather than a list of processes in the group.

10] What are the two implementation models of publish-Subscribe System?

Heterogeneity & Asynchronicity.


Distributed Tutorial questions

1.3) A user arrives at a railway station that she has never visited before, carrying a PDA that is
capable of wireless networking. Suggest how the user could be provided with information about the
local services and amenities at that station, without entering the station’s name or attributes. What
technical challenges must be overcome?

The user must be able to acquire the address of locally relevant information as automatically as
possible. One method is for the local wireless network to provide the URL of web pages about the
locality over a local wireless network. For this to work: (1) the user must run a program on her
device that listens for these URLs, and which gives the user sufficient control that she is not
swamped by unwanted URLs of the places she passes through; and (2) the means of propagating the
URL (e.g. infrared or an 802.11 wireless LAN) should have a reach that corresponds to the physical
spread of the place itself
1.4) Explain the use of distributed systems as a utility.

1.5 The INFO service manages a potentially very large set of resources, each of which can be
accessed by users throughout the Internet by means of a key (a string name). Discuss an approach to
the design of the names of the resources that achieves the minimum loss of performance as the
number of resources in the service increases. Suggest how the INFO service can be implemented so
as to avoid performance bottlenecks when the number of users becomes very large.

Algorithms that use hierarchic structures scale better than those that use linear structures.
Therefore the solution should suggest a hierarchic naming scheme. To allow for large numbers of
users, the resources are partitioned amongst several servers. To avoid performance bottlenecks the
algorithm for looking up a name must be decentralised. That is, the same server must not be
involved in looking up every name.

What is the range of techniques covered by remote invocation? Briefly explain each technique.

• The remote procedure call (RPC): provides higher-level support for programmers by extending the
concept of a procedure call to operate in a networked environment.

• Remote method invocation: method invocations are betwenn objects in different proceses where
client objects may invoke methods of remote objects residing in another process running in another
computer also in the same way as local method invocations.
What is a mobile agent? How can it be a potential security treat?
A mobile agent is a running program (including both code and data) that travels from one computer to another
in a network carrying out a task on someone’s behalf, such as collecting information, and eventually returning
with the results.
What are the two variants of the interaction model in distributed systems? How do they differ?
Synchronous distributed systems and Synchronous distributed systems.

The term protocol is used to refer to a well-known set of rules and formats to be used for communication
between processes in order to perform a given task. The definition of a protocol has two important parts to it:
• Specification of the sequence of messages that must be exchanged;
• Specification of the format of the data in the messages.
Describe the three alternative approaches to external data representation and marshalling.
• CORBA’s common data representation: is concerned with an external representation for the
structured and primitive types that can be passed as the arguments and results of the method
invocations.
• Java’s object serialization: concerned with the flattening and external data representation of any
single object or tree that may need to be transmitted in a message or stored on a disk.
• XML (Extensible Markup Language): represents a textual format for representing structured data.
What are the different ways in which CORBA can represent constructed types?

Short, long, unsigned short, unsigned long, float, double, char, Boolean etc.

CORBA supports passing of non-CORBA objects by value. What are the properties of these non
CORBA objects? What are their limitations?

These non CORBA objects are object-like in the sense that they possess both attributes and methods.

What are the CORBA services used for?


They are used for naming, notifying, Trading, security and transactions
SCPS 312 Questions for the examination

1) Describe the advantages of the uniformity of URI and URL.

A: URI are uniform in that their syntax incorporates that of indefinitely many individual types of resource
identifiers. The advantage of uniformity is that it eases the process of introducing new types of identifiers as
well as using existing types of identifiers in new context, without disrupting existing usage.

2) Discuss the problem associated with name services in a distributed system. How can this
be solved?

3) explain why a name space is important for a particular service. What is the advantage of a
hierarchic name space?

A: Another important aspect of the implementation of a name service is the use of replication and caching.
Both of these assist in making the service highly available, and both also reduce the time taken to resolve a
name. One important advantage of a hierarchy is that it makes large name spaces more manageable.

4) Discuss the shortcomings of the original Internet naming scheme, in which all host names and addresses were
held in a single central master file.

A: This original scheme was soon seen to suffer from three major shortcomings:
• It did not scale to large numbers of computers.
• Local organizations wished to administer their own naming systems.
• A general name service was needed – not one that serves only for looking up
computer addresses.

Transactions

1)State the five properties of a transaction.

A: Atomicity Consistency Isolation Durability Serialization (ACIDS).

2) State the four main problems that may arise in concurrent executions of transactions.

• The lost update problem


• Inconsistent retrievals
• Serial equivalence
• Conflicting operations

3)
Final Exam 2015

Section A[40 Marks]

1. Distributed systems are going through a period of significant change, which can
be traced back to a number of influential trends. Describe three of these trends?

Distributed systems are undergoing a period of significant change and this can
be traced back to a number of influential trends:
• the emergence of pervasive networking technology;
• the emergence of ubiquitous computing coupled with the desire to
support user mobility in distributed systems;
• the increasing demand for multimedia services;
• the view of distributed systems as a utility.

2. What are the two variants of the interaction model in distributed systems? On
what points do they differ?

Synchronous distributed systems: it is the one in which the following bounds are
defined:

• The time to execute each step of a process has known lower and upper
bounds.
• Each message transmitted over a channel is received within a known bounded
time.
• Each process has a local clock whose drift rate from real time has a known
bound.

Asynchronous distributed systems: An asynchronous distributed system is one in


which there are no bounds on:

• Process execution speeds – for example, one process step may take only a
picosecond and another a century; all that can be said is that each step may take
an arbitrarily long time.
• Message transmission delays – for example, one message from process A to
process B may be delivered in negligible time and another may take several
years. In other words, a message may be received after an arbitrarily long time.
• Clock drift rates – again, the drift rate of a clock is arbitrary.

Synchronous distributed systems: The time to execute each step of a process has
known lower and upper bounds.

Asynchronous Distributed systems: The asynchronous model allows no assumptions


about the time intervals involved in any execution.
3. How does adaptive routing ensure the best route of communication between two
points in the network?

The best route for communication between two points in the network is re-evaluated
periodically, taking into account the current traffic in the network and any faults such
as broken connections or routers.

4. What is socket abstraction? Name the main protocols used interprocess


communication

Socket abstraction provides an endpoint for communication between processes.


The main protocols are: TCP and UDP.

5. Describe the three alternative approaches to external data representation and


marshaling

6. Explain the mechanism of the request-reply (RR) protocol. How does RR


protocol compensate for not using the acknowledge message?

7. Group communication is an important building block for reliable distributed


systems. Identify any three key areas of application?

8. Explain why a name space is important for a particular name service. What is
the advantage of a hierarchic name space?

9. What are the two new name variants to the invocation semantics used by
asynchronous RMI(Remote Method Invocation)?

10. How is catching useful in placement strategies?

Section B[60 marks]


(Attempt 3 questions of your choice in this section)

1. Scalability is considered one of the issues faced in a distributed system


a) Describe what is meant by a scalable system
b) Describe four challenges presented by scalability
c) Describe three different techniques that can be applied to achieve scalabilith in
Distributed System.[3, 8, 9]
2. The architecture of a system is its structure in terms of separately specified components
(elements) and their relationship

a) List and discuss the four architectural elements in distributed systems.

b) List any four architectural patterns in distributed systems

c) Describe any two architectural patterns mentioned in b)[12, 4, 4]

3. Indirect communication is defined as communication between entities in a distributed


system through an intermediary with no direct coupling between the sender and the receiver(s)

a) What are the two inherent characteristic of indirect communication?

b) List and discuss any three key elements of a group communication management.

c) In publish-subscribe system, explain how channel-based approaches can trivially be


implemented using a group communication service?

d) List any two subscription filter model with regard to publish-subscribe system.[6,9,3,2]

4. In order to enforce a security policy, a security mechanism is required

a) Explain what a security policy is

b) List and discuss any four security mechanism

c) Suppose you were asked to develop a distributed application that will allow teachers to
set up exams. Give at least three statements that would be part of the security policy for such
an application.

• The requirements would include that students should not be able to access exams before a
specific time.
• Any teacher accessing an exam before the actual examination date should be authenticated.
• There may be a restricted group of people that should be given read access to any exam in
preparation, whereas only the responsible teacher should be given full access.
Final Exam 2015

Section A[40 Marks]

1. Distributed systems are going through a period of significant change, which can
be traced back to a number of influential trends. Describe three of these trends?

//Old Solution

The emergence of pervasive networking technology: the rise of different types of networking
technologies such as WiFi, WiMAX, Bluetooth etc.
Distributed multimedia systems: ability to support a wide variety of multimedia systems such
as audio video and images etc.
Mobile and Ubiquitous computing: Mobile computing is the performance of a computing task
while the user is on the move, or visiting places other than their usual environment. Ubiquitous
computing is the harnessing of small, cheap computational devices that are present in the
users’ physical environments including home, office, even natural settings.

//Slide 11 chapter 1

The Book

Distributed systems are undergoing a period of significant change and this can
be traced back to a number of influential trends:
• the emergence of pervasive networking technology;
• the emergence of ubiquitous computing coupled with the desire to
support user mobility in distributed systems;
• the increasing demand for multimedia services;
• the view of distributed systems as a utility.

2. What are the two variants of the interaction model in distributed systems? On
what points do they differ?

//Old Solution

Synchronous distributed systems: The time to execute each step of a process has
known lower and upper bounds.
Asynchronous Distributed systems: The asynchronous model allows no assumptions
about the time intervals involved in any execution.

The Book

Synchronous distributed systems: it is the one in which the following bounds are
defined:

• The time to execute each step of a process has known lower and upper
bounds.
• Each message transmitted over a channel is received within a known bounded
time.
• Each process has a local clock whose drift rate from real time has a known
bound.

Asynchronous distributed systems: An asynchronous distributed system is one in


which there are no bounds on:

• Process execution speeds – for example, one process step may take only a
picosecond and another a century; all that can be said is that each step may take
an arbitrarily long time.
• Message transmission delays – for example, one message from process A to
process B may be delivered in negligible time and another may take several
years. In other words, a message may be received after an arbitrarily long time.
• Clock drift rates – again, the drift rate of a clock is arbitrary.

3. How does adaptive routing ensure the best route of communication between two
points in the network?

//Old Solution

The best route for communication between two points in the network is re-evaluated
periodically, taking into account the current traffic in the network and any faults such
as broken connections or routers.

The Book
the best route for communication between two points in the network is re-
evaluated periodically, taking into account the current traffic in the network
and any faults such as broken connections or routers.

4. What is socket abstraction? Name the main protocols used interprocess


communication

//Old Solution

Socket abstraction provides an endpoint for communication between processes.


The main protocols are: TCP and UDP.

//Mnaka
Is a socket that allow communication between two different processes on the same or different
machines. It provides an endpoint between communication TCP and UDP

5. Describe the three alternative approaches to external data representation and


marshaling

• Three alternative approaches to external data representation and marshalling


are:
– CORBA’s common data representation (CDR):
• Concerned with an external representation for the structured and
primitives types.
• Can be used by a variety of programming languages.
– Java’s object serialization:
• Concerned with the flattening and external data representation of
any single object or tree of objects.
• Used only by Java.
– XML(Extensible Markup Language)
• Defines textual format for representing data
• Can be used by a variety of languages on a variety of platforms

6. Explain the mechanism of the request-reply (RR) protocol. How does RR


protocol compensate for not using the acknowledge message?
Request-reply protocols are designed to support client-server communication. RRP
compensate for not using the acknowledge message in that the server’s reply message is
regarded as an acknowledgement of the client’s request message.

7. Group communication is an important building block for reliable distributed


systems. Identify any three key areas of application?

• The reliable dissemination of information to potentially large numbers of clients, including in the financial
industry, where institutions require accurate and up-to-date access to a wide variety of information sources;
• Support for collaborative applications, where again events must be disseminated to multiple users to preserve
a common user view – for example, in multiuser games.
• Support for system monitoring and management, including for example load balancing strategies.

//CHAPTER 6 PAGE 233 SECTION 6.2

8. Explain why a name space is important for a particular name service. What is
the advantage of a hierarchic name space?

Namespaces provide a mechanism for scoping names


One important advantage of a hierarchy is that it makes large name spaces more manageable

9. What are the two new name variants to the invocation semantics used by
asynchronous RMI(Remote Method Invocation)?

• callback, in which a client uses an extra parameter to pass a reference to a callback


with each invocation so that the server can call back with the results;
• polling, in which the server returns a valuetype object that can be used to poll or
wait for the reply.

10. How is caching useful in placement strategies?

• A cache is a store of recently used data objects that is closer to one client or a
particular set of clients than the objects themselves

Section B [60 marks]


(Attempt 3 questions of your choice in this section)

1. Scalability is considered one of the issues faced in a distributed system


a) Describe what is meant by a scalable system

Distributed system is described as scalable if it remains effective when there is a


significant increase in the number of resources and the number of users.

b) Describe four challenges presented by scalability

• Controlling the cost of physical resources


• As demand for resources grow, it should be possible to extend the system
at reasonable cost, to meet the demand
• Controlling performance loss
• As the number of clients to a resourse increase , the performance loss
should be reasonable
• Preventing software resources from running out
• E.g. IPv4, hence the introduction of IPv6
• Avoiding performance bottlenecks
• Performance bottlenecks arise from centralization (centralized servers,
data and algorithms)
• Decentralization is the solution

c) Describe three different techniques that can be applied to achieve scalability in


Distributed System.[3, 8, 9]

Hiding communication latencies


• Addresses geographical transparency
• Avoids waiting for responses from remote machines
Distribution
• It involves splitting a component into smaller parts, and
subsequently spreading those parts across the system.
Replication
• It involves keeping copies of a component across a distributed
system

2. The architecture of a system is its structure in terms of separately specified components


(elements) and their relationship

a) List and discuss the four architectural elements in distributed systems.

Communicating Entities
• The entities that communicate in a distributed system are typically processes, leading to the
prevailing view of a distributed system as processes coupled with appropriate interprocess
communication paradigms.
--From the programming perspective and more problem oriented abstructions have been proposed
*Objects, components, and web services

Communication paradigms

There are three types of communication paradigms

*Interprocess communication

-Refers to the relative low-level support for including message passing


premitives, direct access to the API offered by internet protocol(socket
programming) and support multicast communication.

*Remote invocation

-respresent the most common communication paradigm in distributed


systems, caring a range of tchniques based on a two way exchange
between communicating entities in distributed system and resulting in the
calling of a remote operation, procedure, or method

*Indirect communication

Roles and Responsibilities

-Objects, components and services, including web services interact with each other to
perform a useful activity. Eg to support a chat session.

-two architectural styles stemming from the role of the individual processes

Ie. Client server and peer-to-peer

Placement
• Deals with how the object/components/web services are mapped on to the underlying
physical distributed infrastructure

• Placement has a bearing on the properties of a DS, which include: performance,


availability, reliability, security, etc

• Placement design strategies

• Mapping services to multiple servers

• Caching

• Mobile code
• Mobile agents

b) List any four architectural patterns in distributed systems

• Layering

• Tiered architecture

• Thin clients

• Proxy pattern

• Brokerage pattern

• Reflection

d) Describe any two architectural patterns mentioned in b)[12, 4, 4]

The proxy pattern

a. It is designed to support location transparency in distributed systems in RPC and


RMI

b. The proxy offers exactly the same interface as the remote object.

The brokerage Pattern

c. The brokerage pattern is an architectural pattern for supporting interoperability in


potentially complex distributed infrastructures

d. The pattern consists of a trio; the service requestor, service provider and the
brokerage

Reflection pattern

e. It is meant to support introspection and intercession

f. Introspection: is the dynamic discovery of properties of the system

g. Intercession: is the ability to dynamically modify structure or behaviour

h. Reflection has been widely used in the field of reflective middleware to support
configurable and reconfigurable middleware architecture
3. Indirect communication is defined as communication between entities in a distributed
system through an intermediary with no direct coupling between the sender and the receiver(s)

a) What are the two inherent characteristic of indirect communication?

• Space uncoupling
• Time uncoupling

b) List and discuss any three key elements of a group communication management.

Failure detection: The service monitors the group members not only in case they should crash, but also in case
they should become unreachable because of a communication failure.

Notifying members of group membership changes: The service notifies the group’s members when a process is
added, or when a process is excluded

Performing group address expansion: When a process multicasts a message, it supplies the group identifier
rather than a list of processes in the group.

c) In publish-subscribe system, explain how channel-based approaches can trivially be


implemented using a group communication service?

They subscribe to a given channel and receive all events published by that channel. This is a less
optimal strategy because each and every event within a topic can have some attributes that they
describe.

d) List any two subscription filter model with regard to publish-subscribe system.[6,9,3,2]

Heterogeneity & Asynchronicity.

4. In order to enforce a security policy, a security mechanism is required

a) Explain what a security policy is:

A security policy is a specification of the security requirements.

b) List and discuss any four security mechanism

o Encryption
▪ Transforms the data into a form the attacker cannot understand
▪ It also helps to check whether data has been modified
o Authentication
▪ Used to verify the claimed identity of the user, client, server, host or any other
entity
o Authorisation
▪ Checks whether the client is authorised to perform the action requested
o Auditing
▪ Used to trace which clients accessed what and in which way.

c) Suppose you were asked to develop a distributed application that will allow teachers to
set up exams. Give at least three statements that would be part of the security policy for such
an application.

• Students should not be able to access exams before a specific time.


• Any teacher accessing an exam before the actual examination date should be authenticated.
• Also, there may be a restricted group of people that should be given read access to any exam in
preparation, whereas only the responsible teacher should be given full access.
6 Consider Hypothethical same question on 2014 exam.. and has an answer
Examination November 2014

Section A

1. How does adaptive routing ensure the best route of communication between two
points in the network?

The best route for communication between two points in the network is re-evaluated
periodically, taking into account the current traffic in the network and any faults such as
broken connections or routers.

2. What is socket abstraction? Name the main protocols used in Interprocess communication.

Socket abstraction provides an endpoint for communication between processes.


The main protocols are: TCP and UDP.

3. Describe two components of message identifiers in request-reply communication.


• A requestID which is taken from an increasing sequence of integers by the sending
process;
• An identifier for the sender process, for example, its port and Internet address.

4. In Synchronous communication how do the send and receive operation work?


Synchronous: both send and receive are blocking operations:
• Sender blocks until a receive is issued.
• Receiver blocks until a message arrives.
5. All concurrency control protocols are based on serial equivalence and are derived from three rules
on conflicting operations. What are these rules? [6]

• Locks are used to order transactions that access the same objects according to the order of arrival
of their operations at the objects.
• Optimistic concurrency control allows transactions to proceed until they are ready to commit,
whereupon a check is made to see whether they have performed conflicting operations on objects.
• Timestamp ordering uses timestamps to order transactions that access the same objects according
to their starting times.
6. To enforce a security policy, a security mechanism like authentication, encryption, authorisation
and auditing are required. What is a security policy? [2]

A security policy is a specification of the security requirements.


Section B
11. Consider a hypothetical car hire company which has contracted you to develop a
distributed car hire system.
a) Sketch out a three-tier solution to the provision of their underlying distributed car hire
service.

b) Use your design in (a) to illustrate the benefits and drawbacks of a three-tier solution
considering the following issues:
i. Performance: this approach introduces extra latency in that requests must go from the
web-based interface to the middle tier and then to the database (and back).

ii. Scalability: processing load is also spread over three machines (especially over the middle
tier and the database) and this may help with performance. For this latter reason, the three-
tier solution may scale better.

iii. Dealing with failure: in terms of failure, there is an extra element involved and this
increases the probability of a failure occurring in the system.

iv. Maintaining the software over time.: the middle tier only contains application logic
and this should therefore be easier to update and maintain.
c) Name two Technologies that you may use to implement the system.

Client Server, three-tiered architecture

[6,12,2]
12. Scability problems covered!! ☺.

The design of scalable distributed systems presents the following challenges:

Controlling the cost of physical resources: As the demand for a resource grows, it should
be possible to extend the system at a reasonable cost.
Controlling performance loss: the time taken to access hierarchical distributed data is O(log
n) {n is the size of the data}.For a system to be scalable it should not be worse than this.

Preventing software resources from running out: example is IP addresses running out in a
large internet. IPv4, hence the introduction of IPv6

Avoiding performance bottlenecks: Performance bottlenecks arise from centralization


(centralized servers, data and algorithms). Decentralization is the solution

[3,8,9]
13. Computer networks are based on the following Principles: packet transmission, Data
streaming, Packet switching, protocol layering, and Routing.
a) What is the use of a switching system?
To transmit information between two arbitrary nodes
b) List the four different types of switching used in computer networks.
BROADCAST, CIRCUIT SWITCHING, POCKET SWITCHING, FRAME RELAY.

c) Describe any three switching scheme mentioned above.


DONE!! ☺☺
• Broadcast: Broadcasting is a transmission technique that involves no switching. Everything is
transmitted to every node, and it is up to potential receivers to notice transmissions
addressed to them.

• Circuit switching: plain old telephone system.

• Packet switching: instead of making and breaking connections to build circuits, a store and
forward network just forwards packets from their source to their destination.

• Frame Relay: They switch networks on the fly without having to store them.

d) Describe routing mechanism used in Mobile IP.


Figure 3.20 illustrates the MobileIP routing mechanism.
When an IP packet addressed to the mobile host’s home address is received at the home
network, it is routed to the HA. The HA then encapsulates the IP packet in a MobileIP packet
and sends it to the FA. The FA unpacks the original IP packet and delivers it to the mobile
host via the local network to which it is currently attached.
[2,4,6,8]

14. Remote Invocation is concerned with how processes communicate in a distributed system
using the following paradigms: Request-reply protocol, remote procedure call and remote
method invocation.
a) Describe the three paradigms.
Request-reply protocols: Request-reply protocols are effectively a pattern imposed on an
message-passing service to support client-server computing.
Remote procedure calls (RRP): Procedures in processes on remote computers can be called
as if they are procedures in the local address space.
Remote method invocation(RMI): method invocations between objects in different processes
where client objects may invoke methods of remote objects residing in another process and
running in another host.
b) Describe the two components of message identifiers in request=reply communication.
a requestId, which is taken from an increasing sequence of integers by the sending process.
an identifier for the sender process, for example, its port and Internet address.
c) Explain the mechanism of the request-reply (RR) protocol. How does the RR protocol
compensate for not using the acknowledge message?
DONE!! ☺
Request-reply protocols are designed to support client-server communication. RRP compensage
for not using the acknowledge message in that server’s reply message is regarded as an
acknowledgement of the client’s request message.
d) What are the three protocols used for implementing various types of request behaviour? Which
of these protocols can be used when the client requires no confirmation that the operation has
been executed?
• The request (R) protocol;
• The request-reply (RR) protocol;
• The request-reply-acknowledge reply (RRA) protocol.
The request protocol may be used when the client requires no confirmation that the operation
has been executed.

15. ] Indirect communication avoids direct coupling and hence inherits interesting properties. Two key properties
of this scheme are space and time uncoupling.

a)Describe space and time uncoupling.


Space uncoupling: in which the sender does not know or need to know the identity of the receiver(s), and vice
versa.
Time uncoupling: in which the sender and receiver(s) can have independent lifetimes.
b) Using a table, describe space and time uncoupling in DS. DONE! ☺

c) Explain how the loose coupling inherent in message queues can aid with Enterprise Application Integration.
loose coupling is when an application does not need to know the intimate details of how to reach
and interface with other applications

d) Based on your answer in c, consider to what extent this can be traced to time uncoupling, space uncoupling
or a combination of both.
NAME SERVICES

Name Services
Mba IN

1
NAME SERVICES

Topics

• Introduction
• Name Services and the Domain Name
System
• Directory Service

2
NAME SERVICES
Introduction
❖ Entities need ways through which they can be accessed
❖ This is done through entities which are known as Access
Points
❖ The name of an access point is known as an address
❖ So an address is special kind of a name that only refers to special entities
known as access points
❖ It is possible but not convenient to name an entity with the
name of its access point.
❖ An entity may easily change its access point or
❖ An access point may be reassigned to another entity
❖ It is desirable for a entity to be known by a separate name
independent of its access point.
❖ This also helps to support distribution.
❖ An entity can have more than one access point
NAME SERVICES

Introduction
❖ An identifier is a name that is uniquely identify an entity
and in interpreted only by computer programs
❖ Resources are accessed using identifier or reference
➢ An identifier can be stored in variables and retrieved from
tables quickly.
➢ Identifier includes or can be transformed to an address for an
object.
❖ E.g. Corba remote object reference.
❖ Properties of a true identifier:
• An identifier refers to at most one entity.
• Each entity is referred to by at most one identifier.
• An identifier always refers to the same entity
NAME SERVICES

Introduction
❖ A name is human-readable value (usually a string) that
can be resolved to an identifier or address.
❖ These are generally defined entirely by the user.
❖ Examples are: Internet domain name, file pathname.
• ./etc/passwd, http://www.cdk5.net
❖ For many purposes, names are preferable to identifiers
➢ The binding of the named resource to a physical location is
deferred and can be changed.
➢ They are more meaningful to users.
▪ Resource names are resolved by name services
➢ To give identifiers and other useful attributes.
NAME SERVICES
Resolving Names to addresses and identifiers
❖ Key Challenge
❖ How do we resolve names to identifiers and addresses
❖ A naming system maintains name-to-address binding which in
its simplest form is just a table of (name, address) pairs
❖ For distributed systems that span large networks a centralised
table will not work
NAME SERVICES
Uniform Resource Identifiers
❖The se came about with the need of representing
resources on the web
❖URI can be group into two:
➢ Uniform Resource Locators(URL)
❖These provide location information of a resource
❖Also specify the method used to access a resource
❖E.g. http://uzulu.ac.za
❖They are efficient identifiers
➢ Uniform Resource Names(URN)
❖These are pure resource names rather than locators
❖Look up is very difficult in URN
NAME SERVICES

Name Resolution
URL
http://www.cdk3.net:8888/WebExamples/earth.html

DNS lookup
Resource ID (IP number, port number, pathname)

138.37.88.61 8888 WebExamples/earth.html

ARP lookup
file
(Ethernet) Network address

2:60:8c:2:b0:5a Socket
Web server
Figure 1. Composed naming domains used to access a resource from a URL
Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
8
NAME SERVICES

Name Services
▪ A name service stores a collection of one
or more naming contexts, sets of bindings
between textual names and attributes for
objects such as computers, services, and
users.

9
NAME SERVICES
Name spaces

✓ Names are commonly organised into what are called name spaces
✓ Name spaces for structured names can be represented by a
directed graph with two types of nodes
– Leaf Nodes
• Represents a named entity
• Has no outgoing edges
• Stores information about the entity its representing, e.g. its
address
• It alternatively contains the state of an entity it represents
– Directory Nodes
• Has a number of outgoing nodes
• It stores a table in which an outgoing edge is represented as a
pair (edge label, node identifier)
NAME SERVICES

Relative and Absolute Path names


❖ Each path in a naming graph is referred to by a sequence of
labels corresponding to the edges in that path
N:<label-1, label-2,....label-n>
❖ Such a sequence is known as a path name;
❖ If the first node in the path is the root node of the naming
graph, it is called the absolute path name. Otherwise it is
known as a relative path;
❖ Names are only defined relative to a directory node
❖ A Global name is a name that denotes the same entity, no
matter where the name is used in the system.
❖ A local name is a name whose interpretation depends on
where the name is being used.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
NAME SERVICES

DNS name space

An example partitioning of the DNS name space, including


Internet-accessible files, into three layers.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
NAME SERVICES

Implementation of Name Resolution


▪ There are two ways to implement name resolution
▪ Iterative name resolution
▪ Assumes that the address of the root server is
known;
▪ At every step the respective nodes resolve the path
as far as it can and return the result to the client.
▪ Recursive name resolution
▪ Instead of passing the result to the client, the node
passes the result to the next name server it finds.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
NAME SERVICES

Iterative Name Resolution

The principle of iterative name resolution.

▪ Path name: root:<nl, vu, cs, ftp, pub, globe, index.html>


▪ URL: ftp://ftp.cs.vu.nl/pub/globe/index.html
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
NAME SERVICES

Recursive Name resolution


▪ DNS offers recursive navigation as an
option, but iterative is the standard
technique.
▪ Recursive navigation must be used in
domains that limit client access to their
DNS information for security reasons.

15
NAME SERVICES

Recursive Name Resolution

The principle of recursive name resolution.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
NAME SERVICES

Recursive Name Resolution


❖Advantages
❖Caching is more effective compared to
iterative name resolution;
❖Reduced communication cost ;
❖Disadvantages
❖It puts higher performance demands on each
name server

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
NAME SERVICES
Attribute-based Naming
❖There are cases were a user needs only to provide
the a description of a resource he wants to access;
❖Such cases can not be covered by Flat and structured
names;
❖There are many ways in which descriptions can be
provided, but the popular one is to describe an entity
in terms of (attribute, value) pairs;
❖This generally referred to as attribute-based naming;
❖In this an entity is seen as set of (attribute, value)
pairs;
❖It is up to the naming system to return one or more
attributes which meet the description
NAME SERVICES

Directory Services
❖ Attribute-based naming systems are also known as directory
services.
❖ The attributes are used for searching for entities
❖ The major challenge is definition of the appropriate set of
attributes
❖ To alleviate this challenge research has been conducted on
unifying the ways that resources can be described
❖ One such development id LDAP
❖ Directory services are usually centralised
❖ Distribution usually comes with compromised performance
NAME SERVICES

Hierarchical Implementations: LDAP


❖Common approach to distributed directory services is
to combine structured naming and attribute-based
naming
❖The Lightweight Directory Access Protocol(LDAP) was
developed to enable distribution of directory services
❖An LDAP consist of a number of records, usually
referred to as directory entries;
❖Each record is made up of a collection of (attribute ,
value) pairs, where each attribute has an associated
type.
NAME SERVICES

Hierarchical Implementations: LDAP (1)

A simple example of an LDAP directory entry using LDAP naming


conventions.
NAME SERVICES

Implementing LDAP
▪ Implementation of LDAP directory is similar to that of
DNS except that LDAP supports more lookup
operations;
▪ Searching in LDAP is generally expensive compared
to the DNS;
▪ One notable recent directory that extends the LDAP
idea for web service and grid computing is the
Universal Directory and Discovery Integration (UDDI).
Transactions and Concurrency Control
Introduction to Transactions and Concurrency Control

• A Transaction defines a sequence of server operations that is


guaranteed to be atomic in the presence of multiple clients
and server crash.
• All concurrency control protocols are based on serial
equivalence and are derived from rules of conflicting
operations.
▪ Locks used to order transactions that access the same object
according to request order.
▪ Optimistic concurrency control allows transactions to proceed until
they are ready to commit, whereupon a check is made to see any
conflicting operation on objects.
▪ Timestamp ordering uses timestamps to order transactions that
access the same object according to their starting time.
Banking Example

• Each account is represented by a remote object whose


interface Account provides operations for making deposits
and withdrawals and for enquiring about and setting the
balance.
• Each branch of the bank is represented by a remote object
whose interface Branch provides operations for creating a
new account, for looking up an account by name and for
enquiring about the total funds at that branch.
• Main issue: unless a server is carefully designed, its
operations performed on behalf of different clients may
sometimes interfere with one another. Such interference may
result in incorrect values in the object.
Banking Example interfaces

Operations of the Account interface


deposit(amount)
deposit amount in the account
withdraw(amount)
withdraw amount from the account
getBalance() -> amount
return the balance of the account
setBalance(amount)
set the balance of the account to amount

Operations of the Branch interface


create(name) -> account
create a new account with a given name
lookUp(name) -> account
return a reference to the account with the given name
branchTotal() -> amount
return the total of all the balances at the branch
Simple Synchronization without Transactions

• The use of multiple threads is beneficial to the performance. However,


multiple threads may access the same objects.

▪ For example, deposit and withdraw methods: the actions of two concurrent executions
of the methods could be interleaved arbitrarily and have strange effects on the instance
variables of the account object.

• Synchronized keyword can be applied to method in Java, so only one


thread at a time can access an object.

• If one thread invokes a synchronized method on an object, then that


object is locked, another thread that invokes one of the synchronized
method will be blocked.
Enhancing Client Cooperation by Signaling

• We have seen how clients may use a server as a


means of sharing some resources.
▪ E.g. some clients update the server’s objects and other
clients access them.
• However, in some applications, threads need to
communicate and coordinate their actions.
• E.g.
• Producer and Consumer threads.
• Clients sharing a resource
▪ Java uses Wait and Notify thread actions, to allow threads
ensure consistent access of “objects”.
Transactions

• Transaction are originally from database


management systems.
• Clients require a sequence of separate requests to a
server to be atomic in the sense that:
▪ They are free from interference by operations being
performed on behalf of other concurrent clients; and
▪ Either all of the operations must be completed successfully
or they must have no effect at all in the presence of server
crashes.
• Transaction applies to recoverable objects and are
intended to be atomic.
Atomicity

▪ There are two aspects to atomicity


1. All or nothing: a transaction either completes successfully, and
effects of all of its operations are recorded in the object, or it
has no effect at all.
▪ Failure atomicity: effects are atomic even when server crashes
▪ Durability: after a transaction has completed successfully, all its effects are saved
in permanent storage for recover later.
2. Isolation: each transaction must be performed without
interference from other transactions. The intermediate effects
of a transaction must not be visible to other transactions.
Transactions: ACIDS Properties
• Atomic: All or nothing. No intermediate states are
visible.
• Consistent: system invariants preserved, e.g., if there
were n dollars in a bank before a transfer transaction
then there will be n dollars in the bank after the transfer.
• Isolated: Two transactions do not interfere with each
other.
• Durable: The commit causes a permanent change.
• Serialisability: the results o concurrent transactions
should appear as if they were executed one after the
other

9
Example:A client’s banking transaction
Assume Each Operation Is Synchronized
Transaction T;
a.withdraw(100); The aim of any server that supports
transactions is to maximize concurrency.
b.deposit(100);
So,
c.withdraw(200); transactions are allowed to execute
b.deposit(200); concurrently if they would have the same
effect as serial execution.

Each transaction is created


and managed by a coordinator.

10
Operations in Coordinator interface

openTransaction() -> trans;


starts a new transaction and delivers a unique TID trans. This
identifier will be used in the other operations in the transaction.

closeTransaction(trans) -> (commit, abort);


ends a transaction: a commit return value indicates that the
transaction has committed; an abort return value indicates that
it has aborted.

abortTransaction(trans);
aborts the transaction.
Example:A client’s banking transaction

Transaction T
tid = openTransaction(); Coordinator Interface:
a.withdraw(tid,100);
openTransaction() -> transID
b.deposit(tid,100); closeTransaction(transID) ->
c.withdraw(tid,200); commit or abort
abortTransaction(TransID)
b.deposit(tid,200);
closeTransaction(tid) or
abortTransaction(tid)

12
Transaction life histories

Successful Aborted by client Aborted by server

openTransaction openTransaction openTransaction


operation operation operation
operation operation operation
server aborts
transaction
operation operation operation ERROR
reported to client
closeTransaction abortTransaction

If a transaction aborts for any reason (self abort or server abort), it must be guaranteed
that future transaction will not see the its effect either in the object or in their copies in
permanent storage.
Concurrency Control: the lost update problem

Transaction T : Transaction U:
balance = b.getBalance(); balance = b.getBalance();
b.setBalance(balance*1.1); b.setBalance(balance*1.1);
a.withdraw(balance/10) c.withdraw(balance/10)
balance = b.getBalance(); $200
balance = b.getBalance(); $200
b.setBalance(balance*1.1); $220
b.setBalance(balance*1.1); $220
a.withdraw(balance/10) $80
c.withdraw(balance/10) $280
Concurrency Control: The inconsistent retrievals problem

Transaction V: Transaction W:
a.withdraw(100)
aBranch.branchTotal()
b.deposit(100)

a.withdraw(100); $100
total = a.getBalance() $100
total = total+b.getBalance() $300
total = total+c.getBalance()
b.deposit(100) $300

a, b accounts start with 200 both.


Serial equivalence

• If these transactions are done one at a time in some


order, then the final result will be correct.
• If we do not want to sacrifice the concurrency, an
interleaving of the operations of transactions may
lead to the same effect as if the transactions had
been performed one at a time in some order.
• We say it is a serially equivalent interleaving.
• The use of serial equivalence is a criterion for
correct concurrent execution to prevent lost updates,
and inconsistent retrievals.
A serially equivalent interleaving of T and U

Transaction T: Transaction U:
balance = b.getBalance() balance = b.getBalance()
b.setBalance(balance*1.1) b.setBalance(balance*1.1)
a.withdraw(balance/10) c.withdraw(balance/10)

balance = b.getBalance() $200


b.setBalance(balance*1.1) $220
balance = b.getBalance() $220
b.setBalance(balance*1.1) $242
a.withdraw(balance/10) $80
c.withdraw(balance/10) $278
Conflicting Operations

• When we say a pair of operations conflicts we mean


that their combined effect depends on the order in
which they are executed. E.g. read and write.
• For two operations to be serially equivalent, ..
It is necessary and sufficient that all the pairs of conflicting operations of the two transactions
be executed in the same order at all of the objects they access

• Three ways to ensure serializability:


▪ Locking
▪ Optimistic concurrency control
▪ Timestamp ordering
Read and write operation conflict rules

Operations of different Conflict Reason


transactions
read read No Because the effect of a pair of read operations
does not depend on the order in which they are
executed
read write Yes Because the effect of a read and a write operation
depends on the order of their execution
write write Yes Because the effect of a pair of write operations
depends on the order of their execution
A non-serially equivalent interleaving of operations of transactions T and U

Transaction T: Transaction U:

x = read(i)
write(i, 10)
y = read(j)
write(j, 30)

write(j, 20)
z = read (i)

• T’s and U’s access of to objects i and j is serialised with respect to each
other, but…
• The ordering in the above figure is not serially equivalent.
• Serial equivalence will require that:
1. T accesses i before U and T access j before U, OR
2. U accesses I before T and U access j before T.
Recoverability From Aborts

• Servers must record all the effects of committed


transactions and none of the aborted transactions
• They must therefore allow transactions to abort without
affecting concurrent transactions.
• Two problems are associated with aborting
transactions
• Dirty reads(uncommited data)
• Premature writes
A dirty read when transaction T aborts

Transaction T: Transaction U:
a.getBalance() a.getBalance()
a.setBalance(balance + 10) a.setBalance(balance + 20)

balance = a.getBalance() $100


a.setBalance(balance + 10) $110
balance = a.getBalance() $110

a.setBalance(balance + 20) $130


commit transaction
abort transaction

Dirty reads caused by a read in one transaction U and an earlier unsuccessful


write in another transaction T on the same object.
T will be rolled back and restore the original a value, thus U will have seen a
value that never existed. U is committed, so cannot be undone. U performs a
dirty read.
Premature Write: Overwriting uncommitted values

Transaction T: Transaction U:
a.setBalance(105) a.setBalance(110)
$100
a.setBalance(105) $105
a.setBalance(110) $110

Premature write: related to the interaction between write operations on the same
object belonging to different transactions.
a. If U aborts and then T commit, we got a to be correct 105.
Some systems restore value to “Before images” value for abort action, namely the
value before all the writes of a transaction. a is 100, which is the before image of T’s
write. 105 is the before image of U’s write.
b. Consider if U commits and then T aborts, we got wrong value of 100.
c. Similarly if T aborts then U aborts, we got 105, which is wrong and should be 100.
So to ensure correctness, write operations must be delayed until earlier transactions
that updated the same object have either committed or aborted.
Nested transactions
T : top-level transaction
T1 = openSubTransaction T2 = openSubTransaction
commit
T1 : T2 :
openSubTransaction openSubTransaction openSubTransaction
prov. commit abort
T11 : T12 : T21 :
openSubTransaction
prov. commit prov. commit prov. commit
T211 :

prov.commit

• Nested transactions allow transactions to be composed of other transactions. I.e. several


transactions may be started within a transactions
• The outermost transaction is known as the top-level transaction
• Transaction other than the top-level transactions are known as subtransactions.
• Subtransctions appear atomic to their parents
• Subtransactions at the same level can be executed concurrently
Locks

• A simple example of a serializing mechanism is the use of


exclusive locks.
• Server can lock any object that is about to be used by a
client.
• If another client wants to access the same object, it has to
wait until the object is unlocked in the end.
Transactions T and U with exclusive locks
Transaction T: Transaction U:
balance = b.getBalance() balance = b.getBalance()
b.setBalance(bal*1.1) b.setBalance(bal*1.1)
a.withdraw(bal/10) c.withdraw(bal/10)
Operations Locks Operations Locks
openTransaction
bal = b.getBalance() lock B
b.setBalance(bal*1.1) openTransaction
a.withdraw(bal/10) Lock A bal = b.getBalance() waits for T’s
lock on B
closeTransaction unlock A, B
lock B
b.setBalance(bal*1.1)
c.withdraw(bal/10) lock C
closeTransaction unlock B, C
Lock compatibility
For one object Lock requested
read write
Lock already set none OK OK
read OK wait
write wait wait
An object can be read and write. From the compatibility table, we know pairs of
read operations from different transactions do not conflict. So a simple exclusive
lock used for both read and write reduces concurrency more than necessary.
(Many readers/Single writer)
Rules;
1. If T has already performed a read operation, then a concurrent transaction U
must not write until T commits or aborts.
2. If T already performed a write operation, then concurrent U must not read or
write until T commits or aborts.
Shared Lock and Exclusive lock
• Locks must be obtained before read/write can begin
• If a transaction want to read and write the same object, it can either
▪ Obtain an X-lock before reading and unlock it immediately afterwards
▪ Obtain an S-lock before reading, then obtain an X-lock before writing. And
unlock it immediately afterwards.
▪ Consider the following examples
1. A1 <- Read(X) 1. A1 <- Read(X)
2. A1 <- A1 – k 2. A1 <- A1* 1.01
3. Write(X, A1) 3. Write(X, A1)
4. A2 <- Read(Y) 4. A2 <- Read(Y)
5. A2 <- A2 + k 5. A2 <- A2 * 1.01
6. Write(Y, A2) 6. Write(Y, A2)

T1 (Transfer) T2 (Dividend)
Lock-based protocol – example: Example of schedule with locks

1. S-lock(X)
2. A1 <- Read(X) No wait: S-locks

1. S-lock(X)
3. Unlock(X)
2. A1 <- Read(X)
4. A1 <- A1 – k
T1 waits
5. X-lock(X)
3. Unlock(X)
T1 can go ahead
4. A1 <- A1* 1.01 T2 waits

5. X-lock(X)
5. X-lock(X)
T1 6. Write(X, A1) T2
7. Unlock(X)
T2 can go ahead
8. …. 5. X-lock(X)
Lock based protocols -- questions

• Does having locks this way guarantee conflict


serializability?
• Is there any other requirements in the order/manner
of accquiring/releasing locks?
• Does it matter when to acquire locks?
• Does it matter when to release locks?
Lock based protocol – need for a protocol
S-lock (X)
1. A1 <- Read(X)
Unlock(X)
2. A1 <- A1 – k
X-lock (X)
Unlock(X) 3. Write(X, A1)
S-lock (X)
1. A1 <- Read(X) Unlock(X)
2. A1 <- A1* 1.01
X-lock(X)
3. Write(X, A1) Unlock(X),
S-lock(Y)
4. A2 <- Read(Y) Unlock(Y)

5. A2 <- A2 * 1.01
S-lock (Y)
6. Write(Y, A2) X-lock (Y)
Unlock(Y)
4. A2 <- Read(Y)
Unlock(Y)
5. A2 <- A2 + k Not conflict serializable.
X-lock (Y)
Unlock(Y)6. Write(Y, A2)

X : 100 -> 50 -> 50.5; Y : 200 -> 202 -> 252; X+Y = 302.5 not 303
Two-phase locking -- motivation

• What is the problem?


• When a transaction release a lock on an object , that means
other transactions can obtain a lock on it.

X-lock(X)
Write(X, 100)
Unlock(X) S-lock(X)
Read(X)
……
T1 T2
• In this case, there is contention from T1 to T2
• To ensure serializability, we must ensure there is no conflict
from T2 back to T1
• How?
Two-phase locking -- motivation

• Ensure that T1 does not read/write anything that T2


read/write.
▪ Unrealistic to check in real life
• What is a sufficient condition then?
• Ensure T1 does not read/write anything after
releasing the lock!
•  (basic) Two-phase locking
Two phase locking – definition

• The basic two-phase locking (2PL) protocol


▪ A transaction T must hold a lock on an item x in the appropriate
mode before T accesses x.
▪ If a conflicting lock on x is being held by another transaction, T
waits.
▪ Once T releases a lock, it cannot obtain any other lock
subsequently.
• Note: a transaction is divided into two phases:
▪ A growing phase (obtaining locks)
▪ A shrinking phase (releasing locks)
• Claim : 2PL ensures conflict serializability
Two phase locking – Serializability

• Lock-point: the point where the transaction obtains


all the locks
• With 2PL, a schedule is conflict equivalent to a serial
schedule ordered by the lock-point of the
transactions
2-phase locking -- example
1. S-lock(X)
2. A1 <- Read(X)
3. A1 <- A1 – k T2 waits
4. X-lock(X)
1. S-lock(X)
5. Write(X, A1)
6. S-lock(Y)
7. A2 <- Read(Y)
8. A2 <- A2 + k
9. X-lock(Y)
10. Write(Y, A2)
11. Unlock(X)
1. S-lock(X)
Lock point for T1 2. A1 <- Read(X)
3. A1 <- A1* 1.01
4. X-lock(X)
5. Write(X, A1)
6. S-lock(Y)
12. Unlock(Y)
T2 waits
6. S-lock(Y)
7. A2 <- Read(Y)
8. A2 <- A2 * 1.01 Lock point for T2
9. X-lock(Y)
T1 10.
11.
Write(Y, A2)
Unlock(Y)
T2
12. Unlock(X)
Recover problem with 2PL
1. X-lock(X)
2. A1 <- Read(X)
3. A1 <- A1 * 10
4. Write(X, A1)
5. Unlock(X)
1. X-lock(X)
2. A1 <- Read(X)
3. A1 <- A1 + 1
4. Write(X, A1)
5. Unlock(X)
6. Commit
6. Abort!

• Dirty Read problem!


• There is a gap between releasing locks and the decision to commit/abort
• Other transactions can still access data written by a uncommitted transaction
Strict two-phase Locking Protocol

• Because transaction may abort, strict execution are needed


to prevent dirty reads and premature writes, which are
caused by read or write to same object accessed by another
earlier unsuccessful transaction that already performed an
write operation.
• So to prevent this problem, a transaction that needs to read
or write an object must be delayed until other transactions
that wrote the same object have committed or aborted.
• Rule:
▪ Any locks applied during the progress of a transaction are held until
the transaction commits or aborts.
Use of locks in strict two-phase locking

1. When an operation accesses an object within a transaction:


(a) If the object is not already locked, it is locked and the operation proceeds.
(b) If the object has a conflicting lock set by another transaction, the
transaction must wait until it is unlocked.
(c) If the object has a non-conflicting lock set by another transaction, the
lock is shared and the operation proceeds.
(d) If the object has already been locked in the same transaction, the lock will
be promoted if necessary and the operation proceeds. (Where promotion
is prevented by a conflicting lock, rule (b) is used.)
2. When a transaction is committed or aborted, the server unlocks all objects it
locked for the transaction.

A transaction with a read lock that is shared by other transactions cannot


promote its read lock to a write lock, because write lock will conflict with other
read locks.
Lock class
public class Lock {
private Object object; // the object being protected by the lock
private Vector holders; // the TIDs of current holders
private LockType lockType; // the current type
public synchronized void acquire(TransID trans, LockType aLockType ){
while(/*another transaction holds the lock in conflicing mode*/) {
try {
wait();
}catch ( InterruptedException e){/*...*/ }
}
if(holders.isEmpty()) { // no TIDs hold lock
holders.addElement(trans);
lockType = aLockType;
} else if(/*another transaction holds the lock, share it*/ ) ){
if(/* this transaction not a holder*/) holders.addElement(trans);
} else if (/* this transaction is a holder but needs a more exclusive lock*/)
lockType.promote();
}
}
Continues on next slide
Lock Class…..continued

public synchronized void release(TransID trans ){


holders.removeElement(trans); // remove this holder
// set locktype to none
notifyAll();
}
}
Deadlock with write locks

Transaction T Transaction U

Operations Locks Operations Locks

a.deposit(100); write lock A


b.deposit(200) write lock B
b.withdraw(100)
waits for U’s
a.withdraw(200); waits for T’s
lock on B
lock on A
The wait-for graph

Held by Waits for


A
T U T U

Waits for B
Held by
A cycle in a wait-for graph

V
Another wait-for graph

V T
Held by W

T Held by Held by
C
U
Held by B
U
W V
Waits for

T and W then request write locks on object C and a deadlock arises. V is


involved in two cycles.
Deadlock Prevention
• Deadlock prevention:
1. Simple way is to lock all of the objects used by a
transaction when it starts. it should be done as an atomic
action to prevent deadlock.
▪ This is problematic
▪ It is inefficient to lock an object you only need for short period of time.
▪ Hard to predict what objects a transaction will require.
2. Order the objects in certain order. Acquiring the locks
need to follow this certain order.
▪ This can result in premature locking and a reduction in concurrency.
Deadlock Detection

• Deadlock may be detected by finding cycles in


the wait-for-graph. Having detected a deadlock,
a transaction must be selected for abortion to
break the cycle.
▪ If lock manager blocks a request, an edge can be added.
Cycle should be checked each time a new edge is added.
▪ One transaction will be selected to abort in case of cycle.
Age of transaction and number of cycles involved is used
when selecting a victim
Resolution of Deadlock

▪ Timeouts is commonly used to resolve deadlock.


Each lock is given a limited period in which it is
invulnerable. After this time, a lock becomes
vulnerable.
▪ If no other transaction is competing for the object,
vulnerable object remained locked. However, if
another transaction is waiting, the lock is broken.
▪ Disadvantages:
▪ Transaction aborted simply due to timeout and waiting transaction
even if there is no deadlock.
▪ Hard to set the timeout time
Resolution of the deadlock

Transaction T Transaction U
Operations Locks Operations Locks

a.deposit(100); write lock A


b.deposit(200) write lock B
b.withdraw(100)
waits for U’s a.withdraw(200); waits for T’s
lock on B lock on A
(timeout elapses)
T’s lock on A becomes vulnerable,
unlock A, abort T
a.withdraw(200); write locks A
unlock A, B
Problems with the use of locks

• Kung and Robinson [1981] identified a number of inherent


disadvantages of locking and proposed an alternative optimistic
approach to the serialization of transaction that avoids these
drawbacks. Disadvantages of lock-based:
▪ Lock maintenance represents an overhead that is not present in
systems that do not support concurrent access to shared data. Locking
sometimes are only needed for some cases with low probabilities.
▪ The use of lock can result in deadlock. Deadlock prevention reduces
concurrency severely. The use of timeout and deadlock detection is not
ideal for interactive programs.
▪ To avoid cascading aborts, locks cannot be released until the end of the
transaction. This may reduce the potential for concurrency.
▪ This has led to the introduction of Optimistic Concurrency
Control and Timestamping.
Distributed System Principles
Naming

1
Naming
• Names are associated to entities (files,
computers, Web pages, services(remote
and local) disk, Printers, objects etc.)
– Entities (1) have a location and (2) can be
operated on.
• Name Resolution: the process of
associating a name with the entity/object it
represents.
– Naming systems prescribe the rules for doing
this.
2
Names
• Types of names
– Addresses
– Identifiers
– Human friendly
• Representation of names
– Human friendly format
– Machine readable – generally random bit
strings

3
Addresses as Names
• To operate on an entity in a distributed
system, we need an access point.
• Access points are physical entities
named by an address.
– Compare to telephones, mailboxes
• Objects may have multiple access
points
– Replicated servers represent a logical
entity (the service) but have many access
points (the various machines hosting the
service)
4
Addresses as Names
• Entities may change access points over time
– A server moves to a different host machine, with
a different address, but is still the same service.
• New entities may take over the vacated
access point and its address.
• Better: a location-independent name for an
entity E
– should be independent of the addresses of the
access points offered by E.

5
Identifiers as Names
• Identifiers are names that are unique and
location independent.
• Properties of identifiers:
– An identifier refers to at most one entity
– Each entity has at most one identifier
– An identifier always refers to the same entity; it is
never reused.
• Human comparison?
• An entity’s address may change, but its identifier
cannot change.

6
Human-Friendly Names
• Human-friendly names are designed to be
used by humans instead of a computer
• They usually contain contextual
information; e.g., file names or DNS
names.
• Do not usually contain information that is
useful to a computer

7
Representation
• Addresses and identifiers are usually
represented as bit strings (a pure name)
rather than in human readable form.
– Unstructured or flat names.
• Human-friendly names are more likely to
be character strings (have semantics)

8
Name Resolution
• The central naming issue: how can other
forms of names (human-friendly,
identifiers) be resolved to addresses?
• Naming systems maintain name-to-
address bindings
• In a distributed system a centralized
directory of name-address pairs is not
practical.

9
Naming Systems
• Flat Naming
– Unstructured; e.g., a random bit string
– Resolves identifiers to addresses
• Structured Naming
– Human-readable, consist of parts; e.g., file names or
Internet host naming
– Resolves structured human-friendly names to
addresses
• Attribute-Based Naming
– An exception to the rule that named objects must be
unique
– Entities have attributes; request an object by
specifying the attribute values of interest. 10
– Resolves descriptive names to addresses
Flat Naming
• Addresses and identifiers are usually pure
names (bit strings – often random)
• Identifiers are location independent:
– Do not contain any information about how to locate
the associated entity.
• Addresses are not location independent.
• In a small LAN name resolution can be simple.
– Broadcast or multicast to all stations in the network.
– Each receiver must “listen” to network transmissions
– Not scalable

11
Flat Naming
• Simple Solutions
– Broadcasting
– Forwarding pointers
• Home-based Solutions
• Hierarchical Solutions
• Distributed Hash Tables
Broadcasting (I)
Internet ARP: Network IP addresses → data-link MAC addresses

Reply to
Need
Address for IGNORE Request
Entity(A) Request
Entity(A)

Address of
Entity(A)
Broadcasting (II)
• Broadcasting is not suitable for larger
networks
– Bandwidth is wasted
– Hosts are interrupted for no reason
Flat Names – Resolution in WANs
• Simple solutions for mobile entities
– Chained forwarding pointers
• Directory locates initial position; follow chain of
pointers left behind at each host as the server
moves
• Broken links
– Home-based approaches
• Each entity has a home base; as it moves, update
its location with its home base.
• Permanent moves?

15
Useful for contacting mobile hosts
16
Comparison
• Broadcasting:
– Scalability problems
– Efficiency problems in large scale systems

• Forwarding Pointers:
– Geographical scalability problems
• Long chains: performance problem
• Prone to failure
Home-Based Approaches

• Home-location: popular for supporting


mobile entities in large-scale networks

• Keeps track of the current location


Mobile IP
• Assign a fixed IP (home location) to a mobile
host
• Contact host through home location
Hierarchical Approaches

Hierarchical organization of a location service into


domains, each having an associated directory
node.
Hierarchical Approaches –
Lookup Operation

Looking up a location in a hierarchically organized location service.


Hierarchical Approaches –
Update Operation

a) An insert request is forwarded to the first node


that knows about entity E.
b) A chain of forwarding pointers to the leaf node is
created.
Hierarchical Approaches –
Replicating Entities

An example of storing information of an entity


having two addresses in different leaf
domains.
Hierarchical Approaches –
Delete Operation

•Delete a replica of entry E?


Hierarchical Approaches –
Delete Operation
• Delete Replica R of Entity E from
domain D
• Delete pointer from dir(D) to R
• If location record of E at dir(D) is empty,
delete record
• Apply recursively, going up the tree
5.3 Structured Naming
• Flat name – bit string
• Structured name – sequence of words
• Name spaces for structured names –
labeled, directed graphs
• Example: UNIX file system
• Example: DNS (Domain Name System)
– Distributed name resolution
– Multiple name servers
33
Name Spaces - Figure 5-9
1. Entities in a structured name space are named by a
path name
2. Leaf nodes represent named entities (e.g., files) and
have only incoming edges
3. Directory nodes have named outgoing edges and
define the path used to find a leaf node

34
5.4 – Attribute-Based Naming
• Allows a user to search for an entity whose
name is not known.
• Entities are associated with various attributes,
which can have specific values.
• By specifying a collection of <attribute, value>
pairs, a user can identify one (or more) entities
• Attribute based naming systems are also
referred to as directory services, as opposed to
naming systems.

35
5.4 – Attribute-Based Naming
• Examples: search a music data base for a
particular kind of music, or music by a particular
artist, or . . .
• Difficulty: choosing an appropriate set of
attributes – how many, what variety, etc.
– E.g., should there be a category for ragga music (a
type of reggae)?
• Satisfying a request may require an exhaustive
search through the complete set of entity
descriptors
36
Attribute-Based Naming
• Not particularly scalable if it requires storing all
descriptors in a single database.
• RDF: Resource Description Framework
– Standardized data representation for the Semantic
Web
– Subject-predicate-object triplet (person, name, Alice)
• Some proposed solutions: (page 218)
– LDAP (Lightweight Directory Access Protocol)
combines structured naming with attribute based
names. Provides access to directory services via the
Internet.

37
DISTRIBUTED SYSTEMS
Principles and Paradigms

Chapter 9
Security

Mba IN

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Chapter Outline
❖ Introduction to security

❖ Secure Channels

❖ Access Control

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Introduction to security
❖ Security in Distributed Systems can be roughly divided into two
parts:
❖ Secure Channels
❖ This deals with the communication between users and processes
❖ This entails authentication of users, message integrity, and
confidentiality
❖ Authorisation
❖ Authorisation deals with ensuring that processes get access to only
those resources they are entitled to.
❖ It is also loosely referred to access control
❖ Secure channel and access control requires mechanisms to
distribute cryptographic keys, and adding and removing users form
systems
❖ These issues are addressed by what is known as Security
Management
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Security Threats
❖ One way of looking at security in DS is to view it from the
point of protecting the services and data it offers against
security threats.
❖ Types of security threats to consider:
➢ Interception
• An unauthorized party gains access to a service or data
➢ Interruption
• is a situation in which services/data become unavailable, unusable
or destroyed
➢ Modification
• Unauthorized tampering with the data/services so that it no longer
adheres to its original specification
➢ Fabrication
• Is when additional data/activity will be generated that would
normally not exist
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Security policy
❖ Simply getting what security threats exist is not
enough to build a secure system;
❖ A description of the security requirement is needed.
A specification of the security requirements is known
as the security policy;
❖ A security policy describes precisely which actions
the entities in a system are allowed to take and
which ones are prohibited
❖ To enforce a security policy, a security
mechanisms is required

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Security Mechanisms
❖ Following are some important security mechanisms:
▪ Encryption
• Transforms the data into a form the attacker can not understand
• It also helps to check whether data has been modified
▪ Authentication
• Used to verify the claimed identity of the user, client, server, host or any
other entity
▪ Authorisation
• Checks whether the client is authorised to perform the action requested
▪ Auditing
• Used to trace which clients accessed what and in which way.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cryptography (1)

Intruders and eavesdroppers in communication.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cryptography(2)
❖ In cryptography the aim is protection against the following 3
attacks:
❖ Message interception
• Without a proper key the intruder will see only unintelligable
data
❖ Message modification
• Without first decrypting the message the intruder cannot
meaningfully modify the message
❖ Insertion of encrypted message
• If an intruder cannot meaningfully modify the message he
also cannot meanigfully insert messages

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cryptography (3)
❖ There are two different types of Cryptography
❖ Symmetric Cryptography
• The same key is used to encrypt and decrypt a
message
❖ Asymmetric Cryptography (Public-key systems)
• Separate keys are used for encryption and decryption
• The following notation is used:
– K+A: public key belonging to A
– K-A: private key belonging to A

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Cryptography (4)

Notation used in this chapter.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Chapter Outline
❖ Introduction to security

❖ Secure Channels

❖ Access Control

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Secure Channels
❖ Concentrate on the following two issues:
❖ Authentication
❖ Message Integrity and Confidentiality
❖ Authentication and message integrity cannot do without each
other;
❖ Before two parties can communicate a channel is first
established through authentication
❖ After authentication data integrity and confidentiality come
into play. They are achieved through the use of session keys;
❖ A session key is a shared (secret) key that is used to encrypt
messages for a given session;
❖ A session key is only valid as long as the channel exists

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Authentication
❖ Authentication Based on a Shared Secret Key

❖ Authentication using a Key Distribution Center

❖ Authentication using Public-Key Cryptography

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Authentication Based on a Shared Secret Key (1)

Authentication based on a shared secret key: Challenge-response


protocol
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Authentication Based on a Shared Secret Key (2)

Authentication based on a shared


secret key, but using three instead of five messages.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Authentication Based on a Shared Secret Key (3)

The reflection attack.


Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Authentication Using a Key Distribution Center (1)

The principle of using a KDC.

▪ The main challenge with this protocol is Alice may want to start setting up
a secure channel with Bob even before Bob had received the shared key
from the KDC.
▪ One solution to this problem is to let Alice make a connection to Bob by
herself

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Authentication Using a Key Distribution Center (2)

Using a ticket and letting Alice set up a connection to Bob.

❖ This protocol is a variant to the well-known example of an authentication


protocol using KDC, the Needham-Schroeder authentication protocol.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Authentication Using a Key Distribution Center (3)

The Needham-Schroeder authentication protocol.

❖ The concept of a nonce is introduced to uniquely relate request and


response messages to each other
❖ Messages 1 and 2 are protected, but an intruder can replay message 3
and get Bob to think he communicating with Alice
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Authentication Using a Key Distribution Center (4)

Protection against malicious reuse of a previously generated key


in the Needham-Schroeder protocol.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Authentication Using Public Key Cryptography

Mutual authentication in a public-key cryptosystem.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Message Integrity and Confidentiality
❖ Besides authentication a secure channel should also provide
guarantees for message integrity and confidentiality
❖ Confidentiality is easily established through encryption
❖ However, protecting messages against modifications is more
complicated
❖ Digital signatures are used to ensure integrity of messages
exchanged
❖ There are several ways of placing digital signatures on
messages, which include:
– Public Key Cryptography
– Use of message digests
– Use of session keys

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Digital Signatures (1)

Digitally signing a message using public-key cryptography.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Digital Signatures (2)

Digitally signing a message using a message digest.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Session Keys
❖ After the authentication phase the communicating parties
generally use a unique shared session key for confidentiality
❖ The session key is safely discarded when the channel is no
longer in use.
❖ An alternative to session keys is to use the key allocated
during authentication, but the use of session keys has a lot of
important advantages:
▪ When a key is used often it becomes easier to reveal it.
▪ Provides protection against replay attacks
▪ If such a key is compromised damage is limited to only that session

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Chapter Outline
❖ Introduction to security

❖ Secure Channels

❖ Access Control

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Access Control
❖ Requests involve carrying operations on
resources that are controlled by the server;
❖ These operations can only be carried out if the
client has sufficient access rights for invoking
them;
❖ Verifying access rights is referred as access
control;
❖ Granting these access rights in referred to as
authorisation.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
General Issues in Access Control

General model of controlling access to objects.

❖ A reference monitor:
❖ records which subject may do what;
❖ decides whether a subject is allowed to invoke a specific operation.
❖ Is called each time an object is invoked
❖ It is therefore important that the reference monitor be
tamperproof
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Access Control Matrix
❖ Access Control Matrix is a common approach to modelling
the access rights of subjects with respect to objects

❖ It is not economical implement a big matrix with thousands of


users and millions of objects
❖ Many entries in the matrix will be empty.
❖ More efficient ways of implementing these matrices are
needed

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Access Control Matrix (1)

Comparison between ACLs and capabilities for protecting objects.


(a) Using an ACL.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Access Control Matrix (2)

Comparison between ACLs and capabilities for protecting


objects. (b) Using capabilities.

❖ ACL or capability lists can still become too large that further measures
need to be taken.
❖ One general way is to make use of protection domains

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Protection Domains

The hierarchical organization of


protection domains as groups of users.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

You might also like