0% found this document useful (0 votes)
3 views

Chapter 1 Introduction

Uploaded by

siraj mohammed
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Chapter 1 Introduction

Uploaded by

siraj mohammed
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 60

Chapter 1 - Introduction

1.1 Introduction and Definition


 Before the mid-1980s, computers were
 very expensive (hundred of thousands or even millions of
dollars)
 very slow (a few thousand instructions per second)
 not connected among themselves
 After the mid-1980s: two major developments
 cheap and powerful microprocessor-based computers
appeared
 computer networks
 LANs at speeds ranging from 10 to 1000 Mbps
 WANs at speed ranging from 64 Kbps to gigabits/sec
 Consequence
 feasibility of using a large network of computers to work for the
same application; this is in contrast to the old centralized
systems where there was a single computer with its peripherals

2
 Definition of a Distributed System

 Various definitions of distributed systems have been given in the literature,


 None of them satisfactory, and none of them in agreement with any of the others.
 For our purposes it is sufficient to give a loose characterization:

 a distributed system is a collection of independent computers that appears to its users


as a single coherent system.

 This definition has two aspects:


1. hardware: autonomous machines
2. software: a single system view for the users

3
Examples of Distributed systems

 Web Applications
 E.g., Google search or Amazon e-commerce platform
 Cloud computing
 E.g., Google cloud, Microsoft Azure, …..
 File sharing networks
 E.g., Peer-to-peer systems like BitTorrent
 Distributed databases
 E.g., Apache Cassandra or Google Spanned
 Blockchain
 E.g., Bitcoin

4
 Why Distributed?
 Resource and Data Sharing
 printers, databases, multimedia servers, ...
 Availability, Reliability
 the loss of some instances can be hidden
 Scalability, Extensibility
 the system grows with demand (e.g., extra servers)
 Performance
 huge power (CPU, memory, ...) available
 Inherent distribution, communication
 organizational distribution, e-mail, video

5
 Problems of Distribution
 Concurrency, Security
 clients must not disturb each other
 Privacy
 e.g., when building a preference profile
 unwanted communication such as spam
 Partial failure
 we often do not know where the error is (e.g., RPC)
 Location, Migration, Replication
 clients must be able to find their servers
 Heterogeneity
 hardware, platforms, languages, management

6
1.2 Organization and Goals of a Distributed System
 to support heterogeneous computers and networks and
 to provide a single-system view,
 a distributed system is often organized by means of a
layer of software called middleware that extends over
multiple machines

a distributed system organized as middleware; note that the middleware layer


extends over multiple machines 7
 Goals of a distributed system: a distributed system should
 make remote resources easily accessible
 E.g., printers, computers, storage facilities, data, files, Web pages,
networks, ...
 There are many reasons for wanting to share resources:
 economics,
 to collaborate and exchange information
 be transparent: hide the fact that the resources and processes are
distributed across multiple computers
 be open
 be scalable

Transparency in a Distributed System


 a distributed system that is able to present itself to users
and applications as if it were only a single computer
system is said to be transparent
8
 different forms of transparency in a distributed system
Transparency Description
Access Hide differences in data representation
(endianness, file naming, ...) and how a
resource
is accessed
Location Hide where a resource is physically located;
where
is http://www.google.com? (naming)
Migration Hide that a resource may move to another
location
Relocation Hide that a resource may be moved to
another location while in use; e.g., mobile users
using their wireless laptops
Replication Hide that a resource is replicated
Concurrency Hide that a resource may be shared by
several competitive users; a resource must be left
in a consistent state 9
 Openness in a Distributed System
 a distributed system should be open
 we need well-defined interfaces
 interoperability
 components from different origin can communicate/work together
 portability
 components work on different platforms
 another goal of an open distributed system is that it should be
flexible and extensible;
 easy to configure the system out of different components;
 easy to add new components,
 replace existing ones
 an Open Distributed System is a system that offers services
according to standard rules that describe the syntax and
semantics of those services;
 e.g., protocols in networks

10
 Scalability in Distributed Systems can be measured at least in three
dimensions

 a distributed system should be scalable


 size: adding more users and resources to the system
 geographically: users and resources may be far apart
 administratively: should be easy to manage even if it spans
many administrative organizations

11
 Scalability problems: performance problems caused by
limited capacity of servers and networks
Concept Example
Single server for all users-mostly for security
Centralized services
reasons
Centralized data A single on-line telephone book
Doing routing based on complete
Centralized algorithms
examplesinformation
of scalability limitations

 Scaling Techniques
 how to solve scaling problems
 the problem is mainly performance, and arises as a result of limitations in
the capacity of servers and networks (for geographical scalability)
 Three possible solutions:
 Hiding communication latencies,
 Distribution, and
 Replication

12
A. Hide Communication Latencies
 is important to achieve geographical scalability
 The basic idea is simple: try to avoid waiting for responses to
remote service requests
 let the requester do other useful job
 i.e., construct requesting applications that use only
asynchronous communication instead of synchronous
communication; when a reply arrives the application is
interrupted.
 good for batch processing and parallel applications but not for
interactive applications
 for interactive applications, move part of the job to the client to
reduce communication;
 e.g. filling a form and checking the entries

13
(a) a server checking the correctness of field entries
(b) a client doing the job

14
D. Distribution
 Distribution involves taking a component, splitting it into smaller parts
 e.g., DNS - Domain Name System
 divide the name space into zones
 for details, see later in Chapter 4 - Naming

an example of dividing the DNS name space into zones


15
C. Replication
 replicate components across a distributed system to
increase availability and for load balancing, leading to
better performance
 decided by the owner of a resource
 caching (a special form of replication) also reduces
communication latency; decided by the user
 but, caching and replication may lead to consistency
problems (see Chapter 6 - Consistency and Replication)

16
1.3 Hardware Concepts
 All distributed systems consist of multiple CPUs and
Memory.
 We can organize hardwares in several different ways in
term of their connection and communication
 From Hardware perspective, we can divide distributed
systems into two broad categories:
 Multiprocessor Systems
 Multicomputer Systems

17
Hardware requirements

 We cant define complete hardware requirements


for a distributed system in general but we will
discuss the most basic requirements here and
these are.
 Processors.
 Memory (Specially RAM).
 Interconnecting Resources

18
Multiprocessor Systems

 Such systems consist of computer having multiple


processors and memory that are connected through a
high speed back plane over the mother board.

OR

 Multiprocessor is a single computer system containing


multiple processors interconnected with the common
I/O and memory devices.

19
Multiprocessor Systems

 There are two types of multiprocessor system with


respect to Memory.
 Multi Processor System with Shared Memory.
 Multi Processor System with Non Shared Memory.
 There are two types of multiprocessor system with
respect to Inter connection of Memory and Processors.
 Bus Based Systems.
 Switch Based Systems.

20
Multiprocessor Systems

different basic organizations of processors and memories in distributed


systems 21
1.3.1 Bust based Multiprocessor systems with Shared Memory
 the shared memory has to be coherent - the same value written
by one processor must be read by another processor
 performance problem for bus-based organization since the bus
will be overloaded as the number of processors increases
 the solution is to add a high-speed cache memory between the
processors and the bus to hold the most recently accessed
words; may result in incoherent memory

a bus-based multiprocessor

 bus-based multiprocessors are difficult to scale even with caches


 To address scalability issue, there are two possible solutions:
 Crossbar switch and
 Omega network

22
Switch Based Multi Processor Systems With Shared
Memory

 There are two types.


 Cross Bar Switch Based.
 Omega Switch Based.

23
 Crossbar switch
 divide memory into modules and connect them to the
processors with a crossbar switch
 at every intersection, a crosspoint switch is opened and closed
to establish connection
 problem: expensive; with n CPUs and n memories, n 2 switches
are required

Cross Bar
Switch Based Multi
Processor Systems
With Shared Memory

24
 Omega network
 use switches with multiple input and output lines
 drawback: high latency because of several switching
stages between the CPU and memory

Omega Switch
Based Multi Processor
Systems With Shared
Memory

25
1.3.2 Multicomputer Systems

 What is multicomputer systems

 The system consists of multiple computers connected by a


network.
 A multicomputer system is a group of computers
connected by a network that work together to solve a
problem.

 A multicomputer is a system consists of several individual


computers connected in a network
 each individual computer has its own computing
resources such as processor, I/O and memory devices.

26
1.3.2 Multicomputer Systems(cont.)

 Types Of Multi-Computer Systems


 There are two types based on types of computers we
wanted to use in our system
 Homogeneous Multi-Computer systems.
 Heterogeneous Multi-Computer Systems.

27
Homogeneous Multicomputer Systems

 also referred to as System Area Networks (SANs)


 The system consist of same type of computers.
 Each Computer has its own memory and CPUs.
 Computer will communicate each other through a high
speed interconnection network.
 This high speed interconnection network can be a
 Bus based or
 Switch based

28
Homogeneous Multicomputer Systems(cont.)

 Bus-based Homogeneous Multicomputer Systems


 shared multiaccess network such as Fast Ethernet
can be used and messages are broadcasted.
 performance drops highly with more than 25-100
nodes (contention)

29
Homogeneous Multicomputer Systems

 Switch-based
 messages are routed through an interconnection network
 two popular topologies: meshes (or grids) and
hypercubes

Hypercube
Grid(meshe)
30
Heterogeneous Multicomputer Systems

 Most distributed systems are built on heterogeneous multicomputer systems


 The computers could be different in
 processor type,
 memory size,
 architecture,
 power,
 operating system, etc.
 The interconnection network may be highly heterogeneous as well
 The distributed system provides a software layer to hide the heterogeneity at
the hardware level; i.e., provides transparency

31
1.4 Software Concepts

 What is Software/Operating systems?

 Operating systems in relation to distributed systems can


be roughly divided into two categories:
 Tightly-coupled systems
 Loosely-coupled systems

32
1.4 Software Concepts(cont.)

 Tightly-coupled systems, referred to as distributed OSs


(DOS)
 the OS tries to maintain a single, global view of the
resources it manages
 Is used for multiprocessors and homogeneous
multicomputers
 Loosely-coupled systems, referred to as network OSs
(NOS)
 a collection of computers each running its own OS;
 they work together to make their services and resources
available to others
 Is used for heterogeneous multicomputers
 Middleware:
33
 Summary of main issues

System Description Main Goal


Tightly-coupled operating system for multi- Hide and manage
DOS processors and homogeneous hardware
multicomputers resources
Loosely-coupled operating system for Offer local
NOS heterogeneous multicomputers (LAN and services to remote
WAN) clients
Provide
Additional layer on a top of NOS
Middleware distribution
implementing general-purpose services
transparency

an overview of DOSs, NOSs, and middleware

34
Distributed Operating Systems

 There are two types


 multiprocessor operating system:
 to manage the resources of a multiprocessor
 multicomputer operating system:
 To manage homogeneous multicomputers

35
Multiprocessor Operating Systems

 Extended uniprocessor operating systems to support multiple


processors having access to a shared memory
 Modern OSs are designed to be able to handle multiple
processors.
 Multiprocessor Operating Systems aim to support high
performance through multiple CPUs.
 a protection mechanism is required for concurrent access to
guarantee consistency
 There are two synchronization mechanisms:
 Semaphores and
 Monitors

36
Multiprocessor Operating Systems(cont.)

 Semaphore:
 Can be thought of as an integer with two atomic operations:
 Down
 Up
 Monitor:
 a programming language construct consisting of procedures
and variables that can be accessed only by the procedures of
the monitor;
 Only a single process at a time is allowed to execute a
procedure
 Goal:
 To ensure orderly access to shared resources and prevent
issues like race conditions, deadlocks, or data inconsistencies.
 To ensure replicated databases are up-to-date

37
Multicomputer Operating Systems
 Totally different structure and complexity than multiprocessor OSs.
 Processors can not share memory; instead communication is through message passing
 each node has its own
 kernel for managing local resources
 separate module for handling interprocessor communication
 They are organized as follows:

general structure of a multicomputer operating system 38


Multicomputer Operating Systems(cont.)

Blocking and buffering in message passing

 In Multicomputer Operating Systems blocking and


buffering can take place as follows:

39
Distributed Shared Memory Systems

 Practice showed that programing multicomputers is much harder than programing multiprocessor.

 The deference caused by communication way, i.e., shared memory and message passing

 In message passing, issues like buffering, blocking and reliable communication make things worse.

 For this reason, researcher try to implement shared memory communication way on multicomputer systems.

 The goal is to provide a virtual shared memory machine running on a multicomputer systems.

 This leads to what is called Page-based distributed shared memory (DSM)

40
Distributed Shared Memory Systems(cont..)
Page-based distributed shared memory (DSM) - uses the virtual memory
capabilities of each individual node.

pages of address space distributed among four machines 41


situation after CPU 1 references page 10

 read-only pages can be easily replicated

situation if page 10 is read only and replication is used


42
 Network Operating Systems

 possibly heterogeneous underlying hardware


 constructed from a collection of uniprocessor systems, each with its own
operating system and connected to each other in a computer network

general structure of a network operating system


43
 Services offered by network operating systems

 remote login (rlogin)


 remote file copy (rcp)
 shared file systems through file servers

two clients and a server in a network operating system

44
 Middleware
 a distributed operating system is not intended to handle a
collection of independent computers but provides
transparency and ease of use
 a network operating system does not provide a view of a
single coherent system but is scalable and open
 combine the scalability and openness of network operating
systems and the transparency and ease of use of distributed
operating systems
 this is achieved through a middleware, another layer of
software

45
• Positioning Middleware
• The goal is to hide heterogeneity

general structure of a distributed system as middleware

46
Middleware models

 Model-1: UNIX (or Plan 9)


 All resources including I/O devices are treated as
files
 Middleware based on distributed file system has
proven to be reasonable scalable
 Model-2: Remote procedure calls(RPCs)
 Model-3: RMI
(for detail, see Chapter 2 - Communication)

47
 a comparison between multiprocessor operating systems,
multicomputer operating systems, network operating
systems, and middleware-based distributed systems

Distributed OS
Network Middleware
Item
Multiproc Multicomp OS -based OS

Degree of
Very High High Low High
transparency
Same OS on all nodes Yes Yes No No
Number of copies of
1 N N N
OS
Basis for Shared Model
Messages Files
communication memory specific
Global, Global,
Resource management Per node Per node
central distributed
Scalability No Moderately Yes Varies
Openness Closed Closed Open Open
48
1.5 The Client-Server Model

 how are processes organized in a system


 thinking in terms of clients requesting services from servers

general interaction between a client and a server

49
 Application Layering
 no clear distinction between a client and a server;
 for instance a server for a distributed database may act as a
client when it forwards requests to different file servers
 three levels exist
 the user-interface level:
 implemented by clients and contains all that is required by a
client;
 usually through GUIs, but not necessarily
 the processing level:
 contains the applications
 the data level:
 contains the programs that maintain the actual data dealt with

50
 the general organization of an Internet search engine into three
different layers

51
Client-Server Architectures
 how to physically distribute a client-server application across several machines
 One approach is Multi-tiered Architectures and can be divided into two groups:
 Two tiered architecture
 Three tiered architecture

52
Two tiered architecture

Two-tiered architecture: alternative client-server organizations

a) put only terminal-dependent part of the user interface on the client


machine and let the applications remotely control the presentation
b) put the entire user-interface software on the client side
c) move part of the application to the client, e.g. checking correctness in
filling forms
d) and e) are for powerful client machines

53
Three tiered architecture

three tiered architecture: an example of a server acting as a client

54
 Modern Architectures
 Utilize vertical distribution and horizontal distribution to design scalable, efficient, and fault-tolerant
systems.
 These architectures define how components are distributed in a system.
 Two architectures:
 Vertical distribution:
 horizontal distribution:

55
Modern Architectures(cont.)

 Vertical distribution:
 Vertical distribution refers to distributing different functions
or layers of a system across various nodes or machines.
 It is commonly seen in tiered architectures, where
responsibilities are divided by their role or function (e.g.,
presentation, application, and data storage)
 Key Characteristics:
 Focuses on functional separation.
 Commonly follows multi-tiered models like 2-tier or 3-tier
architectures.
 Communication occurs vertically between layers, with each layer
dependent on the others.
 Real-World Use Cases:
 E-commerce Platforms and Enterprise Applications

56
Modern Architectures(cont.)
 Horizontal distribution:
 physically split up the client or the server into
logically equivalent parts. e.g. Web server
 Horizontal distribution involves duplicating or
replicating the same components or functions across
multiple nodes.
 It focuses on scaling out to handle larger workloads or
improve fault tolerance.
 Key Characteristics:
 Focuses on scalability and redundancy.
 Components at the same level perform similar tasks.
 Often associated with load balancing and replication.
 Examples:
 Cluster Computing, Content Delivery Networks, and Distributed
Databases 57
Modern Architectures(cont.)

an example of horizontal distribution of a Web service

58
Modern Architectures(cont.)

59
Thank You!

Next  Chapter two

60

You might also like