Lectures On DS

This document provides an introduction to distributed systems and discusses some of their key benefits and challenges. It explains that distributed systems allow independent computers to appear as a single system to users. Key benefits include improved price/performance, reliability through redundancy, ability to incrementally grow systems, and enabling remote access to services and data. However, distributed systems also present challenges related to complex design, network reliability, and security concerns from widely accessible data. It then discusses different approaches to interconnecting computers in distributed systems using bus-based and switched architectures.

Uploaded by

aj54321

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views

Lectures On DS

Uploaded by

aj54321

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 8

Lectures on distributed systems:

Introduction

recently in the brief history of computer systems. Several factors contributed to this. Com puters got smaller and cheaper: we can fit more of them in a given space and we can afford to do so. Tens to thousands can fit in a box whereas in the past only one would fit in a good-sized room. Their price often ranges from less than ten to a few thousand dollars instead of several million doll ars. More importantly, computers are faster. Network communica tion takes computational effort. A slower compu ter would spend a greater fraction of its time working on communica ting rather than working on the users program. Couple this with past CPU perfo rman ce and cost and networking just wasnt viable. Finally, interco nnect techn ologies have advanced to the point where it is very easy and inexpensi ve to connect computers together. Over local area networks, we can expect connectivity in the range of tens of Mbits/sec to a Gbit/sec. Tanenbaum defines a distributed system as a collection of independent computersthat appear to the users of the system as a single comput r. There are two essential points in this e definition. The first is the use of the word independent. This means that, architecturall y, the machines are capable of operating independently. The second point is that the software enables this set of connected machi nes to appear as a single comput r to the users of the e system. This is known as the single system image and is a major goal in designing distributed systems that are easy to maintain and operate.

ISTRIBUTED SYSTEMSAPPEARED relatively

Why build them?

Just because it is easy and inexpensive to connect multiple computers together does not necessarily mean that it is a good idea to do so. There are genuine benefits in building distributed systems: Price /perfo rmance ratio. You don't get twice the performance for twice the price in buying computers. Processors are only so fast and the price/perfo rman ce curve becomes nonlinear and steep very quickly. With multiple CPUs, we can get (almost) double the performa nce for double the money (as long as we can figure out how to keep the processors busy and the overhead neglig ible). Distr ibuti ng machines may make sense. It makes sense to put the CPUs for ATM cash machines at the source, each networ ked with the bank. Each bank can have one or

more computers networked with each other and with other banks. For computer graphics, it makes sense to put the graphics proc essing at the user's terminal to maximize the bandwidth between the device and processor. Com puter sup ported cooperative networ king. Users that are geogr aphically separated can now work and play together. Examples of this are electr onic whiteb oards, distributed document systems, audio/video tele conferencin g, email, file transfer, and games such as Doom, Quake, Age of Empires, and Duke Nukeem, Starcraft, and scores of oth ers. Increased reliability. If a small perc entage of machines break, the rest of the system remains intact and can do useful work. Incremental growth. A company may buy a computer. Eventually the workload is too great for the machine. The only option is to replace the computer with a faster one. Networking allows you to add on to an existing infrastructure. Remote services. Users may need to access informati on held by others at their systems. Examples of this include web browsing, remote file access, and progr ams such as Napster and Gnutella to access MP3 music. Mobility. Users move around with their laptop computers, Palm Pilots, and WAP phones. It is not feasible for them to carry all the information they need with them. A distributed system has distinct advantages over a set of non-networked smaller computers. Data can be shared dynamically giving private copies (via floppy disk, for example) does not work if the data is changing. Peripherals can also be shared. Some peripherals are expensive and/ or infrequently used so it is not justifiable to give each PC a perip heral. These peripherals include optical and tape jukeboxes, typesetters, large format color printers and expensive drum scanners. Machines themselves can be shared and workload can be distrib uted amongst idle machi nes. Finally, networked machin es are useful for supporting person-to-p erson networ king: exchanging email, file transfer, and information access (e.g., the web). As desirable as they may now be, distributed systems are not without problems: Designing, impl ementing and using distributed software may be difficult. Issues of creating operating systems and/or languages that support distributed systems arise. The network may lose messages and/ or become overl oaded. Rewiring the network can be costly and difficult. Security becomes a far greater concern. Easy and convenient data access from anywhere creates security problems.

Interconnect
There are different ways in which we can connect CPUs together. The most widely used classification scheme (taxonomy) is that created by Flynn in 1972. It classifies machines by the number of instruction streams and the number of data streams. An instruction stream

refers to the sequence of instructions that the computer processes. Multiple instruction streams means that different instructions can be executed concurrently. Data streams refer to memory opera tions. Four combinations are possible: SISD Single instr uction stream, single data stream. This is the traditional unip rocessor computer. Single instr uction stream, multiple data streams. This is an array proces sor; a single instr uction operates on many data units in parallel. Having mul tiple concurrent instruc tions operati ng on a single data element makes no sense. This isn't a useful category. Multiple instruct ion stream, mult iple data streams. This is a broad category covering all forms of machines that contain multiple comput ers, each with a progr am counter, prog ram, and data. It covers parallel and distributed systems.

SIMD

MISD

MIMD

Since the MIMD category is of partic ular interest to us, we can divide it into further classifications. Three areas are of interest to us: Memory We refer to machines with shared memory as multiprocessors and to machines without shared memory as multicomputers. A multiprocessor contains a single virtual address space. If one processor writes to a memory location, we expect another processor to read the value from that same locat ion. A multi computer is a system in which each machine has its own memory and address space. Interconnection network Machines can be connected by either a bus or a switched network. On a bus, a single network, bus, or cable connec ts all machi nes. The bandwidth on the interconne ction is shared. On a switched network, individual connections exist between machines, guaranteeing the full available bandwidth between machines. Coupling A tightly-coupled system is one where the components tend to be reliably connected in close proximity. It is characterized by short message delays, high bandwidth, and high total system reliability. A loosely-coupled system is one where the components tend to be distri buted. Message delays tend to be longer and bandwidth tends to be lower than in closely-coupled systems. Reliab ility expectations are that individual com ponents may fail without affecting the functiona lity of oth er components.

Bus-base multiprocessors d
CP U CP U memory In a bus-based system, all CPUs are connected to one bus (Figure 1). System memory and peripherals are also connected to that bus. If CPU A writes a word to memory and CPU B Figure 1. Bus-based interconnect can read that word back immediatel y, the memo ry is coherent. A bus can get overloaded rather quickly with each CPU accessing the bus for all data and instruct ions. A solut ion to this is to add cache memory between the CPU CPU CPU and the bus (Figure 2). The memory cache holds the most recently cache cache accessed regions of memory. This way, the CPU only has to go out to the bus to access main memory only when the regions are not in its Figure 2. Bus-based interconnect with cache cache. The problem that arises now is that if two CPUs access the same word (or same region of memory) they load it into their respective caches and make future references from their cache. Suppose CPU A modifies a memory location. The modificat ion is local to its cache so when CPU B reads that memory location, it will not get As modification. One solution to this is to use a write-through cache. In this case, any write is written not only to the cache, but also sent on the bus to main memory. Writes generate bus traffic now, but reads generate it only if the data needed is not cached. We expect systems to have far more reads than writes. This alone is not sufficient, since other CPU caches may still store local copies of data that has now been modified. We can solve this by having every cache monitor the bus. If a cache sees a write to a memory location that it has cached, it either removes the entry in 1 its cache (invalidates it) or updates it with the new data thats on the bus . If it ever needs that region of memory again, it will have to load it from main memory. This is known as a snoopy cache (because it snoops on the bus ).

Switched multiprocessors
A bus-based architecture doesn't scale to a large number of CPUs (e.g. 64). Using switches CP U enables us to achieve a far greater CPU density CP U in multiprocessor systems. An mn crossbar CP U switch is a switch that allows any of m elements CP U to be switched to any of n elements. A crossbar switch contains a crosspo switch at each int switching point in the mn array, so mn crosspoint switches are needed (Figure 3). To use a crossbar switch, we place the CPUs on crosspoint switch one axis (e.g. m) and the break the mem ory Figure 3. Crossbar interconnect into a number of chunks which are placed on the second axis (e.g. n memory chunks). There will be a delay only when multi ple CPUs try to access the same memory group. A problem with crossbar switches is that they are expensive: to connect n CPUs with n 2 memory modules requires n cross point switches. We'd like an alternative to using this many switches. To reduce the numb er of switches and mainta in the same connectivity requires increasing the number of switching stages. This results in an omega network (Figure 4), which, for a system of n CPUs and n memory modules, requires logn (base 2) switching stages, each with n/2 switches for a total of (nlogn)/2 switches. This is better 2 than n but can still amount to many switches. As 22 switch we add more switching stages, we find that our CP U memory delay increases. With 1024 CPUs and memories, we have to pass through ten switching stages to get to the memory and through ten to get back. CP U memory To try to avoid these delays, we can use a hierarchical memory access system: each CPU CP U memory can access its own memory quickly but accessing other CPU's memory takes longer. This is
CP U memory memory memory memory

Figure 4. Omega interconnect

known as a Non-Uniform Memory Access, or NUMA, architecture. It provides better average access time but plac ement of code and data to optimize perf orma nce becomes difficult.

Bus-base multicomputers d
Bus-based multicomputers are easier to design in that we don't need to contend with issues of shared memory: every CPU simply has its own local memory. However, without

shared memory, some other communication mechanism is needed so that processes can communica te and synchr onize as needed. The communication network between the two is a bus (for example, an Ethernet local area network). The traffic requir ements are typically far lower than those for memory access (so more systems can be attached to the 2 bus). A bus can either be a system bus or a local area network . Bus-based multicomputers most commonly manifest themselves as a collect ion of workstations on a local area network.

Switched multicomputers
In a switched mult icomputer system, each CPU still has its own local memory but it also has a switched interco nnect to its neighbors. Common arrangements are a grid, cube, or hypercube network. Only nearest neighbors are connected in this network; messages to others requ ire multi ple hops.

Software issues
The software design goal in building a distribut ed system is to create a Single System Image - have a collection of independent comput ers appear as a single system to the user(s). By single system, we refer to creating a system in which the user is not aware of the presence of multiple computers or of distri bution. In discussing software for distributed systems, it makes sense to disting uish looselycoupled vs. tightly-coupled software. While this is a continu um without demarcat ion, by loosely-coupled we refer to software in which the systems interact with each other to a limited ext ent as needed. For the most part, they operate as fully-functio ning stand-alone machines. If the network goes down, things are pretty much functional. Loosely coupl ed systems may be ones in which there are shared devices or services (parts of file service, web service). With tightly-coupled software, there is a strong dependence on other machines for all aspects of the system. Essentially, both the interconnect and functioning of the remote systems are necessary for the local system's operati on. The most common distributed systems today are those with loosely-coupled software and loosely coupled hardware. The quintessential example is that of workstations (each with its own CPU and operating system) on a LAN. Interaction is often primitive expl icit interaction, with progr ams such as rcp and rlogin. File servers may also be present, which accept requests for files and provide the data. There is a high degree of aut onomy and few system-wide requirements. The next step in buildi ng distributed systems is placing tightly -coupled software on loosely- coupled hardwa re. With this structure we attempt to make a network of machines

appear as one single timesharing system, realizing the single system image. Users should not be aware of the fact that the machine is distributed and contains m ultiple CPUs. If we succeed in this, we will have a true distributed system. To accomplish this, we need certain capabilities: A single global IPC mechanism (any process should be able to talk to any other process in the same manner, whe ther it's local or remote). A global protecti on scheme. Uniform naming from anywhere; the file system should look the same. Same system call interface everywhere. The kernel on each machine is responsi ble for contr olling its own resources (such as doing its own memory management/paging). Multi processor time-sha ring systems employing tightly-co upled hardware and software are rather common. Since memory is shared, all operating system structures can be shared. In fact, as long as critical sections are properly taken care of, a traditional unipr ocessor system does not need a great deal of modification. A single run queue is empl oyed amongst all the processors. When a CPU is ready to call the scheduler, it accesses the single run queue (exclusively, of course). The file system interface can remain as is (with a shared buffer cache) as can the system call interface (traps).

Design issues
There are a num ber of issues with which a desig ner of a distributed system has to contend. Tanen baum enumerates them: Transparency At the high levels, trans parency means hiding distribution from the users. At the low levels, transparency means hiding the distributi on from the programs. There are several forms of transparency: Location transparency Users don't care where the resources are located. Migration transparency Resources may move at will. Replication transparency Users cannot tell whet her there are multiple copies of the same resource. Concurrency transparency Users share resources transparently with each other without interference. Parallelism transparency Operations can take place in parallel without the users knowing. Flexibility It should be easy to develop distrib uted systems. One popul ar app roach is through the use of a microkernel. A microkernel is a departure from the monolithic operating systems that try to handle all system requests. Instead, it supports only the very basic operations: IPC, some memory management, a small

amount of process man agement, and low-level I/O All else is performed by user. level servers. Reliability We strive for bui lding highly reliable and highly available systems. Availability is the fraction of time that a system is usable. We can achieve it thro ugh redundancy and not requiri ng the simultaneous functioni ng of a large number of components. Reliability encompa sses a few factors: data must not get lost, the system must be secure, and the system must be fault tolerant. Performance We have to understand the envir onment in which the system may operat e. The communica tion links may be slow and affect network performanc e. If we exploit parall elism, it may be on a fine grain (within a proc edure, array ops, etc.) or a coarse grain (proce dure level, service level). Scalability We'd like a distributed system to scale indefinitely. This generally won't be possible, but the extent of scalability will always be a conside ration. In evaluating algorithms, we'd like to consider distributa ble algorithms vs. centralized ones.

Training Manual: KRONES KFS-3 Filling Valve Controller Incorporating LCT3 Programme Version: From V3.30 ..
100% (1)
Training Manual: KRONES KFS-3 Filling Valve Controller Incorporating LCT3 Programme Version: From V3.30 ..
64 pages
Laptop Power Squence
100% (2)
Laptop Power Squence
52 pages
A Taxonomy of Distributed Systems
No ratings yet
A Taxonomy of Distributed Systems
11 pages
Cluster Computing
No ratings yet
Cluster Computing
5 pages
Research and Realization of Reflective Memory Network: 1. Abstract
No ratings yet
Research and Realization of Reflective Memory Network: 1. Abstract
56 pages
Unit 6 - Computer Organization and Architecture - WWW - Rgpvnotes.in
No ratings yet
Unit 6 - Computer Organization and Architecture - WWW - Rgpvnotes.in
14 pages
Taxonomy of Parallel Computing Paradigms
No ratings yet
Taxonomy of Parallel Computing Paradigms
9 pages
COA
No ratings yet
COA
107 pages
ACA Assignment 4
No ratings yet
ACA Assignment 4
16 pages
Large Computer Systems and Pipelining: Homework
No ratings yet
Large Computer Systems and Pipelining: Homework
11 pages
Computer System Architecture: Pamantasan NG Cabuyao
No ratings yet
Computer System Architecture: Pamantasan NG Cabuyao
12 pages
Distributed Operating Syst EM: 15SE327E Unit 1
No ratings yet
Distributed Operating Syst EM: 15SE327E Unit 1
49 pages
Coa Unit5
No ratings yet
Coa Unit5
11 pages
Chapter 1 - Introduction To Computer Networks Basic Network Concepts
No ratings yet
Chapter 1 - Introduction To Computer Networks Basic Network Concepts
77 pages
CS02DOS Unit 1UNit - 1 Introduction
No ratings yet
CS02DOS Unit 1UNit - 1 Introduction
6 pages
Basics Networking
No ratings yet
Basics Networking
15 pages
Introduction To Computer Part 4
No ratings yet
Introduction To Computer Part 4
26 pages
Relation To Computer System Components: M.D.Boomija, Ap/Cse
100% (1)
Relation To Computer System Components: M.D.Boomija, Ap/Cse
39 pages
Multiprocessor Architecture and Programming
No ratings yet
Multiprocessor Architecture and Programming
20 pages
Cloud Computing Unit1 New
No ratings yet
Cloud Computing Unit1 New
27 pages
Aca Notes: Scalability
No ratings yet
Aca Notes: Scalability
13 pages
Activity 8,9,10, PPP
No ratings yet
Activity 8,9,10, PPP
7 pages
Traing On Hadoop
No ratings yet
Traing On Hadoop
123 pages
What Is A Network?
No ratings yet
What Is A Network?
39 pages
Distributed Systems: COMP9243 - Week 1 (12s1)
No ratings yet
Distributed Systems: COMP9243 - Week 1 (12s1)
11 pages
Grid and Cloud Full Notes
No ratings yet
Grid and Cloud Full Notes
80 pages
What Is Parallel Computing
No ratings yet
What Is Parallel Computing
9 pages
Distributed System IMP APY
No ratings yet
Distributed System IMP APY
210 pages
HPC Unit 3
No ratings yet
HPC Unit 3
31 pages
Cs3551 Distributed Computing Unit-1
No ratings yet
Cs3551 Distributed Computing Unit-1
52 pages
Unit 1
No ratings yet
Unit 1
58 pages
DC unit 1 - notes
No ratings yet
DC unit 1 - notes
36 pages
MIDTERM-REVIEWER
No ratings yet
MIDTERM-REVIEWER
17 pages
Introduction To Distributed Operating Systems Communication in Distributed Systems
No ratings yet
Introduction To Distributed Operating Systems Communication in Distributed Systems
150 pages
Understanding Computer Networks
No ratings yet
Understanding Computer Networks
14 pages
Jamshed 2015
No ratings yet
Jamshed 2015
17 pages
Ds/Unit 1 Truba College of Science & Tech., Bhopal
No ratings yet
Ds/Unit 1 Truba College of Science & Tech., Bhopal
9 pages
CS3551 - Distributed Computing (1)
No ratings yet
CS3551 - Distributed Computing (1)
106 pages
Computer Fundamental and Networking
No ratings yet
Computer Fundamental and Networking
37 pages
Q1. A) What Are The Two Criteria For Classification of The Advanced Operating Systems? Discuss Any Two Operating Systems in Both The Categories
No ratings yet
Q1. A) What Are The Two Criteria For Classification of The Advanced Operating Systems? Discuss Any Two Operating Systems in Both The Categories
27 pages
Computing Paradigms
No ratings yet
Computing Paradigms
4 pages
CA RESEARCH
No ratings yet
CA RESEARCH
5 pages
(HPC) Pratik
No ratings yet
(HPC) Pratik
8 pages
1 Introduction To Computer Networking
100% (1)
1 Introduction To Computer Networking
53 pages
COA Assignment
No ratings yet
COA Assignment
21 pages
Increasing Factors Which Improves The Performance of Computer in Future
No ratings yet
Increasing Factors Which Improves The Performance of Computer in Future
7 pages
Cluster Computing: DATE: 28 November 2013
No ratings yet
Cluster Computing: DATE: 28 November 2013
32 pages
Silberschatz 7 Chapter 1 Questions and Answers: Jas::/conversion/tmp/scratch/389117979
No ratings yet
Silberschatz 7 Chapter 1 Questions and Answers: Jas::/conversion/tmp/scratch/389117979
6 pages
advantage and disadvantage (DOS)
No ratings yet
advantage and disadvantage (DOS)
8 pages
Notes 2023
No ratings yet
Notes 2023
43 pages
Network Configuration and Network Topologies
No ratings yet
Network Configuration and Network Topologies
4 pages
Bandwidth (1) Bandwidth Is Defined As A Range Within A Band of Frequencies or Wavelengths. (2) Bandwidth Is Also Defined As The Amount of
No ratings yet
Bandwidth (1) Bandwidth Is Defined As A Range Within A Band of Frequencies or Wavelengths. (2) Bandwidth Is Also Defined As The Amount of
56 pages
Distributed Operating Systems: Andrew S. Tanenbaum
No ratings yet
Distributed Operating Systems: Andrew S. Tanenbaum
5 pages
Cloud Computing Unit 1 Ppt
No ratings yet
Cloud Computing Unit 1 Ppt
29 pages
Lecture 13
No ratings yet
Lecture 13
4 pages
Exposicion de Jimenez
No ratings yet
Exposicion de Jimenez
19 pages
Net - Centric Computing
No ratings yet
Net - Centric Computing
45 pages
Chapter 1 Exercieses
No ratings yet
Chapter 1 Exercieses
5 pages
Unit 1 BBA V
No ratings yet
Unit 1 BBA V
56 pages
Operating System Interview Questions and Answers
From Everand
Operating System Interview Questions and Answers
Manish Soni
No ratings yet
Learn Computer Science
From Everand
Learn Computer Science
Knowledge Flow
No ratings yet
Operating Systems Interview Questions You'll Most Likely Be Asked
From Everand
Operating Systems Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Analog s7200
No ratings yet
Analog s7200
5 pages
CHIRP Programming Reference Jim Unroe - KC9HI 14-January-2013
No ratings yet
CHIRP Programming Reference Jim Unroe - KC9HI 14-January-2013
2 pages
SC07003A Anchor 9 2024-04-24
No ratings yet
SC07003A Anchor 9 2024-04-24
10 pages
A Hybrid Converter With Dual Outputs For Low Cross Regulation and Improved Current Balance
No ratings yet
A Hybrid Converter With Dual Outputs For Low Cross Regulation and Improved Current Balance
10 pages
When Good Grounds Turn Bad—Isolate by Thomas Kugelstadt TEXAS INSTRUMENTS
No ratings yet
When Good Grounds Turn Bad—Isolate by Thomas Kugelstadt TEXAS INSTRUMENTS
6 pages
1control SOLO - Technical Sheet: Description
No ratings yet
1control SOLO - Technical Sheet: Description
1 page
Advanced Verification Techniques For DO-254
No ratings yet
Advanced Verification Techniques For DO-254
38 pages
Design Techniques of Microwave Cavity and Waveguide Filters: A Literature Review
No ratings yet
Design Techniques of Microwave Cavity and Waveguide Filters: A Literature Review
7 pages
Rr310401-Digital Ic Applications
No ratings yet
Rr310401-Digital Ic Applications
2 pages
Topologies: - The Structure of The Network
No ratings yet
Topologies: - The Structure of The Network
25 pages
Power System Lab021-1
No ratings yet
Power System Lab021-1
10 pages
ELAN Configuration
100% (1)
ELAN Configuration
7 pages
EK101 1a Introd. To Engineers Society
No ratings yet
EK101 1a Introd. To Engineers Society
33 pages
990 Test Adapter
No ratings yet
990 Test Adapter
2 pages
Multifunction Overcurrent Relays: Directional and Nondirectional Overcurrent Relays IM30AE, IM30BE, IM30DE and DM30E
No ratings yet
Multifunction Overcurrent Relays: Directional and Nondirectional Overcurrent Relays IM30AE, IM30BE, IM30DE and DM30E
6 pages
DWDM Planning
No ratings yet
DWDM Planning
54 pages
Astm Protocol: Diagnostica Stago S.A.S - 9, Rue Des Frères Chausson - 92600 Asnières Sur Seine (France)
No ratings yet
Astm Protocol: Diagnostica Stago S.A.S - 9, Rue Des Frères Chausson - 92600 Asnières Sur Seine (France)
48 pages
Diode Approx Exam1
No ratings yet
Diode Approx Exam1
5 pages
MONITOR SERVICIO E8285A Datasheet
No ratings yet
MONITOR SERVICIO E8285A Datasheet
2 pages
BG Systems JFX Joystick: Mechanical
No ratings yet
BG Systems JFX Joystick: Mechanical
2 pages
Aurora 8b10b Protocol Spec sp002
0% (1)
Aurora 8b10b Protocol Spec sp002
73 pages
Martin Destroyer
No ratings yet
Martin Destroyer
16 pages
Projectors
No ratings yet
Projectors
9 pages
Kisii University: COMP 101 Assignment Answers 1
No ratings yet
Kisii University: COMP 101 Assignment Answers 1
3 pages
2014 Cytech Ip Camera Price
No ratings yet
2014 Cytech Ip Camera Price
4 pages
WOF
No ratings yet
WOF
1 page
ECIL Paper Technical Electronics OIST Bhopal 26 December 2010
No ratings yet
ECIL Paper Technical Electronics OIST Bhopal 26 December 2010
2 pages
Synopsys Design Constraints Presentation
No ratings yet
Synopsys Design Constraints Presentation
21 pages

Lectures On DS

Uploaded by

Lectures On DS

Uploaded by

Lectures on distributed systems:

ISTRIBUTED SYSTEMSAPPEARED relatively

Why build them?

Figure 4. Omega interconnect

You might also like