Lectures On DS
Lectures On DS
Introduction
recently in the brief history of computer systems. Several factors contributed to this. Com puters got smaller and cheaper: we can fit more of them in a given space and we can afford to do so. Tens to thousands can fit in a box whereas in the past only one would fit in a good-sized room. Their price often ranges from less than ten to a few thousand dollars instead of several million doll ars. More importantly, computers are faster. Network communica tion takes computational effort. A slower compu ter would spend a greater fraction of its time working on communica ting rather than working on the users program. Couple this with past CPU perfo rman ce and cost and networking just wasnt viable. Finally, interco nnect techn ologies have advanced to the point where it is very easy and inexpensi ve to connect computers together. Over local area networks, we can expect connectivity in the range of tens of Mbits/sec to a Gbit/sec. Tanenbaum defines a distributed system as a collection of independent computersthat appear to the users of the system as a single comput r. There are two essential points in this e definition. The first is the use of the word independent. This means that, architecturall y, the machines are capable of operating independently. The second point is that the software enables this set of connected machi nes to appear as a single comput r to the users of the e system. This is known as the single system image and is a major goal in designing distributed systems that are easy to maintain and operate.
more computers networked with each other and with other banks. For computer graphics, it makes sense to put the graphics proc essing at the user's terminal to maximize the bandwidth between the device and processor. Com puter sup ported cooperative networ king. Users that are geogr aphically separated can now work and play together. Examples of this are electr onic whiteb oards, distributed document systems, audio/video tele conferencin g, email, file transfer, and games such as Doom, Quake, Age of Empires, and Duke Nukeem, Starcraft, and scores of oth ers. Increased reliability. If a small perc entage of machines break, the rest of the system remains intact and can do useful work. Incremental growth. A company may buy a computer. Eventually the workload is too great for the machine. The only option is to replace the computer with a faster one. Networking allows you to add on to an existing infrastructure. Remote services. Users may need to access informati on held by others at their systems. Examples of this include web browsing, remote file access, and progr ams such as Napster and Gnutella to access MP3 music. Mobility. Users move around with their laptop computers, Palm Pilots, and WAP phones. It is not feasible for them to carry all the information they need with them. A distributed system has distinct advantages over a set of non-networked smaller computers. Data can be shared dynamically giving private copies (via floppy disk, for example) does not work if the data is changing. Peripherals can also be shared. Some peripherals are expensive and/ or infrequently used so it is not justifiable to give each PC a perip heral. These peripherals include optical and tape jukeboxes, typesetters, large format color printers and expensive drum scanners. Machines themselves can be shared and workload can be distrib uted amongst idle machi nes. Finally, networked machin es are useful for supporting person-to-p erson networ king: exchanging email, file transfer, and information access (e.g., the web). As desirable as they may now be, distributed systems are not without problems: Designing, impl ementing and using distributed software may be difficult. Issues of creating operating systems and/or languages that support distributed systems arise. The network may lose messages and/ or become overl oaded. Rewiring the network can be costly and difficult. Security becomes a far greater concern. Easy and convenient data access from anywhere creates security problems.
Interconnect
There are different ways in which we can connect CPUs together. The most widely used classification scheme (taxonomy) is that created by Flynn in 1972. It classifies machines by the number of instruction streams and the number of data streams. An instruction stream
refers to the sequence of instructions that the computer processes. Multiple instruction streams means that different instructions can be executed concurrently. Data streams refer to memory opera tions. Four combinations are possible: SISD Single instr uction stream, single data stream. This is the traditional unip rocessor computer. Single instr uction stream, multiple data streams. This is an array proces sor; a single instr uction operates on many data units in parallel. Having mul tiple concurrent instruc tions operati ng on a single data element makes no sense. This isn't a useful category. Multiple instruct ion stream, mult iple data streams. This is a broad category covering all forms of machines that contain multiple comput ers, each with a progr am counter, prog ram, and data. It covers parallel and distributed systems.
SIMD
MISD
MIMD
Since the MIMD category is of partic ular interest to us, we can divide it into further classifications. Three areas are of interest to us: Memory We refer to machines with shared memory as multiprocessors and to machines without shared memory as multicomputers. A multiprocessor contains a single virtual address space. If one processor writes to a memory location, we expect another processor to read the value from that same locat ion. A multi computer is a system in which each machine has its own memory and address space. Interconnection network Machines can be connected by either a bus or a switched network. On a bus, a single network, bus, or cable connec ts all machi nes. The bandwidth on the interconne ction is shared. On a switched network, individual connections exist between machines, guaranteeing the full available bandwidth between machines. Coupling A tightly-coupled system is one where the components tend to be reliably connected in close proximity. It is characterized by short message delays, high bandwidth, and high total system reliability. A loosely-coupled system is one where the components tend to be distri buted. Message delays tend to be longer and bandwidth tends to be lower than in closely-coupled systems. Reliab ility expectations are that individual com ponents may fail without affecting the functiona lity of oth er components.
Bus-base multiprocessors d
CP U CP U memory In a bus-based system, all CPUs are connected to one bus (Figure 1). System memory and peripherals are also connected to that bus. If CPU A writes a word to memory and CPU B Figure 1. Bus-based interconnect can read that word back immediatel y, the memo ry is coherent. A bus can get overloaded rather quickly with each CPU accessing the bus for all data and instruct ions. A solut ion to this is to add cache memory between the CPU CPU CPU and the bus (Figure 2). The memory cache holds the most recently cache cache accessed regions of memory. This way, the CPU only has to go out to the bus to access main memory only when the regions are not in its Figure 2. Bus-based interconnect with cache cache. The problem that arises now is that if two CPUs access the same word (or same region of memory) they load it into their respective caches and make future references from their cache. Suppose CPU A modifies a memory location. The modificat ion is local to its cache so when CPU B reads that memory location, it will not get As modification. One solution to this is to use a write-through cache. In this case, any write is written not only to the cache, but also sent on the bus to main memory. Writes generate bus traffic now, but reads generate it only if the data needed is not cached. We expect systems to have far more reads than writes. This alone is not sufficient, since other CPU caches may still store local copies of data that has now been modified. We can solve this by having every cache monitor the bus. If a cache sees a write to a memory location that it has cached, it either removes the entry in 1 its cache (invalidates it) or updates it with the new data thats on the bus . If it ever needs that region of memory again, it will have to load it from main memory. This is known as a snoopy cache (because it snoops on the bus ).
Switched multiprocessors
A bus-based architecture doesn't scale to a large number of CPUs (e.g. 64). Using switches CP U enables us to achieve a far greater CPU density CP U in multiprocessor systems. An mn crossbar CP U switch is a switch that allows any of m elements CP U to be switched to any of n elements. A crossbar switch contains a crosspo switch at each int switching point in the mn array, so mn crosspoint switches are needed (Figure 3). To use a crossbar switch, we place the CPUs on crosspoint switch one axis (e.g. m) and the break the mem ory Figure 3. Crossbar interconnect into a number of chunks which are placed on the second axis (e.g. n memory chunks). There will be a delay only when multi ple CPUs try to access the same memory group. A problem with crossbar switches is that they are expensive: to connect n CPUs with n 2 memory modules requires n cross point switches. We'd like an alternative to using this many switches. To reduce the numb er of switches and mainta in the same connectivity requires increasing the number of switching stages. This results in an omega network (Figure 4), which, for a system of n CPUs and n memory modules, requires logn (base 2) switching stages, each with n/2 switches for a total of (nlogn)/2 switches. This is better 2 than n but can still amount to many switches. As 22 switch we add more switching stages, we find that our CP U memory delay increases. With 1024 CPUs and memories, we have to pass through ten switching stages to get to the memory and through ten to get back. CP U memory To try to avoid these delays, we can use a hierarchical memory access system: each CPU CP U memory can access its own memory quickly but accessing other CPU's memory takes longer. This is
CP U memory memory memory memory
known as a Non-Uniform Memory Access, or NUMA, architecture. It provides better average access time but plac ement of code and data to optimize perf orma nce becomes difficult.
Bus-base multicomputers d
Bus-based multicomputers are easier to design in that we don't need to contend with issues of shared memory: every CPU simply has its own local memory. However, without
shared memory, some other communication mechanism is needed so that processes can communica te and synchr onize as needed. The communication network between the two is a bus (for example, an Ethernet local area network). The traffic requir ements are typically far lower than those for memory access (so more systems can be attached to the 2 bus). A bus can either be a system bus or a local area network . Bus-based multicomputers most commonly manifest themselves as a collect ion of workstations on a local area network.
Switched multicomputers
In a switched mult icomputer system, each CPU still has its own local memory but it also has a switched interco nnect to its neighbors. Common arrangements are a grid, cube, or hypercube network. Only nearest neighbors are connected in this network; messages to others requ ire multi ple hops.
Software issues
The software design goal in building a distribut ed system is to create a Single System Image - have a collection of independent comput ers appear as a single system to the user(s). By single system, we refer to creating a system in which the user is not aware of the presence of multiple computers or of distri bution. In discussing software for distributed systems, it makes sense to disting uish looselycoupled vs. tightly-coupled software. While this is a continu um without demarcat ion, by loosely-coupled we refer to software in which the systems interact with each other to a limited ext ent as needed. For the most part, they operate as fully-functio ning stand-alone machines. If the network goes down, things are pretty much functional. Loosely coupl ed systems may be ones in which there are shared devices or services (parts of file service, web service). With tightly-coupled software, there is a strong dependence on other machines for all aspects of the system. Essentially, both the interconnect and functioning of the remote systems are necessary for the local system's operati on. The most common distributed systems today are those with loosely-coupled software and loosely coupled hardware. The quintessential example is that of workstations (each with its own CPU and operating system) on a LAN. Interaction is often primitive expl icit interaction, with progr ams such as rcp and rlogin. File servers may also be present, which accept requests for files and provide the data. There is a high degree of aut onomy and few system-wide requirements. The next step in buildi ng distributed systems is placing tightly -coupled software on loosely- coupled hardwa re. With this structure we attempt to make a network of machines
appear as one single timesharing system, realizing the single system image. Users should not be aware of the fact that the machine is distributed and contains m ultiple CPUs. If we succeed in this, we will have a true distributed system. To accomplish this, we need certain capabilities: A single global IPC mechanism (any process should be able to talk to any other process in the same manner, whe ther it's local or remote). A global protecti on scheme. Uniform naming from anywhere; the file system should look the same. Same system call interface everywhere. The kernel on each machine is responsi ble for contr olling its own resources (such as doing its own memory management/paging). Multi processor time-sha ring systems employing tightly-co upled hardware and software are rather common. Since memory is shared, all operating system structures can be shared. In fact, as long as critical sections are properly taken care of, a traditional unipr ocessor system does not need a great deal of modification. A single run queue is empl oyed amongst all the processors. When a CPU is ready to call the scheduler, it accesses the single run queue (exclusively, of course). The file system interface can remain as is (with a shared buffer cache) as can the system call interface (traps).
Design issues
There are a num ber of issues with which a desig ner of a distributed system has to contend. Tanen baum enumerates them: Transparency At the high levels, trans parency means hiding distribution from the users. At the low levels, transparency means hiding the distributi on from the programs. There are several forms of transparency: Location transparency Users don't care where the resources are located. Migration transparency Resources may move at will. Replication transparency Users cannot tell whet her there are multiple copies of the same resource. Concurrency transparency Users share resources transparently with each other without interference. Parallelism transparency Operations can take place in parallel without the users knowing. Flexibility It should be easy to develop distrib uted systems. One popul ar app roach is through the use of a microkernel. A microkernel is a departure from the monolithic operating systems that try to handle all system requests. Instead, it supports only the very basic operations: IPC, some memory management, a small
amount of process man agement, and low-level I/O All else is performed by user. level servers. Reliability We strive for bui lding highly reliable and highly available systems. Availability is the fraction of time that a system is usable. We can achieve it thro ugh redundancy and not requiri ng the simultaneous functioni ng of a large number of components. Reliability encompa sses a few factors: data must not get lost, the system must be secure, and the system must be fault tolerant. Performance We have to understand the envir onment in which the system may operat e. The communica tion links may be slow and affect network performanc e. If we exploit parall elism, it may be on a fine grain (within a proc edure, array ops, etc.) or a coarse grain (proce dure level, service level). Scalability We'd like a distributed system to scale indefinitely. This generally won't be possible, but the extent of scalability will always be a conside ration. In evaluating algorithms, we'd like to consider distributa ble algorithms vs. centralized ones.