0% found this document useful (0 votes)
2 views

Computer Systems Architecture 308 312

Computer architecture

Uploaded by

kartikyadav1120
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
2 views

Computer Systems Architecture 308 312

Computer architecture

Uploaded by

kartikyadav1120
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 5
throughput CHAPTER NINE Pipeline and Vector Processing IN THIS CHAPTER 9-1 Parallel Processing 9-2 Pipelining 9-3 Arithmetic Pipeline 9-4 Instruction Pipeline 9-5 RISC Pipeline 9-7 Array Processors 9-1 Parallel Processing Parallel processing is a term used to denote a large class of techniques that are used to provide simultaneous data-processing tasks for the purpose of increas- ing the computational speed of a computer system. Instead of processing each instruction sequentially as in a conventional computer, a parallel processing system is able to perform concurrent data processing to achieve faster execu- tion time. For example, while an instruction is being executed in the ALU, the next instruction can be read from memory. The system may have two or more ALUs and be able to execute two or more instructions at the same time. Furthermore, the system may have two or more processors operating concur- rently, The purpose of parallel processing is to speed up the computer process- ing capability and increase its throughput, that is, the amount of i that can be accomplished during a given interval of time. The amount of hardware increases with parallel processing, and with it, the cost of the system. increases. However, technological developments have reduced hardware costs to the point where parallel processing techniques are economically feasible. Parallel processing can be viewed from various levels of complexity. At the lowest level, we distinguish between parallel and serial operations by the type of registers used. Shift registers operate in serial fashion one bit at a time, 299 300 — cHaPTER NINE Pipeline and Vector Processing multiple functional units while registers with parallel load operate with all the bits of the word simulta- neously. Parallel processing at a higher level of complexity can be achieved by having a multiplicity of functional units that perform identical or different operations simultaneously. Parallel processing is established by distributing the data among the multiple functional units. For example, the arithmetic, logic, and shift operations can be separated into three units and the operands diverted to each unit under the supervision of a control unit. Figure 9-1 shows one possible way of separating the execution unit into eight functional units operating in parallel. The operands in the registers are applied to one of the units depending on the operation specified by the instruc- Figure 9-1 Processor with multiple functional units. -—P] Adder-subsractor |—m, T—>}__ Integer multiply }+] TH] Logic unit Le t—>| Shift unit |} Tomemory —<—>| t—>{ Incrementer >| Processor registers © [>| Floating-point [| add-subtract. © [7] Floating-point | multiply el Floating-point divide SIMD MIMD SECTION 9-1 Parallel Processing 301 tion associated with the operands. The operation performed in each functional unit is indicated in each block of the diagram. The adder and integer multiplier perform the arithmetic operations with integer numbers. The floating-point operations are separated into three circuits operating in parallel. The logic, shift, and increment operations can be performed concurrently on different data. All units are independent of each other, so one number can be shifted while another number is being incremented. A multifunctional organization is usually associated with a complex control unit to coordinate all the activities among the various components. There are a variety of ways that parallel processing can be classified. It can be considered from the internal organization of the processors, from the interconnection structure between processors, or from the flow of information through the system. One classification introduced by M. J. Flynn considers the organization of a computer system by the number of instructions and data items that are manipulated simultaneously. The normal operation of a com- puter is to fetch instructions from memory and execute them in the processor. ‘The sequence of instructions read from memory constitutes an instruction stream. The operations performed on the data in the processor constitutes a data stream. Parallel processing may occur in the instruction stream, in the data stream, or in both. Flynn’s classification divides computers into four major groups as follows: Single instruction stream, single data stream (SISD) Single instruction stream, multiple data stream (SIMD) Multiple instruction stream, single data stream (MISD) Multiple instruction stream, multiple data stream (MIMD) SISD represents the organization of a single computer containing a con- trol unit, a processor unit, and a memory unit. Instructions are executed sequentially and the system may or may not have internal parallel processing capabilities. Parallel processing in this case may be achieved by means of multiple functional units or by pipeline processing. SIMD represents an organization that includes many processing units under the supervision of a common control unit. All processors receive the same instruction from the control unit but operate on different items of data. The shared memory unit must contain multiple modules so that it can communicate with all the processors simultaneously. MISD structure is only of theoretical interest since no practical system has been constructed using this organization. MIMD organization refers to a computer system capable of processing several programs at the same time. Most multiprocessor and multi- computer systems can be classified in this category. Flynn’s classification depends on the distinction between the perform- ance of the control unit and the data-processing unit. It emphasizes the be- 302 — CHAPTER NINE Pipeline and Vector Processing an example havioral characteristics of the computer system rather than its operational and structural interconnections. One type of parallel processing that does not fit Flynn’s classification is pipelining. The only two categories used from this classification are SIMD array processors discussed in Sec. 9-7, and MIMD multiprocessors presented in Chap. 13. In this chapter we consider parallel processing under the following main topics: 1. Pipeline processing 2. Vector processing 3. Array processors Pipeline processing is an implementation technique where arithmetic suboper- ations or the phases of a computer instruction cycle overlap in execution. Vector processing deals with computations involving large vectors and ma- trices. Array processors perform computations on large arrays of data. 9-2 _ Pipelining Pipelining is a technique of decomposing a sequential process into subopera- tions, with each subprocess being executed ina special dedicated segment that operates concurrently with all other segments. A pipeline can be visualized as. a collection of processing segments through which binary information flows. Each segment performs partial processing dictated by the way the task is partitioned. The result obtained from the computation in each segment is transferred to the next segment in the pipeline. The final result is obtained after the data have passed through all segments. The name “pipeline” implies a flow of information analogous to an industrial assembly line. It is characteristic of pipelines that several computations can be in progress in distinct segments at the same time. The overlapping of computation is made possible by associ- ating a register with each segment in the pipeline. The registers provide isolation between each segment so that each can operate on distinct data simultaneously. Perhaps the simplest way of viewing the pipeline structure is to imagine that each segment consists of an input register followed by a combinational circuit. The register holds the data and the combinational circuit performs the suboperation in the particular segment. The output of the combinational circuit ina given segment is applied to the input register of the next segment. A clock is applied to all registers after enough time has elapsed to perform all segment activity. In this way the information flows through the pipeline one step at a time. The pipeline organization will be demonstrated by means of a simple SECTION 9-2 Pipelining 303 example. Suppose that we want to perform the combined multiply and add operations with a stream of numbers. AmB +C; fori =1,2,3,...,7 Each suboperation is to be implemented in a segment within a pipeline. Each segment has one or two registers and a combinational circuit as shown in Fig. 9-2. R1 through RS are registers that receive new data with every clock pulse. The multiplier and adder are combinational circuits. The suboperations per- formed in each segment of the pipeline are as follows: RICA, R2CB, Input A; and B; R3<-R1*R2, R4—-C, Multiply and input C;, RS<-R3 + RA Add C; to product The five registers are loaded with new data every clock pulse. The effect of each clock is shown in Table 9-1. The first clock pulse transfers A, and B, into Rl and Figure 9-2 Example of pipeline processing. Ai B Gq y y Ri R2 | Multiplier RB R4 Adder

You might also like