0% found this document useful (0 votes)
17 views

Unit 5

Uploaded by

Apurva Jarwal
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Unit 5

Uploaded by

Apurva Jarwal
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

Unit 5 : Processor Organization

-S.R.Milke
The Indirect Cycle
• The execution of an instruction may involve one or more
operands in memory, each of which requires a memory
access.
• Further if indirect addressing is used then additional memory
accesses are required
• We can think of the fetching of indirect addresses as one
more instruction stages
• The main line of activity consists of alternating instruction
fetch and instruction execution activities.
• After an instruction is fetched it is examined to determine if
any indirect addressing is involved
• If so the required operands are fetched using indirect
addressing
• Following execution an interrupt may be processed before
Instruction Cycle State Diagram
Data Flow
• The exact sequence of events during an
instruction cycle depends on the design of the
processor
• We can indicate in general terms what must
happen
• Let us assume that a processor that employs a
memory address register (MAR), a memory
buffer register (MBR) a program counter (PC)
and an instruction register (IR).
System attributes to Performance
Clock Rate and CPI
Clock Rate and CPI (Cont….)
Execution Time (CPU Time)
Execution Time (CPU Time) Cont….
Execution Time (CPU Time) Cont….
System Attributes
MIPS Rate
Throughput Rate (Performance)
Instruction Types and CPI
Example 1
Example 2
Consider a non-pipelined processor with a clock rate of 2.5 gigahertz and
average cycles per instruction of four. The same processor is upgraded to
a pipelined processor with five stages; but due to the internal pipeline
delay, the clock speed is reduced to 2 gigahertz. Assume that there are
no stalls in the pipeline. The speed up achieved in this pipelined
processor is __________.
(A) 3.2
(B) 3.0
(C) 2.2
(D) 2.0
Speedup = ExecutionTimeOld / ExecutionTimeNew

ExecutionTimeOld= CPIOld * CycleTimeOld

= CPIOld * CycleTimeOld = 4 * 1/2.5 Nanoseconds = 1.6 ns

Since there are no stalls, CPInew can be assumed 1 on average.

ExecutionTimeNew = CPInew * CycleTimenew

= 1 * 1/2 = 0.5 Speedup = 1.6 / 0.5 = 3.2


Parallelism
• Computer architects are constantly striving to
improve performance of the machines they design.
• Making the chips run faster by increasing their
clock speed is one way, However, most computer
architects look to parallelism (doing two or more
things at once) as a way to get even more
performance for a given clock speed.
• Parallelism comes in two general forms: –
1) instruction-level parallelism, and
2) processor-level parallelism.
Instruction-Level Parallelism
• Parallelism is exploited within individual
instructions to get more instructions/sec out
of the machine.
• We will consider two approached
– Pipelining
– Superscalar Architectures
Pipelining
• Fetching of instructions from memory is a major bottleneck
in instruction execution speed. However, computers have
the ability to fetch instructions from memory in advance .
• These instructions were stored in a set of registers called the
prefetch buffer.
• Thus, instruction execution is divided into two parts:
fetching and actual execution;
• The concept of a pipeline carries this strategy much further.
• Instead of dividing instruction execution into only two parts,
it is often divided into many parts, each one handled by a
dedicated piece of hardware, all of which can run in parallel.
Dual Pipelines
• If one pipeline is good, then surely two pipelines
are better.
• Here a single instruction fetch unit fetches pairs of
instructions together and puts each one into its
own pipeline, complete with its own ALU for
parallel operation.
• To be able to run in parallel, the two instructions
must not conflict over resource usage (e.g.,
registers), and neither must depend on the result
of the other.
Superscalar Architectures
• Going to four pipelines is conceivable, but doing so
duplicates too much hardware
• Instead, a different approach is used on highend CPUs.
• The basic idea is to have just a single pipeline but give it
multiple functional units.
• This is a superscalar architecture – using more than one
ALU, so that more than one instruction can be executed in
parallel.
• Implicit in the idea of a superscalar processor is that the
S3 stage can issue instructions considerably faster than the
S4 stage is able to execute them.
Processor-Level Parallelism
• Instruction-level parallelism (pipelining and
superscalar operation) rarely win more than a
factor of five or ten in processor speed.
• To get gains of 50, 100, or more, the only way is to
design computers with multiple CPUS
• We will consider three alternative architectures: –
• Array Computers
• Multiprocessors
• Multicomputers
Array Computers
• An array processor consists of a large number
of identical processors that perform the same
sequence of instructions on different sets of
data.
• A vector processor is efficient at executing a
sequence of operations on pairs of Data
elements; all of the addition operations are
performed in a single, heavily-pipelined adder.
Multiprocessors
• The processing elements in an array processor are
not independent CPUS, since there is only one
control unit.
• The first parallel system with multiple full-blown
CPUs is the multiprocessor.
• This is a system with more than one CPU sharing a
common memory co-ordinated in software.
• The simplest one is to have a single bus with
multiple CPUs and one memory all plugged into it.
Multicomputers
• Although multiprocessors with a small number of
processors (< 64) are relatively easy to build, large ones are
surprisingly difficult to construct.
• The difficulty is in connecting all the processors to the
memory.
• To get around these problems, many designers have simply
abandoned the idea of having a shared memory and just
build systems consisting of large numbers of
interconnected computers, each having its own private
memory, but no common memory.
• These systems are called multicomputers.

You might also like