91% found this document useful (22 votes)
9K views

Solution Chapter 1

1. The document discusses performance metrics like clock rate, CPI, instructions per cycle (IPC), and execution time for different processors P1, P2, and P3 executing the same instruction set. It provides calculations to compare their performance in instructions per second (MIPS) and the number of cycles and instructions needed to execute a program in 10 seconds. 2. The document also analyzes the performance of two implementations (P1 and P2) of the same instruction set architecture with different clock rates and CPI values for instruction classes A, B, C, and D. It calculates total execution times for programs with different distributions of instruction classes to determine which implementation is faster. 3. Further sections
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
91% found this document useful (22 votes)
9K views

Solution Chapter 1

1. The document discusses performance metrics like clock rate, CPI, instructions per cycle (IPC), and execution time for different processors P1, P2, and P3 executing the same instruction set. It provides calculations to compare their performance in instructions per second (MIPS) and the number of cycles and instructions needed to execute a program in 10 seconds. 2. The document also analyzes the performance of two implementations (P1 and P2) of the same instruction set architecture with different clock rates and CPI values for instruction classes A, B, C, and D. It calculates total execution times for programs with different distributions of instruction classes to determine which implementation is faster. 3. Further sections
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

1.2.

3 If a compter connected to a 1 Gigabit Ethernet nerwork needs to send


a 256Kbytes file, how long it would take?
Answer: Network speed: 1 gigabit network ==> 1 gigabit/per second = 125
Mbytes/
second. File size: 256 Kbytes = 0.256 Mbytes. Time for 0.256 Mbytes =
0.256/125 =
2.048 ms

For problems below, use the information about access time for every type
of memory in the following table.
Cache DRAM Flash Memory Magnetic Disk
5ns 50 ns 5 s 5 ms

1.2.4 Find how long it takes to read a file from a DRAM if it takes 2
microseconds from the cache memory.

Answer: 2 microseconds from cache ==> 20 microseconds from DRAM.
20 micro-
seconds from DRAM ==> 2 seconds from magnetic disk. 20 microseconds
from
DRAM ==> 2 ms from ash memory

Exercise 1.3
Consider three different processors P1, P2, and P3 executing the same
instruction
set with the clock rates and CPIs given in the following table.

Processor Clock Rate CPI
P1 2 Ghz 1.5
P2 1.5 Ghz 1.0
P3 3 Ghz 2.5

1.3.1 [5] <1.4> Which processor has the highest performance expressed in
instructions per second (MIPS)?

Answer: =>P2 has the highest performance
IPS(P1) = 1.33 10
9

MIPS(P1) = 1.33 10
3


IPS(P2) = 1.5 10
9

MIPS(P2) = 1.5 10
3


IPS(P3) = 1.2 10
9

MIPS(P3) = 1.2 10
3


1.3.2 [10] <1.4> If the processors each execute a program in 10 seconds,
find the
number of cycles and the number of instructions

Answer:
No. cycles = time clock rate
cycles(P1) = 10 2 10
9
= 20 10
9
s
cycles(P2) = 10 1.5 10
9
= 15 10
9
s
cycles(P3) = 10 3 10
9
= 30 10
9
s

time = (No. instr. CPI)/clock rate => No. instructions = No. cycles/CPI
instructions(P1) = 20 10
9
/1.5 = 13.33 10
9

instructions(P2) = 15 10
9
/1 = 15 10
9

instructions(P3) = 30 10
9
/2.5 = 12 10
9


1.3.3 [10] <1.4> We are trying to reduce the time by 30% but this leads to
an increase of 20% in the CPI. What clock rate should we have to get this
time
reduction?
Answer:

CPI_New = CPI_Old * 1.2
CPI_New(P1) = 1.5 * 1.2 = 1.8
CPI_New(P2) = 1 * 1.2 = 1.2
CPI_New(P3) = 2.5 * 1.2 = 3

Time_New = Time_Old * 0.7 = 10*0.7 = 7s

= No. instr. CPI/time (No. instr ly cu 1.3.2)
(P1) = 13.33*10
9
* 1.8/7 = 3.43 GHz
(P2) = 15*10
9
* 1.2 / 7 = 2.57 GHz
(P3) = 12*10
9
* 3 / 7 = 5.14 Ghz

1.3.4 Find the IPC (instructions per cycle) for each processor
For problems below, use the information in the following table.
Processor Rate Clock No. Instructions Time
P1 3 GHz 20.10
9
7s
P2 1.5 GHz 30.10
9
10s
P3 3 GHz 90.10
9
9s

Answer: IPC = 1/CPI = No. instr./(time clock rate)
IPC(P1) = 20.10
9
/ (7*2Ghz) = 1.42
IPC(P2) = 30.10
9
/ (10*1.5Ghz) = 2
IPC(P3) = 90.10
9
/ (9*3Ghz)= 3.33

1.3.5 [5] <1.4> Find the clock rate for P2 that reduces its execution time to
that of P1
Answer:
f_new = No. instr. CPI/time_new
f_old = No. instr. CPI/time_old
f_new/f_old = time_old/time_new
f_new = (f_old * 10/7) = 1.5 Ghz *10/7 = 2.14 Ghz

1.3.6 [5] <1.4> Find the number of instructions for P2 that reduces its
execution
time to that of P3
Answer:
No.instr_new = (f * time_new) / CPI
No.instr_old = (f * time_old) / CPI
No.instr_new / No.instr_old = time_new / time_old
No.instr_new = No.instr_old * 9/10 = 30*10
9
*9 / 10 = 27 * 10
9


Exercise 1.4 Consider two different implementations of the same
instruction set architecture. There are four classes of instructions, A, B, C,
and D. The clock rate and CPI of each implementation are given in the
following table.
Clock rate
CPI Class
A
CPI Class
B
CPI Class
C
CPI Class
D
P1 1.5 Ghz 1 2 3 4
P2 2 Ghz 2 2 2 2

1.4.1 Given a program with 10
6
instructions divided into classes as
follows: 10% class A, 20% class B, 50% class C, and 20% class D, which
implementation is faster?

Answer: P2
Class A: 10
5
instr.
Class B: 2 10
5
instr.
Class C: 5 10
5
instr.
Class D: 2 10
5
instr.
Time = No. instr. CPI/clock rate

P1: Time class A = (10
5
/1.5*10
9
) = 0.66 10
-4

Time class B = 2.66 10
-4

Time class C = 10 10
-4

Time class D = 5.33 10
-4


Total time P1 = 18.65 10
-4


P2: Time class A = 10
-4

Time class B = 2 10
-4

Time class C = 5 10
-4

Time class D = 3 10
-4


Total time P2 = 11 10
-4


1.4.2 [5] <1.4> What is the global CPI for each implementation?
Answer: CPI = time clock rate/No. instr.
CPI(P1) = 18.65 10
-4
1.5 10
9
/10
6
= 2.79
CPI(P2) = 11 10
-4
2 10
9
/10
6
= 2.2

1.4.3 [5] <1.4> Find the clock cycles required in both cases.
Answer:
Clock cycle = Instruction for a program * CPI
clock cycles(P1) = InstrucA * CPIA + InstrucB * CPIB + InstrucC * CPIC +
InstrucD * CPID
= 10
5
1 + 2 10
5
2 + 5 10
5
3 + 2 10
5
4 = 28
10
5

1.4.4 [5] <1.4> Assuming that arith instructions take 1 cycle, load and store
5
cycles, and branches 2 cycles, what is the execution time of the program in
a 2 GHz
processor?

The following table shows the number of instructions for a program.
Arith Store Load Branch Total
500 50 100 50 700

Answer:
CPU Time =

Clock rate
= (500 *1 + 50 * 5+
100*5+50*2)/(2*10
9
) = 675 * 10
-9
s = 675 ns
1.4.5 [5] <1.4> Find the CPI for the program.
Answer: CPI = time clock rate/No. instr.
CPI = 675 10
-9
2 10
9
/700 = 1.92

1.4.6 [10] <1.4> If the number of load instructions can be reduced by one
half,
what is the speedup and the CPI?
Answer:
Time = (500 1 + 50 5 + 50 5 + 50 2) 0.5 10
-9
= 550 ns
Speed-up = 675 ns/550 ns = 1.22
CPI = 550 10
-9
2 10
9
/700 = 1.57

Exercise 1.5
Consider two different implementations, P1 and P2, of the same instruction
set.
There are five classes of instructions (A, B, C, D, and E) in the instruction
set. The
clock rate and CPI of each class is given below.
Clock
Rate
CPI
Class A
CPI
Class B
CPI
Class
C
CPI
Class
D
CPI
Class
E
a
P1 1.0 GHz 1 2 3 4 3
P2 1.5 Ghz 2 2 2 4 4
b
P1 1.0 GHz 1 1 2 3 2
P2 1.5 Ghz 1 2 3 4 3

1.5.1 [5] <1.4> Assume that peak performance is defined as the fastest rate
that
a computer can execute any instruction sequence. What are the peak
performances
of P1 and P2 expressed in instructions per second?

Answer:
a. Peak performance on P1 occurs when only class A instructions
are executed
peakP1 = 1 inst/cycle x 1 x 10
9
cycles/sec = 1 x 10
9
inst/sec = 1G inst/sec
peak P2 = (1/2) inst/cycle x 1.5 x 10
9
cycles/sec = 0.75 x 10
9
inst/sec =
0.75G inst/sec
b. Peak performance on P1 occurs when only class A instructions
are executed
peakP1 = 1 inst/cycle x 1 x 10
9
cycles/sec = 1 x 10
9
inst/sec = 1G inst/sec
peak P2 = 1 inst/cycle x 1.5 x 10
9
cycles/sec = 1.5 x 10
9
inst/sec = 1.5G
inst/sec

1.5.2 [10] <1.4> If the number of instructions executed in a certain
program
is divided equally among the classes of instructions except for class A,
which
occurs twice as often as each of the others, which computer is faster? How
much
faster is it?
Answer:

CPI Freq Freq*CPI Freq Freq*CPI
a 1 0.333 0.333 2 0.666
b 2 0.167 0.334 2 0.334
c 3 0.167 0.501 2 0.334
d 4 0.167 0.668 4 0.668
e 3 0.167 0.501 4 0.668
Total 2.337 2.67


Cpu-time = <cpi> I/F
Cpu-time1 = 2.337 I / 1Ghz
Cpu-time2 = 2.67 I / 1.5Ghz
perf2/perf1 = cpu-time1/cpu-time2 = 1.5 * 2.337/2.67 = 1.3 (Performance =
1/ execution time)
a. P2 is 1.33 times faster than P1
b. Same as question a: P1 is 1.03 times faster than P2

1.5.3 [10] <1.4> If the number of instructions executed in a certain
program
is divided equally among the classes of instructions except for class E,
which oc-
curs twice as often as each of the others, which computer is faster? How
much
faster is it?
Answer: Same as 1.5.2
a. P2 is 1.31 times faster than P1
b. P1 is 1.00 times faster than P2

1.5.4 [5] <1.4> Assuming that computes take 1 cycle, loads and store
instructions
take 10 cycles, and branches take 3 cycles, find the execution time on a 3
GHz MIPS
processor.

The table below shows instruction type breakdown for different programs.
Using
this data, you will be exploring the performance trade-offs for different
changes
made to an MIPS processor.
No Instruction
Compute Load Store Branch total
Program 1 1000 400 100 50 15500
Program 2 1500 300 100 100 1750

Answer:

P1 P2
Instructio
n
Cycl
e
Instruct
1
Cycle*instruc
t1
Instruct
2
Cycle*instruc
t2
Compute 1 1000 1000 1500 1500
Load 10 400 4000 300 3000
Store 10 100 1000 100 1000
Branch 3 50 150 100 300
total 6150 5800

Cpu-time1 = Cycle*instruct1/F = 6150/3Ghz = 2.05*10
6
s = 2.05 s
Cpu-time2 = Cycle*instruct2/F = 5800/3Ghz = 1.93 s

1.5.5 [5] <1.4> Assuming that computes take 1 cycle, loads and store
instructions
take 2 cycles, and branches take 3 cycles, find the execution time on a 3
GHz MIPS
processor

Answer:
P1 P2
Instructio
n
Cycl
e
Instruc
t1
Cycle*instru
ct1
Instruc
t2
Cycle*instruct
2
Compute 1 1000 1000 1500 1500
Load 2 400 800 300 600
Store 2 100 200 100 200
Branch 3 50 150 100 300
total 2150 2600

Cpu-time1 = Cycle*instruct1/F = 2150/3Ghz = 716*10
6
s = 0.71 s
Cpu-time2 = Cycle*instruct2/F = 2600/3Ghz = 0.86 s

You might also like