0% found this document useful (0 votes)

10 views

Disc 6 Sol

Uploaded by

Ishan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Disc 6 Sol

Uploaded by

Ishan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

CS232 Section 6: Performance - Solutions

1. Recall the execution time equation:

CP U timeX,P = Instructions executedP ∗ CP IX,P ∗ Clock cycle timeX (1)

Consider a basic machine with the following characteristics:

OP Type Freq (fi ) Cycles CP Ii

Load 20% 5 1.0
Store 10% 3 0.3
Branch 20% 2 0.4
ALU 50% 3 1.5

Calculate CPI for each instruction type.

Execution time before improvement: 3.2 * I * CCT. Since I and CCT remain unchanged,
we will only consider CPI.
How much faster would the machine be if:
(a) we added a cache to reduce average load time to 3 cycles?
Originally, load instructions added 1.0 to the CPI. After decreasing the number
of cycles by 2, they add .6 CPI. CPI after improvement: 2.8. Improvement: 3.2
- 2.8 /3.2 = 12.5%
(b) we added a branch predictor to reduce branch time by 1 cycle?
Originally branch instructions added .4 to the CPI. After reducing the number of
cycles in half, they add .2 CPI. CPI after improvement: 3.0. Improvement: 3.2
- 3.0 / 3.2 = 6.25%
(c) we could do two ALU operations in parallel?
Originally ALU instructions added 1.5 to the CPI. Running two ALU operations in
parallel is the same as reducing the number of cycles in half, down to .75 CPI.
CPI after improvement: 2.45. Improvement: 3.2 - 2.45 /3.2 = 23.4%

2. Use the basic machine table from question 1, but assume that the frequency column indicates the
percentage of execution time spent in the corresponding instruction type. Use Amdahl’s Law to
answer the following: if you could decrease the cycle time of one of the instruction types by 1 cycle,
which instruction type would you optimize? How would the new execution time compare to the
original one?

T ime af f ected by improvement

Execution time af ter improvement = + T ime unaf f ected by improvement
Amount of improvement
(2)
OP Type Exec time Cycles New exec time Improvement
Load 20% 5 .2 / 5/4 + .8 = .16 + .8 = .96 4%
Store 10% 3 .1 / 3/2 + .9 = .067 + .9 = .967 3.3%
Branch 20% 2 .2 / 2/1 + .8 = .1 + .8 = .9 10%
ALU 50% 3 .5 / 3/2 + .5 = .33 + .5 = .83 16.7%

It’s best to optimize ALU instructions. The expected improvement is 16.7%.

3. Intel’s Itanium (IA-64) ISA is designed to facilitate executing multiple instructions per cycle. If
an Itanium processor achieves an average CPI of .3 (3 instructions per cycle), how much faster is
it than a Pentium4 (which uses the x86 ISA) with an average CPI of 1?

(a) Itanium is three times faster

(b) Itanium is one third as fast
(c) Not enough information - We need cycle time and number of instruction to calculate
the execution times of these machines. Comparing only CPIs, just like comparing
only CCTs is misleading.

1
CS232 Section 6: Performance - Solutions

4. Assume the following delays for the main functional units of the single-cycle datapath shown below:

Functional Unit Time Delay

Memory 5ns
ALU 4ns
Register File 3ns

Given the following instructions: lw, sw and add, calculate:

(a) Minimum time to perform each instruction

(b) Time required on a single-cycle datapath

Write your answers in the table below. State any assumptions.

(a) lw (in that order): access memory (to read instruction), read operands from register
file, ALU (effective address calculation), access memory (to read data) and write
back to register file (the data read from memory).
(b) sw (in that order): access memory (to read instruction), read operands from register
file, ALU (effective address calculation), access memory (to write data).
(c) add (in that order): access memory (to read instruction), read operands from register
file, ALU (ALU computation based on func value) and write back to register file
(value computed in ALU).

Instruction instruction time instruction in Single-Cycle datapath

lw 20ns 20ns
sw 17ns 20ns
add 15ns 20ns
5. Consider the following set of instructions:

add $sp, $sp, -4

sub $v0, $a0, $a1
lw $t0, 4($sp)
or $s0, $s1, $s2
lw $t1, 8($sp)

Assuming the instructions are executed on a single-cycle machine with 10ns cycle time, compute:

(a) cycle time - 10 ns

(b) instruction latency - 10ns
1instruction
(c) instruction throughput - 10ns

6. Assume that the code above is executed on a 5-stage pipelined machine discussed in lecture. You
might first draw the pipeline diagram in the space below.
If a pipeline stage takes 2ns, calculate:
(a) cycle time - 2ns
(b) instruction latency - 5 ∗ 2ns = 10ns, just like single-cycle
5instructions
(c) instruction throughput - 18ns - much better than single-cycle

Solution Manual COD
No ratings yet
Solution Manual COD
115 pages
Solution Chapter 1
91% (22)
Solution Chapter 1
2 pages
Codex Theodosianus
100% (1)
Codex Theodosianus
4 pages
week6_performance_numericals
No ratings yet
week6_performance_numericals
38 pages
Lecture Ch4 Performance
No ratings yet
Lecture Ch4 Performance
25 pages
Today - Finish Single-Cycle Datapath/control Path - Look at Its Performance and How To Improve It
No ratings yet
Today - Finish Single-Cycle Datapath/control Path - Look at Its Performance and How To Improve It
28 pages
Sample Questions
No ratings yet
Sample Questions
5 pages
CS3350B Computer Architecture CPU Performance and Profiling: Marc Moreno Maza
No ratings yet
CS3350B Computer Architecture CPU Performance and Profiling: Marc Moreno Maza
28 pages
Performance: Computer Architecture and Assembly Language Dr. Aiman El-Maleh
No ratings yet
Performance: Computer Architecture and Assembly Language Dr. Aiman El-Maleh
25 pages
Homework 1
No ratings yet
Homework 1
18 pages
Chapter 1 Notes
No ratings yet
Chapter 1 Notes
28 pages
Sheet 1
No ratings yet
Sheet 1
6 pages
09 Perf
No ratings yet
09 Perf
22 pages
COA ASsignment
No ratings yet
COA ASsignment
7 pages
FALLSEM2024-25_CSI3021_TH_VL2024250101951_2024-07-19_Reference-Material-I
No ratings yet
FALLSEM2024-25_CSI3021_TH_VL2024250101951_2024-07-19_Reference-Material-I
21 pages
Discussion Session 4-11
No ratings yet
Discussion Session 4-11
12 pages
Lsli 02
No ratings yet
Lsli 02
32 pages
Chap 2 Exercises With Solutions
No ratings yet
Chap 2 Exercises With Solutions
7 pages
CAO Fall 2024 Lecture 06 Design Metrics Performance Evaluation
No ratings yet
CAO Fall 2024 Lecture 06 Design Metrics Performance Evaluation
41 pages
Computer Architecture Measurement
No ratings yet
Computer Architecture Measurement
26 pages
Performance
No ratings yet
Performance
4 pages
Chapter-7 Practice Questions For Performance
No ratings yet
Chapter-7 Practice Questions For Performance
9 pages
Ca Mid1 2017
No ratings yet
Ca Mid1 2017
9 pages
Lecture 07 - Performance Measurements - Single and Multiple Cycle Processor Designs
No ratings yet
Lecture 07 - Performance Measurements - Single and Multiple Cycle Processor Designs
53 pages
Assg1 Sol PDF
No ratings yet
Assg1 Sol PDF
3 pages
Computer Component Performance-Nguyễn Hoàng Long - BI11-157
100% (1)
Computer Component Performance-Nguyễn Hoàng Long - BI11-157
9 pages
Sheet1 Computer
No ratings yet
Sheet1 Computer
2 pages
ACA Lec2 New
No ratings yet
ACA Lec2 New
44 pages
Assignment - 1
0% (1)
Assignment - 1
4 pages
Problem1 - Pablo Lird
No ratings yet
Problem1 - Pablo Lird
5 pages
12 CPUPerformance
No ratings yet
12 CPUPerformance
26 pages
archmidsem2009sol
No ratings yet
archmidsem2009sol
5 pages
A5 Solution
No ratings yet
A5 Solution
4 pages
Computer Organization Exercise Answerb
No ratings yet
Computer Organization Exercise Answerb
5 pages
EE6304 Tut2
No ratings yet
EE6304 Tut2
2 pages
Sheet2 - Solution (design)
No ratings yet
Sheet2 - Solution (design)
6 pages
Week 2 - Lecture 2 - Performance Measurement
No ratings yet
Week 2 - Lecture 2 - Performance Measurement
25 pages
Module 3.3 - Problems On Performance
No ratings yet
Module 3.3 - Problems On Performance
54 pages
MIS 6110 Assignment #1 (Spring 2015)
No ratings yet
MIS 6110 Assignment #1 (Spring 2015)
14 pages
Performance
No ratings yet
Performance
51 pages
Nmam Institute of Technology: Department of Computer Science and Engineering
No ratings yet
Nmam Institute of Technology: Department of Computer Science and Engineering
8 pages
CS/COE 1541 Term 2174 Quiz 1: (Solutions)
No ratings yet
CS/COE 1541 Term 2174 Quiz 1: (Solutions)
2 pages
Lecture4 Performance Evaluation 2011
No ratings yet
Lecture4 Performance Evaluation 2011
34 pages
CH01 Solution PDF
No ratings yet
CH01 Solution PDF
8 pages
PS1_Exercises
No ratings yet
PS1_Exercises
32 pages
Numerical: Central Processing Unit
No ratings yet
Numerical: Central Processing Unit
28 pages
Computer Science 321
No ratings yet
Computer Science 321
2 pages
Lecture 03
No ratings yet
Lecture 03
30 pages
4 Performance
No ratings yet
4 Performance
27 pages
550 12 6 2011 PDF
No ratings yet
550 12 6 2011 PDF
45 pages
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
No ratings yet
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
52 pages
Slide 1
No ratings yet
Slide 1
33 pages
Test Solution
No ratings yet
Test Solution
3 pages
Chapter 1 Performance
No ratings yet
Chapter 1 Performance
32 pages
Illinois Exam2 Practice Solfa08
No ratings yet
Illinois Exam2 Practice Solfa08
4 pages
Lec 3
No ratings yet
Lec 3
21 pages
02 Performance
No ratings yet
02 Performance
13 pages
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
Multicore DSP: From Algorithms to Real-time Implementation on the TMS320C66x SoC
From Everand
Multicore DSP: From Algorithms to Real-time Implementation on the TMS320C66x SoC
Naim Dahnoun
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Comptia Server+ Primer
From Everand
Comptia Server+ Primer
John Greene
5/5 (1)
2002 Spring Exam1 Sol
No ratings yet
2002 Spring Exam1 Sol
7 pages
Coa Applied
No ratings yet
Coa Applied
13 pages
Cfls and The Pumping Lemma
No ratings yet
Cfls and The Pumping Lemma
24 pages
Question 60: Please Check The Table Given in The Below Figure Which Says The Contents of The File Will Be Lost in W+ Mode Also
No ratings yet
Question 60: Please Check The Table Given in The Below Figure Which Says The Contents of The File Will Be Lost in W+ Mode Also
3 pages
Unit-1 MPMC
No ratings yet
Unit-1 MPMC
40 pages
Your Medn Is Your Guide
No ratings yet
Your Medn Is Your Guide
9 pages
Vabview Manual
No ratings yet
Vabview Manual
15 pages
Low Profile Ultra Wideband HF
No ratings yet
Low Profile Ultra Wideband HF
4 pages
API 16A 3rd Edition, 2004. Errata 1
100% (2)
API 16A 3rd Edition, 2004. Errata 1
2 pages
PDF Translation Between English and Arabic 1st Edition Noureldin Abdelaal Download
100% (3)
PDF Translation Between English and Arabic 1st Edition Noureldin Abdelaal Download
62 pages
The Spiritual Runes A Guide to the Ancestral Wisdom Premium eBook Download
100% (4)
The Spiritual Runes A Guide to the Ancestral Wisdom Premium eBook Download
17 pages
Group 4 Sevilla, Glester ATG
No ratings yet
Group 4 Sevilla, Glester ATG
5 pages
IBM DB2 To PostgreSQL Migration - SQLines Tools
No ratings yet
IBM DB2 To PostgreSQL Migration - SQLines Tools
5 pages
Cloud Computing KCS-713 PUT QP Re-Admit ODD 21-22
No ratings yet
Cloud Computing KCS-713 PUT QP Re-Admit ODD 21-22
2 pages
The Germanic Languages Presentation
No ratings yet
The Germanic Languages Presentation
48 pages
4433596
No ratings yet
4433596
34 pages
ns-2 Tutorial (1) : Contents
No ratings yet
ns-2 Tutorial (1) : Contents
16 pages
Prueba DP4
No ratings yet
Prueba DP4
13 pages
such-sweet-thunder-score-full
No ratings yet
such-sweet-thunder-score-full
16 pages
How To Make Folded-Book-Art
No ratings yet
How To Make Folded-Book-Art
10 pages
CPDprogram TEACHERS-42618
No ratings yet
CPDprogram TEACHERS-42618
80 pages
Artikel Kecemasan Pada Mahasiswa
No ratings yet
Artikel Kecemasan Pada Mahasiswa
11 pages
Chapter 8
No ratings yet
Chapter 8
30 pages
Macbeth Act 1 Scene 1
No ratings yet
Macbeth Act 1 Scene 1
2 pages
The Dwarfs A Play (Pinter, Harold)
No ratings yet
The Dwarfs A Play (Pinter, Harold)
190 pages
2060 VB
No ratings yet
2060 VB
24 pages
Dependent Origination: The Buddhist Law of Conditionality
100% (3)
Dependent Origination: The Buddhist Law of Conditionality
149 pages
Google Earth Engine Basics and General Applications
No ratings yet
Google Earth Engine Basics and General Applications
32 pages
Alty, John - Dorians and Ionians - JHS, 102 - 1982!1!14
No ratings yet
Alty, John - Dorians and Ionians - JHS, 102 - 1982!1!14
15 pages
如何为研究论文撰写文献综述
100% (2)
如何为研究论文撰写文献综述
9 pages
Unit Ba (JMC) : Models of Communication
No ratings yet
Unit Ba (JMC) : Models of Communication
30 pages
BÀI TẬP THÌ HIỆN TẠI ĐƠN
No ratings yet
BÀI TẬP THÌ HIỆN TẠI ĐƠN
6 pages
Comfort O Comfort my People
No ratings yet
Comfort O Comfort my People
4 pages

Disc 6 Sol

Uploaded by

Disc 6 Sol

Uploaded by

CS232 Section 6: Performance - Solutions

1. Recall the execution time equation:

CP U timeX,P = Instructions executedP ∗ CP IX,P ∗ Clock cycle timeX (1)

Consider a basic machine with the following characteristics:

OP Type Freq (fi ) Cycles CP Ii

Calculate CPI for each instruction type.

T ime af f ected by improvement

It’s best to optimize ALU instructions. The expected improvement is 16.7%.

(a) Itanium is three times faster

Functional Unit Time Delay

Given the following instructions: lw, sw and add, calculate:

(a) Minimum time to perform each instruction

Write your answers in the table below. State any assumptions.

Instruction instruction time instruction in Single-Cycle datapath

add $sp, $sp, -4

(a) cycle time - 10 ns

You might also like