Ex4 Updated
Ex4 Updated
4. Problems in this exercise assume that the logic blocks used to implement a processor’s
datapath have the following latencies:
“Register read” is the me needed a er the rising clock edge for the new register value
to appear on the output. This value applies to the PC only. “Register setup” is the
amount of me a register’s data input must be stable before the rising edge of the clock.
This value applies to both the PC and Register File.
a. What is the latency of an R-type instruc on (i.e., how long must the clock period be
to ensure that this instruc on works correctly)?
b. What is the latency of lw?
c. What is the latency of sw?
d. What is the latency of beq?
e. What is the latency of an arithme c, logical, or shi I-type (non-load) instruc on?
f. What is the minimum clock period for this CPU?
5. In this exercise, we examine how pipelining affects the clock cycle me of the processor.
Problems in this exercise assume that individual stages of the datapath have the
following latencies:
Also, assume that instruc ons executed by the processor are broken down as
follows:
9. The importance of having a good branch predictor depends on how o en condi onal
branches are executed. Together with branch predictor accuracy, this will determine how
much me is spent stalling due to mispredicted branches. In this exercise, assume that
the breakdown of dynamic instruc ons into various instruc on categories is as follows:
a. Stall cycles due to mispredicted branches increase the CPI. What is the extra CPI due
to mispredicted branches with the always-taken predictor? Assume that branch
outcomes are determined in the ID stage and applied in the EX stage that there are
no data hazards, and that no delay slots are used.
b. What is the CPI for the “always-not-taken” predictor.
c. What is the CPI for for the 2-bit predictor.
d. With the 2-bit predictor, what speedup would be achieved if we could convert half of
the branch instruc ons to some ALU instruc on? Assume that correctly and
incorrectly predicted instruc ons have the same chance of being replaced.
e. With the 2-bit predictor, what speedup would be achieved if we could convert half of
the branch instruc ons in a way that replaced each branch instruc on with two ALU
instruc ons? Assume that correctly and incorrectly predicted instruc ons have the
same chance of being replaced.
f. Some branch instruc ons are much more predictable than others. If we know that
80% of all executed branch instruc ons are easy-to-predict loop-back branches that
are always predicted correctly, what is the accuracy of the 2-bit predictor on the
remaining 20% of the branch instruc ons?
10. This exercise examines the accuracy of various branch predictors for the following
repea ng pa ern (e.g., in a loop) of branch outcomes: T, NT, T, T, NT.
a. What is the accuracy of always-taken and always-not-taken predictors for this
sequence of branch outcomes?
b. What is the accuracy of the 2-bit predictor for the first four branches in this pa ern,
assuming that the predictor starts off in the bo om le state (predict not taken)?
c. What is the accuracy of the 2-bit predictor if this pa ern is repeated forever?
d. Design a predictor that would achieve a perfect accuracy if this pa ern is repeated
forever. You predictor should be a sequen al circuit with one output that provides a
predic on (1 for taken, 0 for not taken) and no inputs other than the clock and the
control signal that indicates that the instruc on is a condi onal branch.
e. What is the accuracy of your predictor if it is given a repea ng pa ern that is the
exact opposite of this one?
f. Design a predictor similar to (d), but now your predictor should be able to eventually
(a er a warm-up period during which it can make wrong predic ons) start perfectly
predic ng both this pa ern and its opposite. Your predictor should have an input
that tells it what the real outcome was. Hint: this input lets your predictor determine
which of the two repea ng pa erns it is given.