Pipeline Hazards Selected
Pipeline Hazards Selected
• Introduction
– Defining Pipelining
– Pipelining Instructions
• Hazards
– Structural hazards \
– Data Hazards
– Control Hazards
• Performance
• Controller implementation
LW
SW LW
ADD SW LW
SUB ADD SW LW
SUB ADD SW LW
SUB ADD SW
SUB ADD
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8
Memory Conflict
I Load Ifetch
ALU
Reg DMem Reg
n
s
ALU
t
Instr 1 Ifetch Reg DMem Reg
r.
ALU
Reg
Instr 2 Ifetch Reg DMem
O
r
Stall Bubble Bubble Bubble Bubble Bubble
d
e
r
ALU
Instr 3 Ifetch Reg DMem Reg
• Introduction
– Defining Pipelining
– Pipelining Instructions
• Hazards
– Structural hazards
– Data Hazards \
– Control Hazards
• Performance
• Controller implementation
Value of CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
register $2: 10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
Program
execution
order
(in instructions)
sub $2, $1, $3 IM Reg DM Reg
The use of the result of the SUB instruction in the next three instructions causes a
data hazard, since the register $2 is not written until after those instructions read it.
I: add r1,r2,r3
J: sub r4,r1,r3
I: sub r4,r1,r3
J: add r1,r2,r3
K: mul r6,r1,r7
– Called an “anti-dependence” by compiler writers.
This results from reuse of the name “r1”.
I: sub r1,r4,r3
J: add r1,r2,r3
K: mul r6,r1,r7
Value of CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
register $2: 10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
Program
execution IF/ID ID/EX EX/MEM MEM/WB
order
(in instructions)
sub $2, $1, $3 IM Reg DM Reg
0 2 4 6 8 10 12 16 18
add $s0,$t0,$t1 W
IF ID EX MEM s0 $s0
written
here
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
sub $t2,$s0,$t3 R
IF s0 EX MEM WB
$s0 read
here
0 2 4 6 8 10 12 16 18
ID W
lw $s0,20($t1) IF ID EX MEM s0
new value
of s0
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
R
sub $t2,$s0,$t3 IF s0 EX MEM WB
A branch is either
– Taken: PC <= PC + 4 + Immediate
– Not Taken: PC <= PC + 4
ALU
10: beq r1,r3,36 Ifetch Reg DMem Reg
ALU
Ifetch Reg DMem Reg
14: and r2,r3,r5
ALU
Reg
18: or r6,r1,r7 Ifetch Reg DMem
ALU
Ifetch Reg DMem Reg
22: add r8,r1,r9
ALU
36: xor r10,r1,r11 Ifetch Reg DMem Reg
(Fig. 6.37)
0 2 4 6 8 10 12 16 18
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
sw $s4,200($t5) IF ID EX MEM WB
beq
writes PC new PC
here used here
0 2 4 6 8 10 12 16 18
tgt:
sw $s4,200($t5) IF ID EX MEM WB
Fetch assuming
branch taken
0 2 4 6 8 10 12 16 18
tgt:
sw $s4,200($t5) IF
(incorrect - ST ALL) BUBBLE BUBBLE BUBBLE BUBBLE
or $r8,$r8,$r9 IF ID EX MEM WB
“Squashed”
instruction
0
a31a30…a11…a2a1a0 branch instruction
1K-entry BHT
10-bit index
Instruction memory
• Example:
Consider a loop branch that is taken 9 times in a
row and then not taken once. What is the prediction
accuracy of the 1-bit predictor for this branch
assuming only this branch ever changes its
corresponding prediction bit?
NT
CSCE430/830 Pipeline Hazards
n-bit Saturating Counter
• Values: 0 ~ 2n-1
• When the counter is greater than or equal to one-half
of its maximum value, the branch is predicted as
taken. Otherwise, not taken.
• Studies have shown that the 2-bit predictors do
almost as well, and thus most systems rely on 2-bit
branch predictors.
Prediction accuracy of 4K-entry 2-bit prediction buffer vs. “infinite” 2-bit buffer:
increasing buffer size from 4K does not significantly improve performance
CSCE430/830 Pipeline Hazards