Cs/Coe 1541: Single and Multi-Cycle Implementations
Cs/Coe 1541: Single and Multi-Cycle Implementations
2) Decode instruction
3) If necessary, perform an ALU operation
4) If memory access, perform load/store
5) Write results back to register file and increment the PC
Fetching Instruction
Program Counter
(Register)
Next
Instruction
Memory (RAM)
Instruction
address
Instruction
Adder
Current PC
Next PC
Current PC
Adder
4
Program Counter
(Register)
Next
Current PC
Instruction
instruction
address
Write
Introduction to Computer Architecture
Instruction
Memory (RAM)
4
ALU Operation
Consider a basic ALU operation
add
R1,R2,R3
Requires a Register File and an ALU
Register
Numbers
Read Reg 1
Read
Read Reg 2 Data 1
Write Reg
Data
ALU Operation
Read
Data 2
Write Data
Data
ALU
Register File
00100
rt
00010
rd
00000
shamt
100000
funct
Register File
Instruction
Register
Fields
ALU Operation
Read Reg 1
Read
Read Reg 2 Data 1
Write Reg
Data
Read
Data 2
Write Data
Write Enable
ALU
Register File
Read Reg 1
Read
Read Reg 2 Data 1
Write Reg
Write Data
Read
Data 2
ALU
Data Memory
(RAM)
Register File
Data Memory
(RAM)
Read Reg 1
Read
Read Reg 2 Data 1
Write Reg
Read
Data 2
Write Data
ALU
Read
data
Write Enable
Introduction to Computer Architecture
Data Memory
(RAM)
Register File
Read Reg 1
Read
Read Reg 2 Data 1
Write Reg
Read
Data 2
Write Data
Read/Write
address
ALU
Read
data
Write Enable
Introduction to Computer Architecture
ALU Add
Data Memory
(RAM)
Register File
Read Reg 1
Read
Read Reg 2 Data 1
Write Reg
Data
Read/Write
address
Read
Data 2
Write Data
ALU
Read
data
Write Enable
16
Introduction to Computer Architecture
Sign
extend
32
10
Sign Extender
000000 00011
op
rs
00100
rt
00010000 00100000
immediate
Register File
Read Reg 1
Read
Read Reg 2 Data 1
Write Reg
Read
Data 2
Write Data
16 bits
Write Enable
Introduction to Computer Architecture
16 bits
15
...
ALU
16 bits
Sign extend
0
11
ALU Add
Instruction
Data Memory
(RAM)
Register File
Read Reg 1
Read
Read Reg 2 Data 1
Write Reg
Data
Read/Write
address
Read
Data 2
Write Data
ALU
Write Data
Read
data
Write Enable
16
Introduction to Computer Architecture
Sign
extend
32
12
Branch
BEQ
Instruction
Register File
Read Reg 1
Data
Read
Read Reg 2 Data 1
Write Reg
Write Data
Zero
Read
Data 2
To Branch
Control
Logic
ALU
13
ADDER
PC + 4 from
instruction datapath
Instruction
Branch
Target
<< 2
Register File
Read Reg 1
Data
Read
Read Reg 2 Data 1
Write Reg
Zero
Read
Data 2
Write Data
16
Introduction to Computer Architecture
To Branch
Control
Logic
ALU
Sign
extend
32
14
Register File
Read Reg 1
Read
Read Reg 2Data 1
Instruction
Zero
Data Memory
(RAM)
M
U
X
M
U
X
ALU
Write
Data
Sign
extend
16
Read
data
32
15
4
PC
M
U
X
Adder
Instruction
Memory (RAM)
<< 2
ADDER
Register File
Instruction
Read Reg 1
Read
Read Reg 2Data 1
Write Reg Read
Write Data Data 2
Zero
Data Memory
(RAM)
M
U
X
M
U
X
ALU
16
Introduction to Computer Architecture
Sign
exten
d
Write
Data
Read
data
32
16
ALU
10 ns
Register File
5 ns
Memory
10 ns
Assume everything else takes zero time
17
Instruction Timings
Instr Type
R-format
Load
Store
Branch
Jump
InstrMem
10
10
10
10
10
Reg Read
5
5
5
5
-
ALU
10
10
10
10
-
DataMem
10
10
-
Register File
PC
Read Reg 1
Read
Read Reg 2Data 1
Write Reg Read
Write Data Data 2
Zero
Reg Write
5
5
-
Data Memory
(RAM)
M
U
X
M
U
X
ALU
16
Sign
exten
d
Total
30 ns
40 ns
35 ns
25 ns
10 ns
Write
Data
Read
data
32
18
19
Shift
left 2
Jump
Address
M
U
X
32
Memory
Read Reg1
Read
Read Reg2Data 1
PC
M
U
X
M
U
X
Instruction
Register
A
B
4
M
U
X
Zero
ALUOut
M
U
X
ALU
Write Data
MDR
M
U
X
16
Sign
Exten
d
Shift
left 2
32
20
21
IR = Memory[PC];
PC = PC + 4;
22
A = Reg[IR[25..21]];
B = Reg[IR[20..16]];
ALUOut = PC + (signExtend(IR[15..0]) << 2);
23
ALUOut = A + B
Branch
If (A == B) PC = ALUOut;
Jump
PC = PC[31 ..28] || (IR[25..0) << 2);
24
25
Reg[IR[20..16]] = MDR;
26
Multicycle Control
MemReadMemWrite
RegWrite
IRWrite
IorD
RegDest
PC
M
U
X
ALU SelA
M
U
X
Instruction
Register
Read Reg1
Read
Read Reg2Data 1
Write Reg Read
Data 2
Write Data
A
B
4
M
U
X
Zero
ALUOut
ALU
M
U
X
Write Data
MDR
M
U
X
16
MemToReg
ALU SelB
Shift
Sign
left 2
Exten
32
d
Instruction [5:0]
ALU
Contr
ol
ALU Op
27
loads
stores
R-type
branches
jump
5 cycles
4 cycles
4 cycles
3 cycles
3 cycles
22%
11%
49%
16%
2%
28
CS/CoE 1541
Pipelining
29
Fetch an instruction
Decode the instruction
ALU OP
Memory Access
Write-back
Memory
M
U
X
M
U
X
Instruction
Register
Read Reg1
Read
Read Reg2Data 1
M
U
X
M
U
X
Zero
ALU
Write Data
M
U
X
16
Sign
Exten
d
Shift
left 2
32
30
Unpipelined
instructions
Pipelined
time
latency
instructions
Ideally, Speeduppipeline =
Introduction to Computer Architecture
Timesequential
Pipeline Depth
31
32
STAGE 3
ALU
STAGE2
Decode
STAGE 4
MemAcc
STAGE 5
Writeback
Current PC
4
Adder
PC
Register File
Read Reg 1
Read
Data
Read Reg
2 1
Instruction
Write Reg
Read
Data 2
Write Data
Instruction
Memory (RAM)
16
Sign
exte
nd
ALU
M
U
X
Data Memory
(RAM)
Zero
M
U
X
Write
Data
Read
data
32
33
STAGE2
Decode
STAGE 3
ALU
STAGE 4 STAGE 5
MemAcc Writeback
Current PC
4
PC
Adder
R
E
G
I
S
Instruction
Memory (RAM)T
E
R
S
R
E
Register File
Read Reg 1 G
Read
Data
Read Reg
2 1I
S
Write Reg
Read
Data 2T
Write Data
E
Sign
R
exte
nd
32 S
16
M
U
X
R
E
ALU
G
I
S
T
Write E
Data R
S
R
E
Data Memory
(RAM) G
I
S
T
E
R
S
M
U
X
Read
data
34
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
16
Sign
exten
d
M
U
X
M
U
X
ALU
Read
data
32
35
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7
LW R1, 100(R0)
IM
LW R2,200(R0)
LW R3, 300(R0)
REG
IM
ALU
REG
IM
DM
ALU
REG
Reg
DM
ALU
Reg
DM
Reg
36
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
16
Introduction to Computer Architecture
Sign
exte
nd
M
U
X
ALU
M
U
X
Read
data
32
37
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
16
Introduction to Computer Architecture
Sign
exte
nd
M
U
X
ALU
M
U
X
Read
data
32
38
Stage 3 - EX (Execution)
Execution
LW
M
U
X
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
16
Introduction to Computer Architecture
Sign
exte
nd
M
U
X
ALU
M
U
X
Read
data
32
39
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
16
Introduction to Computer Architecture
Sign
exte
nd
M
U
X
ALU
M
U
X
Read
data
32
40
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
Introduction to Computer Architecture
32
41
Clock Speed
If a single-cycle machine is broken into 2 pipeline stages,
how much faster can the clock run?
Latency is time from start to completion of instruction
100 nsecs
Instructions
Result
Instructions
Result
42
Instructions
Result
Instructions
Result
Instructions
Result
43
5 Stage Pipeline
M
U
X
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
44
Pipeline Control
M
U
X
Current PC
4
W
B
M
C
O
N
T
R
O
L
IF/ID
W
B
M
EX
EX/MEM
ID/EX
Adder
<< 2
ADDER
PC
ALU Control
Register File
Write RegRead
Write DataData 2
16
Sign
exte
nd
Data Memory
(RAM)
Zero
Read Reg 1
Read
Read RegData
2 1
Instruction
Memory (RAM)
W
B
MEM/WB
M
U
X
M
U
X
ALU
Read
data
32
45
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8
ADD R2,R3,R1
IM
SUB R5,R6,R7
ADD R10,R11,R12
REG
IM
ALU
REG
IM
DM
ALU
REG
Reg
DM
ALU
Reg
DM
Reg
46
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8
REG
IM
ALU
REG
IM
DM
REG
ALU
DM
REG
IM
ALU
REG
IM
REG
DM
ALU
REG
REG
DM
REG
ALU
DM
47
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8
ADD R2,R3,R1
IM
SUB R5,R6,R7
REG
IM
ALU
REG
Reg
DM
ALU
DM
Reg
Writeback
Result into R10
ADD R10,R11,R12
ADD R12,R10,R11
IM
REG
IM
ALU
REG
DM
ALU
Reg
DM
Reg
48
Data Hazards
Programs assume instructions are executed sequentially with one
instruction completing before the next one begins
Usually the compiler assumes the single machine model
Write-after-read (WAR)
Artificial dependency due to register assignment
Write-after-write (WAW)
Artificial dependency due to register assignment
49
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8
REG
IM
ALU
REG
IM
DM
ALU
REG
Reg
DM
Reg
ALU
DM
Reg
50
Solution 1 : Stall
Program
Execution
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8
REG
IM
ALU
DM
bubble bubble
Reg
REG
IM
ALU
REG
DM
ALU
51
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
52
Stall Conditions
Need to detect data hazard
Occurs when one instruction tries to read result from previous
instruction that hasnt completed yet.
Specifically,
When Instruction in Execute stage tries to read a register that an
instruction in the MemAcc or WB stages will write back to the
Register File
H&P Notation
ID/EX.RegisterRs refers to the number of the first source register
found in the pipeline register ID/EX.
ID/EX. RegisterRt refers to the number of the second source register
found in the pipeline register ID/EX.
53
00110010
01000000 00100000
31 26 25
21 20 16 15 11 10 6
000000 10001 10010
01000 00000
op
rs
rt
rd
shamt
5
0
100000
funct
op
operation of the instruction
rs
first register source operand
rt second register source operand
rd
register destination operand
shamt
shift amount
funct
function (select type of operation)
54
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
55
IF/ID
==
==
ID/EX
Rs
Rt
Rd
Current PC
4
ID/EX.RegisterRs
ID/EX. RegisterRt
EX/MEM
Rs =? Rd
Rt =? Rd Rd
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Data 1
Read Reg 2
Write Reg Read
Write Data Data 2
Instruction
Memory (RAM)
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
56
EX/MEM. RegisterRd
EX/MEM.RegisterRd
MEM/WB. RegisterRd
MEM/WB. RegisterRd
IF/ID
==
==
==
==
ID/EX
EX/MEM
Rs =? Rd
Rt =? RdRd
Rs
Rt
Rd
Current PC
4
ID/EX.RegisterRs
ID/EX. RegisterRt
ID/EX. RegisterRs
ID/EX. RegisterRt
MEM/WB
Rd
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Data 1
Read Reg 2
Write Reg Read
Write Data Data 2
Instruction
Memory (RAM)
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
57
ID/EX
Rs
Rt
Rd
Current PC
4
EX/MEM
MEM/WB
Rd
Rd
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Data 1
Read Reg 2
Write Reg Read
Write Data Data 2
Instruction
Memory (RAM)
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
58
Example
sub
and
or
add
sw
R2, R1, R3
R12, R2, R5
R13, R6, R2
R14, R2, R2
R15, 100(R2)
Rd = R2
Rd = R12
Rd = R13
Rd = R14
Rd = R15
Rs = R1
Rs = R2
Rs = R6
Rs = R2
Rs = R2
Rt = R3
Rt = R5
Rt = R2
Rt = R2
Rt = XX
SUB-AND Hazard
EX/MEM.RegisterRd
== ID/EX. RegisterRs
== R2
== ID/EX. RegisterRt
== R2
SUB-OR Hazard
MEM/WB.RegisterRd
59
Example (cont)
Data Hazard Logic
Current PC
4
Adder
ID/EX
EX/MEM
MEM/WB
Rs =
Rt =
Rd =
Rd =
Rd =
IF/ID
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Data 1
Read Reg 2
Write Reg Read
Write Data Data 2
Instruction
Memory (RAM)
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
60
Example (cont)
Data Hazard Logic
PC
Adder
Current PC
ID/EX
IF/ID
Rs = R3
Rt = R1
Rd = R2
<< 2
Rd =
Rd =
Data Memory
(RAM)
Zero
Read Reg 1
Read
Data 1
Read Reg 2
Instruction
Memory (RAM)
MEM/WB
ADDER
Register File
EX/MEM
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
61
Example (cont)
Data Hazard Logic
EX/MEM.RegisterRD = R2 != ID/EX.RegisterRs = R5
EX/MEM.RegisterRD = R2 == ID/EX.RegisterRt = R2
Adder
PC
EX/MEM
SUB R2, R1, R3
OR R13, R6, R2
Current PC
ID/EX
AND R12, R2, R5
IF/ID
Rs = R5
Rt = R2
Rd = R12
<< 2
ADDER
Rd =
Zero
Read Reg 1
Read
Data 1
Read Reg 2
Instruction
Memory (RAM)
Rd = R2
Data Memory
(RAM)
Register File
MEM/WB
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
62
Example (cont)
Data Hazard Logic
PC
EX/MEM
AND R12, R2, R5
Adder
ADD
OR R13, R6, R2
Current PC
ID/EX
R14, R2, R2
IF/ID
ID/EX.RegisterRs = R6
ID/EX.RegisterRt = R2
ID/EX.RegisterRs = R6
ID/EX.RegisterRt = R2
Rs = R6
Rt = R2
Rd = R13
<< 2
ADDER
Zero
Read Reg 1
Read
Data 1
Read Reg 2
Instruction
Memory (RAM)
Rd = R2
Data Memory
(RAM)
Register File
Rd = R12
MEM/WB
EX/MEM.RegisterRD = R12 !=
EX/MEM.RegisterRD = R12 !=
MEM/WB.RegisterRD = R2 !=
MEM/WB.RegisterRD = R2 ==
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
63
Clock
Clock Clock Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4
OR R13, R6, R2
REG
IM
ALU
REG
IM
Clock
Clock Clock
Clock
Cycle 5 Cycle 6Cycle 7 Cycle 8
DM
REG
ALU
DM
REG
ALU
DM
REG
IM
REG
ALU
REG
DM
REG
64
Time
Clock
Clock Clock Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4
nop
REG
IM
ALU
REG
IM
DM
ALU
REG
IM
Clock
Clock Clock
Clock
Cycle 5 Cycle 6Cycle 7 Cycle 8
Reg
DM
Reg
ALU
DM
REG
ALU
Reg
DM
Reg
65
Time
Clock
Clock Clock Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4
stall
REG
IM
ALU
DM
Clock
Clock Clock
Clock
Cycle 5 Cycle 6Cycle 7 Cycle 8
Reg
bubble bubble
IM
REG
bubble
bubble bubble
ALU
DM
Reg
66
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8
REG
IM
ALU
REG
DM
Reg
ALU
DM
Reg
67
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
M
U
X
M
U
X
ALU
Read
data
Sign
extend
16
32
68
Forwarding Continued
Program
Execution
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle
REG
IM
ALU
REG
IM
DM
Reg
ALU
DM
REG
ALU
Reg
DM
Reg
69
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
M
U
X
M
U
X
ALU
Read
data
Sign
extend
16
32
70
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8
LW R10, 0x00(R4)IM
REG
IM
ALU
REG
DM
Reg
ALU
DM
Reg
71
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8
LW R10, 0x00(R4)IM
nop
REG
IM
ALU
REG
IM
DM
Reg
ALU
DM
REG
ALU
Reg
DM
Reg
72
73
Write-after-read (WAR)
artificial dependency due to register assignment
Example
LW R1,0(R2)
ADD R2, R6, R3
Write-after-write (WAW)
artificial dependency due to register assignment
Example
LW R1, 0(R2)
ADD R1, R3, R4
74
Taxonomy of Hazards
Data Hazards are just one type of hazard that can occur
in a machine. There are actually 3 basic types of hazards
Hazard Taxonomy
Data hazards
Instruction depends on result of prior computation which is not ready
yet
Structural hazards
HW cannot support a combination of instructions
Control hazards
pipelining of branches and other instructions which change the PC
75
Structural Hazards
Structural hazards
HW cannot support a combination of instructions
Occurs when two or more instructions want to use the same
hardware resource in the same cycle
Causes bubble (stall) in pipelined machines
Overcome by replicating hardware resources
Examples
Multiple accesses to the register file
Branch adder and ALU
Multiple accesses to memory
76
M
U
X
Current PC
4
r1,r2, offset
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
Introduction to Computer Architecture
32
77
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8
LW R2, 0x10(R4)
IM
SUB R5,R6,R7
ADD R10,R11,R12
REG
IM
ALU
REG
IM
DM
ALU
Reg
DM
REG
ALU
IM
REG
Reg
DM
ALU
Reg
DM
Reg
78
Program
Execution
Time
Clock
Clock Clock Clock
Clock
Clock Clock
Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8
LW R2, 0x10(R4)
IM
SUB R5,R6,R7
ADD R10,R11,R12
Stall
REG
IM
ALU
REG
IM
DM
ALU
REG
Reg
DM
ALU
Reg
DM
Reg
REG
ALU
DM
79
Instruction
NOP
ADD R30,R30,R30
BEQ R1, R3, 24
<- this branchs to address 72
AND R12, R2, R5
OR R13, R6, R2
ADD R14, R2, R2
...
...
...
LW R4, 50(R7)
...
80
Branch Hazards
Flow of instructions if branch is taken: 36, 40, 44, 72, ...
Flow of instructions if branch is not taken: 36, 40, 44, 48, ...
Clock
Clock Clock Clock
Clock
Clock Clock Clock Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8 Cycle 9
44 BEQ R1, R3,IM
24
REG
ALU
IM
52 OR R13, R6, R2
60 or 72 (depending on branch)
Introduction to Computer Architecture
REG
IM
DM
Reg
ALU
DM
REG
IM
ALU
REG
IM
Reg
DM
ALU
REG
Reg
DM
ALU
Reg
Reg
DM
81
Clock
Clock Clock Clock
Clock
Clock Clock Clock Clock
Cycle 1 Cycle 2 Cycle 3Cycle 4 Cycle 5 Cycle 6Cycle 7 Cycle 8 Cycle 9
stall
REG
IM
stall
stall
ALU
DM
Reg
bubble bubble
IM
bubble
bubble bubble
REG
ALU
DM
Reg
82
REG
IM
52 OR R13, R6, R2
ALU
REG
IM
DM
Reg
ALU
DM
REG
IM
ALU
REG
IM
Reg
DM
ALU
REG
Reg
DM
ALU
Reg
DM
Reg
83
REG
IM
52 OR R13, R6, R2
72 LW R4, 50(R7)
Introduction to Computer Architecture
ALU
REG
IM
Clock
Clock Clock Clock Clock
Cycle 5 Cycle 6Cycle 7 Cycle 8 Cycle 9
DM
Reg
ALU
DM
REG
IM
ALU
REG
IM
Reg
DM
ALU
REG
Reg
DM
ALU
Reg
Reg
DM
84
Current PC
4
IF/ID
ID/EX
EX/MEM
MEM/WB
Adder
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
85
REG
IM
52 OR R13, R6, R2
72 LW R4, 50(R7)
ALU
REG
IM
Clock
Clock Clock Clock Clock
Cycle 5 Cycle 6Cycle 7 Cycle 8 Cycle 9
DM
Reg
ALU
DM
REG
IM
ALU
REG
Reg
DM
ALU
Reg
DM
Reg
86
Current PC
4
IF/ID
ADDER
ID/EX
EX/MEM
MEM/WB
Adder
Compare
<< 2
ADDER
PC
Data Memory
(RAM)
Register File
Zero
Read Reg 1
Read
Read RegData
2 1
Write RegRead
Instruction
Memory (RAM)
Write DataData 2
M
U
X
ALU
M
U
X
Read
data
Sign
extend
16
32
87
REG
IM
72 LW R4, 50(R7)
ALU
REG
IM
Clock
Clock Clock Clock Clock
Cycle 5 Cycle 6Cycle 7 Cycle 8 Cycle 9
DM
Reg
ALU
DM
REG
ALU
Reg
DM
Reg
88
Instruction
NOP
ADD R30,R30,R30
BEQ R1, R3, 24
AND R12, R2, R5
OR R13, R6, R2
ADD R14, R2, R2
...
...
...
LW R4, 50(R7)
...
Instruction
NOP
BEQ R1, R3, 28
ADD R30, R30, R30
AND R12, R2, R5
OR R13, R6, R2
ADD R14, R2, R2
...
...
...
LW R4, 50(R7)
...
89
Example
lw R15, 0x00(R2)
add R14, R15, R15
lw R16, 0x04(R2)
Might become:
lw R15, 0x00(R2)
lw R16, 0x04(R2)
add R14, R15, R15
90
Unpipelined
instructions
time
Pipelined
latency
instructions
Ideally, Throughputpipeline =
Timesequential
Pipeline Depth
91
Without pipelining
Throughput = 1/ ti (for i = 1 to n)
Latency = 1/throughput
With pipelining
Throughput = 1/max ti <= n/ ti
Latency = n/throughput
Speedup = ti / max ti <= n
(for i = 1 to n)
92
93