Chapter 4
Chapter 4
Edition
The Hardware/Software Interface
Chapter 4
The Processor
§4.1 Introduction
Introduction
◼ CPU performance factors
◼ Instruction count
◼ Determined by ISA and compiler
◼ CPI and Cycle time
◼ Determined by CPU hardware
◼ We will examine two RISC-V implementations
◼ A simplified version
◼ A more realistic pipelined version
◼ Simple subset, shows most aspects
◼ Memory reference: ld, sd
◼ Arithmetic/logical: add, sub, and, or
◼ Control transfer: beq
A
Y
B
◼ Arithmetic/Logic Unit
◼ Multiplexer ◼ Y = F(A, B)
◼ Y = S ? I1 : I0
A
I0 M
u Y ALU Y
I1 x
B
S F
Clk
D Q
D
Clk
Q
Clk
D Q Write
Write D
Clk
Q
Increment by
4 for next
64-bit instruction
register
Sign-bit wire
replicated
ALU
opcode ALUOp Operation Opcode field ALU function control
ld 00 load register XXXXXXXXXXX add 0010
sd 00 store register XXXXXXXXXXX add 0010
beq 01 branch on equal XXXXXXXXXXX subtract 0110
◼ Four loads:
◼ Speedup
= 8/3.5 = 2.3
◼ Non-stop:
◼ Speedup
= 2n/0.5n + 1.5 ≈ 4
= number of stages
◼ In RISC-V pipeline
◼ Need to compare registers and compute
target early in the pipeline
◼ Add hardware to do it in ID stage
MEM
Right-to-left WB
flow leads to
hazards
Wrong
register
number
ForwardB = 00 ID/EX The second ALU operand comes from the register
file.
ForwardB = 10 EX/MEM The second ALU operand is forwarded from the prior
ALU result.
ForwardB = 01 MEM/WB The second ALU operand is forwarded from data
memory or an earlier ALU result.
Stall inserted
here
Flush these
instructions
(Set control
values to 0)
PC
Hold pending
operands