Instruction Pipeline - Study Notes
Instruction Pipeline - Study Notes
Pipeline
COMPUTER ORGANIZATION
Copyright © 2014-2021 Testbook Edu Solutions Pvt. Ltd.: All rights reserved
Download Testbook
Instruction Pipeline
Pipelining
Mechanism for overlapping execution of many input sets by dividing one computation stage into many (Let k)
computation sub-stages.
Speed up increases
Working of Pipeline
S1 must happen before S2, S3 and S2 must happen before S3 (sequential execution)
Note: When Item 1 is in S2 stage, S1 will be empty so we can use S1 for Item 2 that time and parallel
execution of Item 1 in stage 2 can also happen.
In the processor pipeline we need a latch between successive stages to hold the intermediate results
temporarily.
Pipelined Processors
a. Degree of Overlap:
Serial: Next operation starts only after the previous operation gets completed.
b. Depth of Pipeline:
Performance of the pipeline depends on the number of stages and how they are utilized without conflict.
c. Scheduling alternatives:
Static Pipeline:
ii. If one instruction stalls, all subsequent ones also get delayed.
Dynamic Pipeline:
τ = τm + dL
Latency
The number of time units between two inputs initialization of a pipeline is called the latency between
them.
When two or more inputs attempt to use the same pipeline stage at same time, it will cause collision.
Latencies, after using which cause collisions, are called forbidden latencies.
Each of 5 steps: If, ID, EX, MEM and WB let them as pipeline stages.
Each stage, let must finish its execution within one clock cycle.
Since many instructions will be overlapped, we must ensure that there is no conflict.
In pipelined:
= (4 + n) (T + Δ)
= (4 + n)T, if T >>Δ
≃5 if n is very large.
Conflict Stages
IF and MEM: Both these stages access memory. So, they should not be in the same cycle.
SOLUTION: Using separate instruction and data cache. (i-cache and d-cache)
ID and WB: Both these stages access register banks. So, they should not be used in the same stock cycle.
SOLUTION: Allow both read and write access to registers in the same clock cycle.
Points to Remember
1. Since, in a pipelined processor we have to fetch an instruction every clock cycle. Hence, we need to incre-
ment the program counter at the fetch stage itself. Otherwise, the next instruction will not be fetched.
2. In a non-pipelined processor there is no need to fetch an instruction every clock cycle. So, we increment
the program counter in the MEM stage.
Pipeline Hazards
An instruction pipeline should complete the execution of an instruction every clock cycle.
Hazards are situations which prevents this from happening (for some instructions)
Hazards
1. Structural Hazards (Resource conflicts)
When one instruction is stalled, all others that follow that instruction will also get stalled.
Structural Hazards
Due to resource conflicts.
Data Hazards
Data hazards occur due to data dependencies between instructions.
Bypassing
The result computed by the previous instruction is stored in some register within the data path.
Take the value directly from the register and forward to instruction required.
Register Read/Write
To reduce the number of instructions to be forwarded.
We can avoid conflict which is occurring in some cycle i.e. WB and ID in the same cycle by using Register
Read/Write Scheme.
Pipeline Interlock: In hardware detects the hazard and stalls the pipeline until the hazard is cleared.
Instruction Issue
Before Ex stage, in ID stage we will decode the instruction, in a typical ALU instruction.
When we are moving from ID to EX stage it means we are starting to execute the operation. It is when we
issue the instruction.
MIPS 32 Code
LW R1, a
LW R2, b
SW R8, x
LW R1, c
LW R2, d
SW R9, y
LW R1, a
LW R2, b
LW R3, c
LW R4, d
SW R8, x
SW R9, y
Pipeline Scheduling can increase number of registers required but result in performance improvement
Load instruction requires that the next instruction should not use the currently loaded value which is
delayed load.
If the compiler cannot move some instruction to fill up the delay slot, it can insert a NOP (No operation)
instruction.
Situation where an instruction refers to a result that has not yet been calculated.
Example:
i1: R2 ← R5 + R3
i2: R4 ← R2 + R3
Example:
i1: R4 ← R1 + R5
i2: R5 ← R1 + R2
Example:
i1: R2 ← R4 + R7
i2: R2 ← R1 + R3
Control Hazard
Arise because of change in flow of control or branch instructions.
If the branch is taken the PC is normally net updated until the end of MEM.
The next instruction can be fetched only after that (3 stall cycles)
Using these registers by comparison logic, we can complete computation of effective addresses by the
end of the ID stage.
Branch Instruction
→ Task of the compiler is to try filling up these delay slots to make more effective use.
→ Instructions in branch delay slots are always executed irrespective of whether the branch is taken or not.