0% found this document useful (0 votes)
8 views40 pages

module-3-chapter-2

The document discusses pipelining and superscalar techniques, focusing on linear and non-linear pipeline processors, instruction pipeline design, and arithmetic pipeline design. It covers various models, mechanisms, and design issues related to instruction execution phases, dynamic scheduling, hazard avoidance, and branch handling techniques. Additionally, it highlights fixed-point and floating-point operations, as well as the design of multifunctional arithmetic pipelines.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views40 pages

module-3-chapter-2

The document discusses pipelining and superscalar techniques, focusing on linear and non-linear pipeline processors, instruction pipeline design, and arithmetic pipeline design. It covers various models, mechanisms, and design issues related to instruction execution phases, dynamic scheduling, hazard avoidance, and branch handling techniques. Additionally, it highlights fixed-point and floating-point operations, as well as the design of multifunctional arithmetic pipelines.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Pipelining and Superscalar Techniques

• Linear Pipeline Processors


• Non-linear Pipeline Processors
• Instruction Pipeline Design
• Arithmetic Pipeline Design
• Superscalar Pipeline Design

2
LINEAR PIPELINE
PROCESSORS
• Linear Pipeline Processor
o Is a cascade of processing stages which are linearly connected to perform a fixed function over a stream of
data flowing from one end to other.
• Models of Linear Pipeline
o Synchronous Model
o Asynchronous Model

• Clocking and Timing Control


o Clock Cycle
o Pipeline Frequency
o Clock skewing
o Flow-through delay
o Speedup, Efficiency and Throughput
• Optimal number of Stages and Performance-Cost Ratio (PCR)
3
Clock Cycle τ=max {τi }1k + d=τm + d

Pipeline frequency f=1/τ


Total time required for k stages is Tk= [k+
(n-1)]τ
Speedup factor Si = Ti/Tk= nk τ/k τ +(n-1) τ =nk/k+(n-
1)
 Performance/cost ratio PCR=f/c+kh=1/(t/k+d)(c+kh)
 Efficiency Ek=n/k+(n-1)
 Throughput Hk=n/[k+(n-1)]τ
LINEAR PIPELINE PROCESSORS
LINEAR PIPELINE PROCESSORS
NON-LINEAR PIPELINE ROCESSORS

• Dynamic Pipeline
o Static v/s Dynamic Pipeline
o Streamline connection, feed-forward connection and feedback connection

• Reservation and Latency Analysis


o Reservation tables
o Evaluation time

• Latency Analysis
o Latency
o Collision
o Forbidden latencies
o Latency Sequence, Latency Cycle and Average Latency
NON-LINEAR PIPELINE
PROCESSORS

8
NON-LINEAR PIPELINE
PROCESSORS

9
NON-LINEAR PIPELINE PROCESSORS
INSTRUCTION PIPELINE
DESIGN
• Instruction Execution Phases
o E.g. Fetch, Decode, Issue, Execute, Write-back
o In-order Instruction issuing and Reordered Instruction issuing
• E.g. X = Y + Z , A = B x C
• Mechanisms/Design Issues for Instruction Pipelining
o Pre-fetch Buffers
o Multiple Functional Units
o Internal Data Forwarding
o Hazard Avoidance
• Dynamic Scheduling
• Branch Handling Techniques
INSTRUCTION PIPELINE DESIGN

Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University


INSTRUCTION PIPELINE DESIGN

• Fetch: fetches instructions from memory; ideally one per cycle


• Decode: reveals instruction operations to be performed and identifies the resources needed
• Issue: reserves the resources and reads the operands from registers
• Execute: actual processing of operations as indicated by instruction
• Write Back: writing results into the registers
INSTRUCTION PIPELINE DESIGN
INSTRUCTION PIPELINE
DESIGN
Mechanisms/Design Issues of Instruction
Pipeline
• Pre-fetch Buffers
o Sequential Buffers
o Target Buffers
o Loop Buffers
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction Pipeline

• Multiple Functional Units


o Reservation Station and Tags
o Slow-station as Bottleneck stage
• Subdivision of Pipeline Bottleneck stage
• Replication of Pipeline Bottleneck stage
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction
Pipeline
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction
Pipeline
• Internal Forwarding and Register Tagging
o Internal Forwarding:
• A “short-circuit” technique to replace unnecessary memory accesses by register-register
transfers in a sequence of fetch-arithmetic-store operations
o Register Tagging:
• Use of tagged registers , buffers and reservation stations, for exploiting concurrent activities
among multiple arithmetic units
o Store-Fetch Forwarding
• (M  R1, R2  M) replaced by (M  R1, R2  R1)
o Fetch-Fetch Forwarding
• (R1  M, R2  M) replaced by (R1  M, R2  R1)
o Store-Store Overwriting
• (M  R1, M  R2) replaced by (M  R2)
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction
Pipeline

• Internal Forwarding and Register Tagging


INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction
Pipeline

• Internal Forwarding and Register Tagging


INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction Pipeline

• Hazard Detection and Avoidance


o Domain or Input Set of an instruction
o Range or Output Set of an instruction
o Data Hazards: RAW, WAR and WAW
o Resolution using Register Renaming approach
INSTRUCTION PIPELINE DESIGN
Dynamic Instruction Scheduling

• Idea of Static Scheduling


o Compiler based scheduling strategy to resolve Interlocking among instructions

• Dynamic Scheduling
o Tomasulo’s Algorithm (Register-Tagging Scheme)
• Hardware based dependence-resolution
o Scoreboarding Technique
• Scoreboard: the centralized control unit
• A kind of data-driven mechanism
INSTRUCTION PIPELINE DESIGN
Branch Handling Techniques
• Branch Taken, Branch Target, Delay Slot
• Effect of Branching
o Parameters:
k : No. of stages in the pipeline
n : Total no. of instructions or tasks
p : Percentage of Brach instructions over n
q : Percentage of successful branch instructions (branch taken) over p.
b : Delay Slot
τ : Pipeline Cycle Time

o Branch Penalty = q of (p of n) * bτ = pqnbτ


o Effective Execution Time:
• Teff = [k + (n-1)] τ + pqnbτ = [k + (n-1) + pqnb]τ
• Effect of Branching
o Effective Throughput:
• Heff = n/Teff
• Heff = n / {[k + (n-1) + pqnb]τ} = nf / [k + (n-1) + pqnb]
• As nInfinity and b = k-1
o H*eff = f / [pq(k-1)+1]
• If p=0 and q=0 (no branching occurs)
o H**eff = f = 1/τ
o Performance Degradation Factor
• D = 1 – H*eff / f = pq(k-1) / [pq(k-1)+1]
INSTRUCTION PIPELINE DESIGN
Branch Handling Techniques

Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University


• Branch Prediction
o Static Branch Prediction: based on branch code types
o Dynamic Branch prediction: based on recent branch history
• Strategy 1: Predict the branch direction based on information found at decode stage.
• Strategy 2: Use a cache to store target addresses at effective address calculation stage.
• Strategy 3: Use a cache to store target instructions at fetch stage
o Brach Target Buffer Organization

• Delayed Branches
o A delayed branch of d cycles allows at most d-1 useful instructions to be executed following the
branch taken.
o Execution of these instructions should be independent of branch instruction to achieve a zero branch
penalty
INSTRUCTION PIPELINE DESIGN
Branch Handling Techniques

Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University


• Finite-precision arithmetic
• Overflow and Underflow
• Fixed-Point operations
o Notations:
• Signed-magnitude, one’s complement and two-complement notation
o Operations:

• Addition: (n bit, n bit)  (n bit) Sum, 1 bit output carry


• Subtraction: (n bit, n bit)  (n bit) difference
• Multiplication: (n bit, n bit)  (2n bit) product
• Division: (2n bit, n bit)  (n bit) quotient, (n bit) remainder
• Floating-Point Numbers
o X = (m, e) representation
• m: mantissa or fraction
• e: exponent with an implied base or radix r.
•Actual Value X = m * r e
o Operations on numbers X = (mx, ex) and Y = (my, ey)
• Addition: (mx * rex-ey + my ) . xey
• Subtraction: (mx * rex-ey – my ) . xey) (mx *
• Multiplication: my ) . rex+ey
• Division:
(mx / my ) . rex – ey

• Elementary Functions
o Transcendental functions like: Trigonometric, Exponential, Logarithmic, etc.
• Separate units for fixed point operations and floating point operations
• Scalar and Vector Arithmetic Pipelines
• Uni-functional or Static Pipelines
• Arithmetic Pipeline Stages
o Majorly involve hardware to perform: Add and Shift micro-operations
o Addition using: Carry Propagation Adder (CPA) and Carry Save Adder (CSA)
o Shift using: Shift Registers

• Multiplication Pipeline Design


o E.g. To multiply two 8-bit numbers that yield a 16-bit product using CSA and CPA Wallace Tree.
ARITHMETIC PIPELINE DESIGN
Static Arithmetic Pipelines

Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University


A x B= P,
where P is the 16-bit product.

P = A x B = P0 + P1 + P2 +…. + P7, where x and + are


arithmetic multiply and acid
operations,
ARITHMETIC PIPELINE DESIGN
Static Arithmetic Pipelines

Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University


• Multifunctional Pipeline:
o Static multifunctional pipeline
o Dynamic multifunctional pipeline

• Case Study: T1/ASC static multifunctional pipeline architecture

3
6
ARITHMETIC PIPELINE DESIGN
Multifunctional Arithmetic Pipelines

Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University


ARITHMETIC PIPELINE DESIGN
Multifunctional Arithmetic Pipelines

Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University

You might also like