0% found this document useful (0 votes)

19 views46 pages

Super Scalar 2

Uploaded by

Tharun Chitipolu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views46 pages

Super Scalar 2

Uploaded by

Tharun Chitipolu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

Computer Architecture

ELE 475 / COS 475

Slide Deck 5: Superscalar 2 and
Exceptions
David Wentzlaff
Department of Electrical Engineering
Princeton University

1
Agenda
• Interrupts
• Out-of-Order Processors

2
Interrupts:
altering the normal flow of control

Ii-1 HI1

interrupt
program Ii HI2
handler

Ii+1 HIn

An external or internal event that needs to be processed by

another (system) program. The event is usually unexpected or
rare from program’s point of view. 3
Causes of Exceptions
Interrupt: an event that requests the attention of the processor
• Asynchronous: an external event
– input/output device service request
– timer expiration
– power disruptions, hardware failure
• Synchronous: an internal exception (a.k.a.
exceptions/trap)
– undefined opcode, privileged instruction
– arithmetic overflow, FPU exception
– misaligned memory access
– virtual memory exceptions: page faults,
TLB misses, protection violations
– software exceptions: system calls, e.g., jumps into kernel
4
Asynchronous Interrupts:
invoking the interrupt handler

• An I/O device requests attention by asserting

one of the prioritized interrupt request lines

• When the processor decides to process the

interrupt
– It stops the current program at instruction Ii, completing all the
instructions up to Ii-1 (a precise interrupt)
– It saves the PC of instruction Ii in a special register (EPC)
– It disables interrupts and transfers control to a designated interrupt
handler running in the kernel mode

5
Interrupt Handler
• Saves EPC before re-enabling interrupts to allow nested
interrupts 
– need an instruction to move EPC into GPRs
– need a way to mask further interrupts at least until EPC can be saved
• Needs to read a status register that indicates the cause
of the interrupt
• Uses a special indirect jump instruction RFE (return-
from-exception) to resume user code, this:
– enables interrupts
– restores the processor to the user mode
– restores hardware status and control state

6
Synchronous Interrupts
• A synchronous interrupt (exception) is caused by a
particular instruction

• In general, the instruction cannot be completed and

needs to be restarted after the exception has been
handled
– requires undoing the effect of one or more partially executed instructions

• In the case of a system call trap, the instruction is

considered to have been completed
– syscall is a special jump instruction involving a change to privileged kernel mode
– Handler resumes at instruction after system call

7
Exception Handling 5-Stage Pipeline
Inst. Data
PC D Decode E + M W
Mem Mem

PC address Illegal Data address

Overflow
Exception Opcode Exceptions

Asynchronous Interrupts

• How to handle multiple simultaneous exceptions in

different pipeline stages?
• How and where to handle external asynchronous
interrupts?
8
Exception Handling 5-Stage Pipeline
Commit
Point

Inst. Data
PC D Decode E + M W
Mem Mem

Illegal Overflow Data address

PC address
Opcode Exceptions
Exception

EPC Cause
Exc Exc Exc
D E M

PC PC PC
Select D E M Asynchronous
Handler Kill F Kill D Kill E Kill
PC Stage Stage Stage Interrupts Writeback

9
Exception Handling 5-Stage Pipeline
• Hold exception flags in pipeline until commit point (M
stage)

• Exceptions in earlier pipe stages override later

exceptions for a given instruction

• Inject external interrupts at commit point (override

others)

• If exception at commit: update Cause and EPC

registers, kill all stages, inject handler PC into fetch
stage

10
Speculating on Exceptions
• Prediction mechanism
– Exceptions are rare, so simply predicting no exceptions is very
accurate!
• Check prediction mechanism
– Exceptions detected at end of instruction execution pipeline, special
hardware for various exception types
• Recovery mechanism
– Only write architectural state at commit point, so can throw away
partially executed instructions after exception
– Launch exception handler after flushing pipeline

• Bypassing allows use of uncommitted instruction

results by following instructions
11
Exception Pipeline Diagram
time
t0 t1 t2 t3 t4 t5 t6 t7 . . . .
(I1) 096: ADD IF1 ID1 EX1 MA1 nop overflow!
(I2) 100: XOR IF2 ID2 EX2 nop nop
(I3) 104: SUB IF3 ID3 nop nop nop
(I4) 108: ADD IF4 nop nop nop nop
(I5) Exc. Handler code IF5 ID5 EX5 MA5 WB5

time
t0 t1 t2 t3 t4 t5 t6 t7 ....
IF I1 I2 I3 I4 I5
ID I1 I2 I3 nop I5
Resource
EX I1 I2 nop nop I5
Usage
MA I1 nop nop nop I5
WB nop nop nop nop I5

12
Agenda
• Interrupts
• Out-of-Order Processors

13
Out-Of-Order (OOO) Introduction
Name Frontend Issue Writeback Commit
I4 IO IO IO IO Fixed Length Pipelines
Scoreboard
I2O2 IO IO OOO OOO Scoreboard
I2OI IO IO OOO IO Scoreboard,
Reorder Buffer, and Store Buffer
I03 IO OOO OOO OOO Scoreboard and Issue Queue
IO2I IO OOO OOO IO Scoreboard, Issue Queue,
Reorder Buffer, and Store Buffer

14
OOO Motivating Code Sequence
0 MUL R1, R2, R3 0 1
1 ADDIU R11,R10,1
2 MUL R5, R1, R4 2 4

3 MUL R7, R5, R6 5 6

3
4 ADDIU R12,R11,1
5 ADDIU R13,R12,1
6 ADDIU R14,R12,2

• Two independent sequences of instructions enable flexibility

in terms of how instructions are scheduled in total order
• We can schedule statically in software or dynamically in
hardware

15
I4: In-Order Front-End, Issue,
Writeback, Commit

F D X M W

16
I4: In-Order Front-End, Issue,
Writeback, Commit

X1
X0
F D W
M0 M1

17
I4: In-Order Front-End, Issue,
Writeback, Commit (4-stage MUL)
X1 X2 X3
X0

F D X2 X3
M0 M1 W
Y0 Y1 Y2 Y3

To avoid increasing CPI, needs full bypassing which can be

expensive. To help cycle time, add Issue stage where
register file read and instruction “issued” to Functional Unit
18
I4: In-Order Front-End, Issue,
Writeback, Commit (4-stage MUL)
SB X0 X1 X2 X3 ARF

F D I M0 M1
X2 X3 W
Y0 Y1 Y2 Y3

ARF R W

SB R/W W
19
Basic Scoreboard
Data Avail.
P F 4 3 2 1 0
P: Pending, Write to
R1
Destination in flight
R2 F: Which functional unit
R3 is writing register
Data Avail.: Where is the
…
write data in the
R31 functional unit pipeline

• A One in Data Avail. In column ‘I’ means that result data is

in stage ‘I’ of functional unit F
• Can use F and Data Avail. fields to determine when to
bypass and where to bypass from
• A one in column zero means that cycle functional unit is in
the Writeback stage
• Bits in Data Avail. field shift right every cycle. 20
Basic Scoreboard
Data Avail.
P F 4 3 2 1 0
P: Pending, Write to
R1 1
Destination in flight
R2 F: Which functional unit
R3 is writing register
Data Avail.: Where is the
…
write data in the
R31 functional unit pipeline

• A One in Data Avail. In column ‘I’ means that result data is

in stage ‘I’ of functional unit F
• Can use F and Data Avail. fields to determine when to
bypass and where to bypass from
• A one in column zero means that cycle functional unit is in
the Writeback stage
• Bits in Data Avail. field shift right every cycle. 21
0 MUL R1, R2, R3 F D I Y0 Y1 Y2 Y3 W
1 ADDIU R11,R10,1 F D I X0 X1 X2 X3 W
2 MUL R5, R1, R4 F D I I I Y0 Y1 Y2 Y3 W
3 MUL R7, R5, R6 F D D D I I I I Y0 Y1 Y2 Y3 W
4 ADDIU R12,R11,1 F F F D D D D I X0 X1 X2 X3 W
5 ADDIU R13,R12,1 F F F F D I X0 X1 X2 X3 W
6 ADDIU R14,R12,2 F D I X0 X1 X2 X3 W

Cyc D I 4 3 2 1 0 Dest Regs

1 0 RED Indicates if we look at F
2 1 0 Field, we can bypass on this cycle
3 2 1 1 R1
4 1 1 R11
5 1 1
6 3 2 1 1
7 1 1 1 R5
8 1 1
9 1
10 4 3 1
11 5 4 1 1 R7
12 6 5 1 1 R12
13 6 1 1 1 R13
14 1 1 1 1 R14
15 1 1 1 1
16 1 1 1
17 1 1 22
18 1
I2O2: In-order Frontend/Issue, Out-of-
order Writeback/Commit
SB X0 ARF

F D I M0 M1 W
Y0 Y1 Y2 Y3

ARF R W

SB R R/W W
23
I2O2 Scoreboard
• Similar to I4, but we can now use it to track
structural hazards on Writeback port
• Set bit in Data Avail. according to length of
pipeline
• Architecture conservatively stalls to avoid
WAW hazards by stalling in Decode therefore
current scoreboard sufficient. More
complicated scoreboard needed for
processing WAW Hazards

24
0 MUL R1, R2, R3 F D I Y0 Y1 Y2 Y3 W
1 ADDIU R11,R10,1 F D I X0 W
2 MUL R5, R1, R4 F D I I I Y0 Y1 Y2 Y3 W
3 MUL R7, R5, R6 F D D D I I I I Y0 Y1 Y2 Y3 W
4 ADDIU R12,R11,1 F F F D D D D I X0 W
5 ADDIU R13,R12,1 F F F F D I X0 W
6 ADDIU R14,R12,2 F D I I X0 W

Cyc D I 4 3 2 1 0 Dest Regs

1 0 RED Indicates if we look at F
2 1 0 Field, we can bypass on this cycle
3 2 1 1 R1
4 1 1 R11
5 1 1
6 3 2 1
7 1 1 R5
8 1 Writes with two cycle
9 1 latency. Structural
10 4 3 1 Hazard
11 5 4 1 1 R7
12 6 5 1 1 R12
13 1 1 1 R13
14 6 1 1
15 1 1 R15
16 1
17 25
18
Early Commit Point?
0 MUL R1, R2, R3 F D I Y0 Y1 Y2 Y3 /
1 ADDIU R11,R10,1 F D I X0 W /
2 MUL R5, R1, R4 F D I I I /
3 MUL R7, R5, R6 F D D D /
4 ADDIU R12,R11,1 F F F /
5 ADDIU R13,R12,1 /
6 ADDIU R14,R12,2

• Limits certain types of exceptions.

26
I2OI: In-order Frontend/Issue, Out-of-
order Writeback, In-order Commit
SB X0 PRF ARF

F D I L0 L1 W ROB
FSB
C

S0
Y0 Y1 Y2 Y3

ARF W
SB R/W W
PRF R W
ROB R/W W R/W
FSB W R/W
27
PRF=Physical Register File(Future File), ROB=Reorder Buffer, FSB=Finished Store Buffer (1 entry)
Reorder Buffer (ROB)
State S ST V Preg
--
P 1
F 1
P 1
P
F
P
P
--
--
State: {Free, Pending, Finished}
S: Speculative
ST: Store bit
V: Physical Register File Specifier Valid
Preg: Physical Register File Specifier 28
Reorder Buffer (ROB)
State S ST V Preg Next instruction allocates here in D
--
P 1 Tail of ROB
F 1 Speculative because branch is in flight
P 1
P
F Instruction wrote ROB out of order
P
P Head of ROB
--
--
State: {Free, Pending, Finished}
S: Speculative Commit stage is waiting for
ST: Store bit Head of ROB to be finished
V: Physical Register File Specifier Valid
Preg: Physical Register File Specifier 29
Finished Store Buffer (FSB)
V Op Addr Data
--

• Only need one entry if we only support one

memory instruction inflight at a time.
• Single Entry FSB makes allocation trivial.
• If support more than one memory instruction,
we need to worry about Load/Store address
aliasing.

30
0 MUL R1, R2, R3 F D I Y0 Y1 Y2 Y3 W C
1 ADDIU R11,R10,1 F D I X0 W r C
2 MUL R5, R1, R4 F D I I I Y0 Y1 Y2 Y3 W C
3 MUL R7, R5, R6 F D D D I I I I Y0 Y1 Y2 Y3 W C
4 ADDIU R12,R11,1 F F F D D D D I X0 W r C
5 ADDIU R13,R12,1 F F F F D I X0 W r C
6 ADDIU R14,R12,2 F D I I X0 W r C

Cyc D I ROB 0 1 2 3
0 Empty = free entry in ROB
1 0
2 1 0 R1 State of ROB at beginning of cycle
3 2 1 R11
4 R5 Pending entry in ROB
5
6 3 2 R11 Circle=Finished (Cycle after W)
7 R7
8 R1
9 Last cycle before entry is freed from ROB
10 4 3
(Cycle in C stage)
11 5 4 R12
12 6 5 R13 R5
13 R14
14 6 R12
15 R13
16 R7 Entry becomes free and is freed
17 R14 on next cycle
18
19 31
What if First Instruction Causes an
Exception?
0 MUL R1, R2, R3 F D I Y0 Y1 Y2 Y3 W /
1 ADDIU R11,R10,1 F D I X0 W r -- /
2 MUL R5, R1, R4 F D I I I Y0 /
3 MUL R7, R5, R6 F D D D I /
4 ADDIU R12,R11,1 F F F D /
F D I. . .

32
What About Branches?
Option 2
0 BEQZ R1, target F D I X0 W C
1 ADDIU R11,R10,1 F D I X0 /
Squash instructions in ROB
2 ADDIU R5, R1, R4 F D I /
when Branch commits
3 ADDIU R7, R5, R6 F D /
T ADDIU R12,R11,1 F D I . . .

Option 1
0 BEQZ R1, target F D I X0 W C
1 ADDIU R11,R10,1 F D I -
Squash instructions earlier. Has more
2 ADDIU R5, R1, R4 F D -
complexity. ROB needs many ports.
3 ADDIU R7, R5, R6 F -
T ADDIU R12,R11,1 F D I . . .

Option 3
0 BEQZ R1, target F D I X0 W C
1 ADDIU R11,R10,1 F D I X0 W / Wait for speculative instructions to
2 ADDIU R5, R1, R4 F D I X0 W / reach the Commit stage and squash in
3 ADDIU R7, R5, R6 F D I X0 W /
Commit stage
T ADDIU R12,R11,1 F D I X0 W C
33
What About Branches?
• Three possible designs with decreasing
complexity based on when to squash speculative
instructions and de-allocate ROB entry:
1. As soon as branch resolves
2. When branch commits
3. When speculative instructions reach commit

• Base design only allows one branch at a time.

Second branch stalls in decode. Can add more
bits to track multiple in-flight branches.

34
Avoiding Stalling Commit on Store
Miss
PRF ARF
W ROB C CSB R
FSB
0 OpA F D I X0 W C CSB=Committed Store Buffer
1 SW F D I S0 W C C C C
2 OpB F D I X0 W W W W C
3 OpC F D I X X X X W C
4 OpD F D I I I I X W C

With Retire Stage

0 OpA F D I X0 W C
1 SW F D I S0 W C R R R
2 OpB F D I X0 W C
3 OpC F D I X W C
4 OpD F D I X W C 35
IO3: In-order Frontend, Out-of-order
Issue/Writeback/Commit
SB X0 ARF

F D I I
Q
M0 M1 W
Y0 Y1 Y2 Y3

ARF R W
SB R R/W W
I W R/W W
36
Q
Issue Queue (IQ)
Op Imm S V Dest V P Src0 V P Src1
Op: Opcode
Imm.: Immediate
S: Speculative Bit
V: Valid (Instruction has
corresponding Src/Dest)
P: Pending (Waiting on
operands to be produced)

Instruction Ready = (!Vsrc0 || !Psrc0) && (!Vsrc1

|| !Psrc1) && no structural hazards

• For high performance, factor in bypassing

37
Centralized vs. Distributed Issue Queue
I
X0 Q
A I X0

F D I I
Q
M0 F D M0

I
Y0 Q
B
I Y0

Centralized Distributed

38
Advanced Scoreboard
Data Avail.
P 4 3 2 1 0
P: Pending, Write to
R1
Destination in flight
R2 Data Avail.: Where is the
R3 write data in the pipeline
and which functional unit
…
R31

• Data Avail. now contains functional unit identfier

• A non-empty value in column zero means that cycle
functional unit is in the Writeback stage
• Bits in Data Avail. field shift right every cycle.

39
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 MUL R1, R2, R3 F D I Y0 Y1 Y2 Y3 W
1 ADDIU R11,R10,1 F D I X0 W
2 MUL R5, R1, R4 F D i I Y0 Y1 Y2 Y3 W
3 MUL R7, R5, R6 F D i I Y0 Y1 Y2 Y3 W
4 ADDIU R12,R11,1 F D i I X0 W
5 ADDIU R13,R12,1 F D i I X0 W
6 ADDIU R14,R12,2 F D i I X0 W

Cyc D I IQ 0 1 2
0
1 0 Dest/Src0/Src1, Circle denotes value
2 1 0 R1/R2/R3 present in ARF
3 2 1 R11/R10
4 3 R5/R1/R4
5 4 R7/R5/R6 Value bypassed so no circle, present
6 5 2 R12/R11 bit
7 6 4 R13/R12 Value set present by
8 5 R14/R12 Instruction 1 in cycle 5, W
9 Stage
10 3
11 6 R14/R12
12
13
40
14
Assume All Instruction in Issue Queue
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 MUL R1, R2, R3 F D i I Y0 Y1 Y2 Y3 W
1 ADDIU R11,R10,1 F D i I X0 W
2 MUL R5, R1, R4 F D i I Y0 Y1 Y2 Y3 W
3 MUL R7, R5, R6 F D i I Y0 Y1 Y2 Y3 W
4 ADDIU R12,R11,1 F D i I X0 W
5 ADDIU R13,R12,1 F D i I X0 W
6 ADDIU R14,R12,2 F D i I X0 W

• Better performance than previous?

41
IO2I: In-order Frontend, Out-of-order
Issue/Writeback, In-order Commit
SB X0 PRF ARF

F D I I
Q L0 L1 W ROB
FSB
C

S0
Y0 Y1 Y2 Y3

ARF W
SB R/W W
PRF R W
ROB R/W W R/W
FSB W R/W
42
IQ W R/W
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
0 MUL R1, R2, R3 F D I Y0 Y1 Y2 Y3 W C
1 ADDIU R11,R10,1 F D I X0 W r C
2 MUL R5, R1, R4 F D i I Y0 Y1 Y2 Y3 W C
3 MUL R7, R5, R6 F D i I Y0 Y1 Y2 Y3 W C
4 ADDIU R12,R11,1 F D i I X0 W r C
5 ADDIU R13,R12,1 F D i I X0 W r C
6 ADDIU R14,R12,2 F D i I X0 W r C

0 MUL R1, R2, R3 F D I Y0 Y1 Y2 Y3 W C

1 ADDIU R11,R10,1 F D I X0 W r C
2 MUL R5, R1, R4 F D i I Y0 Y1 Y2 Y3 W C
3 MUL R7, R5, R6 F D i I Y0 Y1 Y2 Y3 W C
4 ADDIU R12,R11,1 F D i I X0 W r C
5 ADDIU R13,R12,1 F D i I X0 W r C
6 ADDIU R14,R12,2 F D i I X0 W r C

43
Out-of-order 2-Wide Superscalar
with 1 ALU
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
0 MUL R1, R2, R3 F D I Y0 Y1 Y2 Y3 W C
1 ADDIU R11,R10,1 F D I X0 W r C
2 MUL R5, R1, R4 F D i I Y0 Y1 Y2 Y3 W C
3 MUL R7, R5, R6 F D i I Y0 Y1 Y2 Y3 W C
4 ADDIU R12,R11,1 F D I X0 W r C
5 ADDIU R13,R12,1 F D i I X0 W r C
6 ADDIU R14,R12,2 F D i I X0 W r C

44
Acknowledgements
• These slides contain material developed and copyright by:
– Arvind (MIT)
– Krste Asanovic (MIT/UCB)
– Joel Emer (Intel/MIT)
– James Hoe (CMU)
– John Kubiatowicz (UCB)
– David Patterson (UCB)
– Christopher Batten (Cornell)

• MIT material derived from course 6.823

• UCB material derived from course CS252 & CS152
• Cornell material derived from course ECE 4750

Computer Architecture a Constructive Approach
No ratings yet
Computer Architecture a Constructive Approach
197 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
CCNA Certification Study Guide Volume 1: Exam 200-301 v1.1
From Everand
CCNA Certification Study Guide Volume 1: Exam 200-301 v1.1
Todd Lammle
5/5 (1)
Unit 6 COA
No ratings yet
Unit 6 COA
37 pages
lec5
No ratings yet
lec5
47 pages
Onur Digitaldesign - Comparch 2021 Lecture14 Pipelined Processor Design Afterlecture
No ratings yet
Onur Digitaldesign - Comparch 2021 Lecture14 Pipelined Processor Design Afterlecture
97 pages
Stack Computers: The New Wave
From Everand
Stack Computers: The New Wave
Philip Koopman
No ratings yet
L05-PipeliningII
No ratings yet
L05-PipeliningII
36 pages
Onur 447 Spring15 Lecture12 Ooo Execution Afterlecture
No ratings yet
Onur 447 Spring15 Lecture12 Ooo Execution Afterlecture
67 pages
Chapter_04_processor_3.5
No ratings yet
Chapter_04_processor_3.5
52 pages
4.architetture OutOfOrder
No ratings yet
4.architetture OutOfOrder
40 pages
Pipe 4
No ratings yet
Pipe 4
50 pages
3.2 Pipeline Processing
No ratings yet
3.2 Pipeline Processing
18 pages
Course 3 Module 5
No ratings yet
Course 3 Module 5
23 pages
chapter4_2
No ratings yet
chapter4_2
34 pages
section7
No ratings yet
section7
23 pages
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
From Everand
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Digital Equipment Corporation
No ratings yet
ITT204 - ktu qbank
No ratings yet
ITT204 - ktu qbank
8 pages
ELEC 4601 Notes
No ratings yet
ELEC 4601 Notes
23 pages
LCDF3_IM_C9-10
No ratings yet
LCDF3_IM_C9-10
11 pages
ARM
No ratings yet
ARM
44 pages
ARM K
No ratings yet
ARM K
32 pages
15IF11 Multicore A PDF
No ratings yet
15IF11 Multicore A PDF
64 pages
L11 DS PDF
No ratings yet
L11 DS PDF
41 pages
ILP2 (Unit4)
No ratings yet
ILP2 (Unit4)
27 pages
Lecture 6
No ratings yet
Lecture 6
17 pages
Chapter One: Introduction To Pipelined Processors
No ratings yet
Chapter One: Introduction To Pipelined Processors
41 pages
Exam2 Review
No ratings yet
Exam2 Review
54 pages
CLAT3_Set B_answerkey
No ratings yet
CLAT3_Set B_answerkey
7 pages
Csis Csg524 Midsem q
No ratings yet
Csis Csg524 Midsem q
3 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
28 pages
3 Pipeline
No ratings yet
3 Pipeline
21 pages
Ee660 2017 Spring Materials Week 04 Slides
No ratings yet
Ee660 2017 Spring Materials Week 04 Slides
40 pages
Computer Architecture Elementary Pipelining Study
100% (4)
Computer Architecture Elementary Pipelining Study
20 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
30 pages
AppendixA ISCAS Circuits
No ratings yet
AppendixA ISCAS Circuits
31 pages
CMSC414 Practice
No ratings yet
CMSC414 Practice
14 pages
CA Unit 3 Answers
No ratings yet
CA Unit 3 Answers
10 pages
Sdca Course Info
No ratings yet
Sdca Course Info
5 pages
03 Pipeline
0% (1)
03 Pipeline
38 pages
cs146 Fall2017 Midterm1xx
No ratings yet
cs146 Fall2017 Midterm1xx
12 pages
Complex Pipelining: Arvind
No ratings yet
Complex Pipelining: Arvind
32 pages
1158 CS F342 20240527010246 Mid Semester Question Paper
No ratings yet
1158 CS F342 20240527010246 Mid Semester Question Paper
4 pages
CO Formula Notes - PDF 62
No ratings yet
CO Formula Notes - PDF 62
11 pages
HW3 Sol PDF
No ratings yet
HW3 Sol PDF
5 pages
MEL G642-Compre Solution - 2 2016-17
No ratings yet
MEL G642-Compre Solution - 2 2016-17
9 pages
M116C 1 M116C 1 Lec10-Pipeline-II
No ratings yet
M116C 1 M116C 1 Lec10-Pipeline-II
18 pages
Hw5 Solution
No ratings yet
Hw5 Solution
11 pages
General Catalog Bioprocess-Catalog
No ratings yet
General Catalog Bioprocess-Catalog
218 pages
Final Sup Exam
No ratings yet
Final Sup Exam
10 pages
Exe On Pipelining
No ratings yet
Exe On Pipelining
12 pages
Provincial Finals Intermediate: Pack #7
100% (1)
Provincial Finals Intermediate: Pack #7
8 pages
Computer Architecture: Introduction To The Concept of Pipelined Processor
No ratings yet
Computer Architecture: Introduction To The Concept of Pipelined Processor
20 pages
Dodson_The-Nubian-Pharaohs-of-Egypt
No ratings yet
Dodson_The-Nubian-Pharaohs-of-Egypt
229 pages
Computer Architecture
100% (2)
Computer Architecture
46 pages
Lect3 Pipeline
No ratings yet
Lect3 Pipeline
4 pages
Endsem
No ratings yet
Endsem
3 pages
Module 08
No ratings yet
Module 08
30 pages
Vibration of Discrete and Continuous Systems Lecture Notes PDF
100% (1)
Vibration of Discrete and Continuous Systems Lecture Notes PDF
184 pages
Reduced Instruction Set Computer (Risc) Complex Instruction Set Computer (Cisc)
No ratings yet
Reduced Instruction Set Computer (Risc) Complex Instruction Set Computer (Cisc)
7 pages
Model Exam - 1 Cs6303 CA - 19.02.18 Answer Key
No ratings yet
Model Exam - 1 Cs6303 CA - 19.02.18 Answer Key
8 pages
Midtermarch 2
No ratings yet
Midtermarch 2
9 pages
F10 E1 Solution
No ratings yet
F10 E1 Solution
5 pages
Database Link User Guide
No ratings yet
Database Link User Guide
122 pages
ElectraNet Transmission Line Cost Review Jacobs
No ratings yet
ElectraNet Transmission Line Cost Review Jacobs
48 pages
mg400-robot-arm-kit-desktop-manual
No ratings yet
mg400-robot-arm-kit-desktop-manual
38 pages
Computer Design (Spring 2010) Midterm Exam Solution
No ratings yet
Computer Design (Spring 2010) Midterm Exam Solution
2 pages
Cheat Sheet
100% (3)
Cheat Sheet
3 pages
Double Planetary Mixer PDF
100% (1)
Double Planetary Mixer PDF
7 pages
Digital Communication An Overview
No ratings yet
Digital Communication An Overview
5 pages
Assignment 1: Unit 3 - Week 1: NBA, SAR and OBE (Module 1: NBA and OBE Framework)
100% (2)
Assignment 1: Unit 3 - Week 1: NBA, SAR and OBE (Module 1: NBA and OBE Framework)
4 pages
Rockfeller
No ratings yet
Rockfeller
12 pages
CH 6 Presentation PDF
No ratings yet
CH 6 Presentation PDF
28 pages
FF0369 01 Free Business Opportunity Slide Deck 16x9 1
No ratings yet
FF0369 01 Free Business Opportunity Slide Deck 16x9 1
14 pages
Vsjev La-A031p 1127 Final
No ratings yet
Vsjev La-A031p 1127 Final
22 pages
Electribe Sampler PG E3 PDF
No ratings yet
Electribe Sampler PG E3 PDF
20 pages
P 1
No ratings yet
P 1
10 pages
A Study of Relationship of Consumer Attitudes and Purchases Intentions To Celebrity Advertisements
No ratings yet
A Study of Relationship of Consumer Attitudes and Purchases Intentions To Celebrity Advertisements
12 pages
Assembly - List - Substitusi L70x7 Ke L75x7
No ratings yet
Assembly - List - Substitusi L70x7 Ke L75x7
14 pages
History of Wood Working
100% (1)
History of Wood Working
2 pages
GATE Exams Material Science XE C Syllabus
No ratings yet
GATE Exams Material Science XE C Syllabus
1 page
Guildhall Stream - Acting Audition Video Esubmission - Uploading Instructions - Accessible
No ratings yet
Guildhall Stream - Acting Audition Video Esubmission - Uploading Instructions - Accessible
6 pages
10 Best Practices For ChatGPT Advanced Data Analysis
No ratings yet
10 Best Practices For ChatGPT Advanced Data Analysis
3 pages
Answer 1
No ratings yet
Answer 1
4 pages
Historico Competencia Nautica Washington Terra de Carvalho
No ratings yet
Historico Competencia Nautica Washington Terra de Carvalho
3 pages
Secondary Research Assignment 1
No ratings yet
Secondary Research Assignment 1
3 pages
The Traditional Approach To Capital Structure Implies That An Optimum Debt
No ratings yet
The Traditional Approach To Capital Structure Implies That An Optimum Debt
3 pages
Be A NURSE Your Country Needs You: Karnataka State Nursing Council Bengaluru
No ratings yet
Be A NURSE Your Country Needs You: Karnataka State Nursing Council Bengaluru
1 page
Figure-1 Typical Plan For Shell Appurtenances: Notes
No ratings yet
Figure-1 Typical Plan For Shell Appurtenances: Notes
1 page
Smart Tennis Sensor SONY
No ratings yet
Smart Tennis Sensor SONY
2 pages

Super Scalar 2

Uploaded by

Super Scalar 2

Uploaded by

Computer Architecture

ELE 475 / COS 475

An external or internal event that needs to be processed by

• An I/O device requests attention by asserting

• When the processor decides to process the

• In general, the instruction cannot be completed and

• In the case of a system call trap, the instruction is

PC address Illegal Data address

• How to handle multiple simultaneous exceptions in

Illegal Overflow Data address

• Exceptions in earlier pipe stages override later

• Inject external interrupts at commit point (override

• If exception at commit: update Cause and EPC

• Bypassing allows use of uncommitted instruction

3 MUL R7, R5, R6 5 6

• Two independent sequences of instructions enable flexibility

To avoid increasing CPI, needs full bypassing which can be

• A One in Data Avail. In column ‘I’ means that result data is

• A One in Data Avail. In column ‘I’ means that result data is

Cyc D I 4 3 2 1 0 Dest Regs

Cyc D I 4 3 2 1 0 Dest Regs

• Limits certain types of exceptions.

• Only need one entry if we only support one

• Base design only allows one branch at a time.

With Retire Stage

Instruction Ready = (!Vsrc0 || !Psrc0) && (!Vsrc1

• For high performance, factor in bypassing

• Data Avail. now contains functional unit identfier

• Better performance than previous?

0 MUL R1, R2, R3 F D I Y0 Y1 Y2 Y3 W C

• MIT material derived from course 6.823

You might also like