0% found this document useful (0 votes)
17 views

Slot15 CH14 ProcessorStructureAndFunction 42 Slots

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Slot15 CH14 ProcessorStructureAndFunction 42 Slots

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

+

Processor Structure
Chapter 14 and Function
William Stallings, Computer Organization and Architecture, 9 th Edition
+ 2

Objectives
CLO8: Explain processor structure and function
in details, the operations of Reduced Instruction
Set Computers

After studying this chapter, you should be able to:


 Distinguish between user-visible and control/status registers, and
discuss the purposes of registers in each category.
 Summarize the instruction cycle.
 Discuss the principle behind instruction pipelining and how it works in
practice.
 Compare and contrast the various forms of pipeline hazards (rủi ro).
+ 3

Contents
 14.1 Processor Organization
 14.2 Register Organization
 14.3 Instruction Cycle
 14.4 Instruction Pipelining
+ 4

10 Exercises
 14.1 What general roles are performed by processor registers?
 14.2 What categories of data are commonly supported by user-visible
registers?
 14.3 What is the function of condition codes?
 14.4 What is a program status word?
 14.5 Why is a two-stage instruction pipeline unlikely to cut the instruction
cycle time in half, compared with the use of no pipeline?
 14.6 List and briefly explain various ways in which an instruction pipeline
can deal with conditional branch instructions (Refer to “Control Hazard”).
 14.7 How are history bits used for branch prediction? (refer to “Branch
Prediction State Diagram”)
+ 10 Exercises 5

What would be the value of the following flags: Carry, Zero, Overflow,
Sign, Even Parity , Half-Carry ?
 14.8 -If the last operation performed on a computer with an 8-bit
word was an addition in which the two operands were 00000010 and
00000011.
 14.9-Repeat for the addition of -1 (twos complement) and +1.
 14.10- Repeat for the substraction A - B, where A contains 11110000
and B contains 0010100.
Cờ nhớ carry: Làm phép toán xong còn nhớ hay không?
Cờ zero: Kết quả của phép toán có là 0 hay không?
Cờ tràn (overflow): Kết quả của phép toán có bị tràn hay không (nơi chưa kết
quả không đủ rông để chứa kết quả)?
Cờ dấu (sign): Kết quả có là số âm không?
Cờ parity chẵn (even parity): Số bit 1 của kết quả có là số chẵn không?
Cờ nhớ nửa (half-carry): Giá trị của biến nhớ (carry) say khi thực thi được một
nửa số bit cần tính toán. Thí dụ: Đơn vị bộ nhớ là 1 byte thì một nửa có 4 bit.
+ 6

14.1- Processor Organization


Processor Requirements:

 Fetch instruction (from memory (register, cache, main memory))

 Interpret instruction (what action is required)

 Fetch data (data from memory or an I/O module)

 Process data (performing some operations on data)

 Write data (writing result to memory or an I/O module)

 In order to do these things the processor needs to store some data


temporarily and therefore needs a small internal memory
CPU With the System Bus and
7

CPU Internal Structure


+ 14.2- Register Organization 8

 Within the processor there is a set of registers that


function as a level of memory above main memory and
cache in the hierarchy
 The registers in the processor perform two roles:

User-Visible Registers Control and Status Registers

 Enable the machine or  Used by the control unit


assembly language to control the operation of
programmer to minimize the processor and by
main memory references privileged operating
by optimizing use of system programs to
registers control the execution of
programs
9

User-Visible Registers
Categories:
• General purpose
Referenced • Can be assigned to a variety of functions by the programmer
by means of • Data
the machine • May be used only to hold data and cannot be employed in the
language that calculation of an operand address
the processor • Address
executes • May be somewhat general purpose or may be devoted to a
particular addressing mode
• Examples: segment pointers, index registers, stack pointer
• Condition codes
• Also referred to as flags
• Bits set by the processor hardware as the result of operations
10

Table 14.1: Condition Codes


+ 11

Control and Status Registers


Four registers are essential to instruction execution:

 Program counter (PC)


 Contains the address of an instruction to be fetched

 Instruction register (IR)


 Contains the instruction most recently fetched

 Memory address register (MAR)


 Contains the address of a location in memory

 Memory buffer register (MBR)


 Contains a word of data to be written to memory or the word most recently
read
+ Program Status Word (PSW) 12

Register or set of registers


that contain status
information

Common fields or flags


include: Status
• Sign information are
• Zero used to give a
• Carry

decision for
Equal
• Overflow
branching
• Interrupt Enable/Disable
• Supervisor
13

Example
Microprocessor
Register
Organizations
14.3- 14

Instruction Includes the


following
Cycle stages:

Fetch Execute Interrupt

If interrupts are enabled


Read the next
Interpret the opcode and and an interrupt has
instruction from
perform the indicated occurred, save the current
memory into the
operation process state and service
processor
the interrupt
Instruction Cycle
15

Loop due to
additional memory
accesses
16

Instruction Cycle State Diagram

Fetch cycle Indirect cycle Interrupt cycle


Data Flow, Fetch Cycle
17

Fetch cycle for the next The CU examines the contents of


instruction the IR to determine if it contains an
(Instruction index is in PC) operand specified by indirect
MAR: Memory Address Register addressing Use indirect
MBR: Memory buffer Register cycle(data address is in MBR)
Data Flow, Interrupt Cycle
18

(1) Store PC (return point after executing interrupt routine)


(2) Store current state (values in registers before running interrupt routine)
(3) Fetch cycle is used to load interrupt routine
A way to improve19
14.4- Instruction Pipelining performance is
Pipelining Strategy performing jobs in
parallel manner
Similar to the
use of an To apply this concept to
assembly line instruction execution we
in a must recognize that an
manufacturing instruction has a number
plant of stages

New inputs are


accepted at one end
before previously An assembly line (dây
accepted inputs appear chuyền lắp ráp) in which
as outputs at the other some operations are
end performed
concurrently
Two-Stage Instruction Pipeline
20
+ Additional Stages 21

 Fetch instruction (FI)  Fetch operands (FO)


 Read the next expected  Fetch each operand from
instruction into a buffer memory
 Operands in registers need
 Decode instruction (DI) not be fetched
 Determine the opcode and the
operand specifiers  Execute instruction (EI)
 Perform the indicated
 Calculate operands (CO) operation and store the
 Calculate the effective address result, if any, in the specified
of each source operand destination operand location
 This may involve displacement,
 Write operand (WO)
register indirect, indirect, or
other forms of address  Store the result in memory
calculation
Timing Diagram for Instruction Pipeline 22

Operation

I: Instruction
O: operand
F: Fetch
C: Calculate
E: Execute
W: Write
The Effect of a Conditional Branch on 23

Instruction Pipeline Operation

At the time 7, the


instruction 3 executes
and the instruction 15
is loaded.

These jobs are wasted


Suppose that the
instruction 3 is a
branch to the
instruction 15
+
Six Stage
Instruction
Pipeline

Figure 14.12 indicates


the logic needed for
pipelining to account
for branches and
interrupts
+ Alternative
Pipeline
Depiction

I3 is a
conditional
branch to
I15
+ number of
instructions that are
Speedup executed without a
branch
Factors
with
Instruction
Pipelining

The larger the


number of
pipeline stages,
the greater the
potential for
speedup 
higher COST
Pipeline Hazards (rủi ro)
27

Occur when the pipeline, or There are three types of


some portion of the pipeline, hazards:
must stall (trì hoãn) because • Resource conflict
conditions do not permit • Data dependency
continued execution • Control instructions

Also referred
to as a
pipeline
bubble
+ Resource
Hazards
A resource hazard occurs
when two or more
instructions that are already
in the pipeline need the same
resource

The result is that the


instructions must be executed
in serial rather than parallel
for a portion of the pipeline

A resource hazard is
sometimes referred to as a
structural hazard
FO is accessing memory. So, this step is idle
Data Hazards 29

A data hazard occurs when there is a conflict in the RAW


access of an operand location

Instruction is executing and the


register EAX is writing to. So, it
can not be read.

X86
Hazard
instruction

+
+ Types of Data Hazard 30

 Read after write (RAW), or true dependency


 An instruction modifies a register or memory location
 Succeeding instruction reads data in memory or register location
 Hazard occurs if the read takes place before write operation is complete

 Write after read (WAR), or antidependency


 An instruction reads a register or memory location
 Succeeding instruction writes to the location
 Hazard occurs if the write operation completes before the read operation takes place

 Write after write (WAW), or output dependency


 Two instructions both write to the same location
 Hazard occurs if the write operations take place in the reverse order of the intended
sequence
+ 31

Control Hazard
 Also known as a branch hazard
 Occurswhen the pipeline makes the wrong decision on a
branch prediction
 Brings
instructions into the pipeline that must
subsequently be discarded
 Dealing with Branches:
 Multiple streams
 Prefetch branch target
 Loop buffer
 Branch prediction
 Delayed branch
Multiple Streams 32

A simple pipeline suffers a penalty for a


branch instruction because it must choose one
of two instructions to fetch next and may
make the wrong choice

A brute-force approach is to replicate the


initial portions of the pipeline and allow the brute-force search or exhaustive
pipeline to fetch both instructions, making search (vét cạn)
use of two streams

Drawbacks:
• With multiple pipelines there are contention delays for access
to the registers and to memory
• Additional branch instructions may enter the pipeline before
the original branch decision is resolved
33

Prefetch Branch Target


 When a conditional branch is recognized,
the target of the branch is prefetched, in
addition to the instruction following the
branch
 Target is then saved until the branch
instruction is executed
 Ifthe branch is taken, the target has already
+ been prefetched
 IBM 360/91 uses this approach
+ Loop Buffer 34

Small, very-high speed memory


maintained by the instruction
fetch stage of the pipeline and
containing the n most recently
fetched instructions, in sequence

 Benefits:
Similar in principle to
 Instructions fetched in sequence will be
a cache dedicated to
available without the usual memory access
time instructions. Differences:
• The loop buffer only
 If a branch occurs to a target just a few retains instructions in
locations ahead of the address of the branch sequence
instruction, the target will already be in the • Is much smaller in size
buffer and hence lower in cost
 This strategy is particularly well suited to
dealing with loops
+ 35

Branch Prediction

 Various
techniques can be used to predict whether a
branch will be taken:

 These approaches are static


1. Predict never taken
 They do not depend on the execution
2. Predict always taken history up to the time of the conditional
3. Predict by opcode branch instruction

4. Taken/not taken switch  These approaches are dynamic


5. Branch history table  They depend on the execution history

How are predictions carried out?  States of some last instructions (some
bits) must be stores in cache
Next slide
+Branch
Prediction Flow
Chart
If only one bit is stored,
a loop may cause 2
errors in prediction:
once on entering and
once on exiting.

If 2 bits are stored, a


prediction algorithm is
carried out using 2
branches (fig. 14.18)
Branch Prediction State Diagram
37

The decision process


can be represented
more compactly by a
finite-state machine

Finite-state machine is a
way to express a
processing mechanism
in which each part of
input will determine a
step of the process.

Some bits are stored: 0: Not taken, 1: Taken. A history can be as 01110
+ Dealing
With
Branches

Each prefetch triggers a


lookup in the table.
No match: Fetch next branch history table

sequential address.
Match: a prediction is
made based on the state of
the instruction: Either the
next sequential address or
the branch target address is
fed to the select logic.
+ Delayed Branch 39

 It is possible to improve pipeline performance by


automatically rearranging instructions within a
program, so that branch instructions occur later
than actually desired. This intriguing approach is
examined in Chapter 15.
+ Intel 80486 Pipelining 40

 Fetch
 Objective is to fill the prefetch buffers with new data as soon as the old data have been
consumed by the instruction decoder
 Operates independently of the other stages to keep the prefetch buffers full

 Decode stage 1
 All opcode and addressing-mode information is decoded in the D1 stage
 3 bytes of instruction are passed to the D1 stage from the prefetch buffers
 D1 decoder can then direct the D2 stage to capture the rest of the instruction

 Decode stage 2
 Expands each opcode into control signals for the ALU
 Also controls the computation of the more complex addressing modes

 Execute
 Stage includes ALU operations, cache access, and register update

 Write back
 Updates registers and status flags modified during the preceding execute stage
+
80486
Instruction
Pipeline
Examples
+ Summary
42

Processor Structure and


Function
Chapter 14
 Instructionpipelining
 Processor organization
 Pipelining strategy
 Registerorganization  Pipeline performance
 User-visible registers  Pipeline hazards
 Control and status  Dealing with branches
registers  Intel 80486 pipelining
 Instruction
cycle
 The indirect cycle
 Data flow

You might also like