0% found this document useful (0 votes)
3 views

U3.1 Concepts and Challenges[1] (1)

The document discusses instruction-level parallelism (ILP) and its significance in improving processor performance through pipelining. It outlines two approaches for exploiting ILP: dynamic hardware-based and static software-based, and explains the importance of understanding data dependences, hazards, and control dependences in maximizing parallelism. Additionally, it highlights the role of loop-level parallelism as a common method to enhance ILP within basic blocks of code.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

U3.1 Concepts and Challenges[1] (1)

The document discusses instruction-level parallelism (ILP) and its significance in improving processor performance through pipelining. It outlines two approaches for exploiting ILP: dynamic hardware-based and static software-based, and explains the importance of understanding data dependences, hazards, and control dependences in maximizing parallelism. Additionally, it highlights the role of loop-level parallelism as a common method to enhance ILP within basic blocks of code.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Unit-III

Instruction-Level Parallelism:
Concepts and Challenges
Introduction
• All processors since about 1985 have used pipelining to
overlap the execution of instructions and improve
performance.
• This potential overlap among instructions is called
instruction-level parallelism (ILP), because the instructions
can be evaluated in parallel.
• There are two largely separable approaches to exploiting ILP:
(1) an approach that relies on hardware to help discover and exploit the
parallelism dynamically, and
(2) an approach that relies on software technology to find parallelism
statically at compile time.
Processors using the dynamic, hardware-based approach,
including all recent Intel and many ARM processors, dominate
in the desktop and server markets
• The value of the CPI (cycles per instruction) for a
pipelined processor is the sum of the base CPI and all
contributions from stalls:
Pipeline CPI = Ideal pipeline CPI + Structural stalls +
Data hazard stalls + Control stalls

• The ideal pipeline CPI is a measure of the maximum


performance attainable by the implementation
Data Dependences and Hazards
• Determining how one instruction depends on another is critical
to determining how much parallelism exists in a program and
how that parallelism can be exploited.
• In particular, to exploit instruction-level parallelism, we must
determine which instructions can be executed in parallel.
Case(i)If two instructions are parallel, they can execute
simultaneously in a pipeline of arbitrary depth without causing
any stalls, assuming the pipeline has sufficient resources (and
thus no structural hazards exist).
Case(ii)If two instructions are dependent, they are not parallel
and must be executed in order, although they may often be
partially overlapped.
• The key in both cases is to determine whether an instruction is
dependent on another instruction
Data Dependences
• There are three different types of dependences:
1. data dependences (also called true data dependences),
2. name dependences, and
3. control dependences.
• An instruction j is data-dependent on
instruction i if either of the following holds:
– Instruction i produces a result that may be used by
instruction j.
– Instruction j is data-dependent on instruction k, and
instruction k is data dependent on instruction i.
(i) Data Dependences

• For example, consider the following RISC-V code sequence that


increments a vector of values in memory (starting at 0(x1)
ending with the last element at 0(x2)) by a scalar in register f2.
Loop: fld f0,0(x1) //f0=array element
fadd.d f4,f0,f2 //add scalar in f2
fsd f4,0(x1) //store result
addi x1,x1,-8 //decrement pointer 8 bytes
bne x1,x2,Loop //branch x1 is not equal x2
The data dependences in this code sequence involve both floating-
point data:
Loop: fld f0,0(x1) //f0=array element

fadd.d f4,f0,f2 //add scalar in f2

fsd f4,0(x1) //store result


and integer data:
addi x1,x1,-8 //decrement pointer //8 bytes (per DW)
• A data dependence conveys three things:
(1) the possibility of a hazard,
(2) the order in which results must be calculated, and
(3) an upper bound on how much parallelism can
possibly be exploited
(ii)Name Dependences

• A name dependence occurs when two instructions use the same


register or memory location, called a name, but there is no flow
of data between the instructions associated with that name.
• There are two types of name dependences between an
instruction i that precedes instruction j in program order:
– An antidependence between instruction i and instruction j
occurs when instruction j writes a register or memory location
that instruction i reads. The original ordering must be preserved
to ensure that i reads the correct value. In the following
example, there is an antidependence between fsd and addi on
register x1.
fsd f4,0(x1) //store result
addi x1,x1,8 //decrement pointer 8 bytes
– An output dependence occurs when instruction i and instruction
j write the same register or memory location. The ordering
between the instructions must be preserved to ensure that the
value finally written corresponds to instruction j.
Data Hazards
• Consider two instructions i and j, with i preceding j in program order. The
possible data hazards are
RAW (read after write):
o j tries to read a source before i writes it, so j incorrectly gets the old value.
o This hazard is the most common type and corresponds to a true data
dependence.
WAW (write after write):
o j tries to write an operand before it is written by i.
o The writes end up being performed in the wrong order, leaving the value
written by i rather than the value written by j in the destination. This hazard
corresponds to an output dependence.
WAR (write after read):
o j tries to write a destination before it is read by i, so i incorrectly gets the new
value. This hazard arises from an antidependence (or name dependence).
o WAR hazards cannot occur in most static issue pipelines
o A WAR hazard occurs either when there are some instructions that write
results early in the instruction pipeline and other instructions that read a
source late in the pipeline, or when instructions are reordered.
(iii) Control Dependences
• The last type of dependence is a control dependence. A control dependence
determines the ordering of an instruction, i, with respect to a branch instruction so
that instruction i is executed in correct program order and only when it should be.
if p1
{ S1; };
if p2
{ S2; }
S1 is control-dependent on p1, and
S2 is control-dependent on p2 but not on p1.
In general, two constraints are imposed by control dependences:
1. An instruction that is control-dependent on a branch cannot be moved before
the branch so that its execution is no longer controlled by the branch. For
example, we cannot take an instruction from the then portion of an if statement
and move it before the if statement.
2. 2. An instruction that is not control-dependent on a branch cannot be moved
after the branch so that its execution is controlled by the branch. For example,
we cannot take a statement before the if statement and move it into the then
portion.
What Is Instruction-Level Parallelism?
• Basic Block: The amount of parallelism available within a basic
block—a straight-line code sequence with no branches in
except to the entry and no branches out except at the exit—is
quite small.
• The simplest and most common way to increase the ILP is to
exploit parallelism among iterations of a loop. This type of
parallelism is often called loop-level parallelism.

You might also like