0% found this document useful (0 votes)
9 views

Sec5-Fpga - Part2

The document discusses FPGA architecture and technology mapping. It describes the basic components of FPGAs including logic blocks, interconnects, and programmable elements. It also covers mapping algorithms and issues like fanout, reconvergence, and decomposition when mapping circuits to LUT-based FPGA structures.

Uploaded by

Diriba Gobena
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Sec5-Fpga - Part2

The document discusses FPGA architecture and technology mapping. It describes the basic components of FPGAs including logic blocks, interconnects, and programmable elements. It also covers mapping algorithms and issues like fanout, reconvergence, and decomposition when mapping circuits to LUT-based FPGA structures.

Uploaded by

Diriba Gobena
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 63

FPGA Architecture

Abstract Architecture

LOGIC
INSTR

Interconnect
+ Storage

• Three components of all computing elements


• Control
• Compute elements
• Communication
Custom Hardware
•No Control Store
•Not General Purpose
•All “computing” is through
LOGIC spatial connections

Interconnect
+ Storage
Traditional P

•Control store only


controls logic
•Communication is in LOGIC
time INSTR
•Registers, memory
etc
Interconnect
+ Storage
Programmable Devices
• Prefabricated Silicon
• Logic implemented by programming the
basic cells and the interconnect
• Very fast turnaround time
• Limited design flexibility
• Low development time and cost
FPGA
• Combines PLDs and MPGAs
• Densities : 2K to 1000K+ gates
• Array of logic blocks and programmable interconnect
• Logic Block
• Universal gates, multiplexors, RAMs, etc
• Programmable element
• SRAM,EEPROM or antifuse

Q
Read or Write Q P1
P2 Out
Data P3
P4
Programming Bit I1I2
2-Input LUT
Where are FPGAs Used

Time to Price volume


market

Emulation Very high X Low


Emulation
Prototyping Prototyping Very high X Low
PreProduction
Production Pre-production Very high Critical Moderate

production Very high Critical High


Changing Market
FPGA
Generic 2D FPGA
SRAM Based FPGA - XILINX
Xilinx 4000 CLB
The Basic Building Block
• Logic Block
– Lookup table Based
• Xilinx
– Multiplexor Based
• Actel
– Transistor Based
– Universal Gate Based
LUT Mapping
• N-LUT direct implementation of a truth table: any function
of n-inputs.
• N-LUT requires 2N storage elements (latches)
• N-inputs select one latch location (like a memory)
Implementing Combinational
Logic

Two 4-input functions with register outputs


and one 2-input function
5 input Function
Single Port RAM

http://www.xilinx.com/bvdocs/publications/4000.pdf Page 9
Platform Computing
The Virtex Architecture
• CLBs
• IOBs
• General Routing
Matrix (GRM)
• BRAMs
• DLL
Virtex II Architecture
Virtex II CLB

V2 CLB Configuration

V2 Slice Configuration
Virtex II CLB (Half Slice)
Adder
Carry Chain
Other Features
The latest entry – Virtex II Pro
•Embedded high-speed serial transceivers enable data
bit rate up to 3.125 Gb/s per channel (RocketIO) or
10.3125 Gb/s (RocketIO X).
• Embedded IBM PowerPC 405 RISC processor blocks
provide performance up to 400 MHz.
• SelectIO-Ultra blocks provide the interface between
package pins and the internal configurable logic. Most
popular and leading-edge I/O standards are supported
by the programmable IOBs.
• Configurable Logic Blocks (CLBs) provide functional
elements for combinatorial and synchronous logic,
including basic storage elements. BUFTs (3-state
buffers) associated with each CLB element drive
dedicated segmentable horizontal routing resources.
•Block SelectRAM+ memory modules provide large
18 Kb storage elements of True Dual-Port RAM.
• Embedded multiplier blocks are 18-bit x 18-bit
dedicated multipliers.
• Digital Clock Manager (DCM) blocks provide self
calibrating,fully digital solutions for clock distribution
delay compensation, clock multiplication and division,
and coarse- and fine-grained clock phase shifting.
FPGA Technology Mapping
Outline
• Technology mapping
– Definition & Examples
– Algorithms
• FPGA structure & simple mapping
• FPGA technology mapping
– Issues
– Algorithms
Definition
Technology mapping is also referred to as
library binding.

Given a Boolean network and a


characterized cell library, generate a
mapping of the network components onto
cell library components with the objective
of cost optimization or delay optimization.
Input & Library
• Input: Boolean network - Technology
independent optimized network; typically a
multi-level network
• Library:
– Characterization in terms of area, delay and
power
– Enumerated or implicit library cells
Typical Library
A typical simple library cell :-
• a single output combinational logic function
• cost in terms of area
• delay in terms of propagation delays for each
input/output pair and as a function of load
and/or fanout. Sometimes only the worst case
values are stored.
• power in terms of average current
Network Covering
Network covering implies replacement of the
sub-networks of the original network with
cell library instances. Covering entails
recognizing the equivalence of library cell
to the identified sub-network and selecting
adequate number of them to cover the
whole network.
Example 1

Cell library consists of two and three input gates


Example: First Mapping

Cell library consists of two and three input gates


Example: Second Mapping

Cell library consists of two and three input gates


Example 2
Cell library consists of

Component Area Delay


AND2 3 2
OR2 3 2
OA21 5 3
Example: First Mapping
Cell library consists of

Component Area Delay


AND2 3 2
OR2 3 2
OA21 5 3

Area = 9, Delay = 4
Example: Second Mapping
Cell library consists of

Component Area Delay


AND2 3 2
OR2 3 2
OA21 5 3

Area = 10, Delay = 3


Example
m4
Cell library consists of
m2
Component Area Delay
m1 m5 AND2 3 2
OR2 3 2
m3
OA21 5 3

(m1 + m4 + m5)(m2 + m4)(m3 + m5)(m2’ + m1)(m3’ + m1) = 1


FPGA Structures & Mapping
FPGA Structures
• Multiplexer based (ACTEL)
– Mapping techniques similar to library based
– Library is created by enumerating all possible
“patterns”
• LUT based (XILINX)
– Significantly different mapping techniques
LUT Based FPGAs
In LUT based FPGAs (example XILINX
FPGAs) the building blocks are LUTs and
Flip-Flops. A n-input LUT can implement
all functions of n-variables.
The FPGA itself is composed of CLB’s
with each CLB containing multiple LUT’s
and flip-flops which makes the technology
mapping problem more complex.
XC3000 CLB
FF
2X4
0r
1X5
LUT

FF
XC4000 CLB
4 input
LUT
FF

3-input
LUT

FF
4 input
LUT
Mapping Objectives
• Cost optimal mapping
– Minimizing the number of LUTs
– Minimizing the number of CLBs
• Delay optimal mapping
– Minimizing the number of LUT levels
– Minimizing the delays (including routing
delays)
Cost Optimal Mapping
The problem of k-input LUT maps can be
mapped to the problem of bin packing. We
have to minimize the number of bins each
with a capacity of k.
Assume the starting point is a gate-level
netlist with each gate containing less than
equal to k inputs.
Each gate can be packed into one bin.
Example: Simple Mapping
Sum of Products: Bin Packing
• Select the product term with the most
number of variables and fit it into any table
where it fits and if it doesn’t fit anywhere
add a new table
• The table with the fewest number of unused
inputs is declared as final
• Associate this output with the first table that
can accept it
Example: 4-input LUT
Example: Overlapping Inputs

a
b
c

a
d

e
f
g
K=4
Example: Decomposition

a
b
c

h
d

e
f
g
K=4
Example: 3 input LUT
FPGA Technology Mapping:
Issues
LUT Mapping
Starting from a technology independent
optimized circuit, produce a minimal LUT
cover for the circuit. The complexities are
due to the following reasons.
• Fanout nodes
• Reconvergence
• Node decomposition and packing
Area vs. Delay
Decomposition
Decomposition
Fanout: Replication

DAG, not a tree


Fanout: Replication
Fanout: Reconvergence
Fanout: Reconvergence
CLB Mapping
Though direct mapping of technology
independent circuit onto CLBs would
involve function decomposition.
Alternatively, one can start from a circuit
mapped onto LUTs and then pack them
onto CLBs.
Thank You

You might also like