A Time-Multiplexed: Steve Trimberger, Dean Carberry, Anders Johnson, Jennifer Wong
A Time-Multiplexed: Steve Trimberger, Dean Carberry, Anders Johnson, Jennifer Wong
Abstract
This paper describes the architecture of a time-multiplexed
FPGA. Eight configurations of the FPGA are stored in onchip memory. This inactive on-chip memory is distributed
around the chip, and accessible so that the entire configuration of the FPGA can be changed in a single cycle of the
memory. The entire configuration of the FPGA can be loaded
from this on-chip memory in 3011s. Inactive memory is accessible as block RAM for applications. The FPGA is based on
the Xilinx XC4000E FPGA, and includes extensions for
dealing with state saving and forwarding and for increased
routing demand due to time-multiplexing the hardware.
This paper is divided into three sections. First is an introduction to the method of time multiplexing used in this architecture. The second section discusses modes of operation of a
time-multiplexed device. The third section describes the
device itself and highlights the device features and how they
support the modes of operation.
Background
22
Modes of Operation
A rapidly-reconfigurable FPGA is a mere curiosity without a
model of use that can be used as a design target and automated. We envision three modes of operation of the device:
logic engine mode, time-share mode and static mode.
Static Mode
In static mode, the FPGA, or part of it, does not appear to be
reconfiguring at all. Static mode is used to build logic that
must always be resident and active -- for example the logic
that controls the time share or logic engine sequencing, or
asynchronous logic. We implement static logic by programming memory cells of multiple configuration bits to be identical. When the new configuration is loaded, it operates
identically. We ensure that reconfiguration does not glitch
control points if they are unchanged, so the emulated logic
operates without interruption.
23
24
--7+
It
F4
F3
F2
In logic engine mode, micro registers store intermediate values and flip flop values. This is a critically important capability. Although combinational logic can be multiplexed
among several functions, state storage cannot. As a result,
although eight LUTs in a design may share a single physical
LUT, eight flip flops must all be provided for subsequent
logic to access the results. The storage for these flip flops is
in the micro registers.
Micro Registers
A micro register stores the CLB output (either the combinational or sequential output) when the P G A changes configuration. This state saving is automatic, though each micro
register can use the CLBs clock signal as a clock enable,
which can be used to control state saving. The state-saveenable feature is used to support multiple clocks in logic
engine mode.
25
Configuration Controller
To support timeshare mode, flash reconfiguration can be initiated by an external or internal signal. The address of the
new configuration plane can come from an internal or external source. Therefore, there is no restriction on which configuration is the next to be run. The reconfiguration
operation proceeds as follows:
Interconnect
Like CLB configuration cells, all configuration cells that control interconnect are backed by eight inactive memory cells.
Horizontal
Quads
Octals
User Memory
User memory access (#3) and pre-fetching the next configuration (#4) can be pipelined with the operation of the logic
in the configuration (#2). Steps #3 and #4 are both memory
operations, and both used the configuration memory bus.
Therefore they must be serialized. However, since each
memory operation takes about 5ns, the delay for these operations can be completely hidden.
Notice the CLB flip flop values are not restored. The CLB
flip flop is not used in LE mode -- flip flop values are stored
in micro registers.
26
Static Mode
Since static mode is supported by programming some bits
identically, the controller has no special sequencing to support it. The control over restoring flip flops is actually stored
as bits in each CLB, so state restoration is permitted on a
per-CLB basis. This flexibility is required to allow static flip
flops to be distributed among time-shared logic. The CLBs
containing static flip flops are not overwritten during reconfiguration, while those in time-shared logic are overwritten.
Power Consumption
Configuration/User
Memory Data
w2
w3
w4
W8
I
b,it
27
1-1
Latch
+to CLB/lnterconnect
Control
Layout
As shown in figure 7, the design consists of columns of memory blocks, interleaved with logic. Each memory block contains 16 bits of memory, eight of which control field
programmable logic on the left, eight on the right. The area
between the memory cell columns is used to build the fieldprogrammable logic: lookup tables, micro registers and programmable interconnect.
References
N. Bhat, K. Chaudhary, E.S. Kuh, Performance-Oriented
Fully Routable Dynamic Architecture for a Field Programmable Logic Device, M93/42, U.C. Berkeley, 1993.
Memory columns
CLE
CL
IOB
I I
Figure 7. Chip Layout Floorplan. Upper Right Corner
S . Trimberger, Mapping Large Designs into a Time-Multiplexed FPGA, private communication, 1997.
Xilinx, The Programmable Logic Data Book, 1996.
The layout was done in 0.5pm CMOS. The chip has not yet
been fabricated.
Summary
This paper describes an architecture and modes of operation
of a time-multiplexed FPGA. Modes of operation include:
Logic engine mode, where the device emulates a single
large FPGA.
Time share mode, where the device emulates several communicating FPGAs.
Static mode, where the logic remains active and unchanged
during configuration.
28