Spartan 6 FPGA Configurable Logic Block
Spartan 6 FPGA Configurable Logic Block
User Guide
Revision History
The following table shows the revision history for this document.
Spartan-6 FPGA CLB User Guide www.xilinx.com UG384 (v1.1) February 23, 2010
Table of Contents
Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Additional Documentation
The following documents are also available for download at
http://www.xilinx.com/support/documentation/spartan-6.htm.
• Spartan-6 Family Overview
This overview outlines the features and product selection of the Spartan-6 family.
• Spartan-6 FPGA Data Sheet: DC and Switching Characteristics
This data sheet contains the DC and switching characteristic specifications for the
Spartan-6 family.
• Spartan-6 FPGA Packaging and Pinout Specifications
This specification includes the tables for device/package combinations and maximum
I/Os, pin definitions, pinout tables, pinout diagrams, mechanical drawings, and
thermal specifications.
• Spartan-6 FPGA Configuration User Guide
This all-encompassing configuration guide includes chapters on configuration
interfaces (serial and parallel), multi-bitstream management, bitstream encryption,
boundary-scan and JTAG configuration, and reconfiguration techniques.
• Spartan-6 FPGA SelectIO Resources User Guide
This guide describes the SelectIO™ resources available in all Spartan-6 devices.
• Spartan-6 FPGA Clocking Resources User Guide
This guide describes the clocking resources available in all Spartan-6 devices,
including the DCMs and PLLs.
• Spartan-6 FPGA Block RAM Resources User Guide
This guide describes the Spartan-6 device block RAM capabilities.
• Spartan-6 FPGA GTP Transceivers User Guide
This guide describes the GTP transceivers available in the Spartan-6 LXT FPGAs.
COUT
CLB
Slice(1)
Switch
Matrix
Slice(0)
CIN
ug384_01_042309
The Xilinx tools designate slices with the following definitions. An “X” followed by a
number identifies the position of each slice in a pair as well as the column position of the
slice. The “X” number counts slices starting from the bottom in sequence 0, 1 (the first CLB
column); 2, 3 (the second CLB column); etc. A “Y” followed by a number identifies a row of
slices. The number remains the same within a CLB, but counts up in sequence from one
CLB row to the next CLB row, starting from the bottom. Figure 2 shows four CLBs located
in the bottom-left corner of the die.
COUT COUT
CLB CLB
SLICEX SLICEX
X1Y1 X3Y1
Slice Slice
X0Y1 X2Y1
CIN CIN
COUT COUT
CLB CLB
SLICEX SLICEX
X1Y0 X3Y0
Slice Slice
X0Y0 X2Y0
ug384_02_042309
Slice Description
Every slice contains four logic-function generators (or look-up tables, LUTs) and eight
storage elements. These elements are used by all slices to provide logic and ROM functions
(Table 1). SLICEX is the basic slice. Some slices, called SLICELs, also contain an arithmetic
carry structure that can be concatenated vertically up through the slice column, and wide-
function multiplexers. The SLICEMs contain the carry structure and multiplexers, and add
the ability to use the LUTs as 64-bit distributed RAM and as variable-length shift registers
(maximum 32-bit).
Each column of CLBs contain two slice columns. One column is a SLICEX column, the
other column alternates between SLICEL and SLICEMs. Thus, approximately 50% of the
available slices are of type SLICEX, while 25% each are SLICEL and SLICEMs. The
XC6SLX4 does not have SLICELs (Table 3).
SLICEM (shown in Figure 3) represents a superset of elements and connections found in
all slices. SLICEL is shown in Figure 4. SLICEX is shown in Figure 5. All eight SR, CE, and
CLK inputs are driven by common control inputs.
COUT
DI
DI2 CLK WE D
D6:D1 A6:A1 O6
WA6:WA1 O5 D Q
LUT CE D5Q
RAM CK MC31
ROM DI1 CARRY4 SRINIT0 CY DMUX
SRINIT1 XOR
DPRAM64 O5
DPRAM32 SR O6
SPRAM64
SPRAM32 D Q DQ
SRL16 FF
MC31 CE
SRL32 WA8 LATCH
CY CK
WA7 AND2L
XOR
MC31 OR2L
DX SRINIT0
O5 SRINIT1
DX O6
SR
CI
DI2 CLK WE
A6:A1 O6 C
C6:C1
WA6:WA1 O5 D Q
LUT CE
RAM CK C5Q
ROM DI1 SRINIT0 F7
SRINIT1 CY
DPRAM64 XOR CMUX
DPRAM32 SR O5
SPRAM64 O6
SPRAM32
SRL16
SRL32 WA8 F7
WA7 CY D Q CQ
FF
MC31 XOR CE LATCH
CX CK AND2L
CX O5 OR2L
CLK O6 SRINIT0
SRINIT1
SR
WE
CE
SR
BI
DI2 CLK WE B
B6:B1 A6:A1 O6
WA6:WA1 O5 D Q
LUT CE B5Q
RAM CK F8
ROM DI1 SRINIT0 CY
BMUX
SRINIT1 XOR
DPRAM64 O5
DPRAM32 SR O6
SPRAM64
SPRAM32 D FF Q BQ
SRL16 WA8 F8
CY
CE LATCH
SRL32 CK AND2L
WA7 XOR
MC31 BX OR2L
O5 SRINIT0
BX O6 SRINIT1
SR
AI
DI2 CLK WE A
A6:A1 A6:A1 O6
WA6:WA1 O5 D Q
LUT CE A5Q
RAM CK F7
ROM DI1 SRINIT0 CY
XOR AMUX
SRINIT1
DPRAM64 O5
DPRAM32 SR O6
SPRAM64
SPRAM32 D Q AQ
SRL16 WA8 F7 FF
CY
CE LATCH
SRL32
WA7 XOR CK AND2L
MC31 AX OR2L
O5 SRINIT0
O6 SRINIT1
SR
1
AX
CIN
ug384_03_042309
COUT
D
D6:D1 A6:A1 O6
O5 D Q
CE D5Q
CK
CARRY4 SRINIT0 CY DMUX
SRINIT1 XOR
O5
SR O6
D FF Q DQ
CE LATCH
CY CK AND2L
XOR
OR2L
DX SRINIT0
O5 SRINIT1
DX O6
SR
A6:A1 O6 C
C6:C1
O5 D Q
CE
CK C5Q
SRINIT0 F7
SRINIT1 CY
XOR CMUX
SR O5
O6
F7
CY D FF Q CQ
XOR CE LATCH
CX CK AND2L
CX O5 OR2L
CLK O6 SRINIT0
SRINIT1
SR
CE
SR
B
B6:B1 A6:A1 O6
O5 D Q
CE B5Q
CK F8
SRINIT0 CY
BMUX
SRINIT1 XOR
O5
SR O6
D FF Q BQ
F8
CY
CE LATCH
XOR CK AND2L
BX OR2L
O5 SRINIT0
BX O6 SRINIT1
SR
A
A6:A1 A6:A1 O6
O5 D Q
CE A5Q
CK F7
SRINIT0 CY
XOR AMUX
SRINIT1
O5
SR O6
D FF Q AQ
F7
CY
CE LATCH
XOR CK AND2L
AX OR2L
O5 SRINIT0
O6 SRINIT1
SR
1
AX
CIN
ug384_04_042309
D
D6:D1 A6:A1 O6
O5 D Q
CE
CK D5Q
SRINIT0 DMUX
SRINIT1
O5
SR O6
D FF Q DQ
CE LATCH
CK AND2L
OR2L
DX DX SRINIT0
SRINIT1
O6
SR
A6:A1 O6 C
C6:C1
O5 D Q
CE
CK
SRINIT0 C5Q
SRINIT1 CMUX
SR O5
O6
D FF Q CQ
CE LATCH
CX CX CK AND2L
OR2L
CLK O6 SRINIT0
SRINIT1
SR
CE
SR
B
B6:B1 A6:A1 O6
O5 D Q
CE
CK B5Q
SRINIT0 BMUX
SRINIT1
O5
SR O6
D FF Q BQ
CE LATCH
CK AND2L
BX BX OR2L
SRINIT0
O6 SRINIT1
SR
A
A6:A1 A6:A1 O6
O5 D Q
CE
CK A5Q
SRINIT0 AMUX
SRINIT1
O5
SR O6
D FF Q AQ
CE LATCH
CK AND2L
AX OR2L
SRINIT0
O6 SRINIT1
SR
AX
ug384_05_121108
Notes:
1. SLICEM only, SLICEL and SLICEX do not have distributed RAM or shift registers.
2. SLICEM and SLICEL only.
Table 3 shows the available CLB resources for the Spartan-6 FPGAs. The ratio between the
number of 6-input LUTs and logic cells is 1.6. This reflects the increased capability of the
new 6-input LUT architecture compared to traditional 4-input LUTs.
DI
QD or Q6
A6 D Q
LUT6
LUT5 O6 O6
A[5:1]
O5 or Q5
LUT5 O5
D Q
UG384_06new_021210
Figure 6: LUT6
In addition to the basic LUTs, SLICEL and SLICEM contain three multiplexers (F7AMUX,
F7BMUX, and F8MUX). These multiplexers are used to combine up to four function
generators to provide any function of seven or eight inputs in a slice. F7AMUX and
F7BMUX are used to generate seven input functions from slice A and B, or C and D, while
F8MUX is used to combine all slices to generate eight input functions. Functions with more
than eight inputs can be implemented using multiple slices. There are no direct
connections between slices to form function generators greater than eight inputs within a
CLB or between slices, but CLB outputs can be routed through the switch matrix and
directly back into the CLB inputs.
Storage Elements
Each slice has eight storage elements. There are four storage elements in a slice that can be
configured as either edge-triggered D-type flip-flops or level-sensitive latches. The D input
can be driven directly by a LUT output via AFFMUX, BFFMUX, CFFMUX or DFFMUX, or
by the BYPASS slice inputs bypassing the function generators via AX, BX, CX, or DX input.
When configured as a latch, the latch is transparent when the CLK is Low.
In Spartan-6 devices, there are four additional storage elements that can only be configured
as edge-triggered D-type flip-flops. The D input can be driven by the O5 output of the LUT.
When the original 4 storage elements are configured as latches, these 4 additional storage
elements can not be used.
Figure 7 shows both the register only and the register/latch configuration in a slice, both
are available.
CFF CFF/LATCH
LUT C O6 Output
SRINIT1 FF
LUT C O5 SRINIT0 Q CQ LATCH Q CQ
D D SRINIT1
Output
CE CX CE SRINIT0
CK CK
SR SR
AFF AFF/LATCH
LUT A O6 Output
SRINIT1 Q FF
AQ LATCH Q AQ
LUT A O5 D
SRINIT0
D SRINIT1
Output CE CE SRINIT0
AX
CK CK
SR SR
ug384_06_042309
The control signals clock (CLK), clock enable (CE), and set/reset (SR) are common to all
storage elements in one slice. When one flip-flop in a slice has SR or CE enabled, the other
flip-flops used in the slice will also have SR or CE enabled by the common signal. Only the
CLK signal has independent polarity but applies it to all eight storage elements. Any
inverter placed on the clock signal is automatically absorbed. The CE and SR signals are
active High. All flip-flop and latch primitives have CE and non-CE versions. The SR signal
always has priority over CE.
Initialization
The SR signal forces the storage element into the initial state specified by SRINIT1 or
SRINIT0. SRINIT1 forces a logic High at the storage element output when SR is asserted,
while SRINIT0 forces a logic Low at the storage element output (see Table 4).
SRINIT0 and SRINIT1 can be set individually for each storage element in a slice. The choice
of synchronous (SYNC) or asynchronous (ASYNC) set/reset (SRTYPE) is common to all
eight storage elements and cannot be set individually for each storage element in a slice.
The initial state after configuration or global initial state is also defined by the same SRINIT
option. The initial state is set whenever the Global Set/Reset (GSR) signal is asserted. The
GSR signal is always asserted during configuration, and can be controlled after
configuration by using the STARTUP_SPARTAN6 primitive. To maximize design
flexibility and utilization, use the GSR and avoid local initialization signals.
The initial state of any storage element (SRINIT) is defined in the design either by the INIT
attribute or by the use of a set or reset. If both methods are used, they must both be 0 or
both be 1. INIT = 0 or a reset selects SRINIT0, and INIT = 1 or a set selects SRINIT1.
The storage element must be initialized to the same value both by the global power-up or
GSR signal, and by the local SR input to the slice. A storage element cannot have both set
and reset, unless one is defined as a synchronous function so that it can be placed in the
LUT. Avoid instantiating primitives with the control input while specifying the INIT
attribute in an opposite state, for example, an FDRE with a reset input and the INIT
attribute set to 1. Care should be taken when re-targeting designs from another technology
to the Spartan-6 architecture. If converting an existing FPGA design, avoid primitives that
use both set and reset, such as the FDCPE primitive.
Each of the eight flip-flops in a slice must use the same SR input, although they can be
initialized to different values. A second initialization control will require implementation
in a separate slice, so minimize the number of initialization signals. The SR could be turned
off for all flip-flops in a slice and implemented independently for each flip-flop by
implementing it synchronously in the LUT.
The SR signal is available to the flip-flop, independent of whether the LUT is used as a
distributed RAM or shift register, which supports a registered read from distributed RAM
or an additional pipeline stage in a shift register while still allowing initialization.
The configuration options for the set and reset functionality of a register or the four storage
elements capable of functioning as a latch are as follows:
• No set or reset
• Synchronous set
• Synchronous reset
• Asynchronous set (preset)
• Asynchronous reset (clear)
Notes:
1. S = single-port configuration; D = dual-port configuration; Q = quad-port configuration; SDP = simple
dual-port configuration.
2. RAM32M is the associated primitive for this configuration.
3. RAM64M is the associated primitive for this configuration.
For single-port configurations, distributed RAM has a common address port for
synchronous writes and asynchronous reads. For dual-port configurations, distributed
RAM has one port for synchronous writes and asynchronous reads, and another port for
asynchronous reads. In simple dual-port configuration, there is no data out (read port)
from the write port. For quad-port configurations, distributed RAM has one port for
synchronous writes and asynchronous reads, and three additional ports for asynchronous
reads.
In single-port mode, read and write addresses share the same address bus. In dual-port
mode, one function generator is connected with the shared read and write port address.
The second function generator has the A inputs connected to a second read-only port
address and the WA inputs shared with the first read/write port address.
Figure 8 through Figure 16 illustrate various example distributed RAM configurations
occupying one SLICEM. When using x2 configuration (RAM32X2Q), A6 and WA6 are
driven High by the software to keep O5 and O6 independent.
RAM 32X2Q
DPRAM32
(DX)
DID[1] DI1 O6 DOD[0]
(AI/BI/CI/DI)
DID[0] DI2
D[5:1] 5
ADDRD[4:0] A[6:1]
5
WA[6:1]
(CLK)
WCLK CLK O5 DOD[1]
(WE)
WED WE
DPRAM32
DI1 O6 DOC[0]
DI2
C[5:1] 5
ADDRC[4:0] A[6:1]
5
WA[6:1]
CLK O5 DOC[1]
WE
DPRAM32
DI1 O6 DOB[0]
DI2
B[5:1] 5
ADDRB[4:0] A[6:1]
5
WA[6:1]
CLK O5 DOB[1]
WE
DPRAM32
DI1 O6 DOA[0]
DI2
A[5:1] 5
ADDRA[4:0] A[6:1]
5
WA[6:1]
CLK O5 DOA[1]
WE
ug384_07_042309
RAM 32X6SDP
DPRAM32
unused DI1
unused DI2
WADDR[5:1] D[5:1] 5
A[6:1]
WADDR[6] = 1 5
WA[6:1]
(CLK)
WCLK CLK
(WE)
WED WE
DPRAM32
DATA[1] DI1 O6 O[1]
DATA[2] DI2
C[5:1] 5
RADDR[5:1] A[6:1]
RADDR[6] = 1 5
WA[6:1]
CLK O5 O[2]
WE
DPRAM32
DATA[3] DI1 O6 O[3]
DATA[4] DI2
B[5:1] 5
A[6:1]
5
WA[6:1]
CLK O5 O[4]
WE
DPRAM32
DATA[5] DI1 O6 O[5]
DATA[6] DI2
A[5:1] 5
A[6:1]
5
WA[6:1]
CLK O5 O[6]
WE
ug384_08_042309
RAM64X1S
SPRAM64
(DX) O
D DI1 O6 Output
6 (D[6:1]) Registered
A[5:0] A[6:1] D Q
Output
6
WA[6:1]
(CLK) (Optional)
WCLK CLK
(WE/CE)
WE WE
ug384_09_042309
If four single-port 64 x 1-bit modules are built, the four RAM64X1S primitives can occupy
a SLICEM, as long as they share the same clock, write enable, and shared read and write
port address inputs. This configuration equates to 64 x 4-bit single-port distributed RAM.
X-Ref Target - Figure 11
RAM64X1D
DPRAM64
(DX) SPO
D DI1 O6
(D[6:1]) 6 Registered
A[5:0] A[6:1] D Q
Output
6
WA[6:1]
(CLK) (Optional)
WCLK CLK
(WE/CE)
WE WE
DPRAM64
DPO
DI1 O6
(C[6:1]) 6 Registered
DPRA[5:0] A[6:1] D Q
Output
6
WA[6:1]
(Optional)
CLK
WE
ug384_10_042309
If two dual-port 64 x 1-bit modules are built, the two RAM64X1D primitives can occupy a
SLICEM, as long as they share the same clock, write enable, and shared read and write port
address inputs. This configuration equates to 64 x 2-bit dual-port distributed RAM.
RAM64X1Q
DPRAM64
(DX) DOD
DID DI1 O6
(D[6:1]) Registered
ADDRD A[6:1] D Q Output
WA[6:1]
(CLK)
WCLK CLK
(WE) (Optional)
WE WE
DPRAM64
DI1 DOC
O6
(C[6:1])
ADDRC A[6:1] Registered
D Q Output
WA[6:1]
CLK
WE (Optional)
DPRAM64
DOB
DI1 O6
(B[6:1]) Registered
ADDRB A[6:1] D Q Output
WA[6:1]
CLK
WE (Optional)
DPRAM64
DOA
DI1 O6
(A[6:1]) Registered
ADDRA A[6:1] D Q Output
WA[6:1]
CLK
WE (Optional)
ug384_11_042309
RAM 64X3SDP
DPRAM32
unused DI1
unused DI2
D[6:1] 6
WADDR[6:1] A[6:1]
6
WA[6:1]
(CLK)
WCLK CLK
(WE)
WED WE
DPRAM32
DATA[1] DI1 O6 O[1]
DI2
C[6:1] 6
RADDR[6:1] A[6:1]
6
WA[6:1]
CLK O5
WE
DPRAM32
DATA[2] DI1 O6 O[2]
DI2
B[6:1] 6
A[6:1]
6
WA[6:1]
CLK O5
WE
DPRAM32
DATA[3] DI1 O6 O[3]
DI2
A[6:1] 6
A[6:1]
6
WA[6:1]
CLK O5
WE
ug384_12_042309
Implementation of distributed RAM configurations with depth greater than 64 requires the
usage of wide-function multiplexers (F7AMUX, F7BMUX, and F8MUX).
RAM128X1S
A6 (CX)
SPRAM64
(DX)
D DI1 O6
[5:0]
A[6:0] A[6:1]
7
WA[7:1]
(CLK)
WCLK CLK
(WE/CE)
WE WE
0
Output
SPRAM64 Registered
F7BMUX D Q
Output
DI1 O6
[5:0]
(Optional)
A[6:1]
7
WA[7:1]
CLK
WE
ug384_13_042309
If two single-port 128 x 1-bit modules are built, the two RAM128X1S primitives can occupy
a SLICEM, as long as they share the same clock, write enable, and shared read and write
port address inputs. This configuration equates to 128 x 2-bit single-port distributed RAM.
RAM128X1D
A6 (CX)
DPRAM64
DX
D DI1 O6
6
A[6:0] A[6:1]
7
WA[7:1]
(CLK)
WCLK CLK
(WE)
WE WE
SPO
DPRAM64 Registered
F7BMUX D Q
Output
DI1 O6
(Optional)
6
A[6:1]
7
WA[7:1]
CLK
WE
DPRAM64
DI1 O6
6
DPRA[6:0] A[6:1]
7
WA[7:1]
CLK
WE
DPO
DPRAM64 Registered
F7AMUX D Q
Output
DI1 O6
(Optional)
6
A[6:1]
7
WA[7:1]
CLK
WE
AX
ug384_14_042309
RAM256X1S
SPRAM64
D DI1 O6
6
A[7:0] A[6:1]
8
WA[8:1]
(CLK)
WCLK CLK
(WE/CE) A6 (CX)
WE WE
SPRAM64 F7BMUX
DI1 O6
6
A[6:1]
8
WA[8:1]
A7 (BX)
CLK
WE O
Output
Registered
F8MUX D Q
SPRAM64 Output
DI1 O6 (Optional)
6
A[6:1]
8
WA[8:1] A6 (AX)
CLK
WE
SPRAM64 F7AMUX
DI1 O6
6
A[6:1]
8
WA[8:1]
CLK
WE
ug384_15_042309
Distributed RAM configurations larger than the examples provided in Figure 8 through
Figure 16 require more than one SLICEM. There are no direct connections to form larger
distributed RAM configurations within a CLB or between slices.
Using distributed RAM for memory depths of 64 bits or less is generally more efficient
than block RAM in terms of resources, performance, and power. For depths greater than 64
bits but less than or equal to 128 bits, use the following guidelines:
• To conserve LUT resources, use any extra block RAM
• For asynchronous read capability, use distributed RAM
• For widths greater than 16 bits, use block RAM
SRLC32E
SRL32
(AX)
SHIFTIN (D) DI1
MC31
SHIFTOUT (Q31)
5 (A[6:2])
A[4:0] A[6:2]
ug384_16_042309
WE SHIFTOUT(Q31)
CLK
5
Address (A[4:0]) MUX
Q ug384_17_042309
Figure 19 shows two 16-bit shift registers. The example shown can be implemented in a
single LUT.
SRL16
SHIFTIN1 (AX) DI1 O5
4
A[3:0] A[5:2]
CLK CLK
CE WE
SRL16
SHIFTIN2 (AI) DI2 O6
4
A[5:2] MC31
CLK
WE
ug384_18_042309
SRL32
SHIFTIN (D) DI1 O6
5
A[5:0] A[6:2]
MC31 A5 (AX)
(CLK)
CLK CLK
(WE/CE)
WE WE
Output (Q)
(AQ) Registered
SRL32 F7AMUX D Q
Output
DI1 O6
(Optional)
5 (MC31)
A[6:2] MC31
CLK
WE SHIFTOUT (Q63)
ug384_19_042309
CX (A5)
SRL32
SHIFTIN (D) DI1 O6
5
A[6:0] A[6:2]
MC31 F7BMUX
BX (A6)
(CLK)
CLK CLK
(WE/CE)
WE WE (BMUX)
Output (Q)
(BQ) Registered
F8MUX D Q
Output
SRL32 (Optional)
DI1 O6
5
A[6:2]
MC31
CLK
WE
AX (A5)
SRL32
Not Used
DI1 O6
F7AMUX
5
A[6:2]
CLK
WE UG384_20_042309
SRL32
SHIFTIN (D) DI1 O6
5
A[6:0] A[6:2]
MC31 CX (A5)
(CLK)
CLK CLK
(WE/CE)
WE WE
SRL32 F7BMUX
DI1 O6
A[6:2] BX (A6)
MC31
CLK
WE (BMUX)
Output (Q)
(BQ) Registered
D Q
F8MUX Output
SRL32 (Optional)
DI1 O6
A[6:2]
MC31 AX (A5)
CLK
WE
SRL32
F7AMUX
DI1 O6
A[6:2]
(MC31)
MC31
CLK SHIFTOUT (Q127)
WE
ug384_21_042309
It is possible to create shift registers longer than 128 bits across more than one SLICEM.
However, there are no direct connections between slices to form these shift registers.
Multiplexers
Function generators and associated multiplexers in SLICEL or SLICEM can implement the
following:
• 4:1 multiplexers using one LUT
• 8:1 multiplexers using two LUTs
• 16:1 multiplexers using four LUTs
These wide input multiplexers are implemented in one level or logic (or LUT) using the
dedicated F7AMUX, F7BMUX, and F8MUX multiplexers. These multiplexers allow LUT
combinations of up to four LUTs in a slice. Dedicated multiplexers can be automatically
inferred from the design, or the specific primitives can be instantiated. See WP309:
Targeting and Retargeting Guide for Spartan-6 FPGAs White Paper.
SLICE
LUT
(D)
O6 4:1 MUX Output
(D[6:1]) 6 (DQ) Registered
SEL D [1:0], DATA D [3:0] D Q
A[6:1] Output
Input
(Optional)
LUT
(C)
O6 4:1 MUX Output
(Optional)
LUT
(B)
O6 4:1 MUX Output
(B[6:1]) 6 (BQ) Registered
SEL B [1:0], DATA B [3:0] A[6:1] D Q
Output
Input
(Optional)
LUT
(A)
O6 4:1 MUX Output
(A[6:1]) (AQ) Registered
SEL A [1:0], DATA A [3:0] 6 D Q
A[6:1] Output
Input
(CLK) (Optional)
CLK
ug384_22_042309
SLICE
LUT
O6
(CX)
SELF7(1)
(CLK)
CLK
LUT
O6
(B[6:1]) 6
SEL B [1:0], DATA B [3:0] A[6:1]
Input (2) F7AMUX
(AMUX)
8:1 MUX
LUT Output (2)
(AQ) Registered
O6 D Q
Output
(A[6:1])
SEL A [1:0], DATA A [3:0] 6
A[6:1]
Input (2)
(Optional)
(AX)
SELF7(2)
ug384_23_042309
SLICE
LUT
O6
(D[6:1]) 6
SEL D [1:0], DATA D [3:0] A[6:1] F7BMUX
Input
LUT
O6
LUT
O6
SEL A [1:0], DATA A [3:0] (A[6:1])
6
Input A[6:1]
(AX)
SELF7
(BX)
SELF8
(CLK)
CLK
ug384_24_042309
It is possible to create multiplexers wider than 16:1 across more than one SLICEM.
However, there are no direct connections between slices to form these wide multiplexers.
(Optional)
CO2
CMUX/CQ*
S2
O6 From LUTC MUXCY
O2
CMUX
O5 From LUTC DI2
CX D Q CQ
(Optional)
CO1
BMUX/BQ*
S1
O6 From LUTB MUXCY
O1
BMUX
O5 From LUTB DI1
BX D Q BQ
(Optional)
CO0
AMUX/AQ*
S0
O6 From LUTA MUXCY
O0
AMUX
O5 From LUTA DI0
AX D Q AQ
* Can be used if
01 unregistered/registered
outputs are free.
CIN (From Previous Slice)
ug384_25_042309
The carry chains carry lookahead logic along with the function generators. There are ten
independent inputs (S inputs – S0 to S3, DI inputs – DI1 to DI4, CYINIT and CIN) and eight
independent outputs (O outputs – O0 to O3, and CO outputs – CO0 to CO3).
The S inputs are used for the “propagate” signals of the carry lookahead logic. The
“propagate” signals are sourced from the O6 output of a function generator. The DI inputs
are used for the “generate” signals of the carry lookahead logic. The “generate” signals are
sourced from either the O5 output of a function generator or the BYPASS input (AX, BX,
AND2B1L OR2L
ug384_27_012710
As shown in Figure 28, the data and SR inputs and Q output of the latch are used when the
AND2B1L and OR2L primitives are instantiated, and the CK gate and CE gate enables are
held active High. The AND2B1L combines the latch data input (the inverted input on the
gate, DI) with the asynchronous clear input (SRI). The OR2L combines the latch data input
with an asynchronous preset. Generally, the latch data input comes from the output of a
LUT within the same slice, extending the logic capability to another external input. Since
there is only one SR input per slice, using more than one AND2B1L or OR2L per slice
requires a shared common external input.
X-Ref Target - Figure 28
IN[6:1] O6 Q
LUT6 D Q
VCC
OR2L
CE SRINIT1
CK RESET TYPE = ASYNC
SR
SRI
ug384_28_021610
The AND2B1L and OR2L two-input gates save LUT resources and are initialized to a
known state on power-up and on GSR assertion. Using these primitives can reduce logic
levels and increase logic density of the device by trading register/latch resources for logic.
However, due to the static inputs required on the clock and clock enable inputs, specifying
one or more AND2B1L or OR2L primitives can cause register packing and density issues in
a slice disallowing the use of the remaining registers and latches.
Interconnect Resources
Interconnect is the programmable network of signal pathways between the inputs and
outputs of functional elements within the FPGA, such as IOBs, CLBs, DSP slices, and block
RAM. Interconnect, also called routing, is segmented for optimal connectivity. The Xilinx
Place and Route (PAR) tool within the ISE Design Suite software exploits the rich
interconnect array to deliver optimal system performance and the fastest compile times.
Most of the interconnect features are transparent to FPGA designers. Knowledge of the
interconnect details can be used to guide design techniques but is not necessary for
efficient FPGA design. Only selected types of interconnect are under user control. These
include the clock routing resources, which are selected by using clock buffers, and
discussed in more detail in the Spartan-6 FPGA Clocking Resources User Guide. Two global
control signals, GTS and GSR, are selected by using the STARTUP_SPARTAN6 primitive,
which is described in Global Controls. Knowledge of the general-purpose routing
resources is helpful when considering floorplanning the layout of a design.
ug384_29_012710
The various types of routing in the Spartan-6 architecture are primarily defined by their
length (Figure 30). Longer routing elements are faster for longer distances.
Fast Interconnects
Fast connects route block outputs back to block inputs. Along with the larger size of the
CLB, fast connects provide higher performance for simpler functions.
Single Interconnects
Singles route signals to neighboring tiles, both vertically and horizontally.
Double Interconnects
Doubles connect to every other tile, both horizontally and vertically, in all four directions,
and to the diagonally adjacent tiles.
Quad Interconnects
Quads connect to one out of every four tiles, horizontally and vertically, and diagonally to
tiles two rows and two columns distant. Quad lines provide more flexibility than the
single-channel long lines of earlier generations.
Fast
Single Double Quad
Double
Quad
UG384_30_012710
Use the GSR control in a design instead of a separate global reset signal to make CLB
inputs available, which results in a smaller more efficient design. The GSR signal must
always re-initialize every flip-flop. The GSR signal is asserted automatically during the
FPGA configuration process, guaranteeing that the FPGA starts up in a known state.
Using GSR and GTS does not use any general-purpose routing resources.
STARTUP_SPARTAN6 Primitive
The GSR and GTS signal sources are defined and connected using the
STARTUP_SPARTAN6 primitive. This primitive allows the user to define the source of
these dedicated nets. GSR and GTS are always active during configuration, and connecting
signals to them on the STARTUP primitive defines how they are controlled after
configuration. By default, they are disabled after configuration on a selected clock cycle of
the start-up phase, enabling the flip-flops and I/Os in the device. The STARTUP primitive
also includes other signals used specifically during configuration. For more information,
read the Spartan-6 FPGA Configuration User Guide.
Interconnect Summary
The flexible interconnect resources of the Spartan-6 family allows efficient implementation
of almost any configuration of logic and I/O resources. The ISE software automatically
places and routes designs to take best advantage of these general-purpose resources.
Dedicated resources for clocks are used when clock buffers are used in a design. Dedicated
resources for global set/reset and global 3-state are controlled by using the
STARTUP_SPARTAN6 primitive.
Floorplanning
Floorplanning is the process of specifying user-placement constraints. Floorplanning can
be done either before or after automatic place and route, but automatic place and route is
always recommended first before specifying user floorplanning. The PlanAhead Design
Analysis tool provides a graphical view of placement, and helps the designer make choices
between RTL coding and synthesis and implementation, with extensive design exploration
and analysis features. More information on the PlanAhead tool is available at:
http://www.xilinx.com/tools/planahead.htm.
For floorplanning and design analysis it is important to understand the general layout and
naming designations for the CLB resources. As shown in Figure 2, CLBs each contain two
GTP Transceivers
Integrated Block
for PCI Express
IOB Bank
IOB Cells
IOI Cells
Memory Controller
Block
Block RAM
Column
DSP Column
Clock Management
Tile Column
UG384_31_012710
LUT
D 6
O6 D
Inputs
O5
DMUX
FF/LAT
D Q DQ
DX CE
CLK
D Q
SR
CE
CK
LUT SR
F7BMUX
C 6
O6 C
Inputs
O5 CMUX
FF/LAT
D Q CQ
CX
CE
CLK
D Q
SR
F8MUX CE
CK
SR
LUT
B 6
O6 B
Inputs
O5 BMUX
FF/LAT
D Q BQ
BX
CE
CLK
D Q
SR
CE
CK
LUT F7AMUX SR
A 6
O6 A
Inputs
O5 AMUX
AX
D Q
FF/LAT
CE
CK D Q AQ
SR
CE
CE CLK
CLK SR
SR ug384_26_042309
Notes:
1. This parameter includes a LUT configured as two five-input functions.
2. TXXCK = Setup Time (before clock edge), and TCKXX = Hold Time (after clock edge).
Timing Characteristics
Figure 33 illustrates the general timing characteristics of a Spartan-6 FPGA slice.
X-Ref Target - Figure 33
1 2 3
CLK
TCEO
CE
TDICK
AX/BX/CX/DX
(DATA)
TSRCK
SR (RESET)
TCKO TCKO
AQ/BQ/CQ/DQ
(OUT)
ug384_27_042309
• At time TCEO before clock event (1), the clock-enable signal becomes valid-high at the
CE input of the slice register.
• At time TDICK before clock event (1), data from either AX, BX, CX, or DX inputs
become valid-high at the D input of the slice register and is reflected on either the AQ,
BQ, CQ, or DQ pin at time TCKO after clock event (1).
• At time TSRCK before clock event (3), the SR signal (configured as synchronous reset)
becomes valid-high, resetting the slice register. This is reflected on the AQ, BQ, CQ, or
DQ pin at time TCKO after clock event (3).
RAM
DX DI1
DI DI2 O6 D
6
D input A[6:0]
WA[6:0]
CLK CLK O5 DMUX
WE WE
RAM
CX DI1
CI DI2 O6 C
6
C input A[6:0]
WA[6:0]
CLK O5 CMUX
WE
RAM
BX DI1
BI DI2 O6 B
6
B input A[6:0]
WA[6:0]
CLK O5 BMUX
WE
RAM
AX DI1
AI DI2 O6 A
6
A input A[6:0]
WA[6:0]
CLK O5 AMUX
WE
ug384_29_042309
Notes:
1. This parameters includes a LUT configured as a two-bit distributed RAM.
2. TXXCK = Setup Time (before clock edge), and TCKXX = Hold Time (after clock edge).
3. Parameter includes AI/BI/CI/DI configured as a data input (DI2).
1 2 3 4 5 6 7
TMCP
TMPW
TMPW
CLK
TAS
A/B/C/D
2 F 3 4 5 E
(ADDR)
TDS
AX/BX/CX/DX
1 X 0 1 0 X
(DI)
TWS TILO TILO
WE
TSHCKO
DATA_OUT
A/B/C/D 1 MEM(F) 0 1 0 MEM(E)
Output
WRITE READ WRITE WRITE WRITE READ
ug384_29_021610
SRL
DX DI1
O6 D
D address A MC31
6
CLK WE
CLK
W
SRL
DI1
CX O6 C
C address A MC31
6
CLK WE
SRL
DI1
BX O6 B
B address A MC31
6
CLK WE
SRL
DI1
AX O6 A
A address A MC31 DMUX
6
CLK WE
ug384_30_042309
Notes:
1. This parameter includes a LUT configured as a two-bit shift register.
2. TXXCK = Setup Time (before clock edge), and TCKXX = Hold Time (after clock edge).
3. Parameter includes AI/BI/CI/DI configured as a data input (DI2) or two bits with a common shift.
1 2 3 4 5 6 32
CLK
TWS
Write Enable
(WE)
TDS
Shift_In (DI) 0 1 1 0 1 0
Address
0 2 1
(A/B/C/D)
TREG TILO TILO
Data Out
X 0 1 1 0 1 1 0 1
(A/B/C/D)
TREG
MSB
X X X X X X X 0
(MC31/DMUX)
ug384_31_042309
Notes:
1. TXXCK = Setup Time (before clock edge), and TCKXX = Hold Time (after clock edge).
1 2 3
CLK
TCINCK
CIN
(DATA)
TSRCK
SR (RESET)
TCKO TCKO
AQ/BQ/CQ/DQ
(OUT)
ug384_32_042309
• At time TCINCK before clock event 1, data from CIN input becomes valid-high at the D
input of the slice register. This is reflected on any of the AQ/BQ/CQ/DQ pins at time
TCKO after clock event 1.
• At time TSRCK before clock event 3, the SR signal (configured as synchronous reset)
becomes valid-high, resetting the slice register. This is reflected on any of the
AQ/BQ/CQ/DQ pins at time TCKO after clock event 3.
The input and output data are 1-bit wide (with the exception of the 32-bit RAM).
Figure 39 shows generic single-port, dual-port, and quad-port distributed RAM
primitives. The A, ADDR, and DPRA signals are address buses.
DPO
DPRA[#:0] Read Port ADDRC[#:0] Read Port DOC[#:0]
ug384_33_042309
Instantiating several distributed RAM primitives can be used to implement wide memory
blocks.
Port Signals
Each distributed RAM port operates independently of the other while reading the same set
of memory cells.
Clock – WCLK
The clock is used for the synchronous write. The data and the address input pins have
setup times referenced to the WCLK pin.
Enable – WE/WED
The enable pin affects the write functionality of the port. An active write enable prevents
any writing to memory cells. An active write enable causes the clock edge to write the data
input signal to the memory location pointed to by the address inputs.
Data In – D, DID[#:0]
The data input D (for single-port and dual-port) and DID[#:0] (for quad-port) provide the
new data value to be written into the RAM.
SRLC32E
D
6 Q
A[4:0]
CE
Q31
CLK
ug384_34_042309
Instantiating several 32-bit shift register with dedicated multiplexers (F7AMUX, F7BMUX,
and F8MUX) allows a cascadable shift register chain of up to 128-bit in a slice. Figure 20
through Figure 22 in the Shift Registers (SLICEM only) section of this document illustrate
the various implementation of cascadable shift registers greater than 32 bits.
Port Signals
Clock – CLK
Either the rising edge or the falling edge of the clock is used for the synchronous shift
operation. The data and clock enable input pins have setup times referenced to the chosen
edge of CLK.
Data In – D
The data input provides new data (one bit) to be shifted into the shift register.
Clock Enable - CE
The clock enable pin affects shift functionality. An inactive clock enable pin does not shift
data into the shift register and does not write new data. Activating the clock enable allows
the data in (D) to be written to the first location and all data to be shifted by one location.
When available, new data appears on output pins (Q) and the cascadable output pin (Q31).
Address – A[4:0]
The address input selects the bit (range 0 to 31) to be read. The nth bit is available on the
output pin (Q). Address inputs have no effect on the cascadable output pin (Q31). It is
always the last bit of the shift register (bit 31).
SRLC32E FF
Synchronous
D Q D Q Output
Address
CE (Write Enable)
CLK Q31
ug384_35_042309
This configuration provides a better timing solution and simplifies the design. Because the
flip-flop must be considered to be the last register in the shift-register chain, the static or
dynamic address should point to the desired length minus one. If needed, the cascadable
output can also be registered in a flip-flop.
LUT LUT
D D D D
Q31 Q31
SRLC32E SRLC32E
LUT LUT
D D
Q31 Q31
SRLC32E SRLC32E
FF
LUT LUT
D Q OUT D Q D Q OUT
(72-bit SRL) (72-bit SRL)
00111 A[4:0] 00110 A[4:0]
5 5
Q31 Q31
SRLC32E SRLC32E
ug384_36_042309
Multiplexer Primitives
Two primitives (MUXF7 and MUXF8) are available for access to the dedicated F7AMUX,
F7BMUX and F8MUX in each SLICEM or SLICEL. Combined with LUTs, these multiplexer
primitives are also used to build larger width multiplexers (from 8:1 to 16:1). The
Designing Large Multiplexers section provides more information on building larger
multiplexers.
Port Signals
Data In – I0, I1
The data input provides the data to be selected by the select signal (S).
Control In – S
The select input signal determines the data input signal to be connected to the output O.
Logic 0 selects the I0 input, while logic 1 selects the I1 input.
Data Out – O
The data output O provides the data value (one bit) selected by the control inputs.
Port Signals
Sum Outputs – O[3:0]
The sum outputs provide the final result of the addition/subtraction.
Carry In – CI
The carry in input is used to cascade slices to form longer carry chain. To create a longer
carry chain, the CO[3] output of another CARRY4 is simply connected to this pin.