Vlsi Script0607
Vlsi Script0607
Design of Integrated
Circuits
Wintersemester 2006/2007
http://www.microelectronic.e-technik.tu-darmstadt.de
Institute of
Microelectronic
Systems
Organizational (I)
• This lecture is intended for students of the following subjects:
– Wirtschaftsingenieurwesen Elektrotechnik (FB1, ab 5. Semester)
– Elektrotechnik und Informationstechnik (FB18, ab 5. Semester)
– Informatik (FB20, nach dem Vordiplom)
– Intern. Master Program Information & Communication Engineering
– Master Program „Informations- und Kommunikationstechnik“
Institute of
Microelectronic
Organizational Systems 2
Organizational (II)
Lecture:
Monday 1425h - 1605h in room S3|06/051 (former 48/051)
Wednesday 1140h - 1320h in room S3|06/053 (former 48/053)
Practice:
The excercises will take place within the lecture hours (Mon. or Wed.)
Attending Staff:
Dr.-Ing. Thomas Hollstein, Zi. S4|04/209, Tel. 16-4038
Dipl.-Ing. Heiko Hinkelmann, Zi. S4|04/207, Tel. 16-4238 Building “Sitte”
Dipl.-Ing. Petru Bacinschi, Zi. S4|04/201, Tel. 16-4439 Karlstr. 15
Consultation hours:
On request
Institute of
Microelectronic
Organizational Systems 3
Exam
Diploma Exam:
Institute of
Microelectronic
Organizational Systems 4
Overview
Institute of
Microelectronic
Organizational Systems 5
Literature
[1] John P. Uyemura: Fundamentals of MOS Digital Integrated
Circuits, Addison Wesley, 1988
Institute of
Microelectronic
Systems
0.5
Vt
Transistors/cm 2 40 M 100 M
20
DRAM bits /chip 17.2 G 275 G
0.2
10
Number of wiring levels 7-8 9
1
0.02 0.05 0.1 0.5 1
MOSFET channel length (µm)
Institute of
Microelectronic
1: Introduction Systems 2
ASIC Outlook 1997: Semiconductor and Electronic
Equipment Sales Trends (1992 - 2001)
Institute of
Microelectronic
1: Introduction Systems 3
Interconnect
Passivation
Technology Requirements:
Dielectric
Inductive effects will become
Etch stop
increasingly important layer
Additional metal patterns or Global Dielectric
ground planes for inductive diffusion
barrier
shielding
Thinner metallization
Lower line-to-line capacitance Copper
conductor
Increasing pitch and with metal
Intermediate barrier liner
thickness at each
conductor level to alleviate
the impact of interconnect
Pre-metal
delay Local
dielectric
Tungsten
contact
plug
Source: SIA Roadmap 1999
Institute of
Microelectronic
1: Introduction Systems 4
Productivity Gap: Technology vs. CAD
Institute of
Microelectronic
1: Introduction Systems 5
Institute of
Microelectronic
1: Introduction Systems 6
Productivity Gap: Beyond 2008
Institute of
Microelectronic
1: Introduction Systems 7
begin
Gate-Level
delay_register:
process(reset,clk)
begin
Netlist
if reset='1' then
x_q <= (others => '0');
elsif (clk'event and clk='1') then
RTL-Synthesis
x_q <= x_in;
end if; (Synopsys)
end process;
Placement &
Production Routing
(Cadence/Mentor)
ASIC Layout
Institute of
Microelectronic
1: Introduction Systems 8
Challenge: System-on-a-Chip Design ?
System on a Chip
Reuse, IP Cores
Design RTL
Complexity
Synthesis
Gates
All layers
Non-standard Custom
customised
IC
ASIP
(application Circuit with fuse,
Programmable
specific) antifuse or
memory that can
Institute of be programmed
Microelectronic
1: Introduction Systems 10
SoC: Silicon Components Categories
Silicon
Siliconcomponents
components
Discrete
Discretedevices
devices
Integrated
Integratedcircuits
circuits and optoelectronics
and optoelectronics
Analog
Analogand
and Logic Memory
Memory Microcomponets
Microcomponents
Logic
Mixed
Mixedsignal
signal ••Logic ••DRAMs
DRAMs ••Microprocessors
Microprocessors
Logic
••Gate
Gatearrays
arrays ••SRAMs
SRAMs ••Microcontrolers
Microcontrollers
••Cell
Cellbased
based ••Flash
Flash ••Microperipherals
Microperipherals
••FPLDs
FPLDs ••Other
Other
••SoC
Other
Institute of
Microelectronic
1: Introduction Systems 11
Area Examples:
-> Domain Specific Computing
Multimedia
Mobile Communication SoC
Automotive
...
Institute of
Microelectronic
1: Introduction Systems 12
Application: Single-Chip Integrated CMOS
Berkeley Wireless Centre
Radio
Institute of
Microelectronic
1: Introduction Systems 13
Receiver Oscillator
Low Noise
Filter Mixer Demodulator
Amplifier
Digital Baseband
AD/DA AD/DA
Converter Memory CMOS Converter
Logic
Transmitter Oscillator
Power
Filter Mixer Modulator
Amplifier
Institute of
Microelectronic
1: Introduction Systems 14
2. Repetition Transistor Models
Institute of
Microelectronic
Systems
Structure of MOSFET
vS vG vD
iS iG iD
Gate (G)
Source (S) Drain (D) D
n+ Channel Region n+ G B
P-Type Substrate
S
Body (B)
iB
vB
Institute of
Microelectronic
2: Transistors Systems 2
Inversion
Institute of
Microelectronic
2: Transistors Systems 3
Ohmic region
8.00e-4
• Increasing VDS to a value VDS > 0
VGS= 5 V
leads to a current ID.
• Near the drain the voltage
Drain-Source Current (A)
6.00e-4
responsible for the inversion is VGS= 4 V
(VGS - VT) - VDS and thus smaller
than near the source. 4.00e-4
• The channel acts like a linear VGS= 3 V
resistor - that’s why this region of
operation is called ohmic. 2.00e-4
VGS = 2 V
0.00e+0
0.0 0.2 0.4 0.6 0.8
Drain-Source Voltage (V)
Institute of
Microelectronic
2: Transistors Systems 4
Pinch - off
Institute of
Microelectronic
2: Transistors Systems 5
Saturation
Institute of
Microelectronic
2: Transistors Systems 6
Output Characteristics
2.20e-4
Linear VGS = 5 V
2.00e-4 Region
Drain-Source Current (A)
1.80e-4 Pinchoff Locus
1.60e-4
Saturation Region
1.40e-4
1.20e-4 VGS= 4 V
1.00e-4
8.00e-5
6.00e-5 VGS = 3 V
4.00e-5
VGS < 1 V
2.00e-5 VGS= 2 V • VT = 1V
0.00e+0
0 2 4 6 8 10 12
Drain-Source Voltage (V)
Institute of
Microelectronic
2: Transistors Systems 7
Institute of
Microelectronic
2: Transistors Systems 8
Transfer Characteristics and Depletion Mode
MOSFET
B
Institute of
Microelectronic
2: Transistors Systems 9
2.50e-4
VSG = 5 V GS (V = -5 V)
vS vG < 0 2.00e-4
vD < 0
iS iG Gate iD Drain 1.50e-4
Source VSG= 4 VGS (V = -4 V)
VSG = 3 VGS (V = -3 V)
L 5.00e-5
VSG= 2 VGS (V = -2 V)
n-type substrate
0.00e+0 VSG< 1 V GS (V > -1 V)
Body
iB vB > 0 -5.00e-5
-2 0 2 4 6 8 10 12
Source-Drain Voltage (V)
Institute of
Microelectronic
2: Transistors Systems 10
IEEE Standard MOS Transistor Circuit
Symbols
D D
G G
B B
S S
(a) NMOS enhancement-mode device (b) PMOS enhancement-mode device
D D
G G
B B
S S
(c) NMOS depletion-mode device (d) PMOS depletion-mode device
D D
G G
S S
(e) Three-terminal NMOS transistor (f) Three-terminal PMOS transistor
Institute of
Microelectronic
2: Transistors Systems 11
D S
G iDS
G B
B
iSD
S D
n+ n+
C n-type channel C
SB DB
p-type substrate
NMOS device in
the linear region Bulk
Institute of
Microelectronic
2: Transistors Systems 13
n+ n+
C n-type channel C
SB DB
p-type substrate
Institute of
Microelectronic
2: Transistors Systems 14
MOS Capacitances - Cutoff
The gate-bulk capacitance consists of the gate capacitance in series with the
depletion capacitance of the depletion region.
Gate
Source Drain
C' C'OL
OL
n+ n+
CGB
CSB C DB
Institute of
Microelectronic
2: Transistors Systems 15
+
ig
+ v
ds
v
gs
-
-
ig id
G D
+ +
v g v r v
gs m gs ο ds
- -
S
Institute of
Microelectronic
2: Transistors Systems 17
G D B
+ + +
gmv gs gmbvbs ro vds vbs
vgs
- -
-
S
Institute of
Microelectronic
2: Transistors Systems 18
High-Frequency MOSFET
Small Signal Model
D*
CGD RD C BD
CGB
B
D
G + + +
gmv gs gmbvbs ro vds vbs
vgs
- -
-
S
CGS RS C BS
S*
Institute of
Microelectronic
2: Transistors Systems 19
High-Frequency MOSFET
Small Signal Model
Institute of
Microelectronic
2: Transistors Systems 20
3. Short Channel Effects on MOS
Transistors.
Institute of
Microelectronic
Systems
Overview.
• Short Channel
Devices.
• Velocity Saturation
Effect.
• Threshold Voltage
Variations.
• Hot Carrier Effects.
• Process Variations.
Institute of
Microelectronic
3: Short Channel Effects Systems 2
Short Channel Devices.
Institute of
Microelectronic
3: Short Channel Effects Systems 3
L x
Qi(x)=-COX[VGS-V(x)-VT] (1)
p-substrate
ID=-vn(x)Qi(x)W (2)
Institute of
Microelectronic
3: Short Channel Effects Systems 4
Velocity Saturation Effect (II)
vn (m/s)
Institute of
Microelectronic
3: Short Channel Effects Systems 6
Velocity Saturation Effect (IV)
µ nΕ
v= for Ε≤ΕC (6) • For large values of L or small
1 + Ε ΕC values of VDS, κ approaches 1
and (7) reduces to (5).
v = vsat for Ε≥ΕC
• For short channel devices κ<1
Reevaluating (1) and (2) using (6): and the current is smaller than
what would be expected.
Institute of
Microelectronic
3: Short Channel Effects Systems 7
ID
Short-channel device
VDS
VDSAT VGS-VT
Institute of
Microelectronic
3: Short Channel Effects Systems 9
Lν sat
VDSAT = LΕ C = (11)
µn
Under these conditions the equation for the current in the linear
region remains unchanged from the long channel model. The
value for IDSAT is found by substituting eq. (11) in (5).
Institute of
Microelectronic
3: Short Channel Effects Systems 10
Simplificated model for hand calculations (II)
⎡ 2
⎤
⎢(VGS − VT )VDSAT −
W VDSAT
I DSAT = µ n COX ⎥
L ⎣ 2 ⎦
⎡ ⎤
I DSAT = vsat COX W ⎢(VGS − VT ) − DSAT ⎥
V
(12)
⎣ 2 ⎦
Institute of
Microelectronic
3: Short Channel Effects Systems 11
Institute of
Microelectronic
3: Short Channel Effects Systems 12
ID-VGS characteristic for long- and short
channel devices both with W/L=1.5
Institute of
Microelectronic
3: Short Channel Effects Systems 13
VT = VT 0 + γ ( − 2φ F + VSB − − 2φ F ) (11)
• Eq. (11) states that the threshold Voltage is only a function of the
technology and applied body bias VSB
Institute of
Microelectronic
3: Short Channel Effects Systems 14
Threshold Voltage Variations (II)
VT VT
L VDS
Process Variations.
Institute of
Microelectronic
3: Short Channel Effects Systems 18
Impact of Device Variations.
2.10
2.10
1.90
Delay (nsec)
Delay (nsec)
1.90
1.70
1.70
1.50 1.50
1.10 1.20 1.30 1.40 1.50 1.60 –0.90 –0.80 –0.70 –0.60 –0.50
Institute of
Microelectronic
3: Short Channel Effects Systems 20
4. SPICE LEVEL 1 MOSFET
MODEL
Institute of
Microelectronic
Systems
Institute of
Microelectronic
4: MOSFET Model Systems 2
Layout and cross section of a n-well CMOS
technology.
Institute of
Microelectronic
4: MOSFET Model Systems 3
I DS = 0 (VGS ≤ VTH )
I DS =
KP
(W Leff )(VGS − VTH )2 (1 + LAMBDA ⋅VDS ) (0 ≤ VGS − VTH ≤ VDS )
2
(
VTH = VT 0 + GAMMA 2 ⋅ PHI − VBS − 2 ⋅ PHI )
and the channel length:
Leff = L − 2 ⋅ LD
Institute of
Microelectronic
4: MOSFET Model Systems 4
Where L is the length of the polysilicon gate and LD is the gate
overlap of the source and drain.
The elements in the large signal MOSFET model are shown in
the following figure.
Institute of
Microelectronic
4: MOSFET Model Systems 5
Lateral diffusion/
Gate-source overlap LD LD M
Transconductance
parameter KP µnCOX A/V2
Threshold voltage/
Zero-bias threshold VTO VTO V
Channel-length
modulation parameter LAMBDA λn V-1
Bulk threshold/
Backgate effect parameter GAMMA γn V1/2
Surface potential/
Depletion drop in PHI -φP V
inversion
Institute of
Microelectronic
4: MOSFET Model Systems 6
Specifying MOSFET Geometry in SPICE.
Institute of
Microelectronic
4: MOSFET Model Systems 7
Institute of
Microelectronic
4: MOSFET Model Systems 8
LEVEL 1 MOSFET MODEL PARAMETERS.
C BD (VBD ) = C BS (VBS ) =
CBD CBS
(1 − VBD PB )MJ (1 − VBS PB )MJ
Institute of
Microelectronic
4: MOSFET Model Systems 9
Institute of
Microelectronic
4: MOSFET Model Systems 10
LEVEL 1 MOSFET MODEL PARAMETERS.
CJ ⋅ AD CJSW ⋅ PD
C BD (VBD ) = +
(1 − VBD PB ) (1 − VBD PB )MJSW
MJ
CJ ⋅ AS CJSW ⋅ PS
C BS (VBS ) = +
(1 − VBS PB ) (1 − VBS PB )MJSW
MJ
Institute of
Microelectronic
4: MOSFET Model Systems 11
Bottom=ABCD
Sidewall=ABEF+BCFG+DCGH+ADEH
Institute of
Microelectronic
4: MOSFET Model Systems 12
LEVEL 1 MOSFET MODEL PARAMETERS.
Institute of
Microelectronic
4: MOSFET Model Systems 13
Institute of
Microelectronic
4: MOSFET Model Systems 14
Example of MOSFET model parameters
values.
Parameter Name N Channel MOSFET P Channel MOSFET Units
Transconductance
parameter KP 50 x 10-6 25 x 10-6 A/V2
Channel-length
modulation parameter 0.1/L (L in µm) 0.1/L (L in µm) V-1
LAMBDA
Gate-Drain overlap
capacitance. CGDO 5 x 10-10 5 x 10-10 F/m
Gate-Source overlap
capacitance. CGSO 5 x 10-10 5 x 10-10 F/m
Institute of
Microelectronic
4: MOSFET Model Systems 15
5. CMOS Inverter
Institute of
Microelectronic
Systems
Overview
• Logic levels
• Noise Margin
• CMOS Inverter
– static behaviour
– dynamic behaviour
Institute of
Microelectronic
5: CMOS Inverter Systems 2
Inverter as simplest logic gate
V+
V
+ R
v v v
I O O
vI VO
V DD VCC
R R
v v
i O
i O
D C
VI
vI
vI
M Q
S S
Institute of
Microelectronic
5: CMOS Inverter Systems 3
vO vI
V+
"1"
NML: Noise margin associated with V OH "1"
a low input level NMH
VIH
NM L
NMH = VOH - VIH "0"
VOL
"0"
V-
Institute of
Microelectronic
5: CMOS Inverter Systems 5
V =5V
DD
• NMOS switching device MS
designed to force vO to VOL R
v
• Resistor load R to pull the output O
up toward the power supply VDD i
D
+
• VOH = VDD (driver in cut off v M v
⇒ iD = 0) I S DS
• VOL determined by W/L ratio of
MS -
Institute of
Microelectronic
5: CMOS Inverter Systems 7
Example
V = 5V V DD= 5V
DD i
DD
R R 95 k Ω
v =V =5V
O OH
v =V
O OL
0 50 µA
M +
S
M v = 0.25 V
S DS
2.06
1 -
v =V <V v =V =5V
I OL TH I OH
(a) (b)
Institute of
Microelectronic
5: CMOS Inverter Systems 8
On - Resistance
V V
DD DD
R R
VOH VOL
v = V OL v =V
I I OH
R on R on
(a) (b)
vDS 1 Ron 1
Ron = = VOL = VDD = VDD
iD W ⎛ v ⎞ Ron + R 1+
R
K 'n ⎜ vGS − VTN − DS ⎟ Ron
L ⎝ 2 ⎠
Institute of
Microelectronic
5: CMOS Inverter Systems 9
(a) NMOS inverter with gate of the load (b) NMOS inverter with gate
device connected to its source of the load device grounded
V DD V DD
VGG
ML ML
vO vO
vI MS VI
MS
Institute of
Microelectronic
5: CMOS Inverter Systems 10
CMOS Inverter Technology
V (0 V) v V (5 V)
SS I DD
B S D vo D S B
p+ n+ n+ p+ p+ n+
n-well
Ohmic NMOS transistor
contact PMOS transistor Ohmic
contact
p-type substrate
C M O S T ra n sisto r P a ra m e te rs
N M O S D e vice P M O S D e vice
VTO 1 V -1 V
γ 0 .5 0 V 0 .7 5 V
2 φF 0 .6 0 V 0 .7 0 V
K' 25 µA /V 2 1 0 µA /V 2
Institute of
Microelectronic
5: CMOS Inverter Systems 11
VDD = 5 V VDD = 5 V
• Inverter with resistive S
load ⇒ power R onp
dissipation when the M
P
input is high. G
• If an NMOS and D v
I
PMOS transistor is v v
v O
I D O
used ⇒ CMOS.
• One transistor is G
M
N
always off while the
other is on ⇒ no S
R onn
static power
consumption.
Institute of
Microelectronic
5: CMOS Inverter Systems 12
CMOS voltage transfer Characteristic
VIL
1 2
4.0V M N off M N saturated
M P linear
v o = v I - VTP
vo M and M P saturated
N
2.0V 3
M P saturated
M N linear
VIH
v o= v I - VTN 5
0V 4 M P off
0V 1.0V 2.0V v 3.0V 4.0V 5.0V
I
Institute of
Microelectronic
5: CMOS Inverter Systems 13
Institute of
Microelectronic
5: CMOS Inverter Systems 14
What happens, if the inverter is not
symmetrical?
6.0V 6.0V
VDD = 5 V
vO= vI
VDD = 4 V
4.0V 4.0V KR= 5
VDD = 3 V v O= vI
VDD = 2 V K R= 1
2.0V 2.0V
K R = 0.2
0V 0V
0V 1.0V 2.0V 3.0V 4.0V 5.0V 6.0V 0V 1.0V 2.0V 3.0V 4.0V 5.0V
vI vI
Institute of
Microelectronic
5: CMOS Inverter Systems 15
Calculation of VIL
2 2
The derivation condition (dVout / dVin) = -1 has to be evaluated for
IDn(Vin, Vout) = IDp(Vin):
⎛ K ⎞
VIH ⎜⎜1 + p ⎟⎟ = 2Vout + VTn + p (VDD − VTp )
K
⎝ Kn ⎠ Kn
This equation forms together with the first equation a quadratic in VIH
which has to be solved. Institute of
Microelectronic
5: CMOS Inverter Systems 17
Calculation of Vth
2 2 vo
M N and M P saturated
Solving for Vth yields: 2.0V 3
VTn + K p / K n (VDD − VTp )
Vth = VIH
1+ K p / Kn 0V 4 5
0V 1.0V 2.0V 3.0V 4.0V 5.0V
vI
Vth
Institute of
Microelectronic
5: CMOS Inverter Systems 18
Design of CMOS inverter (I)
• KR = Kp / Kn 3.0 NM
H
⎛W ⎞ 2.5
• Remember: K n = K 'n ⎜ ⎟
⎝ L ⎠n 2.0
⎛W ⎞
K p = K'p ⎜ ⎟ 1.5
⎝ L ⎠p
NM L
⇒Influence of the symmetry via 1.0
W/L of transistors!
0.5
0 1 2 3 4 5 6 7 8 9 10 11
KR
Institute of
Microelectronic
5: CMOS Inverter Systems 19
Institute of
Microelectronic
5: CMOS Inverter Systems 20
Summary
Institute of
Microelectronic
5: CMOS Inverter Systems 21
MN goes from Cutoff over Saturation into Nonsaturation region for the given
input.
The border between Saturation and Nonsaturation is reached at the time tx
and the output voltage Vout = VOH - VTn v
I
V DD = 5 V + 5V
MP
0V t
v I = 5V v O (0+) = 5V 0
v
O
MN C MN saturated
VOH = 5V
MN nonsaturated
(Vin - VTn)
VOL = 0 V t
t1 tX t2
Institute of
Microelectronic
5: CMOS Inverter Systems 22
High to Low Output Transition (II)
Saturation:
VDD −VTn
dVOUT 2CoutVTn
t x − t1 = −COUT ∫ =
K n (VDD − VTn )
2
VDD
Kn
(VDD − VTn )2
2
Nonsaturation:
V0
⎛ ⎞
V0
dVOUT 2C 1 VOUT
t 2 − t x = −COUT ∫ = − OUT ln⎜⎜
K n 2(VDD − VTn ) ⎝ 2(VDD − VTh ) − VOUT
⎟⎟ =
Kn
VDD −VTn
2
[
2(VDD − VTn )VOUT − VOUT
2
] ⎠ VDD −VTn
dx 1 ⎛ xn ⎞
We have used the following integral: ∫ x a + bx n = an ln⎜⎜⎝ a + bx n ⎟⎟⎠
( )
dx 1 ⎛ x ⎞
In our case: n = 1, b = −1 ∫ ax − x 2
= ln⎜ ⎟
a ⎝a−x⎠
t HL = (t x − t1 ) + (t 2 − t x )
⎡ 2VTn ⎛ 2(VDD − VTn ) ⎞⎤
therefore: t HL = τ ⎢ + ln⎜⎜ − 1⎟⎟⎥
V − V
⎣ DD Tn ⎝ V 0 ⎠⎦
COUT
where τ=
K n (VDD − VTn )
Institute of
Microelectronic
5: CMOS Inverter Systems 24
Low to high output transition
From symmetry (VTn → VTp; Kn → Kp) follows for the high to low transition
( )
time:
⎡ 2 VTp ⎛ 2 VDD − VTp ⎞⎤
+ ln⎜ − 1⎟⎥
COUT
⇒ t LH = ⎢
K p VDD − VTp ⎢VDD − VTp
⎣
⎜
⎝( V0 ) ⎟⎥
⎠⎦
V =5V
DD v
I
+ 5V
MP
0V t
V =0V
I 0
v (0+) = 0V
O v
O
M C
N + 5V
0V t
0
Institute of
Microelectronic
5: CMOS Inverter Systems 25
M 5 32.5
P 1 M M 20
P 1 P 1
v v v v
I o I v I v
o o
M 2 13 8
N C M M
1 N 1 N 1
1 pF 2 pF
(a) (b)
Institute of
Microelectronic
5: CMOS Inverter Systems 27
The power P(t) = VDDi(t), and because The current supplied by source VDD is
VDD is a constant, also equal to the current in capacitor C,
and so ∞ dv
∞ ∞
ED = VDD ∫ C C
dt
ED = ∫ VDD i (t )dt = VDD ∫ i (t )dt dt
0
0 0 VC ( ∞ )
= CVDD ∫ dvC
VC ( 0 )
Institute of
Microelectronic
5: CMOS Inverter Systems 28
Dynamic Power Dissipation (II)
Thus, every time a logic gate goes through a complete switching cycle, the
transistors within the gate dissipate an energy equal to ETD. Logic gates
normally switch states at some relatively high frequency (switching
events/second), and the dynamic power PD dissipated by the logic gate is
then
PD = CVDD
2
f
Institute of
Microelectronic
5: CMOS Inverter Systems 30
Dynamic Power Dissipation (IV)
• Power dissipation due to the “short circuit current” (when both transistors
are on during transition)
• The short circuit current reaches a peak for Vin = Vout = VDD/2
VDD = 5 V
5.0 V
vO
Voltage
R onp
Vin = Vout = VDD/2
vI
0.0 V vout
30uA
i DD
Current
R onn
0 uA
0s 4ns 8ns 12ns 16ns
Time
Institute of
Microelectronic
5: CMOS Inverter Systems 31
Summary
Let’s repeat:
6.0V
• What is the dynamic behaviour of
Output Voltage
40uA the inverter?
4.0V • What do we need it for?
• What kind of power dissipation is
there?
20uA
2.0V • What kind of power dissipation is
dominant with CMOS logic?
Drain Current
0V 0A >>
0V 2.0V v 4.0V 6.0V
I
PD = CVDD
2
f
Institute of
Microelectronic
5: CMOS Inverter Systems 32
6. CMOS Technology
Institute of
6: CMOS Technology Microelectronic
Systems 1
CMOS Technology
Institute of
6: CMOS Technology Microelectronic
Systems 2
Wafer Terminology
Institute of
6: CMOS Technology Microelectronic
Systems 3
The number of steps in IC fabrication flow depends upon the technology process
and the complexity of the circuit
Example:
CMOS n-Well process - 30 major steps, and each major step may involve up to
15 substeps
Only three basic operations are performed on the wafer:
• Layering
• Patterning
• Doping
Institute of
6: CMOS Technology Microelectronic
Systems 4
Layering
Layers Technique
Thermal ChemicalVapor Evaporation Sputtering
oxidation Deposition (CVD)
Institute of
6: CMOS Technology Microelectronic
Systems 5
Natural oxide: silicon will readily grow an oxide (5-10nm) if exposed to oxygen in the air!
The range for useful oxide thickness: 25nm (MOS gates) - 1500nm (field oxide)
Dry oxidation
Si + O2 → SiO2 (900-1200°C)
O2
700nm oxide: 10hours (1200°C)
SiO2
Good oxide quality: gate oxide
Silicon
Wet oxidation (water vapor or steam)
Si + H2O → SiO2 + 2H2 (900-1200°C)
700nm oxide: 0.65hours (1200°C)
Poor oxide quality: field oxide
Institute of
6: CMOS Technology Microelectronic
Systems 6
Layering - Chemical Vapor Deposition (CVD)
Deposited materials:
• Insulators & Dielectrics: SiO2, Si3N4, Phosphorus Silicate Glass (PSG), Doped Oxide
• Semiconductors: Si
• Conductors: Al, Cu, Ni, Au, Pt, Ti, W, Mo, Cr, Silicides (WSi2, MoSi2), doped polysilicon
wafer
Institute of
6: CMOS Technology Microelectronic
Systems 7
Layering - Evaporation
Used to deposit conductive layers (metallization): Al, Al/Si, Al/Cu, Au, Mo, Pt
When temperature is raised high enough, atoms of solid material (Al) will melt and “evaporate”
into the atmosphere and deposit on to the wafer
External energy needed to evaporate the metal are provided by:
Wafer
Magnet High Vacuum
1.A current flowing Al
(10-5-10-7 torr)
through a filament
Crucible
Institute of
6: CMOS Technology Microelectronic
Systems 9
Patterning
• Patterning = Lithography = Masking
• Selective removal of the top layer(s) on the wafers
• Ex.: Process steps required for patterning SiO2
Photoresist
5.SiO2 etching
2.Photoresist deposition
UV light
5.SiO2 etching (end)
Mask
Insoluble
photoresist
Soluble
photoresist 3.UV Exposure 6.Photoresist etching
Institute of
6: CMOS Technology Microelectronic
Systems 10
Doping
Thermal diffusion:
- heat the wafer to the vicinity of 1000°C
- expose the wafer to vapors containing the desired dopant
- the dopant atoms diffuse into the wafer surface creating a p/n region
Ion implantation:
- room temperature
- dopant atoms are accelerated to a high speed and “shot” into the wafer surface
- an annealing (heating) step is necessary to reorder the crystal structure damaged by implant
Institute of
6: CMOS Technology Microelectronic
Systems 11
Si Substrate (p)
Oxidation (Layering)
Institute of
6: CMOS Technology Microelectronic
Systems 12
NMOS Transistor Fabrication - process flow (2)
Oxidation (Layering)
Institute of
6: CMOS Technology Microelectronic
Systems 13
n type
n+ n+
Oxidation (Layering)
Institute of
6: CMOS Technology Microelectronic
Systems 14
NMOS Transistor Fabrication - process flow (4)
Oxide etching (Patterning)
Contact windows
n+ n+
Al evaporation
n+ n+
G
n+ n+
Si Substrate (p)
Institute of
6: CMOS Technology Microelectronic
Systems 15
Institute of
6: CMOS Technology Microelectronic
Systems 16
Local Oxidation of Silicon (LOCOS) (1)
More planar surface topology
Selectively growing the field oxide in certain regions - process flow:
1) grow a thin pad oxide (SiO2) on the silicon surface
2) define active area : deposition and patterning a silicon nitride (Si3N4) layer
Si3N4
SiO2
Silicon substrate
The thin pad oxide - protect the silicon surface from stress caused by nitride
3) channel stop implant: p-type regions that surround the transistors
p+ p+ p+
Institute of
6: CMOS Technology Microelectronic
Systems 17
Field oxide is partially recessed into the surface (oxidation consume some of the silicon)
Field oxides forms a lateral extension under the nitride layer - bird`s beak region
Bird’s beak region limits device scaling and device density in VLSI circuits!
5) Etch the nitride layer and the thin oxide pad layer
Active Active
area area
Institute of
6: CMOS Technology Microelectronic
Systems 18
n-Well CMOS Technology - simplified process sequence
Institute of
6: CMOS Technology Microelectronic
Systems 19
• Process starts with a moderately doped (1015 cm-3) p-type substrate (wafer)
• An initial oxide layer is grown on the entire surface (barrier oxide)
SiO2
Si (p)
Institute of
6: CMOS Technology Microelectronic
Systems 20
1. n-Well mask - defines the n-Well regions
• Pattern the oxide
• Implant n-type impurity atoms (phosphorus) - 1016cm-3
• Drive-in the impurities (vertical but also lateral redistribution - limits the density )
SiO2
n-well
Si (p)
Institute of
6: CMOS Technology Microelectronic
Systems 21
2. Active area mask - define the regions in which MOS devices will be created
• LOCOS process to isolate NMOS and PMOS transistors
• lateral penetration of bird’s beak region ~ oxide thickness
• channel stop p+ implants (boron)
• Grow gate oxide (dry oxidation) - only in the open area of active region
SiO2
p+
n-well
Si (p)
Institute of
6: CMOS Technology Microelectronic
Systems 22
3. Polysilicon mask - define the gates of the MOS transistors
• Polysilicon is deposited over the entire wafer (CVD process) and doped (typically n-type)
• Pattern the polysilicon in the dry (plasma) etching process
• Etch the gate oxide
Polysilicon gate
SiO2
p+
n-well
Si (p)
Institute of
6: CMOS Technology Microelectronic
Systems 23
SiO2
S n+ n+ D n+
p+
n-well
Si (p)
Institute of
6: CMOS Technology Microelectronic
Systems 24
5. Complement of the n-select mask - define the p+ source/drain regions of PMOS transistors
• Define the ohmic contacts to the substrate
• Implant p-type impurity atoms (boron)
• Polisilicon layer protects transistor channel regions from the boron dopant
p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
n-well
Si (p)
Institute of
6: CMOS Technology Microelectronic
Systems 25
SiO2
p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
n-well
Si (p)
Institute of
6: CMOS Technology Microelectronic
Systems 26
6. Contact mask - define the contact cuts in the insulating layer
• Contacts to polysilicon must be made outside the gate region (avoid metal spikes through
the poly and the thin gate oxide)
Contact window
SiO2
p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
n-well
Si (p)
Institute of
6: CMOS Technology Microelectronic
Systems 27
Metal
SiO2
p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
n-well
Si (p)
Institute of
6: CMOS Technology Microelectronic
Systems 28
• The final step: the entire surface is passivated (overglass layer)
• Protect the surface from contaminants and scratches
• Than opening are etched to the bond pads to allow for wire bonding
Institute of
6: CMOS Technology Microelectronic
Systems 29
GND In VDD
Out
Poly
Metal
SiO2
p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
Gate oxide n-well
Si (p) N-channel transistor P-channel transistor
In
GND VDD
Out
Institute of
6: CMOS Technology Microelectronic
Systems 30
Design Rules
Institute of
6: CMOS Technology Microelectronic
Systems 31
Institute of
6: CMOS Technology Microelectronic
Systems 32
Intra-Layer Design Rules (λ)
Well Polysilicon
6 9 2
10 2
Active
3 Metal1
3
3
3
Select 2 Metal2
4
2
3
Contact/Via
hole Minimum dimensions and distances
2
Institute of
6: CMOS Technology Microelectronic
Systems 33
Transistor
1
3 2
Well boundary
Institute of
6: CMOS Technology Microelectronic
Systems 34
Inter-Layer Design Rules - Contact and Via (λ)
2
m2 4
Via
1 1
m1 5
Metal1 to Metal to
Metal2 contact Poly contact
1
Metal to 3 2
Active contact Via
m2 m1
2 m1
2 2
poly
n+
Institute of
6: CMOS Technology Microelectronic
Systems 35
Select
2
Contact to Contact to
well substrate
2
Select 1
3 3
2
5
Well
Substrate
Institute of
6: CMOS Technology Microelectronic
Systems 36
CMOS Inverter Layout
GND In VDD
Out
Poly
Metal
SiO2
p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
Gate oxide n-well
Si (p) N-channel transistor P-channel transistor
Institute of
6: CMOS Technology Microelectronic
Systems 37
CMOS Latchup
V (0 V) v V (5 V)
SS O DD
B S D D S B
p+ n+ n+ p+ p+ n+
Rn
n-well
npn transistor
Rp
p-type substrate
pnp transistor
Institute of
Microelectronic
Systems
VDD
Institute of
Microelectronic
7: CMOS Logic Systems 2
CMOS NOR Gate
VDD = 5 V VDD = 5 V
10
1 MP 5
1
v
I vo
2 2
1 1 0 0 1
A B
0 1 0
1 0 0
1 1 0
Institute of
Microelectronic
7: CMOS Logic Systems 3
Goal: To maintain the delay times equal the reference inverter design
under the worst-case input conditions
Institute of
Microelectronic
7: CMOS Logic Systems 4
CMOS NAND Gate
V =5V
0 0 1
Z DD 0 1 1
1 0 1
M 5 1 1 0
4 P 1
1
A v v
I O
M 2
4 N 1
1
B
Institute of
Microelectronic
7: CMOS Logic Systems 5
Y= ABCDE
5 5 5 5 5
1 1 1 1 1
Y
Y
10 C
1 Why should one
A
prefer a NAND
10
gate rather than a
1 NOR gate?
B
10
1
C
10
1
D
10
1
E
Institute of
Microelectronic
7: CMOS Logic Systems 6
Steps in Constructing Graphs for NMOS and
PMOS Networks (I)
+5 V
A PMOS
B Switch
C Network
D
Y
B MB
B (C + D)
A MA C MC D MD
C+D
A + B (C + D) Y = A + B (C + D)
Institute of
Microelectronic
7: CMOS Logic Systems 7
Y 3
2 4 4
(a) B MB
1
A 1 2
1 C 5
4
A MA C MC D MD D
2 4 4
1 1 0
1
(c) NMOS Graph with 3
0
1
2 New Nodes Added 2
B B
(b) NMOS Graph
3
A 1
C 2 A 4 1 2
C 5
D
D
0 Institute of
Microelectronic 0
7: CMOS Logic Systems 8
Steps in Constructing Graphs for NMOS and
PMOS Networks (III)
Final CMOS Circuit
3 +5 V
15
A 1
Graph with 4
2
PMOS Arcs Added B 15
4 C 1
7.5
3 B 1 5
4
A 1 2 D
15
C 1
5
2
D Y
0 MB 4
B 1
1
A MA C MC D MD 4
2 4 1
1 1
Institute of
Microelectronic
7: CMOS Logic Systems 9
Summary
+5 V
15
A 1
• AND - serially connected FET
• OR - parallel connected FET 15
C 1
7.5
• NMOS network implements B 1
“zeros” 15
D
• PMOS network implements 1
“ones”
Y
MB 4
• W/L ratio has to be determined as B 1
a design parameter
A MA C MC D MD 4
2 4 1
1 1
Institute of
Microelectronic
7: CMOS Logic Systems 10
CMOS Gate Design: Minimum Size Vs.
Performance (I)
CMOS circuit with only Considerable savings in chip area,
minimum size transistors but increased logic delay
Example:
Institute of
Microelectronic
7: CMOS Logic Systems 11
τP =
(τ PHL + τ PLH ) = (2 τ PHLI + 7 .5 τ PLHI ) 9 .5 τ PLHI
= = 4 .75 τ PLHI
2 2 2
Mininimum size gate will 4.75 times slower than reference inverter when
driving the same load capacitance
Institute of
Microelectronic
7: CMOS Logic Systems 12
Power-Delay Product (PDP)
The PDP is an important figure of merit for a logic technology
PDP = PAV τ P
1
For CMOS: P AV = CV 2
DD f with f =
T
Institute of
Microelectronic
7: CMOS Logic Systems 14
7b. Passtransistor and
Transmission Gate Logic
Institute of
Microelectronic
Systems
Idea:
0=open
control 1=closed
Vin Vout
Vin control Vout
1 0 x
Implementation: 1 1 1
0 0 x
Vin Vout 0 1 0
control
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 2
Passtransistor Logic: NEXOR Realisation
B
A B OUT
OUT
0 0 1
0 1 0
A
1 0 0
B
1 1 1
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 3
Vctrl (t )
NMOS Vctrl (t < 0) = 0
Vctrl (t >= 0) = VDD
Transistor is in
VGS
Saturation during
Vin = VDD Vout ( t ) Charging Process
Cout Vout ( t = 0) = 0
Vout (t )
VDD − VT ( VSB )
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 4
Passtransistor Cascades
VDD VDD VDD VDD
VDD
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 5
Vctrl (t )
NMOS Vctrl (t < 0) = 0
Vctrl (t >= 0) = VDD Transistor is always in
VGS Nonsaturation during
Discharging Process
Vin = 0 Vout ( t )
Cout Vout (t = 0) = VDD − VT ( VSB )
Vout (t )
VDD − VT ( VSB )
NMOS Passtransistor:
Discharging faster than
Charging, since Device
t Impedance is lower in NSat
than in Sat
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 6
Passtransistor: Charging Characteristics
PMOS Charging Process:
Vctrl (t )
Vctrl (t < 0) = VDD
Vctrl (t >= 0) = 0
VGS The output is
charged to VDD
(Transistor is initially
Vin = VDD Vout ( t )
saturated and goes
VDD Cout Vout ( t = 0) = 0
in nonsaturated
mode)
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 7
Logic
NMOS PMOS CMOS
Vctrl Level
Logic 0 0 VTP 0
Logic 1 VDD − VTN VDD VDD
VDD
Vin Vout
dV
I DN + I DP = Cout * out
Vctrl dt Vctrl
CMOS Transmission Gate Symbol: CMOS Transmission Gate
cut-off
Mn
VDD − VTN
nonsaturated
Mp
Mn saturated
VTP
sat.
Mp
Initial Voltage : 0
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 9
R onP R onN
R EQ =
R onP + R onN
On-resistance of a transmission
gate, including body effect
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 10
CMOS Transmission Gate (III)
C BIGVBIG + C SMALLVSMALL
VF =
C BIG + C SMALL
= A⊕B B
S B
A A
S F B F A F
B A
S B
B
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 12
Function Implementation with Passtransistor Logic
F
1 0 0 1 Step 1: find minimum decomposition in such a
way, that each selected field is
0 0 1 0 depending on one variable or constant 0
or constant 1 only
b
1 0 1 1
(in our case: decompose with
a combinations of the literals b and d
1 1 1 1
c
d
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 13
a
F
b b d d
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 14
8. Combinational MOS Logic Circuits
Institute of
Microelectronic
Systems
Introduction
• Combinational logic circuits, or gates, witch perform Boolean operations on multiple
input variables and determine the output as Boolean functions of the inputs, are the
basic building blocks of all digital systems.
• We will examine the static and dynamic characteristics of various combinational
MOS logic circuits. It will be seen that many of the basic principles used in the
design and analysis of MOS inverters can be directly applied to the combinational
logic circuit as well.
• In its most general form, a combinational logic circuit, or gate, performing a Boolean
function can be represented as a multiple-input single-output system.
Calculation of VOH
When both input voltages VA and VB are lower than the corresponding driver
threshold voltage, the driver transistor are turned off and conduct no drain current.
Consequently, the load device, which operates in the linear region, also has zero
drain current. In particular, its linear region current equation becomes
kn , load ⎡
ID , load = 2 VT , load ( VOH ) ( VDD − VOH ) − ( VDD − VOH )2 ⎤ = 0
2 ⎢
⎣ ⎥⎦
The solution of this equation gives VOH=VDD
Institute of
8: Combinational MOS Microelectronic
Systems 3
Logic Circuits
Calculation of VOL
To calculate the output voltage VOL, we must consider three different cases, i.e.,
three different input voltage combinations, which produce a conduction path from
the output node to the ground. These cases are
For first two cases the NOR circuit reduces to a simple nMOS depletion-load
inverter. Assuming that the threshold voltages of the two enhancement-type driver
transistors are identical (VT0,A=VT0,B=VT0), the driver-to-load ratio of the
corresponding inverter can be found as follows.
(i)
⎛W ⎞
k ′n , driver ⎜ ⎟
kR =
kdriver , A
= ⎝ L ⎠A
kload ⎛W ⎞
k ′n , load ⎜ ⎟
⎝ L ⎠ load
Institute of
8: Combinational MOS Microelectronic
Systems 4
Logic Circuits
(ii)
⎛W ⎞
k ′n , driver ⎜ ⎟B
kR =
kdriver , B
= ⎝L⎠
kload ⎛W ⎞
k ′n , load ⎜ ⎟
⎝ L ⎠ load
The output low voltage level VOL in both cases is found as follows:
⎛ kload ⎞ 2
VOL = VOH − VT 0 − ( VOH − VT 0 )2 − ⎜ ⎟ VT , load ( VOL )
⎝ kdriver ⎠
The output low voltage (VOL) values calculated for case (i) and (ii) will be identical.
In case (iii), where both driver transistors are turned on, the saturated load current
is the sum of the two linear-mode driver currents.
ID , load = ID , driverA + ID , driverB
kload kdriver , A ⎡ ⎤
VT , load ( VOL ) 2 = 2 ( VA − VT 0 )VOL − V 2
2 2 ⎢
⎣
OL
⎥⎦
kdriver , B ⎡
+ 2 ( VB − VT 0 )VOL − V 2 OL ⎤
2 ⎢
⎣ ⎥⎦
Institute of
8: Combinational MOS Microelectronic
Systems 5
Logic Circuits
Since the gate voltages of both driver transistors are equal (VA=VB=VOH), we
can devise an equivalent driver-to-load ratio for the NOR structure:
⎡ ⎤
kdriver , A + kdriver , B k ′n , driver ⎢⎣ ⎜⎝ L ⎟⎠ A+ ⎜⎝ L ⎟⎠ B ⎥⎦
⎛W ⎞ ⎛W ⎞
kR = =
kload ⎛W ⎞
k ′n , load ⎜ ⎟
⎝ L ⎠ load
Thus, the NOR gate with both of its inputs tied to a logic-high voltage is replaced
with an nMOS depletion-load circuit with the driver-to-load ratio given by the above
equation. The output voltage level in this case is:
⎛ kload ⎞ 2
VOL = VOH − VT 0 − ( VOH − VT 0 )2 − ⎜ ⎟ VT , load ( VOL )
⎝ kdriver , A + kdriver , B ⎠
The VOL is lower than the VOL values calculated for case (i) and for case (ii), when
only one input is logic-high. This also suggests a simple design strategy for NOR
gates. Usually, we have to achieve a certain maximum VOL for the worst case, i.e.,
when only one input is high. Thus, we assume that one input (either VA or VB) is
logic-high and determine the driver-to-load ratio of the resulting inverter. Then set
kdriver , A = kdriver , B = kRkload
This design choice yields two identical driver transistors, which guarantee the
required value of VOL in the worst case. When both inputs are logic-high, the
output voltage is even lower than the required maximum VOL, thus the design
constraint is satisfied.
Institute of
8: Combinational MOS Microelectronic
Systems 6
Logic Circuits
Generalized NOR Structure with Multiple Inputs
Parasitic device capacitances in the NOR2 gate and the lumped equivalent
load capacitance. The gate-to-source capacitances of the driver transistors
are included in the load of the previous stages driving the inputs A and B.
It can easily be seen that the drain currents of all transistors in the circuit are
equal to each other.
ID , load = ID , driverA = ID , driverB
Institute of
8: Combinational MOS Microelectronic
Systems 10
Logic Circuits
kload kdriver , A ⎡ ⎤
VT , load ( VOL ) 2 = 2
2 ( VGS , A − VT , A )VDS , A − V DS
2 2 ⎢⎣ ,A
⎥⎦
kdriver , B ⎡ 2 ⎤
= 2 ( VGS , B − VT , B )VDS , B − V DS
2 ⎢
⎣
,B
⎥⎦
The gate-to-source voltages of both driver transistors can be assumed to be
approximately equal to VOH. ( VGS , A = VOH − VDS , B ≈ VOH , since VDS low in NSAT)
The drain-to-source voltages of both transistors can be
solved:
⎛ kload ⎞ 2
VDS , A = VOH − VT 0 − ( VOH − VT 0 )2 − ⎜ ⎟ VT , load ( VOL )
⎝ kdriver , A ⎠
⎛ kload ⎞ 2
VDS , B = VOH − VT 0 − ( VOH − VT 0 )2 − ⎜ ⎟ VT , load ( VOL )
⎝ kdriver , B ⎠
Let the two driver transistors be identical, i.e., kdriver,A=kdriver,B=kdriver. Noting that the
output voltage VOL is equal to the sum of the drain-to-source voltages of both
drivers, we obtain:
⎛
⎜
VOL ≈ 2 ⎜VOH − VT 0 − ( VOH − VT 0 )2 − ⎜⎜
⎛ kload ⎞⎟
V ( V ) 2 ⎞⎟
⎟ T , load OL ⎟
⎜ ⎝ kdriver ⎠ ⎟
⎝ ⎠
Institute of
8: Combinational MOS Microelectronic
Systems 11
Logic Circuits
The following analysis gives a better and more accurate view of the operation of
two series-connected driver transistors.Consider the two identical enhancement-
type nMOS transistors with their gate terminals connected. At this point, the only
simplifying assumption will be VT,A=VT,B=VT0. When both driver transistors are in
the linear region, the drain currents can be written as:
kdriver ⎡ 2 ⎤
ID , A = 2 ( VGS , A − VT 0 )VDS , A − V DS
2 ⎢⎣ ⎥⎦
,A
kdriver ⎡ 2 ⎤
ID , B = 2 ( VGS , B − VT 0 )VDS , B − V DS
2 ⎢⎣ ⎥⎦
,B
ID =
kdriver ⎡2 ( VGS − VT 0 )VDS − VDS 2 ⎤
4 ⎢⎣ ⎥⎦
Institute of
8: Combinational MOS Microelectronic
Systems 12
Logic Circuits
Generalized NAND Structure with Multiple Inputs
Institute of
8: Combinational MOS Microelectronic
Systems 14
Logic Circuits
Transient Analysis of NAND Gate
Institute of
8: Combinational MOS Microelectronic
Systems 15
Logic Circuits
As in the inverter case, we can combine the capacitances into one capacitance,
connected between the output and node and the ground. The value of the lumped
capacitance Cload depends on the input voltage conditions.
For example, the input VA is equal to VOH and the other input VB is switching from
VOH to VOL. In this case, both the output voltage Vout and the internal node voltage
Vx will rise, resulting in:
Cload = Cgd , load + Cgd , A + Cgd , B + Cgs , A
+ Cdb , A + Cdb , B + Csb , A + Csb , load + Cwire
Note that this value is quite conservative and fully reflects the internal node
capacitances into the lumped output capacitance Cload. In reality, only a fraction
of the internal node capacitances is reflected into Cload.
Institute of
8: Combinational MOS Microelectronic
Systems 16
Logic Circuits
9. Memory Elements and Dynamic Logic
Institute of
Microelectronic
Systems
RS Flipflop
Institute of
9: Memory Elements & Microelectronic
Systems 2
Dynamic Logic
RS-Flipflops
There are two ways to implement a RS-flipflop:
• based on NOR-gates: positive logic
• based on NAND-gates: negative logic
Institute of
9: Memory Elements & Microelectronic
Systems 3
Dynamic Logic
Clocked RS-Latch
To achieve a synchronous
operation, we can add a clock
signal
Institute of
9: Memory Elements & Microelectronic
Systems 4
Dynamic Logic
D-Latch
• Clock= 0: Q unchanged
• Clock= 1: Q= D
Institute of
9: Memory Elements & Microelectronic
Systems 5
Dynamic Logic
Institute of
9: Memory Elements & Microelectronic
Systems 6
Dynamic Logic
Clocked JK-Latch
Institute of
9: Memory Elements & Microelectronic
Systems 7
Dynamic Logic
Institute of
9: Memory Elements & Microelectronic
Systems 9
Dynamic Logic
Institute of
9: Memory Elements & Microelectronic
Systems 10
Dynamic Logic
Transmission Gate JK- Flipflop
It is also possible to
build a JK-flipflop with
transmission gates as
a edge-triggered
flipflop.
This achieves that the
output state can only
change at the rising
edge of the clock
signal
Institute of
9: Memory Elements & Microelectronic
Systems 11
Dynamic Logic
Dynamic D-Flipflop
Institute of
9: Memory Elements & Microelectronic
Systems 12
Dynamic Logic
Dynamic Shift Register
Institute of
9: Memory Elements & Microelectronic
Systems 13
Dynamic Logic
Institute of
9: Memory Elements & Microelectronic
Systems 14
Dynamic Logic
Dynamic RAM
A special kind of memory is dynamic RAM. The major advantage is
the low transistor count, DRAM requires only one transistor and
one (small) capacitor per bit.
The first disadvantage is the destructive read. After reading a cell
the red value must be written back to keep the data in the RAM.
The second disadvantage is the limited duration of storage. After
some milliseconds the cell must be refreshed (read and written
back).
Institute of
9: Memory Elements & Microelectronic
Systems 15
Dynamic Logic
Dynamic RAM
Institute of
9: Memory Elements & Microelectronic
Systems 16
Dynamic Logic
Clocking
Clock Signal:
• used to synchronize data flow though
a digital network
⇒ clocked static or dynamic circuits
• problems: clock skew(delay caused by
clock distribution wires)
Institute of
9: Memory Elements & Microelectronic
Systems 17
Dynamic Logic
Institute of
9: Memory Elements & Microelectronic
Systems 18
Dynamic Logic
Single and Multiple Clock Signals
⇒ For nonoverlapping clock phases φ and φ fine tuned and well designed
delay lines (realized as Transmission gates) have to be inserted in order to
avoid overlapping of φ and φ.
Institute of
9: Memory Elements & Microelectronic
Systems 19
Dynamic Logic
TG delay circuit
Institute of
9: Memory Elements & Microelectronic
Systems 20
Dynamic Logic
Pseudo 2-φ clocking
Institute of
9: Memory Elements & Microelectronic
Systems 21
Dynamic Logic
Shift register
Institute of
9: Memory Elements & Microelectronic
Systems 22
Dynamic Logic
Time constant for charging and discharging:
τTG = RTGCL
where
CL = CTG + Cin + Cline
VA=VDD: (Vin(0)=0)
Vin( t ) ≅ VDD ⎡1 − e −t / τTG ⎤
⎢⎣ ⎥⎦
Inverter is switched, when Vin=VIH which occurs after
⎡ VIH ⎤
ϕt 1 ≅ − τTG ln ⎢1 −
⎣ VDD ⎥⎦
Cin = Cox [(WL )n + (WL )p ]
VA=0: (Vin(0)= VDD)
Vin( t ) ≅ VDD ⋅ e −t / τTG
Institute of
9: Memory Elements & Microelectronic
Systems 23
Dynamic Logic
Institute of
9: Memory Elements & Microelectronic
Systems 24
Dynamic Logic
Charge leakage problem in CMOS TG
Institute of
9: Memory Elements & Microelectronic
Systems 25
Dynamic Logic
With
IL = ILn − ILp
dQstore
= ILp − ILn
dt
dQstore
Cstore =
dV
Assuming that the leakage currents ILp and ILn are constant and that the node
charge voltage relation is linear of the form
Qstore = CstoreV
Institute of
9: Memory Elements & Microelectronic
Systems 26
Dynamic Logic
follows (because Cstore is const.)
dV
Cstor = ILp − ILn
dt
The solution of this equation is
( ILp − ILn )
V(t ) = t +V(0 )
Cstor
If ∆V is the maximum allowed voltage change:
Cstor∆V
t max =
IL
Institute of
9: Memory Elements & Microelectronic
Systems 27
Dynamic Logic
With Tmax=2tmax (the longest allowed clock period) follows for the minimum
frequency
1 IL
f min ≅ ≅
2 t max 2Cstore∆V
Institute of
9: Memory Elements & Microelectronic
Systems 28
Dynamic Logic
So the storage capacitance can be estimated by voltage averaging of this
expression:
Cstor ≅ C G + Cline + Cols + Cold + K ( 0 ,VDD )[CSBp + CDBn ]
For a realistic analysis of the charge leakage problems the dependence of the
leakage currents from the reverse voltage bias has to be taken into consideration.
Institute of
9: Memory Elements & Microelectronic
Systems 29
Dynamic Logic
Charge Sharing
V 1(t ) = V 1(0) +
[V (0) − V
1 (0)]
2
[
C 1 + C 2 e −t / τ ]
(C 1 + C 2)
⎛ C1 ⎞
V 2(t ) = V 2(0) + [V 1(0) − V 2(0)]⎜ ⎟ 1− e[−t / τ
]
⎝ C1 + C 2 ⎠
where the time constant is given by
C1C 2
τ = RTGCeq with Ceq =
C1 + C 2
In the limit t→∝, V1=V2=Vf:
C1 C1
Vf = V 1( 0 ) + V 2( 0 )
C1 + C 2 C1 + C 2
Institute of
9: Memory Elements & Microelectronic
Systems 31
Dynamic Logic
This agrees with the result from simple charge conservation by noting that the
final charge distributes according to
QT = ( C1 + C 2 )Vf
Institute of
9: Memory Elements & Microelectronic
Systems 32
Dynamic Logic
Charge sharing among N TG-connected capacitors
N
Initial charge: QT = ∑ CiVi ( 0 )
i =1
QT = ⎛⎜ ∑ Ci ⎞⎟Vf
N
After connecting nodes:
⎝ i =1 ⎠
∑Ni =1 CiVi ( 0 )
Final voltage: Vf =
∑Ni =1 Ci
Institute of
9: Memory Elements & Microelectronic
Systems 33
Dynamic Logic
Dynamic Logic
• Pull-up (pull-down) network of static CMOS is replaced by a single precharge
(discharge) transistor.
The remaining network then conditionally discharges (changes up) the output
in a second operation pulse
• One logic level is held by dynamic charge storage
• Transistor count is reduced from 2n (static CMOS) to n+2 for dynamic
precharged CMOS (but now: 2 phases of operation)
If Vin=0 then
Cout
τch = = RpCout
β p( VDD − VTp )
Evaluation Phase
For the case that M1 is switched on and identically designed channel width for M1
and Mn the discharge time constant is given by
( L1 + Ln )Cout
τdis =
k ′nW ( VDD − VTn )
Institute of
9: Memory Elements & Microelectronic
Systems 36
Dynamic Logic
Evaluation discharge network
⎡ ⎤
tdis = τdis ⎢⎢ 2VTn + ln ⎛⎜ 2 ( VDD − VTn ) − 1 ⎞⎟ ⎥
⎜ ⎟⎥
⎣⎢ ( VDD − VTn ) V0
⎜ ⎟⎥
⎝ ⎠⎦
φ=1 Precharge
φ=0 Evaluate
Institute of
9: Memory Elements & Microelectronic
Systems 39
Dynamic Logic
Dynamic Cascades
pMOS blocks and nMOS blocks have to be installed alternated in order to avoid
glitches
Dynamic cascades
Institute of
9: Memory Elements & Microelectronic
Systems 40
Dynamic Logic
Domino CMOS Logic
Institute of
9: Memory Elements & Microelectronic
Systems 41
Dynamic Logic
• Domino Logic: design method for glitch-free cascading of nMOS logic blocks
• Each stage is driven by φ
- Precharge during φ = 0
- Evaluation when φ = 1
• Domino logic blocks consists of a precharge/ evaluation block and an output
inverter
Precharge Phase: The gate output is precharged to logic 1 and the inverter output
is going to logic 0. Logic transmission errors are avoided by providing a
logic 0 at the inverter output (avoiding discharge of the next logic state).
Evaluation Phase: The inverter output stays according to the actual input values
at logic 0 or is set to logic 1. The correct result signal is provided at the
end of the domino cascade after stabilization of all stages.
Institute of
9: Memory Elements & Microelectronic
Systems 42
Dynamic Logic
Domino AND gate
Institute of
9: Memory Elements & Microelectronic
Systems 43
Dynamic Logic
Domino timing
Institute of
9: Memory Elements & Microelectronic
Systems 44
Dynamic Logic
Cascaded domino circuit with fanout = 2
Institute of
9: Memory Elements & Microelectronic
Systems 45
Dynamic Logic
CX=C0+CT. C0 represents the capacitance due to M0, while CT is the total of all
other contributions.
Institute of
9: Memory Elements & Microelectronic
Systems 47
Dynamic Logic
Evaluate
If all inputs Ai are set to logic 1, the worst case delay time can be estimated by
tD ≅ RnCn + ( Rn + R 3 )C 3 + ( Rn + R 3 + R 2 )C 2 +
+ ( Rn + R 3 + R 2 + R1 )C1 + ( Rn + R 3 + R 2 + R1 + R 0 )CX
with
1
Rj +
k ′n( W / L ) j ( VDD − VTn )
Institute of
9: Memory Elements & Microelectronic
Systems 48
Dynamic Logic
Charge Leakage and
Charge Sharing
Cout,1>>Cx1+Cx2
Institute of
9: Memory Elements & Microelectronic
Systems 50
Dynamic Logic
Use of feedback to control a pull-up MOSFET for charge sharing problem
Institute of
9: Memory Elements & Microelectronic
Systems 51
Dynamic Logic
NORA Logic
(NORA = NO RAce)
NORA Properties
• NORA is very insensitive to clock delay
• one clock signal and the inverted clock signal with short slopes rise times are
sufficient
• no inverter is needed between the logic stages, because of alternate use of
n-type and p-type blocks
• the last stage is a clocked inverter, a C2MOS latch
• ideal to clock pipelined logic systems
Institute of
9: Memory Elements & Microelectronic
Systems 52
Dynamic Logic
The Signal Race Problem
The signal race problem can be seen: a signal race can arise, when both
transmission gates conduct at the same time. If the new input from TG1 reaches
the input of TG2 while TG2 is still transmitting the output, the output information
will be lost. Imperfect TG synchronization occurs because of normal transmission
intervals or clock skew.
Institute of
9: Memory Elements & Microelectronic
Systems 53
Dynamic Logic
tp>>tr,tf → no problems
Clock skew
Institute of
9: Memory Elements & Microelectronic
Systems 54
Dynamic Logic
φ=0 Precharge
φ=1 Evaluate
Institute of
9: Memory Elements & Microelectronic
Systems 55
Dynamic Logic
NORA Structuring
clk2
NORA structuring
Institute of
9: Memory Elements & Microelectronic
Systems 56
Dynamic Logic
NORA φ and φ sec tions
Institute of
9: Memory Elements & Microelectronic
Systems 57
Dynamic Logic
φ=1 Precharge
φ=0 Evaluate
C2MOS latch
φ
NORA φ and φ sec tions
Institute of
9: Memory Elements & Microelectronic
Systems 59
Dynamic Logic
0V
φ = 0: P P locked E E transp.
φ = 1: E E transp. P P locked
φ
NORA φ and φ sec tions
Institute of
? Microelectronic
9: Memory Elements &
Systems 60
Dynamic Logic
0V
φ = 0: P P locked E E transp.
φ = 1: E E transp. P P locked
C²MOS Latch
φ locked during
clock skew
period!
φ
NORA φ and φ sec tions
Institute of
? Microelectronic
9: Memory Elements &
Systems 61
Dynamic Logic
Precharged
to 0V
φ = 0: P P locked E E transp.
φ = 1: E E transp. P P locked
And the other Duration of provision of logical
way round: output value to next stage will
φ eventually be enhanced
φ
NORA φ and φ sec tions
Institute of
? Microelectronic
9: Memory Elements &
Systems 62
Dynamic Logic
10. Performance
Institute of
Microelectronic
Systems
Summary
Institute of
Microelectronic
10: Performance Systems 2
Interconnect Parameters
Institute of
Microelectronic
10: Performance Systems 3
Modern Interconnect
Institute of
Microelectronic
10: Performance Systems 4
Full Wire Model
Assume that all wires in a bus network are implemented in a single interconnect layer (Al),
isolated from the silicon substrate and from each other by a layer of dielectric material (SiO2):
Schematic view
Physical view
Institute of
Microelectronic
10: Performance Systems 5
Institute of
Microelectronic
10: Performance Systems 6
Wire Parallel-Plate Capacitance
The capacitance of a wire is function of:
• shape of the wire
• environment
• distance to substrate
Current Flow
• distance to surrounding wires
L
Simple model - the parallel-plate capacitance:
W
ε ox Electrical-field
C wire = C pp = WL H
lines
tox
tox SiO2
Cwire is the total capacitance of the
wire (pF)
Substrate
True for W >> tox ⇒ electric field lines are orthogonal to the capacitor plates
Institute of
Microelectronic
10: Performance Systems 7
SiO2 tOX
Cfringe Substrate Cpp
Substrate Cpp
cwire = c pp + c fringe
cwire ≈
(W − H / 2)ε ox + 2πε ox
tox log(tox / H ) cfringe
cwire
cwire is the wire capacity per unit length (pF/cm)
cpp
cpp
For W/H large cfringe < cpp, cwire ~ cpp
For W/H < 1.5 ⇒ cfringe > cpp
Institute of
Microelectronic
10: Performance Systems 8
Interwire Capacitance
Level2 In multilevel interconnects technologies the
wires are not completely isolated
Institute of
Microelectronic
10: Performance Systems 9
Wiring Capacitances
Cplate (aF/µm2) 88
Poly
Cfringe (aF/µm) 54
Cplate (aF/µm2) 30 41 57
Al1
Cfringe (aF/µm) 40 47 54
Cplate (aF/µm2) 13 15 17 36
Al2
Cfringe (aF/µm) 25 27 29 45
Plate and fringe capacitance values for a typical 0.25 µm CMOS process
Institute of
Microelectronic
10: Performance Systems 10
Wire Resistance
ρ L L
R= = R
H W W
L R - Sheet Resistance
H
W R1 ≡ R2
Institute of
Microelectronic
10: Performance Systems 11
Institute of
Microelectronic
10: Performance Systems 12
Other Resistive Effects
(1) Contact resistance
• Extra resistance added by transition between routing layers
• Can be reduced by making the contact holes larger
• Current crowding upper limits the size of the contact
(3) Electromigration
• Limits the DC currents to 1mA/µm
Institute of
Microelectronic
10: Performance Systems 13
Wire inductance
At switching frequencies in GHz range the wire inductance must be considered
di
A changing current passing through an inductor generates a voltage drop: ∆v = L
dt
On-chip inductance effects are:
• reflection of signals due to impedance mismatch
• inductive coupling between lines
• ringing effects
• switching noise due to Ldi/dt voltage drops
It is possible to compute the wire inductance directly from its geometry and its environment
A more simple approximation is given by following relation:
cl = εµ
where c is capacitance per unit length, l inductance per unit length, ε electric permittivity and
µ magnetic permeability of the surrounding dielectric
Ex.: 0.25 µm technology a 0.4µm width Al wire routed on top of the field oxide (SiO2) has
c = 92aF/µm, l = 0.47pH/µm
Institute of
Microelectronic
10: Performance Systems 14
Example: Intel 0.25 micron Process
Institute of
Microelectronic
10: Performance Systems 15
Conditions:
• resistive component of the wire is small
• consider only the capacitive component
• switching frequencies are in medium range
The wire still represents an equipotential region and does not introduce any delay
The distributed capacitance is lumped into a single capacitor
The only impact on performance:
• loading effect of Clumped on the driving gate
Institute of
Microelectronic
10: Performance Systems 16
The Lumped RC Model
Metal wires of few mm length have a significant resistance and the equipotential assumption is
no longer adequate!
New model:
• Lumps the total resistance of the wire into a single resistor R
• Combines the global capacitance of the wire into a single capacitor C
The estimated wire delay: τ = RC
This model is pessimistic and inaccurate for long interconnect wires!
Institute of
Microelectronic
10: Performance Systems 17
Assume that each node of the network is initially discharged and a step input is applied at t=0
The Elmore delay at node i, for a network with N nodes, is given by:
N
τ Di = ∑ C k Rik
k =1
Ex.: τDi = R1C1 + R1C2 + (R1 + R3)C3 + (R1 + R3)C4 + (R1 + R3 + Ri)Ci
Institute of
Microelectronic
10: Performance Systems 18
The RC Chain Model
RC chain - a special case of the RC-tree network:
R1 1 R2 2 Ri-1 i-1 Ri i N
Vin VN
C1 C2 Ci-1 Ci
N i N
τ DN = ∑ Ci ∑ R j = ∑ Ci Rii Ex.: τ Di = C1R1 + C2(R1 + R2) + ... + Ci(R1 + ... + Ri)
i =1 j =1 i =1
Assume that a wire of length L is modeled by N equal-length segments, each having Ri = rL/N,
and Ci = cL/N (r, c are resistance and capacitance per unit length)
2 N ( N + 1) N +1
2
τ DN = ⎜
⎛L⎞
(
⎟ (rc + 2rc + ... + Nrc ) = rcL ) = RC
⎝N⎠
2
2N 2N
RC rcL2
For N large, the RC chain model approach the distributed RC line model: τ DN = =
2 2
(1) The delay of a wire is a quadratic function of its length
(2) The delay of the RC chain model is 1/2 of the delay predicted by the lumped RC model!
Institute of
Microelectronic
10: Performance Systems 19
rcL2
τ (out ) =
2
Institute of
Microelectronic
10: Performance Systems 20
The Distributed RC Line Model (2)
0 → 63%(τ) RC 0.5RC
0 → 90% 2.3RC RC
(2) A distributed RC model should be used if the rise (fall) time at the line input is smaller
than the rise (fall) time of the line.
tr < RC
Otherwise, a simple lumped C model suffices.
g c g c g c g c
∂ 2v ∂ 2v 1 ∂ 2v 1
= lc 2 = 2 2 ν= propagation speed along the line
∂x 2 ∂t ν ∂t lc
Institute of
Microelectronic
10: Performance Systems 23
Institute of
Microelectronic
10: Performance Systems 24
Lossless Transmission Lines Parameters (2)
Characteristic impedance: impedance presented by wire
l 1
Z0 = = lν = 100 to 500Ω for typical wires
c cν
The behavior of the transmission line is influenced by the termination of the line
The termination how much of the wave is reflected upon arrival at the wire end
Vrefl I refl R − Z0
ρ= = =
Vinc I inc R + Z0
ρ - Reflection coefficient
R - the termination resistance
R = Z0 ρ=0
R=∞ ρ=1
R=0 ρ = -1
Institute of
Microelectronic
10: Performance Systems 25
Zs VSource Z0 VDest
Vin
ZL
VSource = (Z0/(Z0+Zs))Vin
ρs = (Zs-Z0)/(Zs+Z0)
Institute of
Microelectronic
10: Performance Systems 26
Lattice Diagram
VSource VDest
Vin = 5V, RS = 5Z0, RL = ∞
0.8333 V + 0.8333
ρs = (Zs-Z0)/(Zs+Z0) = 0.66
1.6666 V
+ 0.8333 ρD = 1
2.2222 V
+ 0.5556 t = 0 ... tflight
+ 0.5556 2.7778 V V1S = (Z0/(Z0+Zs))Vin = 0.83V
V1D = V1S + Vr,1D; Vr,1D = ρD V1S = 0.83V
t 3.1482 V + 0.3704
V1D = 0.83V + 0.83 = 1.66V
+ 0.3704 3.5186 V
t = tflight ... 2tflight
3.7655 V + 0.2469 V2S = V1S + Vr,1D + Vr,1S ; Vr,1S = ρS Vr,1D = 0.55V
+ 0.2469 4.0124 V V2S = 2.22V
V2D = V1D + Vr,1S + Vr,2D; Vr,2D = ρD Vr,1S = 0.55V
...
V2D = 2.77V
L/ν ....
Conclusion: in order to avoid ringing or slow propagation delay the transmission line
should be terminated both at the source (series termination) and at the destination (parallel
termination) with a resistance equal to Z0
Institute of
Microelectronic
10: Performance Systems 27
Criteria:
• Rise (fall) time of input signal, tr, must be smaller than propagation delay through
wire. Otherwise, a lumped model suffices.
t flight lw
tr < = lc Length (cm)
2 2 10.00
• Wire resistance R / damping factor ξ may not 2. High
2tr
< lw
be too large, otherwise distributed RC model attenuation lc 1. & 2.
sufficient 1.00
l Inductance is
R = rlw < 2 Z 0 = 2 important
2 l
c lw <
r c
0.10
rl c
or ξ= w <1
1. Large input
rise time
2 l
• In conclusion: 0.01
0.01 0.10 1.00 10.00
Institute of
Microelectronic
10: Performance Systems 29
Scaling (2)
Influence of first-order scaling on MOS device
Parameter Scaling Factor
Length; L 1/α
Width; W 1/α
Gate oxide thickness; tox 1/α
Device Junction depth; Xj 1/α
Parameter
Substrate doping; Na or Nd α
Supply voltage; VDD 1/α
Electric field across gate oxide; E 1
Depletion layer thickness; d 1/α
Parasitic capacitance; WL/tox 1/α
Gate delay; VC/I 1/α
DC power dissipation; Ps 1/α2
Resultant Dynamic power dissipation; Pd 1/α2
Influence Power speed product 1/α3
Gate area 1/α2
Power density; VI/A 1
Current density; I/A α
Transconductance; gm 1
Institute of
Microelectronic
10: Performance Systems 30
Scaling (3)
Interconnect layer scaling
Parameter Scaling Factor The scaled line resistance is:
Conductor line width; W 1/α
ρ ⎡ L /α ⎤
r' = = αr
t / α ⎢⎣W / α ⎥⎦
Conductor line length; L 1/α
Conductor line thickness; t 1/α
Line cross-section; A 1/α2 The voltage drop along the scaled line is:
Line resistance; r 1/α
Line response time; rc 1 Vd ' = (I / α )(αr ) = Ir = ct
Normalized line response time 1/α
Line voltage drop; Vd 1 The scaled line response time is:
Normalized line voltage drop 1/α τ s ' = (αr )(C / α ) = rC = ct
Current density; J 1/α
Normalized contact voltage drop; Vc/V 1/α2
For a constant chip size many of the signals paths do not scale down! Therefore:
• Voltage drops along the lines are larger by a factor of α than scaled line voltage drop
• The line response time is larger by a factor of α than scaled line response (see table)
Problems: distribution and organization of clocking signals, electromigration, the increase of
the wire capacitance (affects the gate delay)
Institute of
Microelectronic
10: Performance Systems 31
Power Distribution
Process with 1 Level of metal :
• VDD and ground (VSS) are routed in interdigitated trees
• Crossunders are very difficult (low resistance interconnect)
Power distribution is much easier for technologies with 2 (or
more) levels of metal
Cautions:
• Parts of the chip that are likely to simultaneous
transition are routed separately!
• Separate power pins might be used for the
output driver!
Institute of
Microelectronic
10: Performance Systems 32
Clock and Timing Circles (1)
The clock
• synchronize machine operations and data transfer
• global control technique that provide the “glue” for system operation
System level timing can be described using circular timing charts
Institute of
Microelectronic
10: Performance Systems 33
Institute of
Microelectronic
10: Performance Systems 34
Clock Generation Circuits (1)
2-phase clock generator with transmission gate delay
1
RTG =
(
β n (VDD − VTn ) + β p VDD − VTp )
Institute of
Microelectronic
10: Performance Systems 35
Institute of
Microelectronic
10: Performance Systems 36
Clock Drivers and Distribution Techniques (1)
Institute of
Microelectronic
10: Performance Systems 37
Institute of
Microelectronic
10: Performance Systems 38
Input Protection Circuits (1)
Excessive electrical charge on the gate of the MOS transistor can destroy the device!
Protection circuits drain this excessive charge and avoid static burnout!
VG
C g = CoxWL Eox ≈ E BD ~ 7,5 • 10 6 V / cm
xox
If Eox>EBD, the oxide insulating properties break down and charge is transported through
the material - destruction of the device!
The max gate voltage VGmax is a relatively small number
Static electricity during handling could easily reach a few kV
Protection circuits allow for alternate charge flow paths when the input voltage is too large
Diode structures are very useful in this application because:
• have relatively low breakdown voltages which can be controlled
• reverse breakdown in a pn junction is non-destructive
Institute of
Microelectronic
10: Performance Systems 39
Input protection circuits introduce parasitic RC time constants into the network!
Institute of
Microelectronic
10: Performance Systems 40
Static Gate Sizing (1)
Problem - determine the values of Sj for j = 2,... which minimizes the total propagation delay
through the inverter chain
⎛R⎞ ⎛R⎞
[
t D , j = ⎜ ⎟(Co , j + Ci , j +1 + Cw, j +1 ) = ⎜ ⎟ S j Co + S j +1 (Ci + Cw )
⎜S ⎟ ⎜S ⎟
]
⎝ j⎠ ⎝ j⎠
Institute of
Microelectronic
10: Performance Systems 41
TD = ∑
N [ ]
R S j Co + S j +1 (Ci + C w )
j =1 Sj
∂TD
To minimize TD we differentiate with respect to Sj and look for zero slope points: =0
∂S j
S j +1 Sj
This results in the recursion relation: = for j= 2,3,...N
Sj S j −1
S j +1
If this to hold for arbitrary values of j, then: = K = const
Sj
S 2 S 3 S 4 S N +1 C
Forming the product: ⋅ ⋅ ⋅⋅⋅ = KN = L
S1 S 2 S 3 SN Ci
1/ N
⎛C ⎞
We obtain the scaling ratio in the form: K = ⎜⎜ L ⎟⎟
⎝ Ci ⎠
Institute of
Microelectronic
10: Performance Systems 42
Static Gate Sizing (3)
Explicitly, the scaling factors are given by: S1 = 1, S2 = K, S3 = K2 ... SN = KN-1
N
The minimum delay is then: TD ,min = ∑ R[Co + K (Ci + C w )] = NR[Co + K (Ci + C w )]
j =1
The equation K = Sj+1/Sj says that the minimum delay occurs when every stage has the
same individual delay time tD
The number of stages that optimize the delay is obtained by differentiating TD (replacing K
with its N-dependent equation) with respect to N and setting the result to 0:
1
⎛ C ⎞ N ⎡ ln (C L / Ci ) )⎤
RCo + R (Ci + C w )⎜⎜ L ⎟⎟ ⎢1 − ⎥=0
⎝ Ci ⎠ ⎣ N ⎦
⎛C ⎞
If Co is small: N = ln⎜⎜ L ⎟⎟ N is chosen the nearest integer for given values of Ci and CL
⎝ Ci ⎠
the optimum
C ⎛C ⎞
with K = L ⇔ N ln K = ln⎜⎜ L ⎟⎟ ⇒ N ln K = N ⇔ ln K = 1 ⇔ K = e = e scaling ratio
N 1
Ci ⎝ Ci ⎠ equals e !!!
Institute of
Microelectronic
10: Performance Systems 43
Institute of
Microelectronic
10: Performance Systems 44
Double-Inverter Off-Chip Driver Circuit
The simplest off-chip driver circuit: an inverter chain designed to handle a large capacitive load
⎛W ⎞ Cout
⎜ ⎟ =
⎝ L ⎠ n 2 τ n k 'n (VDD − VTn )
⎛W ⎞ Cout
⎜ ⎟ =
(
⎝ L ⎠ p 2 τ p k ' p VDD − VTp )
Cout is large ⇒ Mn2 and Mp2 are large! ⇒ obtained using parallel connected transistors to aid in
layout and parasitic control
Mn1 and Mp1 can be sized using the previously presented sizing theory
The actual values of the fall and rise time can be estimated from:
⎡ 2VTn ⎛ 2(VDD − VTn ) ⎞⎤
⎡ 2 VTp
t LH = τ p ⎢ + ln⎜
(
⎛ 2 VDD − VTp ⎞⎤
− 1⎟⎥
)
t HL = τ n ⎢ + ln⎜⎜ − 1⎟⎟⎥ ⎜ ⎟⎥
⎣VDD − VTn ⎝ V0 ⎠⎦ ⎢VDD − VTp ⎝
V0
⎠⎦
⎣
Example
Institute of
Microelectronic
10: Performance Systems 46
Tri-State Off-Chip Driver Circuit
The input signal is split and individually control each output transistor
The high-impedance state is obtained by driving both NMOS and PMOS output devices into
cutoff
Normal operation:
Z = 1 ⇒ Mp1 and Mp2 off, Mn on
High-impedance state:
Z = 0 ⇒ Mp1 and Mp2 on, Mn off
⇒ Vp = VDD, Vn = 0
⇒ the output transistors are in cutoff
Institute of
Microelectronic
10: Performance Systems 47
Institute of
Microelectronic
10: Performance Systems 48
Packaging Technology (1)
2 Package types
7
1. Bare die
2. Dual-In-line Package (DIP)
3. Pin Grid Array (PGA)
1
4. Small-outline IC
5
5. Quad flat pack
6. Plastic Leaded Package
4 (PLCC)
7. Leadless carrier
3 6
Institute of
Microelectronic
10: Performance Systems 49
Institute of
Microelectronic
10: Performance Systems 50
Packaging Technology (3)
Example: parasitic effects of the bond-wire inductance
Institute of
Microelectronic
10: Performance Systems 51
Board Bonding
Wiring Wire
+
SUPPLY Cd CHIP
Decoupling
Capacitor
Institute of
Microelectronic
10: Performance Systems 52
Packaging Technology (5)
Institute of
Microelectronic
10: Performance Systems 53
Substrate
Die
Pad
Lead Frame
Institute of
Microelectronic
10: Performance Systems 54
Packaging Technology (7)
1-b: Tape-automated bonding (TAB)
Sprocket
hole
Test Die
pads
Lead
frame Substrate
Polymer film
• The die is attached to a metal lead frame that is printed on a polymer film
• The connection between chip pads and polymer film wires is made using solder bumps
• Highly automated process
• Improve electrical performance (L ~ 0.5nH, C~0.3pF)
Institute of
Microelectronic
10: Performance Systems 55
Die
Solder bumps
Interconnect
layers
Substrate
• Flip the die upside-down and attach it directly to the substrate using solder bumps
• Superior electrical performance
• Pads can be placed at any position on the chip (not only on the die boundary)
• A possible solution for power and clock distribution problems
Institute of
Microelectronic
10: Performance Systems 56
Packaging Technology (9)
Institute of
Microelectronic
10: Performance Systems 57
Institute of
Microelectronic
10: Performance Systems 58
11. CAD & Design Flow
Institute of
Microelectronic
Systems
Moore‘s ???
Efficiency
Law
Platform-based Design
Schematic Entry
Layout Editor
Institute of
Microelectronic
11: CAD & Design Flow Systems 2
Example for Complex Systems: Embedded SoC
Embedded „System-on-Chip“
Properties
Sensors
• Potentially consisting of a large number of
components
• Specialised to an application domain
I/O-
Micro- • reactive
Module
con- Memory • Real-time capability
troller
Constraints
• Costs
ASIC
• Power consumption
DSP • Latency
RF
Transc. • Required flexibility
Design Tasks
• Definition of communication architecture which
Actuators is adequate to the application‘s structure
• Mapping of the system specification on
available implementation components
Institute of
Microelectronic
11: CAD & Design Flow Systems 3
Experiences
Applications
New Requirements Specific
blocks
DSP
core API
Æ Feedback for future bus
Memory OS
platform generations CPU
core
Drivers
Product
Cost Analysis
Validation
Quality Assurance System
Design of System Integration
System Architecture Level
HW/SW
HW and SW Component IP Database
Implementation and Implementation
Level
Institute of
Microelectronic
11: CAD & Design Flow Systems 5
Hardware/Software Co-Design
Specification
Co-Simulation
HW/SW-Partitioning
Communication Synth.
HW-Specification SW-Specification
Synthesis Compilation
Placement/Routing Real-Time OS
O.k., let‘s go
bottom-up now Heterogeneous HW-/SW-System
Institute of
Microelectronic
11: CAD & Design Flow Systems 6
Classes of CAD Tools
• Design Entry:
– Graphical Editor (drawing schematic diagrams, physical layout, stick
layout diagrams, ...)
– Language based circuit capture tools (for hardware description
languages like VHDL, Verilog, EDIF)
• Design Validation:
– Physical design verification tools (design rule checker, extractor,
LVS, schematic and electrical rule checker)
– Design Simulation:
• analog simulation: circuit level; behavioural level
• digital simulations: circuit level, switch level, logic level, register transfer
level, architectural level, behavioural level;
• thermal simulation: displaying heat dissipation on chip
– Formal Verification Methods
Institute of
Microelectronic
11: CAD & Design Flow Systems 7
• Design Implementation:
– Layout Compilers (stick2layout, macrocell generators, datapath
compilers)
– Layout Structuring & Optimization:
• Layout Compaction
• Placement and Routing
– Logic Synthesis
– Finite State Machine (FSM) Synthesis
– Architectural Synthesis
Institute of
Microelectronic
11: CAD & Design Flow Systems 8
Full Custom Design: Design Entry
Full Custom Design
Institute of
Microelectronic
11: CAD & Design Flow Systems 9
• The layout is specified in textual form giving either the position and layer of rectangles
(similar to hand crafted layout) or lines (as in stick diagrams).
Institute of
Microelectronic
11: CAD & Design Flow Systems 10
Full Custom Design: Design Entry
B x y dx dy Box with length dx, width dy, an lower left hand corner placed at (x,y)
Ln Layout level (layer) for the box definiitions that follow
Mn Start of macro definition n
E End of macro definition
Cnxym Call for macro number n with translation x,y and orientation m.
Q End of layout file
1 n-diffusion n-diffusion
2 p-diffusion ion implant
3 polysilicon polysilicon
4 metal metal
5 contact contact
8 n-well --
9 overglass overglass
Institute of
Microelectronic
11: CAD & Design Flow Systems 11
Cell Orientations:
Orien-
tation Description
1 no rotation
2 rotate 90° counterclockwise
3 rotate 180° counterclockwise
4 rotate 270° counterclockwise
5 mirror about y-axis
6 rotate 90° counterclockwise and mirror about y-axis
7 rotate 180° counterclockwise and mirror about y-axis
8 rotate 270° counterclockwise and mirror about y-axis
Institute of
Microelectronic
11: CAD & Design Flow Systems 12
Full Custom Design: Design Entry
Institute of
Microelectronic
11: CAD & Design Flow Systems 13
Institute of
Microelectronic
11: CAD & Design Flow Systems 15
Schematic Entry
stick2layout
Converter
and Compactor
Simulation Netlist
Layout Editor Extraction and Simulation (SPICE)
Floorplanning
Placement & Routing
Design Analysis
DRC, ERC
Mask Layout Data Circuit Extraction
LVS
Cell based Design approaches rely on layout components predefined and provided
by a silicon foundry. Several implemenation styles can be distinguished:
• Standard Cells:
– layout blocks predefined by silicon foundry
– full process sequence (amount of mask layers) for chip fabrication required
• Gate Arrays:
– Linear Gate Arrays:
• pre-fabricated diffusion and poly layers (regular structures, e.g. transistors)
• customized interconnect structures (wires in metal 1 and metal 2)
• fixed size interconnect areas (channels) discussed later in
– Sea of Gate Array this lecture
• pre-fabricated diffusion and poly layers (regular structures e.g. transistors)
• customized interconnect structures (wires in metal 1 and metal 2)
• variable size interconnect areas (channels) over unused transistors
Institute of
Microelectronic
11: CAD & Design Flow Systems 17
Graphical
Simulation Netlist
Data Schematic Entry
Cell Extraction
Library Simulation Models
Layout
Data Placement: Logic Simulation
Standard Cells Fault Simulation
Macro Cells Timing Analysis
I/O Cells Test Pattern Generation
Routing: Parasitic
Place &
Channel Generation Route
Wire Capacitances /
Delay Backannotation
Global Routing Optimization
Detailed Routing
Design Analysis
DRC, ERC
Mask Layout Data Circuit Extraction
LVS
Institute of
Microelectronic
11: CAD & Design Flow Systems 18
Standard Cell Full Custom Design
Institute of
Microelectronic
11: CAD & Design Flow Systems 19
Design Verification
Physical Design Rule Check:
• Minimum width
• Minimum spacing
• Overlapping
• Extension
Institute of
Microelectronic
11: CAD & Design Flow Systems 20
Design Verification
Extraction:
• Circuit Level Extraction can be used to create a netlist for circuit level simulations
(e.g. SPICE, ...). The netlist consists of MOS transistors (including geometrical
parameters as W / L, parasitic capacitances), resistors, capacitances, diodes, ...
• Switch Level Extraction: can be used to create a netlist which can be processed by a
switch level simulator. The resulting netlist consists of MOS transistors and parasitic
capacitances (to model storage effects in MOS circuits).
• Parasitics Extraction: is used in conjunction with cell based design techniques. Since
wire delay is dependent on the parasitic capacitance of a wire, parasitic capacitances of
nets and input capacitances of other gates connected to an output can be used to
estimate the extrinsic delays (Note: intrinsic delays [i.e. the delay of unloaded gates] are
fetched from the cell library's simulation model data).
Institute of
Microelectronic
11: CAD & Design Flow Systems 21
Design Verification
LVS:
The layout-versus-schematic (LVS) comparison tool checks the equivalence of the layout and its schematic.
The tool can be used to find wrong connections or parameter mismatch (as W/L of transistors, ...) between
a schematic and its physical layout representation.
To verify schematics used e.g. in cell based designs, a schematic rulechecker can find schematic rule
violations (like the following examples):
• Warnings:
• unconnected (floating) wire segments
• open outputs
• exceeded fanout
• Errors:
• open inputs (undefined input value!)
• number of bits differ for 2 buses connected together
• number of input/output pins in a schematic differs from its symbol representation ( --> pins are
not accessible / not present at higher levels of schematic hierarchy)
• more than one active driver connected to a net at the same time
Institute of
Microelectronic
11: CAD & Design Flow Systems 22
Simulation
Goal of Simulation:
Simulator Classification:
Institute of
Microelectronic
11: CAD & Design Flow Systems 23
Simulation: Models
Signal Modelling:
• values which exist in real circuits (0, 1, high impedance, oscillation, ...)
• values which exist only in the simulator (unknown, transition, ...)
• boolean logic set not sufficient
3-valued Logic:
logic zero = 0
logic one = 1
unknown = U
Example: AND 0 1 U
0 0 0 0
1 0 1 U
U 0 U U
Problems:
Institute of
Microelectronic
11: CAD & Design Flow Systems 24
Simulation: Models
Timing Models:
Institute of
Microelectronic
11: CAD & Design Flow Systems 25
Simulation: Models
Advanced Logic Simulators:
• Introduction of signal strength additional to logic values for driver and bus modelling
Institute of
Microelectronic
11: CAD & Design Flow Systems 27
Simulation
www.modelsim.com
Institute of
Microelectronic
11: CAD & Design Flow Systems 28
Simulation: Techniques
Simulation Techniques:
• Compiler-driven technique:
– Problems:
• Feedbacks
• Sorting of gate netlist
• Zero delay model
• Entire circuit is simulated
Switch-Level Simulation:
Institute of
Microelectronic
11: CAD & Design Flow Systems 29
Drain
Source
Drain
Remarks:
Logic n-Channel p-Channel • In the linear model,
(Gate) Enhancement Enhancement Depletion node capacitance and
Gate 1 REFF infinity REFF devices resistance are
0 infinity REFF REFF used to compute output
X [REFF, infinity] [REFF, infinity] REFF logic levels and
REFF transition time
• Ratio errors can be
detected
Source
Institute of
Microelectronic
11: CAD & Design Flow Systems 30
Executable Specifications: VHDL
VHDL: Very high speed integrated Circuits Hardware Description Language
Institute of
Microelectronic
11: CAD & Design Flow Systems 31
begin
Gate-Level
delay_register:
process(reset,clk) Netlist
begin
RTL-Synthesis
if reset='1' then
x_q <= (others => '0');
(Synopsys)
elsif (clk'event and clk='1') then
x_q <= x_in;
end if;
end process;
Placement &
Production Routing
(Cadence/Mentor)
ASIC Layout
Institute of
Microelectronic
11: CAD & Design Flow Systems 32
Future Outlook: Networks-on-Chip
– Regular platform integrating – Separation between
independent subsystems Communication and
• combine structures of Computation
today‘s SoC complexity
Generic
µP ASIC
Interface
Router
High-Speed
FPGA MEM Interconnect
Institute of
Microelectronic
11: CAD & Design Flow Systems 33
Co-Simulation Implementation
HW/SW-Partitioning
SW Library HW Library
Communication Synth.
NoC Mapping
Dynamic
Allocation/Re-
HW-Specification SW-Specification Mapping during
NoC Placement
Operation
Synthesis Compilation
Placement/Routing Real-Time OS
Heterogeneous HW-/SW-System
Institute of
Microelectronic
11: CAD & Design Flow Systems 34
Application Scenario: Mobile Video Terminal
Different Configurations for:
• High Quality (Resolution) Downstreaming
• Low-Power Mode (Quality Reduction)
• Image Compression and Upstreaming
• Multi-Stream Modes
DISPLAY
Displ.
CTRL
Institute of
Microelectronic
11: CAD & Design Flow Systems 35
12. Digital Subsystem Design
Institute of
Microelectronic
Systems
Weinberger Structuring
Institute of
Microelectronic
12: Digital Design Systems 2
Weinberger Structuring (2)
Weinberger structuring:
Institute of
Microelectronic
12: Digital Design Systems 4
3-to-8 decoder (2)
Institute of
Microelectronic
12: Digital Design Systems 5
Institute of
Microelectronic
12: Digital Design Systems 6
Example 2
F =U +V +W + X +Y
Institute of
Microelectronic
12: Digital Design Systems 7
Example 2 (2)
Institute of
Microelectronic
12: Digital Design Systems 8
Example 2 (3)
Gate matrix layout is a character based layout style for custom CMOS
circuitry. It is a regular design style employing a matrix of intersecting
transistor diffusion rows and poly-silicon columns such that intersections
are potential transistor sites.
Creating a gate matrix. Representational line drawing or stick figure
using the levels of interconnections available e.g. poly-silicon gate
technology poly-silicon metal diffusion.
– Immediately draw series of parallel poly lines corresponding to the
number of inputs to the circuit (may become more if an output is chosen to
be poly-silicon)
– Subsequent transistor placements will be determined by two factors, i.e.
input column and serial or parallel association among transistors.
– After row definition, further interconnections may be done with horizontal
and vertical metal interconnection tracks\item final improvements
Institute of
Microelectronic
12: Digital Design Systems 10
Gate matrix layout (2)
C = AB = AB
( )
S = AB + A B = A + B B + ( A + B ) A
= AB B + AB A = AB B ⋅ AB A
Institute of
Microelectronic
12: Digital Design Systems 12
Half adder realizations
Institute of
Microelectronic
12: Digital Design Systems 13
N n-channel transistor
P p-channel transistor
+ metal-poly or metal-diffusion crossover
* contact
| poly-silicon or n-diffusion wire
! p-diffusion wire
: vertical metal
- horizontal metal
Institute of
Microelectronic
12: Digital Design Systems 14
Character definitions (cont.)
Institute of
Microelectronic
12: Digital Design Systems 15
Rules
The following rules summarize the gate-matrix technique:
– Poly-silicon runs only in one direction and is of constant width and pitch
– Diffusion wires (of constant width) may run vertically between poly-silicon
columns.
– Metal may run horizontally and vertically. Any pitch departures from a
minimum (e.g. power rails) are manually specified.
– Transistors can only exist on poly-silicon columns.
Wide transistors may be specified by abutting two ort more N or P
symbols.
Institute of
Microelectronic
12: Digital Design Systems 16
Summary of gate matrix properties
Institute of
Microelectronic
12: Digital Design Systems 17
optimal
Institute of
Microelectronic
12: Digital Design Systems 18
EXOR implementation
Institute of
Microelectronic
12: Digital Design Systems 19
Institute of
Microelectronic
12: Digital Design Systems 20
Complex gates (2)
In the following, the consideration is limited to AND/OR networks realized in
complex gate CMOS by means of series/parallel connections of transistors.The
topology of the NMOS network and the PMOS network are assumed to be dual.
The delay of a complex CMOS cell mainly depends on the maximum number of
series transistors between VDD or VSS and the cell output, which is called level
of the complex cell. This quantity has a direct influence on the charging or
discharging resistance of the cell. Generally, cells with less than four levels are
desirable. The number of cells with parallel/serial topology is given by the
following table:
Institute of
Microelectronic
12: Digital Design Systems 21
Institute of
Microelectronic
12: Digital Design Systems 22
Basic layout strategy
Institute of
Microelectronic
12: Digital Design Systems 23
Institute of
Microelectronic
12: Digital Design Systems 24
Optimized layout
An even more sophisticated layout arrangement which reduces the
required area is shown in (b)
Institute of
Microelectronic
12: Digital Design Systems 25
Optimal layout
The best layout is achieved by the following transistor arrangement,
logically equivalent to the previous figures:
Institute of
Microelectronic
12: Digital Design Systems 26
Graph theoretical algorithm
The p-side and the n-side of the circuit can be formulated as graphs
which can be defined:
G P = (V P , E P ) p − side network
G N = (V N , E N ) n − side network
Graph properties:
– the graphs are series/parallel graphs (CMOS complex gate
property/assumption)
– every source/drain potential is represented by a vertex V
– every transistor is represented by an edge E, connecting the vertices
representing source and drain
– edges are labeled by the corresponding transistor gate input signal
– GP and GN are dual
Institute of
Microelectronic
12: Digital Design Systems 27
If there exist Euler paths for GN and GP then all transistors can be chained
by diffusion areas. Otherwise the graphs have to be partitioned into sub-
graphs which have Euler graphs.
It's necessary to find a pair of paths for GP and GN with the same
sequence of labels, because p- and n-type transistors corresponding to
the same input have to be positioned at the same horizontal position
(poly line).
Institute of
Microelectronic
12: Digital Design Systems 28
Graph theoretical algorithm (3)
General algorithm:
– enumerate all possible decompositions of the graph model to find the
minimum number of Euler paths that cover the graph
– chain the gates by means of a diffusion area according to the order of the
edges in each Euler path and
– if more than two Euler paths are necessary to cover the graph model,
then provide a separation area between each pair of chains
Result: Search of minimal number of Euler paths is NP-complete.
Problem reduction:
An odd number of series or parallel edges can be reduced to a single edge:
Institute of
Microelectronic
12: Digital Design Systems 29
Problem reduction
Institute of
Microelectronic
12: Digital Design Systems 30
Problem reduction (2)
If there are gates in the logic diagram with an even number of inputs, additional
“pseudo” inputs have to be introduced in order to guarantee an odd number of
inputs. It is guaranteed by the second previously given theorem, that there exists
an Euler path for this modified problem. But the pseudo edges in the Euler path
have to be removed afterwards and then they can cause diffusion separations.
An algorithm for minimizing separations caused by pseudo edges is given in the
next section ( minimal interlace of normal and pseudo inputs).
Institute of
Microelectronic
12: Digital Design Systems 31
Institute of
Microelectronic
12: Digital Design Systems 32
Application of reduction rule
Institute of
Microelectronic
12: Digital Design Systems 33
This heuristic algorithm does not necessarily give the optimal layout, but if
the resulting sequence has no separation areas, it is the real optimal
solution.
Institute of
Microelectronic
12: Digital Design Systems 34
Algorithm for calculating minimal interlace
start An example of line.
Any
Yes
white triangle Put it in the line.
left?
No
Any Put it in the line,
Yes
blackwhite triangle and set the white
left? part on top.
No
Any
Yes
black triangle Put it in the line.
left?
No
Any Put it in the line,
Yes
blackwhite triangle and set the black
left? part on top.
No
Any
Yes
white triangle
left?
No
Institute of
stop Microelectronic
12: Digital Design Systems 35
Institute of
Microelectronic
12: Digital Design Systems 36
Example: carry look-ahead
This topology
does have Euler path!
Institute of
Microelectronic
12: Digital Design Systems 38
Comparison of space
Institute of
Microelectronic
12: Digital Design Systems 39
Institute of
Microelectronic
12: Digital Design Systems 40
Example: synchronous counter
Institute of
Microelectronic
12: Digital Design Systems 41
Institute of
Microelectronic
12: Digital Design Systems 42
Programmable Logic Arrays (2)
Institute of
Microelectronic
12: Digital Design Systems 43
Institute of
Microelectronic
12: Digital Design Systems 44
Architectures (1)
Institute of
Microelectronic
12: Digital Design Systems 45
Architectures (2)
Institute of
Microelectronic
12: Digital Design Systems 46
Example (1)
x0 x1 x2 z0 z1
• PROM implementation realizes all
0 0 0 1 1
of the 8 product terms
0 0 1 1 1
0 1 0 0 0
z 0 = x 0 x 1 x 2 + x 0 x 1x 2 + x 0 x 1 x 2 0 1 1 0 0
1 0 0 0 0
= x 0 x1 + x 0 x1 x 2 1 0 1 0 0
1 1 0 1 0
z 1 = x 0 x 1 x 2 + x 0 x 1x 2 + x 0 x 1 x 2
1 1 1 0 1
= x 0 x1 + x 0 x1 x 2
Institute of
Microelectronic
12: Digital Design Systems 47
Example (2)
z 1 = x 0 x 1 x 2 + x 0 x 1x 2 + x 0 x 1 x 2
= x 0 x1 + x 0 x1 x 2
Institute of
Microelectronic
12: Digital Design Systems 48
Floor Plan for PLA
A AND plane programming cell
O OR plane programming cell
AO AND-OR communication cell
IN AND plane input cell
OUT OR plane output cell
LA left AND plane cell
RO right OR plane cell
BL bottom left cell
BM bottom middle cell
BR bottom right cell
TL top left cell
TA top AND cell
PLA generic floor plan
TM top middle cell
TO top OR cell
TR top right cell
Institute of
Microelectronic
12: Digital Design Systems 49
Institute of
Microelectronic
12: Digital Design Systems 50
INV-NOR-NOR-INV Structure (1)
Institute of
Microelectronic
12: Digital Design Systems 51
Example:
General structure:
Institute of
Microelectronic
12: Digital Design Systems 52
INV-NOR-NOR-INV Structure (3)
Properties:
• high static power dissipation
• small area
• useful if high speed is not required
Institute of
Microelectronic
12: Digital Design Systems 53
Institute of
Microelectronic
12: Digital Design Systems 54
INV-NOR-NOR-INV Structure (5)
Institute of
Microelectronic
12: Digital Design Systems 55
Institute of
Microelectronic
12: Digital Design Systems 56
NAND-NAND Structure (1)
Example:
Institute of
Microelectronic
12: Digital Design Systems 57
Properties:
• NAND-NAND approach not recommended:
• decreasing performance at increasing number of inputs (because
of series connection of nMOS transistors)
• high static power dissipation
Institute of
Microelectronic
12: Digital Design Systems 58
Static CMOS PLA (1)
Properties:
• no static power dissipation
• area increase becomes unacceptable for large PLAs
• working fast
Institute of
Microelectronic
12: Digital Design Systems 59
Institute of
Microelectronic
12: Digital Design Systems 60
Static CMOS PLA Layout
Institute of
Microelectronic
12: Digital Design Systems 61
Institute of
Microelectronic
12: Digital Design Systems 62
Dynamic CMOS PLA (2)
Institute of
Microelectronic
12: Digital Design Systems 63
Institute of
Microelectronic
12: Digital Design Systems 64
Noise in PLA circuits (1)
Institute of
Microelectronic
12: Digital Design Systems 65
Institute of
Microelectronic
12: Digital Design Systems 66
Optimization of PLAs – Logic Minimization
Institute of
Microelectronic
12: Digital Design Systems 67
Row-folded PLA
Column-folded PLA
Institute of
Microelectronic
12: Digital Design Systems 68
Optimization of PLAs – Multi Sided Access
Institute of
Microelectronic
12: Digital Design Systems 69
• Delay is determined by
– (W/L) of the AND/OR load
– (W/L) of the AND/OR cells
• Minimum Delay:
– large load current Iload
– (W/L)ORplane = e*(W/L)ANDplane
• Limitations:
– Iload limited by:
• the total power of the PLA
• the internal logical ‘0’: (I * RnMOS = ‘0’) < VT !
– the stage sizing factor e for successive stages can not always be
realized due to the floorplan
Institute of
Microelectronic
12: Digital Design Systems 70
Automatic PLA Layout Generation (1)
logical optimization
Institute of
Microelectronic
12: Digital Design Systems 71
AND_BEGIN 1 1 X 1 0
P1 := I1 * I2;
P2 := I1 * I3; 1 X 1 1 0
P3 := I2 * I3;
P4 := I1 * I2' * I3'; X 1 1 1 0
P5 := I1' * I2 * I3';
P6 := I1' * I2' * I3; 1 0 0 0 1
P7 := I1 * I2 * I3;
END_END
0 1 0 0 1
0 0 1 0 1
OR_BEGIN
O1 := P1 + P2 + P3; 1 1 1 0 1
O2 := P4 + P5 + P6 + P7;
OR_END
Institute of
Microelectronic
12: Digital Design Systems 72
13. ASIC Design Concepts:
Gate Arrays
Institute of
Microelectronic
Systems
Cost Issues
• Design Costs
• Non-recurring Engineering Costs (NRE)
• Manufacturing Costs
Total Costs
Costs per Chip
Design Design
+ NRE + NRE
Costs Costs
= Fixed = Fixed
Costs Costs
Institute of
Microelectronic
13: Gate Arrays Systems 2
Cost Issues: Design Costs
Synthesis:
• High-level Synthesis (allocation, scheduling, binding)
• Logic Synthesis (RTL to logic translation, FSM synthesis, logic optimisation, retiming)
• Layout Synthesis (module generators, PLA generators, Place & Route)
Institute of
Microelectronic
13: Gate Arrays Systems 3
ASIC
Cell-based Array-based
Institute of
Microelectronic
13: Gate Arrays Systems 4
Gate Arrays – Introduction (1)
Institute of
Microelectronic
13: Gate Arrays Systems 5
Institute of
Microelectronic
13: Gate Arrays Systems 6
Gate Arrays – Introduction (3)
Institute of
Microelectronic
13: Gate Arrays Systems 7
Institute of
Microelectronic
13: Gate Arrays Systems 8
IMI Grid Structure (2)
Institute of
Microelectronic
13: Gate Arrays Systems 9
From the figure on the previous slide the following features can be
seen:
• Cells containing transistors are clustered around the VDD and VSS
buses
• In each cell four horizontal bars (crossing VDD and VSS) can be
seen. The thick bar represents a poly underpass while the three
thin bars are common poly input lines to an nMOS/pMOS
transistor pair
• Between cell columns a column of short horizontal poly
underpasses is placed
Institute of
Microelectronic
13: Gate Arrays Systems 11
Institute of
Microelectronic
13: Gate Arrays Systems 12
IMI Grid Structure (6)
Institute of
Microelectronic
13: Gate Arrays Systems 13
Institute of
Microelectronic
13: Gate Arrays Systems 14
IMI Grid Structure (8)
Institute of
Microelectronic
13: Gate Arrays Systems 15
Institute of
Microelectronic
13: Gate Arrays Systems 16
IMI Grid Structure (10)
Institute of
Microelectronic
13: Gate Arrays Systems 17
Institute of
Microelectronic
13: Gate Arrays Systems 19
Personalization of
IMI and CDI gate
arrays for an
inverter :
a) schematic
b) IMI layout
c) IMI layout
d) CDI layout
Institute of
Microelectronic
13: Gate Arrays Systems 20
Personalization Examples (2)
Institute of
Microelectronic
13: Gate Arrays Systems 21
Institute of
Microelectronic
13: Gate Arrays Systems 22
Qualification of Gate Array Design Style
• Advantages:
– Lower number of individual masks needed
– Higher number of pieces for uncustomized master (cost reduction)
– Many others for masters, second source fabrication, libraries and
design systems
• Disadvantages:
– Area overhead (by unused transistor cells)
– Overdimensioned routing channels
– Larger cell size
Institute of
Microelectronic
13: Gate Arrays Systems 23
Institute of
Microelectronic
13: Gate Arrays Systems 24
14. Programmable Logic Devices
Institute of
Microelectronic
Systems
Overview
• Introduction
• Programming Technologies
• Basic Programmable Logic Device (PLD) Concepts
• Complex PLD
• Field Programmable Gate Array (FPGA)
• CAD (Computer Aided Design) for FPGAs
• Design flow for Xilinx FPGAs
• Economical Considerations
• Logic design Alternatives
Institute of
Microelectronic
14: PLDs Systems 2
Introduction
Institute of
Microelectronic
14: PLDs Systems 3
Programming Technologies
Programmable Logic Device can be programmed in two ways:
1. Mask programming (in some few cases)
2. Field programming (typical)
1.) Mask programming: programming of device is done in the mask level.
+ good timing performance due to internal connections hardwired during
manufacture
+ cheap at high volume production
- programmed by manufacturer
- development cycle = weeks or months
- not re-programmable
Institute of
Microelectronic
14: PLDs Systems 4
Programming Technologies (II)
2.) Field programming: Programming of device is done by the user. The
programming technologies are of two
types
Institute of
Microelectronic
14: PLDs Systems 5
Institute of
Microelectronic
14: PLDs Systems 6
Basic PLD Concepts (II)
2.) Memory based: Device with fixed AND array and programmable OR array
• output of OR gate has fixed connection with input of AND gates
• PROM, EPROM and EEPROM are memory based PLD device
• GAL:
- has array of programmable AND gates and OLMC (Output
Logic Macro Cell)
- EEPROM - based programming Technology
- programmable output polarity
- device can be configured as dedicated input and output mode
Institute of
Microelectronic
14: PLDs Systems 8
Figure 2:
Combinational PAL
device, AMD PAL16L8
Institute of
Microelectronic
14: PLDs Systems 9
Figure 3:
Sequential PAL devices,
AMD PAL16R8
Institute of
Microelectronic
14: PLDs Systems 10
Figure 4:
Arithmetic PAL
device, AMD
PAL16A4
Institute of
Microelectronic
14: PLDs Systems 11
• GAL16V8 has 8
configurable OLMC
(Output Logic Macro Cell)
• each OLMC has
programmable XOR to get
active low or high output
signal
• there is a feedback from
output to input
Institute of
Microelectronic
14: PLDs Systems 12
Complex PLD (CPLD)
• is combination of multiple PAL or GAL type devices on a single chip
• CPLD architectures consists of
- Macrocells
- configurable flip-flop (D, T, JK or SR)
- Output enable/clock select
- Feedback select
• CPLD has predictable time delay because of hierarchical inter-connection
• easy to route, very fast turnaround
• performance independent of netlist
• devices is erasable and programmable with non-volatile EPROM or
EEPROM configuration
• wide designer acceptance
• has more logic density than any classical PLDs device
• relatively mature technology, but some innovation still ongoing
Institute of
Microelectronic
14: PLDs Systems 13
Figure 6:
Complex PLD device
Altera EP1800
Institute of
Microelectronic
14: PLDs Systems 14
Erasable CPLD
• EP1800 is erasable PLD device and has 48 macrocells, 16 dedicated
input pins and 48 I/O pins.
• device is divided into four quadrants, each contains 12 macrocells and
has local bus with 24 lines and a local clock
• out of 12 microcells, 8 are “local” macrocells and 4 are “global”
macrocells
Institute of
Microelectronic
14: PLDs Systems 17
Institute of
Microelectronic
14: PLDs Systems 18
Electrically Erasable PLD (III)
Institute of
Microelectronic
14: PLDs Systems 19
•“The routing resources are both the greatest strength and weakness
of the FPGA’s”
Institute of
Microelectronic
14: PLDs Systems 20
Field Programmable Gate Array (II)
Institute of
Microelectronic
14: PLDs Systems 21
a) pass-transister b) transmission
c) multiplexer
gate
Figure 16: SRAM based programming technology
Institute of
Microelectronic
14: PLDs Systems 23
Institute of
Microelectronic
14: PLDs Systems 24
Anti-fuse Programming
Institute of
Microelectronic
14: PLDs Systems 25
Figure 20:
EEPROM programming
technology
Institute of
Microelectronic
14: PLDs Systems 28
Commercially available FPGAs
Institute of
Microelectronic
14: PLDs Systems 29
Xilinx FPGA
• Xilinx architecture
comprises of two
dimensional array of
logic block called as
CLB.
• They are
interconnected via
horizontal and vertical
routing channel
• I/O Blocks are user
configurable to provide
an interface between
external package pin
and input logic
Institute of
Microelectronic
14: PLDs Systems 32
Xilinx FPGA (IV)
• interconnects of XC4000 device are arranged in horizontal and vertical
channels
• each channel contains some number of wire segments
• They are,
Single length lines:
• they span a single CLB
• provide highest interconnect flexibility and offer fast routing
• acquire delay whenever line passes through switch matrix
• they are not suitable for routing signal for long distance
Double length lines:
• they span two CLB so that each line is twice as long as single length
lines
• provide faster signal routing over intermediate distance
Longlines:
• Longlines form a grid of metal interconnect segments that run entire
length or width of the array
• they are for high fan-out and nets with critical delay
Institute of
Microelectronic
14: PLDs Systems 33
Institute of
Microelectronic
14: PLDs Systems 34
Xilinx, Virtex-II ProTM FPGA family (II)
• The Virtex-II Pro FPGA consists
of the following components:
Institute of
Microelectronic
14: PLDs Systems 36
Xilinx, Virtex-II ProTM FPGA family (IV)
• IOB blocks include six storage
elements, as shown in Figure.
• Each storage element can be
configured either as an edge-
triggered D-type flip-flop or as a
level-sensitive latch.
• On the input, output, and 3-state
path, one or two DDR (Double Data
Rate) registers can be used.
• Double data rate is directly
accomplished by the two registers
on each path, clocked by the rising
edges (or falling edges) from two
different clock nets.
Institute of
Microelectronic
14: PLDs Systems 37
Institute of
Microelectronic
14: PLDs Systems 38
Actel/TI FPGA architecture (II)
Act-1 Logic Module:
• The Act-1 logic module has 8 - input and 1- output
logic circuit
• it has only combinatorial logic circuit module
• The Logic Module can implement the four basic
functions which are NAND, AND, NOR and OR
Institute of
Microelectronic
14: PLDs Systems 39
C module
Institute of
Microelectronic
14: PLDs Systems 41
Logic Optimization
Technology Mapping
Placement
Routing
Programming Unit
Figure 33: Design flow for
FPGA Institute of Configured FPGA
Microelectronic
14: PLDs Systems 43
Design
DesignEntry
Entry Design flow for Xilinx FPGA
Design validation
Design validation
Device
DeviceSelection
Selection
DESIGN IMPLEMENTATION
Design
Designvalidation
validation Mapping
Mapping
Placement
Placement
Routing
Routing
Design
Designvalidation/
validation/
Back
BackAnnotation
Annotation
Institute of
Microelectronic
14: PLDs Systems 45
FPGA MPGA
1. Cost per chip is less for low 1. Less cost per chip for high volumes
volumes (low fixed cost) 2. Fabrication is done with hardwired
2. Short turnaround time metal connection layer, this results
3. Design flexibility is high and fast operation
cost for re-designing is low 3. High logic density
4. Speed is relatively slow 4. Very high costs for low volumes
because of resistance and (high fixed cost)
capacitance of the 5. No redesign flexibility
programmable switch
5. Programmable switches and
configuration network require
chip area, this results
decreased in logical density
Institute of
Microelectronic
14: PLDs Systems 46
Logic design Alternatives
Institute of
Microelectronic
14: PLDs Systems 47
Institute of
Microelectronic
14: PLDs Systems 48
CPLDs and FPGAs
Chapter 15
Arithmetic Units
In the following chapter, basic arithmetic units like adders, subtracters, or multipliers are
discussed. These components are widely used in VLSI circuits e. g. for the digital signal
processing application domain. More detailed descriptions on arithmetic units can be found
e. g. in [4] or [1].
C = A1 A2 (15.1)
S = A1 ⊕ A2 (15.2)
is called half–adder and can be used to calculate the sum S of two bits A1 and A0 . A possible
carry is set at the C output.
Full Adder For adding binary numbers having a bitwidth of more than one single bit, the
concept of the half–adder has to be extended. The carry output of less significant bits in the
addition process have to be taken into account in the more significant bits. For that, a new
circuit structure called full–adder is used which is based on the following functional equations:
These equations can be realized either by logic gates (AND, OR, XOR) or by two half–adders
and an OR gate.
The following section introduces the basic arithmetic components used in VLSI designs. First,
adder and subtracter architectures are discussed. Since addition and subtraction for binary
VLSI Design
Course 15-1
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Adders / Subtracters
numbers can be calculated by almost the same hardware (by selecting the appropriate comple-
ment representation first), the term “adder” is used as synonym for both adder and subtracter
in the following section.
Serial Adders
- Shift Register
.....
n ....
?Sum ?Cout
At the beginning of the operation, the two n–bit operands A and B are loaded to the shift
registers. The carry register is cleared resp. set to the value of the carry input. During the
next n clock cycles (if a wordlength of n bits for each operand is assumed), the operands are
added bitwise in the full–adder and stored in the sum register. For that, the operand shift
registers apply the least significant bit to the full–adder inputs whereas the sum shift register
reads the current sum output of the full–adder at the serial input and and shift the contents
by one bit to the right each clock cycle. The carry output of an addition is stored in the carry
register for use in the next clock cycle. The n-bit sum and the carry output are available after
(n+1) clock cycles [1 operand load, n calculation].
The serial adder has the smallest hardware complexity which is wordlength independent (if
the shift registers are not considered) but requires the highest computation time of all adder
implementations.
Parallel Adders
Ripple Carry Adder Chained full–adders which form an adder of the required wordlength
are called ripple carry adder since during addition the carry “ripples” through the whole chain
from the least significant to the most significant bit as shown in Fig. 15.2:
The addition time is therefore dependent on the wordlength of the operands.
Carry Lookahead Adder To speed up the addition process, lookahead methods can be
applied to reduce the time associated with carry propagation. The carry input of a stage
i is calculated directly from the input of the preceding stages i − 1, i − 2, . . . i − k rather
VLSI Design
Course 15-2
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Adders / Subtracters
? ? ? ?
CoutSum[n-1] Sum[1] Sum[0]
than allowing carries to ripple from stage to stage. To perform that task, the cout of ordinary
full–adders are substituted by the generate and propagate signals defined by
gi = ai bi (15.5)
pi = ai + bi . (15.6)
As can be seen in the equations above, the carry lookahead logic circuits can be realized by a
two level logic implementation, that means the whole addition is performed in constant time
(without influence of wordlength). The implementation of the carry lookahead corresponding
to the above equations is shown in Fig. 15.3.
A[3] B[3] A[2] B[2] A[1] B[1] A[0] B[0]
? ? ? ? ? ? ? ?
Cin[3]
+ + Cin[2] + Cin[1] Cin[0]
+ Cin
? ? ? ?
Sum[3] Sum[2] Sum[1] Sum[0]
?
Cout
The number of gate inputs is restricted due to technological constraints. That means, the
wordlength of a carry lookahead cannot increase above any number. Due to that reason,
adders for a big wordlength are split into smaller groups processed by single carry lookahead
adders with reasonable wordlengths as shown in Fig. 15.4.
VLSI Design
Course 15-3
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Adders / Subtracters
? ? ? ? ?
Cout Sum[15:12] Sum[11:8] Sum[7:4] Sum[3:0]
The carry signal produced by a group is forwarded to the next group so that, if the group is
considered as a single block, the carry ripples through different blocks as in the carry ripple
adder. Alternatively, a hierarchical approach might be chosen in a way, that for each group
a group-generate as well as a group-propagate signal are generated which are evaluated by a
second level carry lookahead circuit.
Carry Select Adder In the following adder type, the wordlength of the operands is again
subdivided into clusters (see Fig. 15.5). The cluster subwordlength is chosen to balance the
time required for intra-cluster carry ripple additions and carry calculation of the preceding
clusters. The additions are all performed in parallel assuming the following two cases: carry in
of a cluster are ’0’ and are ’1’. The results (cluster carry out and partial sum C/Sum[i : j]) are
forwarded to multiplexors which select the appropriate value depending on the carry output of
the preceding stages. Since the time to switch a multiplexor is almost negligible compared to
the time required for the carry ripple additions, the overall addition time is almost independent
of the wordlength.
A[15:12] B[15:12] A[11:8] B[11:8] A[7:4] B[7:4] A[3:0] B[3:0]
? ? ? ? ? ? ? ?
4 bit 4 bit 4 bit 4 bit
+
CR-Adder
0 +
CR-Adder
0 +
CR-Adder
0 +
CR-Adder
Cin
? ? ? ? ? ?
4 bit 4 bit 4 bit
+
CR-Adder
1 +
CR-Adder
1 +
CR-Adder
1
H1? ?
0
H1? ?
0
H1? ?
0 C[3]
H
H H
H H
H
? ? C[11] ? C[7] ? ?
Cout Sum[15:12] Sum[11:8] Sum[7:0] Sum[3:0]
Since the carry select adder requires two carry ripple adder chains for each cluster (except in
the least significant), the hardware amount is almost twice that of a simple ripple carry adder.
It is slower than a carry lookahead adder but compared to that type it has a higher regularity
and is for that reason better suited for VLSI implementation.
VLSI Design
Course 15-4
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Adders / Subtracters
Carry Save Adder For the addition of very many addends (e. g. in parallel multipliers),
the time required for full carry propagation even in the case of use of carry lookahead adders
might be to high for some applications. To achieve constant addition time complexity, the
propagation of computed carry results is avoided in the same stage and both, the S and the
Cout vectors are connected to the correct adder in the succeeding stage. This concept requires
a final addition to merge the sum and the carry vector of the final stage into a single sum
vector which can be realized using any of the adders discussed above (in Fig. 15.6 a carry ripple
adder has been chosen for simplicity). In a carry save adder, the adder delay is increased by
one full-adder delay if it is extended by an additional operand.
X[n-1] Y[n-1] X[2] Y[2] X[1] Y[1] X[0] Y[0] ....
...
?? ?? ?? ??
Full-Adders + + + + Cin
. . . .
..... ..... ..... .....
.... . . . .... .... ....
W[n-1] W[2]
W[1]
W[0]
? ? ? ? ? ? ??
Full-Adders + + + + 0 Carry
Save
Adder
. . . .
.... .... .... .... Array
.... . . .
.... .... ....
V[n-1] V[2]
V[1]
V[0]
?? ? ? ? ? ? ? ??
Full-Adders + 0 + + + + 0
. . .
.... .... .... ....
..... ..... # ..... .....
...
## .......
...
? ?? ?? ?? ?? Final
Full-Adders + + + . Cout[2]
. . + Cout[1] + 0 Carry
Propagation
...
....
? ? ? ? ? ? ?
Cout Sum[n+1] Sum[n] Sum[n-1] Sum[2] Sum[1] Sum[0]
.... ..
... ....
Stages required to
evaluate the carry outputs
of preceeding stages
VLSI Design
Course 15-5
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Multipliers
15.2 Multipliers
Shift and Add Multiplier The most common multiplier is the Shift and Add Multiplier
(SAA Mult.). Two binary unsigned integer words X and Y of bit-size Nx and Ny , respectively,
can be written using their binary representation:
x −1
NX Ny −1
X
X= xi 2i Y = y j 2j (15.12)
i=0 j=0
x −1
NX
Z= xi Y 2i = (...((xNx −1 Y )2 + xNx −2 Y )2 + ...)2 + x0 Y (15.13)
i=0
In each step of the recurrence one bit of X is multiplied (a simple AND-operation) with Y and
added to the intermediate result Di which is shifted one bit. Figure 15.7 shows the general
structure of the Shift and Add multiplier with bit-sizes Nx and Ny .
For this multiplier type it takes Nx clock cycles to complete the multiplication, since one bit
of X is processed each step. The delay of the combinatorical circuit (which determines the
maximum clock frequency) is approximately: Ny δF A (δF A is the delay of a full adder, the
register delays are not considered).
The cost of a Shift and Add Multiplier is (3Ny + 2Nx )γF A (the cost of a full adder γF A is
assumed to be equal to the cost of a register).
Carry Save Multiplier In opposite to the SAA-Multiplier, the Carry Save Multiplier
(CSM) calculates the result in one step. Every bit of the first argument is multiplied with
every bit of the second argument concurrently. The results are added up according to the
position of the source bits.
VLSI Design
Course 15-6
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Multipliers
The CSM consists of combinatorial logic only. The multiplication of two 4-bit binary numbers
can be written as
X3 X2 X1 X0
Y3 Y2 Y1 Y0
————————–
P30 P20 P10 P00
P31 P21 P11 P01
P32 P22 P12 P02
P33 P23 P13 P03
—————————————————
Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0
where Pij = Xi ∧ Yj . The addition of all Pij terms can be done in an array of full adders.
Figure 15.8 shows the general structure of a Carry Save Multiplier assuming Nx ≥ Ny . Part
II is omitted in case of same size for Nx and Ny . The Carry In of the full adder is supplied in
the upper right corner. Not every full adder needs a Carry In, for some position half adders
are sufficient. The adder Carry Out is depicted in the lower left corner.
The delay of this type of multipliers is (Nx + Ny − 2)δF A . The cost is (Nx − 1)Ny γF A plus
(2Ny + 2Nx )γF A , if X, Y and the Z-register are accounted as in the shift and add case above.
Block Multiplier A combination of the fully parallel Carry Save Multiplier and the serial
Shift and Add Multiplier leads to a flexible architecture which can be configured from working
fully serial to working fully parallel. Many combinations in between are possible, thus allowing
the adaptation to given specifications and restrictions.
VLSI Design
Course 15-7
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Multipliers
The basic idea of the block multiplication is to divide each argument into blocks of the same
size. Each block of the first argument is multiplied with each block of the second argument in
a fast Carry Save Multiplier. All calculated block products are added up taking into account
the positions of the current argument blocks. Therefore, as in the Shift and Add Multiplier,
the arguments and the intermediate result have to be shifted in an appropriate way.
.
. . . . . . .......
.
X register nx Y register
.
AA
.......... .... ...
.. ny
Carry Save
Multiplier
XX ( ( nx+ n y
XX nx+ n y ..... ..... ..... .....
.......... ..........
Adder
( ( nx+ n y
Controller
..... .....
........ ..........
..
.
. . . .......
.
Z register
Figure 15.9 shows the architecture of the block multiplier. The argument registers and the
Carry Hold Register are simple shift registers. The intermediate result has to be shifted in
both directions, thus requiring a bidirectional shift register. Signals for controlling the shift
directions are generated by a controller, which can be realized using a simple counter.
The multiplier can be configured by varying the block sizes of the arguments. With increasing
block sizes the multiplier becomes more parallel, thus reducing the number of clock cycles
needed to perform a multiplication. Larger block sizes, however, require a larger Carry Save
Multiplier, which increases the area needed to realize the multiplier. Assuming that the first
argument is separated in kx Blocks of size nx and the second argument in ky blocks of size
ny , the multiplier needs kx ∗ ky clock cycles to perform a multiplication. The delay of the
multiplier is determined by the size of the ripple carry adder, which has a width of nx + ny
bits.
VLSI Design
Course 15-8
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Microarchitectures
Chapter 16
Microarchitectures
The term microarchitecture describes the domain between the macroarchitecture (the lowest-
level hardware visible to the user) and the implementation technology (MOS VLSI) [9]. For
better analysis, microarchitectures are usually divided into 3 parts: the data path which
performs the data manipulations and calculations, the control path is used to apply correct
sequences of control signals to the data path, and the input/output unit providing access
from/to the external world (see Fig. 16.1)
Control
.. ..
.....
Signals
Data Control
Path Path
Status
.. ...
....
-
Flags
...
.....
6
?
Input /
Output
.....
6
....
?
External I/O Data
The control path which can be interpreted as a more or less complex finite state machine
(FSM) can be either hardwired (used in fixed applications like a controller for the serial
adder in Fig. 15.1) or programmable (microprocessor with downloadable microcode). The
microarchitecture scheme as shown in Fig. 16.1 can represent quite simple circuits (like a
traffic light controller) as well as complex microprocessors.
VLSI Design
Course 16-1
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Datapath Design
In the datapath of a microarchitecture, the operations and data manipulations are performed.
For that, control signals are generated by the control path depending on the operation(s) to
be executed. By forwarding information about the status of the data path (e. g. exceptional
conditions, underflow, overflow, division by zero, . . . ), the control path is able to react in a
correct way to the actual needs. The state signals (flags) can be used to enable conditional
branching depending on the state of the data path. Data processing is usually performed by
typical components like ALUs, shifters, register files, . . . .
The following section shows how datapath structures are usually implemented in larger VLSI
designs. For that, we assume the following simple datapath structure:
Control Signals
Clock OP-Sel Sel Shift Clock
Cin
..........
?
.
Ain - -PP
?
P ?
Inputs -@
? ..........
?
.
..... 6
.
.....
- Rout
Output
@ - -
..........
? -
.
?
Bin - -
?
Status Flags
Status Signals
Figure 16.2: Datapath example
The datapath consists of 2 input registers for the input operands Ain and Bin, an arithmetic-
logic unit (ALU), a multiplexor to select between the Cin input and the ALU output, a
shifter unit, and an output register. The datapath structure could be implemented based on
standard cells, where basic library cells (like gates, muxes, registers, . . . ) are selected and
interconnected, or, if a datapath compiler is used, based on a set of several layout tiles as
shown in Fig. 16.3.
A datapath compiler creates a regular layout depending on the wordlength of the operands by
stacking the appropriate number of tiles in the layout. The horizontal structure consisting of
a set of tiles performing all functions for a single bit is called bit slice. If we apply vertical cuts
to the layout structure, the whole layout will be subdivided in layout blocks corresponding to
a single function implemented. These layout stripes are called functional slices.
As an example for a discrete datapath implementation the 2901 bit-slice will be discussed in
the following section (→ [3]).
The 2901 integrated circuit contains besides of a 16 word register set, a Q register (used
VLSI Design
Course 16-2
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Datapath Design
Control
Signal
Buffers
.....
.....
.
Bit[0]
Bit[1] Bit
Slices
.
Bit[n-1] ....
.....
Status
Buffers
..... .
..... ....
. .....
Functional Slices
Figure 16.4: 2901 4-bit ALU slice Figure 16.5: 2901 µ-OPs
VLSI Design
Course 16-3
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Datapath Design
The 2901 IC has been widely used for applications in digital signal processing and for minicom-
puters. It is available as stand-alone IC and some silicon manufacturers also provide macrocells
with the functionality of the 2901 (for different wordlengths) that might be included to ASIC
designs.
VLSI Design
Course 16-4
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Controller Implementations
Controllers are used to apply a sequence of control signals to the datapath components. These
control signals are chosen to perform the desired operation(s) within the datapath. The
datapath is able to interact with the controller unit by sending appropriate status signals
(e. g. overflow flag when an addition is performed, equal flag as a result of a comparison, . . . ).
The controller can be designed to change the sequence of control signals depending on these
flags (used e. g. in microprocessors to perform conditional branches).
The general structure of such a controller can be found in Fig. 16.7.
Environmental Inputs
? ?
Combinational
Logic
?
State Register
?
Control Outputs
It consists of a combinational logic block and a register. The combinational logic block gener-
ates out of the input signals (which can be e. g. an instruction word defining the sequence of
control signals to be generated, state flags, . . . ) and parts of the previous register content the
control output signals as well as the information which step in the sequence of control signals
is to be executed in the next cycle. The controller can be seen as a realization of the abstract
model of a finite state machine.
To get a high level of regularity in the design of a controller, very often regular layout structures
(like ROMs or PLAs) are used to implement the combinational logic block rather than directly
implement the logic functions in separate gates (random logic). The random logic approach
was chosen in the control unit of many early microprocessors (≤ 8 bit) and in RISC (Reduced
Instruction Set Computer) processors whereas the regular layout structures are used in CISC
(Complex Instruction Set Computer) processors to simplify their controller design. Regular
structures simplify the design process due to the fact that if modifications in the control
sequences are required only the contents of a PLA resp. a ROM has to be redefined instead
of designing a whole combinational gate network. Since the design process for the latter
approach can be compared with programming a memory contents instead of circuit design,
that approach is called microprogramming and will be considered in detail in the sequel.
VLSI Design
Course 16-5
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Controller Implementations
@ Address
@ Decoder
ROM
?
Control NA
6 6
?
Control Outputs Environmental Inputs
. . . . . . . . . . . . . . .P
. .L. .A. . . ......
. .
. .
. .
. .
. .
. OR A ND .
. .
. .
. .
. .
...................... .6
.....
?
Control NA
6 6
?
Control Outputs Environmental Inputs
Depending on the generation of the control signals, two types of microinstructions can be
distinguished:
VLSI Design
Course 16-6
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Controller Implementations
? ? ? ? ? ? ? ? ? ? ? ?
..... .
..... .....
. ....
Control Lines
?
@
@
Control Bit Decoder
? ? ? ? ? ? ? ? ? ? ? ?
..... .....
.....
. .....
Control Lines
In controller design, one can proceed one step further: if a microinstruction itself can be
represented as a sequence of ‘sub’microinstructions (so called nanoinstructions, the structure
shown in Fig. 16.12 can be used. The most simple approach, which already has been mentioned
under vertical microcode, is a single step ‘sequence’ of nanoinstructions, namely the decoding
of the control outputs out of an encoded control vector from the microcode control memory.
VLSI Design
Course 16-7
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Controller Implementations
If feedback is introduced in the decoder PLA (via the NNA [nanocode next address] register),
control sequences can be generated by the nanocode PLA. As long as a nanocode sequence
is running, the MNA [microcode next address] register is halted. In the case that many
microinstructions use the same nanocode sequences, significant savings in implementation
area for the whole controller can be reached.
. . . . . . . .Microcode
. . . . . . . . .PLA
..... ......
. .
. .
. .
. .
. .
. OR A ND .
. .
. .
. .
. .
...................... .6
.....
?
MNA
6 6
Environmental Inputs
?
Control Outputs
VLSI Design
Course 16-8
Darmstadt University of Technology
Institute of Microelectronic Systems 0
17. Semiconductor Memories
Institute of
Microelectronic
Systems
Overview
• Introduction
• Read Only Memory (ROM)
• Nonvolatile Read/Write Memory (RWM)
• Static Random Access Memory (SRAM)
• Dynamic Random Access Memory (DRAM)
• Summary
Institute of
Microelectronic
17: Semiconductor Memories Systems 2
Semiconductor Memory Classification
Memory array
• Memory storage cells
• Address decoders
• Simple combinatorial Boolean network which produces a specific output for each input
combination (address)
• ”1“ bit stored - absence of an active transistor
• ”0“ bit stored - presence of an active transistor
• Organized in arrays of 2N words
• Typical applications:
• store the microcoded instructions set of a microprocessor
• store a portion of the operation system for PCs
• store the fixed programs for microcontrollers (firmware)
Institute of
Microelectronic
17: Semiconductor Memories Systems 5
• Each column Ci (NOR gate) corresponds to one bit of the stored word
• A word is selected by rising to “1“ the corresponding wordline
• All the wordlines are “0“ except the selected wordline which is “1“
Institute of
Microelectronic
17: Semiconductor Memories Systems 6
Mask Programmable NOR ROM (2)
D
G
S
common ground line
S
G
D
• “1” bit stored - the drain/source connection (or the gate electrode) are omitted in the final
metallization step
• “0” bit stored - the drain of the corresponding transistor is connected to the metal bit line
Idea: deactivation of the NMOS transistors by raising their threshold voltage above the VOH
level through channel implants
• “1” bit stored - the corresponding transistor is turned off through channel implant
• “0” bit stored - non-implanted (normal) transistors
Advantage: higher density (smaller area)!
Institute of
Microelectronic
17: Semiconductor Memories Systems 8
Implant Mask Programmable NAND ROM (1)
• Each column Ci (NAND gate) corresponds to one bit of the stored word
• A word is selected by putting to “0“ the corresponding wordline Ri
• All the wordlines Ri are “1“ except the selected wordline which is “0“
Normally on transistors: have a lower threshold voltage (channel implant)
Institute of
Microelectronic
17: Semiconductor Memories Systems 9
D D
R1
S S
Institute of
Microelectronic
17: Semiconductor Memories Systems 10
NOR Row Address Decoder for a NOR ROM Array
NOR ROM
Array
A1 A2 R1 R2 R3 R4
0 0 1 0 0 0
0 1 0 1 0 0
1 0 0 0 1 0
1 1 0 0 0 1
• The decoder must select out one row by rising its voltage to “1” logic
• Different combinations for the address bits A1A2 select the desired row
• The NOR decoder array and the NOR ROM array are fabricated as two adjacent arrays,
using the same layout strategy
Institute of
Microelectronic
17: Semiconductor Memories Systems 11
• The decoder has to lower the voltage level of the selected row to logic “0” wile keeping all
the other rows at logic “1”
• The NAND row decoder of the NAND ROM array is implemented using the same layout
strategy as the memory itself
Institute of
Microelectronic
17: Semiconductor Memories Systems 12
NOR Column Address Decoder for a NOR ROM Array
Institute of
Microelectronic
17: Semiconductor Memories Systems 13
Method of erasing:
• ultraviolet light - EPROMs
• electrically - EEPROMs
Institute of
Microelectronic
17: Semiconductor Memories Systems 14
EPROM (1)
The floating gate avalanche-injection MOS (FAMOS) transistor:
• extra polysilicon strip is inserted between the gate and the channel - floating gate
• impact: double the gate oxide thickness, reduce the transconductance, increase the
threshold voltage
• threshold voltage is programmable by the trapping electrons on the floating gate through
avalanche injection
Schematic
symbol
Institute of
Microelectronic
17: Semiconductor Memories Systems 15
EPROM (2)
• Electrons acquire sufficient energy to became “hot” and traverse the first oxide insulator
(100nm) so that they get trapped on the floating gate
• Electron accumulation on the floating gate is a self-limiting process that increases the
threshold voltage (~7V)
• The trapped charge can be stored for many years
• The erasure is performed by shining strong ultraviolet light on the cells through a
transparent window in the package
• The UV radiation renders the oxide conductive by direct generation of electron-hole pairs
Institute of
Microelectronic
17: Semiconductor Memories Systems 16
EPROM (3)
Institute of
Microelectronic
17: Semiconductor Memories Systems 17
EEPROM
Institute of
Microelectronic
17: Semiconductor Memories Systems 18
Flash Memories
Combines the density of the EPROM with the versatility of EEPROM structures
• Programming: avalanche hot-electron-injection
• Erasure: Fowler-Nordheim tunneling (as for EEPROM cells)
• Difference: erasure is performed in bulk for the complete (or subsection of) memory chip -
reduction in flexibility!
• Extra access transistor of the EEPROM is eliminated because the global erasure process
allows a careful monitoring of the device characteristics and control of the threshold
voltage!
• High integration density
v
o
6
Stable
Q-Point
V OH
v vo
I
v
1 2 I 1 4 vo = v I
0 1 0 1 0
Unstable
0 1 vo Q-Point
2
(a) (b) 2 Stable
Q-Point
V
OL
0
0 2 4 6 v
I
Institute of
Microelectronic
17: Semiconductor Memories Systems 20
Static Random Access Memory - SRAM (2)
Institute of
Microelectronic
17: Semiconductor Memories Systems 21
Institute of
Microelectronic
17: Semiconductor Memories Systems 22
Resistive Load SRAM Cell - Operation Principle (2)
Institute of
Microelectronic
17: Semiconductor Memories Systems 23
• Low-power SRAM Cell: the static power dissipation is limited by the leakage current during a
switching event
• The pMOS pull-up transistors allow the column voltage to reach full VDD level
• High noise immunity due to larger noise margins
• Lower power supply voltages than resistive-load SRAM cell
• Drawback: large area!
Institute of
Microelectronic
17: Semiconductor Memories Systems 24
CMOS SRAM Cell Design Strategy (1)
Layout of the resistive-load SRAM cell Layout of the CMOS SRAM cell
Institute of
Microelectronic
17: Semiconductor Memories Systems 25
• RS = 0: M3, M4-off;
• RS = 1: M3-saturation; M4, M1-linear
VC decreases , V1 increases slowly
Condition - M2 must remain turned off during
the data reading operation:
V1, max ≤ V T,2 ; IM3 = IM1 ⇒
⎛W ⎞
⎜ ⎟
⎝ L ⎠ 3 2(VDD − 1.5VT ,n )VT ,n
Design rule: <
⎛W ⎞
⎜ ⎟
(VDD − 2VT ,n )2
A symmetrical rule is valid also for M2 and M4
⎝ L ⎠1
Institute of
Microelectronic
17: Semiconductor Memories Systems 26
CMOS SRAM Cell Design Strategy (3)
(2) The cell should allow modification of the stored information during the data write phase
Consider the write “0“ operation, assuming that “1“ is stored in the cell (V1 = 1, V2 = 0: M1,
M6-off; M2, M5-linear)
• RS = 0: M3, M4-off;
• RS = 1: M3, M4 saturation, M5-linear
In order to change the stored information: V1 =
0, V2 = 1 ⇒ M1 on and M2 off!
But V2 < VT1 (previous design condition) ⇒ M1
cannot be switched on! ⇒ M2 must be
0V
VDD 0V switched off ⇒ V1 must be reduced below VT2
V1 ≤ V T,2 ; IM3 = IM5 ⇒
⎛W ⎞
⎜ ⎟
⎝ L ⎠ 5 µ n 2(VDD − 1.5VT ,n )VT ,n
Design rule: =
⎛W ⎞
⎜ ⎟
µp (VDD + VT , p )2
A symmetrical rule is valid also for M6 and M4 ⎝ L ⎠3
Institute of
Microelectronic
17: Semiconductor Memories Systems 27
W DATA WB WB Operation
0 1 1 0 M1-off, M2-on, VC high, VC low
0 0 0 1 M1-on, M2-off, VC low, VC high
1 X 0 0 M1, M2 off, VC, VC high
Write operation is performing by forcing the voltage level of either column (bit line) to “0”
Institute of
Microelectronic
17: Semiconductor Memories Systems 28
SRAM Read Circuitry
∂ (Vo1 − Vo 2 ) ∂I D
= − R • g m , where g m = = 2k n I D
∂ (VC − VC ) ∂VGS
Institute of
Microelectronic
17: Semiconductor Memories Systems 29
• Eliminates wait states for the processes during data read operation
• Problems can occur if:
• two processors attempt to write data simultaneously onto the same cell
• one processor attempts to read while other writes data onto the same cell
• Solution: contention arbitration logic
Institute of
Microelectronic
17: Semiconductor Memories Systems 30
Dynamic Random Access Memories - DRAM
SRAM drawbacks (1)
• large area: 4-6 transistors/bit + 4 lines connections
• static power dissipation (exception CMOS SRAM)
DRAM
• binary data is stored as charge in a capacitor
• requires periodic refreshing of the stored data
• no static power dissipation
Institute of
Microelectronic
17: Semiconductor Memories Systems 31
Institute of
Microelectronic
17: Semiconductor Memories Systems 32
Three-Transistor DRAM Cell (1)
Institute of
Microelectronic
17: Semiconductor Memories Systems 33
Institute of
Microelectronic
17: Semiconductor Memories Systems 34
Three-Transistors DRAM Cell (3)
WRITE 1 operation:
READ 1 operation:
Institute of
Microelectronic
17: Semiconductor Memories Systems 35
Write 0 operation:
• Precharge: C2, 3
• DATA = 1, MD on; WS = 1, M1 on ⇒ C2, C1
pulled to 0 through M1 and MD;
• After write operation Ws = 0, M1 off; C2 is
discharged to 0, M2 off MD
READ 0 operation:
• Precharge: C2, 3
• RS = 1, M3 on; M2 off
• C3 does not discharge - the 1 logic level is
interpreted as a stored 0
• 1 transistor M1
• 1 explicit capacitor C1: 30-100 fF, (C1<<C2)
Charge sharing between C2 and C1 has a key role in the r/w operations
• Data WRITE:
“1” - D = 1, R/W = 1 M1-on; C1 charge up to 1 level
“0” - D = 0, R/W = 0 M1-on; C1 discharge to 0 level
(1) Data
(2) Gate
(3) Drain area
(4) Source area (6)Capacitor Plate
(5) Field oxide
(6) Capacitor plate (Poly
(7)Capacitor Insulator
Si) Refilling Poly
(7) Capacitor insulator
(8) Storage node (8)Storage Node
electrode (Poly Si)
Substrate-Si (9)
(9) Substrate (Si)
(5)Field Oxide
Institute of
Microelectronic
17: Semiconductor Memories Systems 38
Data Read Example (1)
Institute of
Microelectronic
17: Semiconductor Memories Systems 39
• Precharge devices are turned on, CD and CD are charged up to ”1” level
• The dummy nodes X and Y are pulled to “0” level
• During this phase all other signals are inactive
Institute of
Microelectronic
17: Semiconductor Memories Systems 40
Data Read Example (3)
• One of the 256 word lines is raised to “1” (cell R128 is selected)
• The corresponding dummy cell on the other side is also selected (right)
• Charge sharing between the selected cell and CD (depending on the value stored by cell
“0” or “1”) and between dummy cell and CD
• Voltage level is detected through the charge sharing
Institute of
Microelectronic
17: Semiconductor Memories Systems 41
Institute of
Microelectronic
17: Semiconductor Memories Systems 42
DRAM Architectures
Name Feature Die size Frequency Application
increase (system
level)
DRAM Fast page mode - 25MHz Main memory
VRAM DRAM+SAM 50% 40MHz Viedo display buffer
EDO DRAM with modofied 0% 40-50MHZ Main memory, low-end graphic
CAS memory
SDRAM Sync.DRAM+Register 0-10% 60-150MHz Main memory in workstations,
(Latch) high end PCs, middle range
graphic memory
SGRAM SDRAM+Block write+ 10% 60-150MHz High-end memory
WPB 3Gb/s
CDRAM Sync.DRAM+SRAM+ 7-10% 66MHz Low-end PC
DTB
RDRAM Sync.DRAM+Raambus 12-15% 250MHz High-end PC, graphic memory
I/O
3D-RAM Sync.DRAM+SRAM+ ? 400Mb/s High-end graphic memory
SAM+ALU ext, 1.6
Gb/s int
EDRAM DRAM + SRAM ? ? Low-end PC
SVRAM Sync.DRAM+SAM 50% 100MHz High-end graphic memory
Summary
• the memory architecture has a major impact on the ease of use of the memory, its
reliability and yield, its performance and power consumption;
• memories are organized as arrays of cells; an individual cell is addressed by a
column and row address;
• the memory cells should be designed so that a maximum signal is obtained in a
minimum area; the cell design is dominated by technological considerations and
most of the improvement in density results from scaling and advanced
manufacturing processes;
• we have discussed cells for read-only memories (NOR and NAND ROM), nonvolatile
memories (EPROM, EEPROM and FLASH) and read-write memories (SRAM and
DRAM)
• the peripheral circuitry is very important to operate the memory in a reliable way
and with reasonable performance; decoders, sense amplifiers and I/O buffers are
an integral part of every memory design;
Institute of
Microelectronic
17: Semiconductor Memories Systems 44
18. ASIC Design Guidelines
Institute of
Microelectronic
Systems
Introduction
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 2
Synchronous Circuits (1)
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 3
• NON-RECOMMENDED CIRCUITS:
– Flip-flop driving clock input of another Flip-flop:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 4
Synchronous Circuits (3)
• NON-RECOMMENDED CIRCUITS:
– Gated clock line:
– Clock skew caused by gating the clock line (e.g. multiplexer in clock
line)
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 5
• NON-RECOMMENDED CIRCUITS:
– Double-edged clocking:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 6
Synchronous Circuits (5)
• NON-RECOMMENDED CIRCUITS:
– Flip-flop driving asynchronous reset of another Flip-flop:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 7
• NON-RECOMMENDED CIRCUITS:
– Unequal depth of clock buffering:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 8
Clock Buffering (2)
• NON-RECOMMENDED CIRCUITS:
– Unbalanced fanout of clock buffers:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 9
• Recommended circuits:
– Balanced clock tree buffering
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 10
Clock Buffering (4)
• Recommended circuits:
– Combined geometric/tree buffering
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 11
• NON-RECOMMENDED CIRCUITS:
– Multiplexer on clock line:
– Signal change at multiplexer input can cause a glitch at the clk input
(FF captures invalid data)
– Gating the clock line introduces clock skew
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 12
Gated Clocks (2)
• Recommended circuits:
1) Enabled (E-type) flip-flop: 2) Toggle (T-type) flip-flop:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 13
• NON-RECOMMENDED CIRCUITS:
– Pipelined logic with double-edged clocking:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 14
Double-edged Clocking (2)
• Recommended circuits:
– Pipelined logic with single-edged clocking:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 15
• NON-RECOMMENDED CIRCUITS:
– Flip-flop driving the asynchronous reset of another flip-flop:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 16
Asynchronous Resets (2)
• Recommended circuits:
– Global asynchronous reset by external signal:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 17
• Recommended circuits:
– Flip-flop driving the synchronous reset of another flip-flop:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 18
Shift Registers (1)
• NON-RECOMMENDED CIRCUITS:
– Shift register with forward or reverse chain of clock buffers:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 19
• Recommended circuits:
– Shift register with balanced tree of clock buffers:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 20
Asynchronous Inputs (1)
• NON-RECOMMENDED CIRCUITS:
– Circuits with complicated feedback loops to capture asynchronous
inputs (very sensitive to noise, and functionality can be influenced
by placement and routing delays)
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 21
• Recommended circuits:
– Chain of two or more D-type flip-flops for capturing an asynchronous
input:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 22
Asynchronous Inputs (3)
• Recommended circuits:
– Use of 4-bit register as shift register for capturing an asynchronous
input:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 23
• Recommended circuits:
– Asynchronous handshake circuit:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 24
Asynchronous Inputs (5)
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 25
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 26
Asynchronous Inputs (7)
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 27
• NON-RECOMMENDED CIRCUITS:
– In general, it cannot be recommended to build circuits with a
functionality that relies on delays.
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 28
Delay Lines and Monostables (2)
• NON-RECOMMENDED CIRCUITS:
– Pulse generator using flip-flop:
– Multivibrator:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 29
• Recommended circuits:
– Synchronous pulse generator:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 30
Bistable Elements (1)
• NON-RECOMMENDED CIRCUITS:
– Cross-coupled flip-flops and RS-flip-flops
– Bistable storing elements formed by cross-coupled NAND or NOR
gates:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 31
• NON-RECOMMENDED CIRCUITS:
– Asynchronous RS-flip-flop:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 32
Bistable Elements (3)
• Recommended circuits:
– Use D-types with set/reset
– Use latch configured as RS flip-flop:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 33
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 34
RAMs and ROMs in Synchronous Circuits 2
• Recommended circuits:
– Interfacing RAM into synchronous circuit: ME and WEbar generation
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 35
• Recommended circuits:
– Using flip-flop for WEbar generation: timing scheme
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 36
RAMs and ROMs in Synchronous Circuits 4
• Recommended circuits:
– Avoiding floating RAM/DPRAM output propagation
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 37
Tristates (1)
• NON-RECOMMENDED CIRCUITS:
– Tristate bus with non-central enable control:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 38
Tristates (2)
• Recommended circuits:
– Tristate bus with central control of all tristate enable signals and one
additional driver that is activated on non-controlled states
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 39
Tristates: Multiplexer:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 40
Parallel Signals
• NON-RECOMMENDED CIRCUITS:
– Wired-OR part used to create higher fanout:
• Recommended Circuits:
– High-fanout buffer replacing wired OR part
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 41
Fanout (1)
• NON-RECOMMENDED CIRCUITS:
– Excessive fanout on
control signals:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 42
Fanout (2)
• Recommended circuits:
– Geometric buffering
on control signal:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 43
Fanout (3)
• Recommended circuits:
– Tree buffering
on control signal:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 44
Design for Speed (1)
• Use AOI logic (complex cells from standard cell library) where
possible. The figure below shows a multiplexer using AOI logic:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 45
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 47
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 48
Design for Testability (2)
• Recommended circuit:
– Insert test inputs and outputs
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 49
• NON-RECOMMENDED CIRCUITS:
– Chain of counters: first counter is not directly observable and
second counter is not directly controllable
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 50
Design for Testability (4)
• Recommended circuit:
– Break long counter / shift register chains
– Chain of counters broken by test input tc and output signals:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 51
• NON-RECOMMENDED CIRCUITS:
– Counter with closed feedback loop: initial state is not known
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 52
Design for Testability (6)
• Recommended circuit:
– Open feedback loops
– Counter with feedback loop opened by test control tr and output
signals:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 53
• Recommended circuits:
– Use BIST (Built-In-Self-Test) with compiled megacells
– Compiled megacell with compiled inputs/outputs:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 54
Design for Testability (8)
• Recommended circuits:
– Scan path testing
– E-type scan path flip-flop (right):
– Circuit with scan path (below):
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 55
• Recommended circuits:
– Use of JTAG boundary scan path
– JTAG test circuitry:
Institute of
Microelectronic
18: ASIC Design Guidelines Systems 56
19. Testing and
Design for Testability
Institute of
Microelectronic
Systems
Motivation
phases of design
Institute of
Microelectronic
19: Testing Systems 2
Economical Considerations (1)
# DevectiveParts
aql =
# AcceptedParts
Institute of
Microelectronic
19: Testing Systems 3
Institute of
Microelectronic
19: Testing Systems 4
Economical Considerations (3)
DL = 1 − Y 1−T
Institute of
Microelectronic
19: Testing Systems 5
Institute of
Microelectronic
19: Testing Systems 6
Design Flow: Testing (1)
Institute of
Microelectronic
19: Testing Systems 7
Manufacturing Process
Institute of
Microelectronic
19: Testing Systems 8
Fundamental Definitions
Institute of
Microelectronic
19: Testing Systems 10
Fault Models (2)
Institute of
Microelectronic
19: Testing Systems 11
PHYSICAL LOGICAL
(analog) (digital)
Institute of
Microelectronic
19: Testing Systems 12
Fault Models for Gates (2)
• Issue: complexity
– as 1 model .......................
• 12 faults
– as 12 gates ......................................................
• 30 (collapsed) faults
• 12x larger netlist
• Æ 30x computation
– as 60 transistors ................
• 90 (collapsed) faults
• 60 transistors
• Æ 400x computation
Institute of
Microelectronic
19: Testing Systems 13
• The controversy:
– IBM: comprehensive stuck-at Æ no empirical need for MOS fault
models
– UNISYS: MOS model required for < 1% AQL
Institute of
Microelectronic
19: Testing Systems 14
Fault Models for Gates (4)
– test b 1 1 0 1
c 1 1 1 0
Institute of
Microelectronic
19: Testing Systems 15
Fault detection by
duplication with
complementary logic
Institute of
Microelectronic
19: Testing Systems 16
Fault Tolerant Design (2)
Institute of
Microelectronic
19: Testing Systems 17
Reconfigured array
Institute of
Microelectronic
19: Testing Systems 18
Test Pattern Generation (1)
• manually
• pseudo random (leads up to 60% fault coverage)
• algorithmic
• special test patterns for RAMs
Institute of
Microelectronic
19: Testing Systems 19
Institute of
Microelectronic
19: Testing Systems 20
The D-Algorithm (2)
Institute of
Microelectronic
19: Testing Systems 21
Institute of
Microelectronic
19: Testing Systems 23
Institute of
Microelectronic
19: Testing Systems 24
The D-Algorithm (6)
Institute of
Microelectronic
19: Testing Systems 25
Construction of the
singular cover of a
logic module
Institute of
Microelectronic
19: Testing Systems 26
D-Algorithm Example (1)
Institute of
Microelectronic
19: Testing Systems 27
Institute of
Microelectronic
19: Testing Systems 28
D-Algorithm Example (3)
Institute of
Microelectronic
19: Testing Systems 29
Institute of
Microelectronic
19: Testing Systems 30
D-Algorithm Example (5)
Institute of
Microelectronic
19: Testing Systems 31
5) Now the consistency phase is started and a value for line 4 has to
be found. From the singular cover table it can be seen that a 0 on
line 10 implies both line 7 and line 8 to be 1. In cube m line 7 is a D
(and also line 5 which is connected to 7 by j), and this D must now
be set to 1 which is a contradiction that disables the path
sensitization 5 Æ 6/7 Æ 9 Æ 11.
Institute of
Microelectronic
19: Testing Systems 32
D-Algorithm Example (7)
7) From the singular cover table we get the information that a 1 on line
8 is the same as a 0 on line 4. Additionally, it can be seen that the 0
on line 9 can be obtained by a 1 on line 1.
8) This yields the final cube:
1110DDD10DD
9) Î A test vector for line 5/0 is given by:
1110
Institute of
Microelectronic
19: Testing Systems 33
Fault Simulation
• Improved Algorithms:
– Parallel Fault Simulation
– Concurrent Fault Simulation
Æ discussed in CAD lecture
Institute of
Microelectronic
19: Testing Systems 34
Design for Testability (1)
Testability:
• controllability
• observability
• Æ additional chip area required
• Æ shorter design cycle
Institute of
Microelectronic
19: Testing Systems 35
Design for testability: complex gate (a) not testable with stuck-at model;
(b) fully testable with stuck-at model
Institute of
Microelectronic
19: Testing Systems 36
Design for Testability (3)
• Ad-Hoc Techniques:
– developed for special design
– less silicon area
– design automation almost impossible
– partitioning (test of circuit components by use of dedicated
multiplexers)
Institute of
Microelectronic
19: Testing Systems 37
Institute of
Microelectronic
19: Testing Systems 38
Design for Testability (5)
A-hoc techniques:
insertion of register in order to limit logic depth to a given maximum value
Institute of
Microelectronic
19: Testing Systems 39
Ad-hoc techniques :
test shift registers for PLA test (increasing PLA area)
Institute of
Microelectronic
19: Testing Systems 40
Scan-Path Methods (1)
Institute of
Microelectronic
19: Testing Systems 41
Institute of
Microelectronic
19: Testing Systems 42
Scan-Path Methods (3)
• Advantages:
– Testability of clocked circuits is improved and guaranteed at design
stage
– Consistent with good VLSI design practice (rules, abstraction,
modularity, ...)
– Does not require special CAD
• Disadvantages:
– Wastes silicon
– Constrains designer to design according given conditions
– Additional complexity
• Overhead:
~
– 2% for a fundamentally ‘structured’ design
~
– 30% for ‘wild’ logic
Institute of
Microelectronic
19: Testing Systems 43
Institute of
Microelectronic
19: Testing Systems 44
Built-In Tests (2)
Institute of
Microelectronic
19: Testing Systems 45
xi (t ) = xi −1 (t − 1) für 2 ≤ i ≤ n
n
xi (t ) = ∑ ki * ( xi (t − 1)) (mod 2)
i =1
K ( x) = k n x n + k n −1 x n −1 + L + k1 x + k0
Institute of
Microelectronic
19: Testing Systems 46
Built-In Tests (4)
K ( x) = x 4 + x + 1
Institute of
Microelectronic
19: Testing Systems 47
1
F ≈ 1−
m *π
Institute of
Microelectronic
19: Testing Systems 48
Evaluation of Testing Data (2)
• Signature analysis
– Communication technique: coding theory
– Code words: data stream D, polynomial P(x), division modulo 2
D R
=Q+
P P
Institute of
Microelectronic
19: Testing Systems 49
Institute of
Microelectronic
19: Testing Systems 50
Evaluation of Testing Data (4)
• Interpretation:
– all faults recognized if m < n (trivial)
– long sequences: n is important only
– n = 16 bit Æ F = 99,99985%
2 mk − n − 1
• Parallel signature register with k inputs: F = 1 − mk
2 −1
Institute of
Microelectronic
19: Testing Systems 52
Built-in Logic Block Observation (1)
• Advantages:
– Versatility
• Normal operation
• Scan-path test: enhances testability
• Test vector generation via LFSR
• Data compression via LFSR
• Combined scab-path/self-test using LFSRs
• Disadvantages:
– silicon area
• Bilbo latch can be ≈ 50% larger than ordinary latch
Institute of
Microelectronic
19: Testing Systems 54
Built-in Logic Block Observation (3)
feedback disconnect:
open in test mode
decoder
binary up-counter
go / no go
output
Test Clock
pass gate
red LED,
For clarity, mode control lines, normal green LED
system clocks, and preset/clear facilities
have been omitted
Chapter 20
Boundary-Scan Architecture –
JTAG Standard
• later North American companies joined the group (→ Joint Test Action Group = JTAG)
VLSI Design
Course 20-1
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Classical Board Test Approaches
VLSI Design
Course 20-2
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Classical Board Test Approaches
• increased density
VLSI Design
Course 20-3
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Introduction to Boundary Scan
VLSI Design
Course 20-4
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Introduction to Boundary Scan
Input Output
Expected Actual
x1x1x0xxxxxx xxxxxxxx01x1 xxxxxxxx11x0
x0x0x1xxxxxx xxxxxxxx10x0 xxxxxxxx11x0
VLSI Design
Course 20-5
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Introduction to Boundary Scan
• self-testing ICs: boundary scan can be used to trigger the self-test procedure
VLSI Design
Course 20-6
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1
• TAP Controller: responds to the control sequences supplied through the test access port
(TAP) and generates the clocks an control signals required for the operation of the other
circuit blocks
• Instruction Register: shift register which is serially loaded with instruction for test
• Test Data Registers: Bank of shift registers. The stimuli values required for a test are
serially loaded into a test register selected by the current instruction. After execution
the results can be shifted out for examination
VLSI Design
Course 20-7
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1
• Test Clock Input (TCK): independent of the system clock; used for synchronization of
test operations between various chips on a board
• Test Mode Select Input (TMS): Input for controlling the test logic
• Test Data Input (TDI): Serial input for instruction and test register data
• Test Data Output (TDO): Serial output of instruction or test register data (source se-
lected by TMS code)
VLSI Design
Course 20-8
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1
VLSI Design
Course 20-9
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1
Figure 20.11: Use of bus master chip to control IEEE Std 1149.1 chips
20.3.3 TAP-Controller
• 16-state FSM which controls data register (DR) and instruction register (IR) operations
• input signals:
– TRST∗
– TCK
– TMS
– last state (stored in internal FFs)
VLSI Design
Course 20-10
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1
• output signals:
– Reset*
– Select
– Enable
– ShiftIR
– ClockIR
– UpdateIR
– ShiftDR
– ClockDR
– UpdateDR
VLSI Design
Course 20-11
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1
VLSI Design
Course 20-12
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1
Bypass Register
VLSI Design
Course 20-13
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1
VLSI Design
Course 20-14
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1
VLSI Design
Course 20-15
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog Signal Processing
Chapter 21
as shown in Fig.21.1
The aim of development is to integrate all these functions on a single chip.
VLSI Design
Course 21-1
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog Signal Processing
Figure 21.3: Signal bandwidths that can be processed by present day (1989)
technologies
VLSI Design
Course 21-2
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog Signal Processing
Fig. 21.4 illustrates how analog-to-digital (A/D) and digital-to-analog (D/A) converters are
used in data systems. In general, an A/D conversion process will convert a sampled and
held analog signal to a digital word that is a representative of the analog signal. The D/A
conversion process is essentially the inverse of the A/D process. Digital words are applied to
the input of the D/A converter to create from a reference voltage an analog output signal that
is a representative of the digital word.
Figure 21.4: Converters in signal processing systems: (a) A/D, (b) D/A
VLSI Design
Course 21-3
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Digital-To-Analog Converters
Figure 21.5: (a) Conceptual block diagram of a D/A converter, (b) Clocked
D/A converter
In most cases, the digital input of the D/A converter is synchronously clocked. It is therefore
necessary to provide a latch to hold the word for conversion and a sample-and-hold circuit at
the output, as shown in Fig. 21.5(b).
The basic architecture of the D/A converter without an output sample-and-hold circuit is
shown in Fig. 21.7. Fig. 21.8 shows the ideal input-output characteristics for such a D/A
converter.
The output Voltage of a current-scaling D/A converter as shown in Fig. 21.9 can be expressed
as
R R b1 b2 b3 bN
Vout = − I0 = − + + + . . . + N −1 Vref (21.4)
2 2 R 2R 4R 2 R
= −Vref (b1 2−1 + b2 2−2 + b3 2−3 + . . . + bN 2−N ) (21.5)
VLSI Design
Course 21-4
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Digital-To-Analog Converters
Figure 21.6: (a) Sample-and-hold circuit, (b) Waveforms illustrating the op-
eration of the sample-and-hold circuit
VLSI Design
Course 21-5
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Digital-To-Analog Converters
The major disadvantage of this approach is the large ratio of component values. For example,
the ratio of the resistor for the MSB to the resistor for the LSB is given by
RM SB 1
= N −1 (21.6)
RLSB 2
Thus, the output voltage of the R-2R D/A converter is given by Eq. 21.5.
VLSI Design
Course 21-6
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Digital-To-Analog Converters
VLSI Design
Course 21-7
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Digital-To-Analog Converters
A voltage-scaling D/A converter is shown in Fig. 21.11. Its output voltage at any tap i can
be expressed as
Vref
Vi = (i − 0.5) (21.8)
8
The output voltage of the D/A converter is then determined by the values of the inputs b1 ,
b2 and b3 .
The structure of this voltage-scaling D/A converter is very regular and thus well suited for
MOS technology. A problem with this type of D/A converters is the accuracy requirements
of the resistors used. This makes it difficult to build D/A converters of this type with more
than 8 bit resolution.
VLSI Design
Course 21-8
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters
The objective of an A/D converter is the determination of the digital word corresponding to
the analog input signal. Usually a sample-and-hold circuit (see Fig. 21.6) is required at the
input of the A/D converter because it is not possible to convert a changing analog signal. A
block diagram of a general A/D converter is shown in Fig. 21.12. The ideal input-output
characteristics for a A/D converter are shown in Fig. 21.13.
VLSI Design
Course 21-9
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters
Two possible implementations of serial A/D converters are single-slope and dual-slope A/D
converters. Both will not be discussed in detail here. The main advantages of these converters
is their simplicity, their main disadvantage is the long conversion time required.
This type of A/D converters converts an analog input into an N-bit digital word in N clock
cycles. Consequently, the conversion time is less than for the serial converters without much
increase in the complexity of the circuit. Fig. 21.14 shows an example of a successive approx-
imation A/D converter architecture.
VLSI Design
Course 21-10
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters
In many applications, it is necessary to have a smaller conversion time than is possible with
the previously described A/D converter architectures. Parallel A/D converters, also known as
flash A/D converters, typically require down to one clock cycle for conversion. An architecture
of a 3-bit parallel A/D converter is shown in Fig. 21.16.
Parallel A/D converters can reach typically up to 20 MHz for CMOS technology. The sample-
and-hold time may though be larger than 50 ns and could prevent this conversion time from
being realised. Another problem is that the number of comparators required is 2N −1 . For N
greater than 8, too much area is required.
One method of achieving small system conversion times is to use slower A/D converters in
parallel, which is called time-interleaving and is shown in Fig. 21.17. Here M successive
approximation A/D converters are used in parallel to complete the N -bit conversion of one
analog signal per clock cycle. The sample-and-hold circuits consecutively sample and apply
the input analog signal to their respective A/D converters. N clock cycles later, the A/D
converter provides a digital word output. If M = N , then a digital word is given out every
clock cycle. If one examines the chip area for an N -bit A/D converter using the parallel A/D
converter architecture (M = 1) compared with the time-interleaved architecture for M = N ,
the minimum area will occur for a value of M between 1 and N .
VLSI Design
Course 21-11
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters
VLSI Design
Course 21-12
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters
VLSI Design
Course 21-13
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters
Introduction
The basic structure of a sigma-delta converter is shown in Fig. 21.18. The sigma-delta con-
verter can be referred to as an oversampling converter, although oversampling is just one of
the techniques contributing to the performance of a sigma-delta converter. The sigma-delta
converter shown in Fig. 21.18 quantizes an analog signal with very low resolution (1 bit) and
a very high sampling rate (2 MHz). With the use of oversampling techniques and digital
filtering, the sampling rate is reduced (8 kHz) and the resolution is increased (16 bits).
A more detailed block diagram of the sigma-delta modulator is shown in Fig. 21.19. It consists
of an integrator, a quantizer (comparator for 1 bit) and a feedback loop with a D/A converter
(switch for 1 bit). The output of the sigma-delta modulator is shown in Fig.21.20 for a sine
wave input. The single-bit conversion will result in an output which is either ’1’ or ’0’. When
the signal is near plus full scale, the output is positive during most of the clock cycles. The
opposite is true for near minus full scale signals. When the output is followed by a digital
filter as shown in Fig. 21.18 which can perform sophisticated averaging functions, the 1-bit
sequence is transformed into a much more meaningful signal.
Noise Shaping
One feature that makes the sigma-delta converter so powerful is its noise shaping capability.
To understand how this works, the analysis of the sigma-delta modulator in the frequency
domain is appropriate. Fig.21.21 shows the frequency domain linearized model of a sigma-
delta modulator.
VLSI Design
Course 21-14
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters
VLSI Design
Course 21-15
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters
The integrator is represented as a analog filter. For an integrator, the transfer function has
an amplitude which is inversly proportional to the input frequency ( f1 relationship). The
quantizer is modelled as a gain stage followed by the addition of quantization noise.
Thus, the output y of the sigma-delta converter can be expressed by
1
y = (x − y) +q (21.9)
f
where (x − y) is the difference signal from the summing node at the input and q is the
quantization noise. Applying some algebraic rearrangement yields
x y
y = − +q
f f
1 x
1+ y = +q
f f
x
f q
y = 1 + 1
1+ f 1+ f
x qf
y = + (21.10)
f +1 f +1
At a frequency f = 0, the output signal equals x with no noise element q. At higher frequencies,
the value of x is reduced and the influence of q increases. In essence, the sigma-delta modulator
has a low pass effect on the signal and a high pass effect on the noise. As a result of this,
the modulator can be thought of as a noise shaping filter where noise in the signal pass band
is reduced and noise energy is pushed into the higher frequency region. The effect of this
procedure on normally equally distributed (white) quantization noise is shown in Fig. 21.22.
VLSI Design
Course 21-16
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters
Digital Filtering
The sigma-delta modulator described so far produces a stream of single-bit digital values at
a very high rate. The modulator’s output bit stream is fed into the converter’s digital filter,
which performs several different functions. All of these functions, however, are integrated into
a single filter implementation. The functions of the filter are:
The sampling rate reduction is done by averaging over a sample of cycles of the input bit
stream and produces an output data stream that is reduced in sampling rate, but increased
in resolution (i.e. number of bits per sample).
• Sigma-delta converters are a complete conversion and filtering system, additional digital
filtering functions may easily be implemented in the digital output filter of the converter
• Very low-cost and high-performance conversion ist possible as the analog part of the
converter is very simple and need not be as accurate as in other A/D converters. The
main part of the converter is the digital filter which can be integrated more easily in
MOS technology.
VLSI Design
Course 21-17
Darmstadt University of Technology
Institute of Microelectronic Systems 0
22. VLSI in Communications
Institute of
Microelectronic
Systems
State-of-the-art
RF Design,
Communications
and DSP Algorithms VLSI Design
Design
• Power-Area-Speed optimization
• Performance
• Flexibility
Improvement
• Risk minimization
Institute of
Microelectronic
22: VLSIinCOMMS Systems 2
Trends
RF Design,
Communications
and DSP Algorithms VLSI Design
Design
Trade-off during
the development
(Interdisciplinary Issue)
Challenge
Towards complex on-chip wireless system design
• The increasing communication and multimedia processing can cope
with the high integration density of microelectronic circuits
1. Key ingredients
- maximize digital components
- minimize analog, passive elements (i.e. Simplification of design
requirements of analog components, moving to digital processing
as early as possible)
- Low power design techniques
Institute of
Microelectronic
22: VLSIinCOMMS Systems 4
Design Flow: Overview
Digital Baseband Processing
SOFTWARE
System Design
Code Generation Compiler
System Simulation and
Description Analysis Optimization
Digital
Modeling Language Frequency Spectrum
e.g. Matlab, C, C++, HARDWARE
SDL, ...
Hardware Synthesis
Description (RTL, High Level)
(VHDL, Verilog)
Graphical
Eye Pattern
Environment Placement &
Routing
Dataflow-oriented
(e.g. Signal Processing)
Bit Error Rate
Tools:Cossap, Simulink,
0
10
−1
10
−2
10
AWGN
6Mbps
9Mbps
12Mbps
18Mbps
27Mbps
36Mbps
54Mbps
Analog & RF Design
SPW, ...
Bit Error rate
−3
10
Analog
−4
10
Description
−5
Simulation
10
−6
10
−7
10
−5 0 5 10 15 20 25 30 35
(VHDL AMS,
C / N [dB]
(SPECTRE, ...)
Spice, ...)
Controlflow
State Diagramm Layout
-oriented
(e.g. Protocols) Generation
Tools: Statemate
ADC Data
RAM
DAC Path Cores
Goal Analog Control
(DSP,
RISC)
ROM
RF Logic
Institute of
Microelectronic
22: VLSIinCOMMS Systems 5
LNA IF M ix e r
D u p le x e r M ix e r Dem od.
VCO IF P L L
D ig ita l B a s e b a n d
Q P ro c e s s in g
ADC
- D iv e rs ity R e c e p tio n
- E q u a liz a tio n (R L S , V ite rb i)
I - C h a n n e l C o d in g /D e c o d in g
DAC - V o ic e C o d in g /D e c o d in g
- In te rle a v in g /D e in te rle a v in g
- E n c ry p tio n /D e c ry p tio n
M o d u la to r
Pow er
T ra n s m it P L L
A m p lifie r
VCO
Q
ANALOG DAC DIGITAL
Institute of
Microelectronic
22: VLSIinCOMMS Systems 6
IC Technologies
Institute of
Microelectronic
22: VLSIinCOMMS Systems 7
fT up to 30 GHz
fT up to 30 GHz fT up to 80 GHz
Features fMAX up to 40 GHz
fMAX up to 40 GHz fMAX up to 75 GHz
(0.15 µm)
• Digital baseband • Intermediate fre-
• IF and RF mo-
Application • Trends:RF, IF and quency (IF) mo-
dules
analog baseband dules
(1999)
CMOS is currently the best IC technology for single chip solutions (analog +
digital) for communication applications
CMOS technologies
Advantages: Mature technology, high integration density, cost-effective
Drawbacks: Bad noise figure, bad linearity, substrate parasitics
Institute of
Microelectronic
22: VLSIinCOMMS Systems 9
IC Technologies (cont’d)
Institute of
Microelectronic
22: VLSIinCOMMS Systems 10
IC Technologies (cont’d)
SILICON
GERMANIUM
(SiGe) HBT HEMT
IC Technologies (cont’d)
Institute of
Microelectronic
22: VLSIinCOMMS Systems 12
IC Technologies (cont’d)
GALLIUM
ARSENIDE
(GaAs) MESFET HEMT HBT
Institute of
Microelectronic
22: VLSIinCOMMS Systems 13
23. Digital Baseband Design
Institute of
Microelectronic
Systems
og VHDL,
eril Verilog
DL, V
VH Memory Control
Behavioral Synthesis +
Behavioral Level Architectural Level
is
hes
0 State
For I=0 to I=15 y nt
lS 0 0
Sum = Sum + array[I]
ctura is
r c hite thes
0
A y n
LS
RT
Clk
C
Institute of
Microelectronic Gnd
Institute of
Microelectronic
23: Digital Baseband Design Systems 3
E.g. Constraint
Gain Gain loss < 0.5 dB
Floating Point
8 bits
4 bits
Input Parameter
Institute of
Microelectronic
23: Digital Baseband Design Systems 4
Behavioral or High-Level Synthesis
1. Ressoure allocation
2. Scheduling (! Introduction of timing infornation !)
3. Ressource assignment
TOOLS:
SYNOPSYS Behavioral
Compiler, Virtual Artist
Institute of
Microelectronic
23: Digital Baseband Design Systems 5
og VHDL,
eril Verilog
DL, V
VH Memory Control
Behavioral Synthesis +
Behavioral Level Architectural Level
is
hes
0 State
For I=0 to I=15 y nt
lS 0 0
Sum = Sum + array[I]
ctura is
r c hite thes
0
A y n
LS
RT
Clk
Clocked C
Institute of
Microelectronic Gnd
or Untimed
Data[15]
Sum
Sum
Behavioral Synthesis
Clear
Register level MEM
(e.g. 20% technology dependent)
address Clocked
Clock
Clear
sum
Institute of
Microelectronic
23: Digital Baseband Design Systems 7
Institute of
Microelectronic
23: Digital Baseband Design Systems 8
Low Power Design
• Simple example:
– 4 bit counter
Institute of
Microelectronic
23: Digital Baseband Design Systems 9
Institute of
Microelectronic
23: Digital Baseband Design Systems 10
Gray-Counter – Schematic
Avarage Power
Consumption:
76µW@20MHz
Institute of
Microelectronic
23: Digital Baseband Design Systems 11
76 µW
= 75%
102 µW
Institute of
Microelectronic
23: Digital Baseband Design Systems 12
Bibliography
Bibliography
[2] European Silicon Structures (ES2), Zone Industrielle, 13106 Rousset, France. Solo 2030
User Guide, e02a02 edition, June 1992.
[3] John P. Hayes. Computer Architecture and Organization. McGraw-Hill, Inc., 1988.
[4] Kai Hwang. Computer Arithmetic – Principles, Architectures, and Design. John Wiley
and Sons, 1979.
[6] Jan M. Rabaey. Digital Integrated Circuits – A Design Perspective. Prentice Hall,
http://bwrc.eecs.berkeley.edu/Classes/IcBook/index.html.
[7] John P. Uyemura. Fundamentals of MOS Digital Integrated Circuits. Addison Wesley,
1988.
[8] John P. Uyemura. Circuit Design for CMOS VLSI. Kluwer Academic Publishers, 1992.
[9] Neil Weste and Kamran Eshraghian. Principles of CMOS VLSI design. Addison-Wesley
Publishing Company, 1985.
VLSI Design
Course 24-1
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Appendix
Derivation of current
equations of short
channel devices
vS vG vD
iD iD
Gate (G)
Source (S) Drain (D)
n+ Channel Region n+
x
L
P-Type Substrate
Body (B)
iB
vB
There is a negative charge induced by the voltage VGS − VT under the oxide. The charge
per area can therefore be expressed as:
For simplification the electron velocity v(x) with velocity saturation effect is assumed as:
−µn E(x)
v(x) = ¯ ¯ (3)
1 + ¯ E(x)
¯ ¯
Ec ¯
dV
with : E(x) = − (4)
dx
Combining equations (2) and (3) yields:
00 µn dV
dx
ID = W · Cox · 1 dV
· (VGS − VT − V (x)) (5)
1 + Ec dx
Equivalence conversion:
ID dV dV 00
ID + = µn · W · Cox · (VGS − VT − V (x)) (6)
Ec dx dx
ID 00
ID dx + dV = µn · W · Cox · (VGS − VT − V (x))dV (7)
Ec
ID 00 00
ID dx = − dV + µn · W · Cox · (VGS − VT )dV − µn · W · Cox · V (x)dV (8)
Ec
Integrating both sides yields:
ZL VDSµ
Z ¶
ID 00 00
ID dx = − + µn · W · Cox · (VGS − VT ) − µn · W · Cox · V (x) dV
Ec
0 0
ID 00 1 00 2
ID L = − VDS + µn · W · Cox · (VGS − VT )VDS − µn · W · Cox · VDS
Ec 2
µ ¶ µ ¶
VDS 00 1 2
ID L + = µn · W · Cox · (VGS − VT )VDS − VDS
Ec 2
For a NMOS device we get for the drain current including velocity satuartion:
µ ¶
1 00 W 1 2
ID = · µn · Cox · · (VGS − VT )VDS − VDS (9)
1 + VEDS
cL
L 2
This equation is true for both the NMOS and the PMOS device.
Exercises
1. Exercise: Short channel MOSFETs
Institute of
Microelectronic
Systems
Institute of
1. Exercise: Short Microelectronic
Channel MOSFETs Systems 2
1. Problem: Short channel MOSFETs
Institute of
1. Exercise: Short Microelectronic
Channel MOSFETs Systems 3
Formulas:
W ⎡ 2 ⎤
VDS
I DS = κ (VDS )µ ⋅ COX ⎢(VGS − VT )VDS − ⎥
L ⎣ 2 ⎦
1
κ (VDS ) =
1 + (VDS Ε C L )
1 ∂I DS
Ron = g DS =
g DS VDS →0 ∂VDS
Institute of
1. Exercise: Short Microelectronic
Channel MOSFETs Systems 4
Exercises 2
Institute of
Microelectronic
Systems
1. NMOS Inverter
V D D V D D V D D
Q L
R L R L
Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 2
1. NMOS Inverter
Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 3
1. NMOS Inverter
Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 4
1. NMOS Inverter
Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 5
1. NMOS Inverter
• Calculate VOH
• Calculate VOL
• Calculate VIH
Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 6
1. NMOS Inverter
Hints:
• The body effect (influence of the bulk- source voltage) of the load
transistor must be taken into account when determining its
threshold voltage. Therefore the following equation for the
threshold voltage can be used:
VTH = VT 0 + γ ( 2 | φ F | +VSB − 2 | φ F | )
• An equation of type x = f(x) can be solved numerically by starting
at any value for x and iteratively calculating f(x) until the result
reaches the desired precision.
Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 7
Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 8
Pull−Up−Characteristics
5
4.5
3.5
2.5
Vout (V)
2
1.5
0.5
0
0 0.05 0.1 0.15 0.2 0.25 0.3
Iout (mA)
VGS=5.0V 3.0V 2.75V
3.5V
4.5V
0.25
4.0V 2.5V
0.2
2.4V
2.3V
0.15
ID (mA)
2.2V
2.1V
2.0V
0.1
1.9V
1.8V
1.7V
0.05
1.6V
1.5V
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
V (V)
DS
Determination of Voltage Transfer Characteristic (VTC)
3.5V
2.75V
3.0V
Pull−Up−Characteristic of Enhancement−Load
0.25
4.0V
4.5V 2.5V
VGS=5.0V
0.2
2.4V
2.3V
0.15
ID (mA)
2.2V
2.1V
2.0V
0.1
1.9V
1.8V
1.7V
0.05
1.6V
1.5V
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
V (V)
DS
VTC of NMOS−Inverter
5
4.5
3.5
(V)
2.5
out
V
2
1.5
0.5
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Vin (V)
3. Exercise: NMOS and CMOS Inverter
Institute of
Microelectronic
Systems 1
Problem 1
The figure below shows the layout of a CMOS inverter, whose dimensions
are given in micrometers. The inverter is realized in a n-well CMOS process.
The oxid capacitance is Cox = 69.1 nF/cm2 for both n and p-channel
transistors. The drain-bulk and source-bulk depletion capacitances of the
transistors are given by the following parameters:
NMOS PMOS
C j0 [ fF / µm ]
2
0.0975 0.0298
C jsw0 [ fF / µm] 0.107 0.362
φ0 [V ] 0.879 0.939
φ0 sw [V ] 0.921 0.985
Institute of
Microelectronic
Systems 2
Although not explicitly shown in the figure, an overlap L0 = 0.3µm is assumed
and must be included in calculations. The supply voltage is VDD = 5V .
a) Compute the maximum value of CGDn and CGDp
b) Determine the zero bias value of Cdbn and Cdbp .
Take the sidewall and the bottom regions into account separately.
C j 0 ⋅ area C jsw0 ⋅ perimeter
Cbottom = ; Csidewall =
1 + Vr /φ0 1 + Vr /φ0 sw
c) Compute K (V0 H ,V0 L ) for the inverter and herewith determine Cout ,
i.e. ignore the interconecting wires and CG .
C db,average = K (VOL , VOH ) ⋅ C db,max ; Average for V between VOL and VOH
d) Compute t HL and t LH for the inverter, by using the value of Cout
determined above. Use the following parameters for the transistors :
µA µA
NMOS : VT 0 n = +0.8V , k n = 40 ; PMOS : VT 0 p = −0 .8V , k p = 16
V2 V2
Institute of
Microelectronic
Systems 3
Problem 2
The figure below shows the layout of two cascaded CMOS inverters, each
stage being identical to the one analysed in the problem 1. Capacitances and
the connecting wires are now taken into account. Let Cp-f = 0.0576 fF/um2
and Cm-f = 0.0345 fF/um2.
a) Compute the metal - field capacitance
from the output of the first stage to
the metal - poly contact of the second
stage. Consider only the metal - field
regions, ignoring the regions in which
metal overlaps n +, p + or poly.
b) Determine the input capacitance of the
second stage, as seen from the
beginning of the poly line. Determine
the sum Cline + C g , using the value
of Cout calculated in problem 1. Is one
of the two capacitances dominating?
Institute of
Microelectronic
Systems 4
Problem 3
Let’s consider a CMOS inverter with βn = βp = 35 µA/V2 and VT0n = 0.9V,
VT0p = -0.8V. The output capacitance is Cout = 125 fF and the supply voltage
is VDD = 5V.
a) Compute t HI and t IH for the inverter .
b) Determine the propagation delay time t p .
Institute of
Microelectronic
Systems 5
4. Exercise: CMOS and Pass Transistor
Logic
Institute of
Microelectronic
Systems
b)
Institute of
4. Exercise: CMOS and Microelectronic
Pass Transistor Logic Systems 2
2. Problem: CMOS Logic
Synthesize the CMOS circuit for a parity generator with four inputs:
Z = A⊕ B ⊕C ⊕ D
Institute of
4. Exercise: CMOS and Microelectronic
Pass Transistor Logic Systems 4
6. Problem: Pass Transistor Logic
Institute of
4. Exercise: CMOS and Microelectronic
Pass Transistor Logic Systems 5
Exercise 5: Dynamic Logic
Institute of
Microelectronic
Systems
Draw the transistor level circuit of a dynamic ripple carry full adder,
whose logic functions are the following.
C n +1 = An ⋅ Bn + C n ( An + Bn )
S n = C n +1 ( An + Bn + C n ) + An ⋅ Bn ⋅ C n
Institute of
Exercise 5: Dynamic
Microelectronic
Logic
Systems 2
Problem 2: Charge Sharing
The function:
Z = A (B + C + D + E + F )
Institute of
Exercise 5: Dynamic
Microelectronic
Logic
Systems 3
All input variables in the above circuit come from domino logic blocks, so
that immediately after the precharge we have: A = B = C = D = F = 0V.
For which possible 0 →1 transitions has the charge sharing effect the
greatest influence? The capacitances are:
C X 1 = C X 2 = 10 fF , Cout ,1 = 185 fF
Calculate the voltage Vout,1. Make the calculations for C X 1 = C X 2 = 40 fF.
Institute of
Exercise 5: Dynamic
Microelectronic
Logic
Systems 4
Exercises 6
Institute of
Microelectronic
Systems
Institute of
Exercise 6: Line Propagation Delay, Microelectronic
Buffer Stages Systems 2
Problem 2: Inverter Chain
In L o a d
C L
C g S C g S M -1
C g S M
C g = C L
Institute of
Exercise 6: Line Propagation Delay, Microelectronic
Buffer Stages Systems 3
Institute of
Exercise 6: Line Propagation Delay, Microelectronic
Buffer Stages Systems 4
Exercises 7
Gate-Matrix, Stick-Diagrams, Euler Graphs
Institute of
Microelectronic
Systems
Let’s consider a full adder, whose input signals are A, B and Cin.
The outputs are S and Cout.
A) Draw the logic table for the full adder and determine the
equations for S and Cout.
B) Show the stick-diagram of the full adder
Institute of
Exercises 7: Gate-Matrix, Microelectronic
Stick-Diagrams, Euler Graphs Systems 2
Problem 2: Barrel Shifter
Institute of
Exercises 7: Gate-Matrix, Microelectronic
Stick-Diagrams, Euler Graphs Systems 3
Institute of
Exercises 7: Gate-Matrix, Microelectronic
Stick-Diagrams, Euler Graphs Systems 5
Exercises 8
PLA Structures
Institute of
Microelectronic
Systems
Draw the stick diagram of a NMOS PLA that implements a full adder
stage. The input and the output registers are clocked using φ1 and
φ2 respectively.
Institute of
Microelectronic
Exercises 8: PLA Structures Systems 2
Problem 2: FSM implementation with PLA
Design and implement with PLA a traffic light controller for the
crossroad below. The farm road has sensors for detecting waiting
cars.
There is also a timer available, which is
triggered by the rising edge of a ‘Start’
signal and provides two output signals:
TShort - during the yellow phase
TLong - for timing the green phase
TLong
Start
TShort
TShort
TLong
Institute of
Microelectronic
Exercises 8: PLA Structures Systems 3
S = 0 o r T L o n g = 0
First, draw the schematics of the controller, showing the PLA, the
timer and the traffic lights.
Institute of
Microelectronic
Exercises 8: PLA Structures Systems 4
Problem 1: Unconfigured PLA
V d d
V d d C C A A B B
A c tiv e
P o ly s ilic o n
M e ta l
T r a n s is to r
P h i2
P h i1
S i
C i+ 1
C i A i B i
Institute of
Microelectronic
8. Exercise Systems 1
Problem 2: Unconfigured PLA
V d d
V d d
S S T T Y 1 Y 1 Y 0 Y 0
P 2 A c tiv e
P 3 P o ly s ilic o n
P 4
M e ta l
P 5
P 6
T r a n s is to r
P 7
P 8
P 9
P h i2
P h i1
Y 1' Y 0' F L 1
S T Y 1 Y 0
S t H L 1
Institute of
Microelectronic
8. Exercise Systems 2
Appendix: IMI Unprogrammed Array
Institute of
Microelectronic
Exercises 9 Systems 6
Appendix: CDI Unprogrammed Array
Institute of
Microelectronic
Exercises 9 Systems 7