0% found this document useful (0 votes)
61 views

Vlsi Script0607

Uploaded by

Johan Isaac
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

Vlsi Script0607

Uploaded by

Johan Isaac
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 439

VLSI

Design of Integrated
Circuits
Wintersemester 2006/2007

Dr.-Ing. Thomas Hollstein

Dipl.-Ing. Heiko Hinkelmann


Dipl.-Ing. Petru Bacinschi

Darmstadt University of Technology


Institute of Microelectronic Systems
Prof. Dr. Dr. h.c. mult. Manfred Glesner
Karlstr. 15, 64283 Darmstadt, Germany
VLSI- Design of Integrated
Circuits

Winter Term 2006/2007

Dr.-Ing. Thomas Hollstein

Dipl.-Ing. Heiko Hinkelmann


Dipl.-Ing. Petru Bacinschi

http://www.microelectronic.e-technik.tu-darmstadt.de

Institute of
Microelectronic
Systems

Organizational (I)
• This lecture is intended for students of the following subjects:
– Wirtschaftsingenieurwesen Elektrotechnik (FB1, ab 5. Semester)
– Elektrotechnik und Informationstechnik (FB18, ab 5. Semester)
– Informatik (FB20, nach dem Vordiplom)
– Intern. Master Program Information & Communication Engineering
– Master Program „Informations- und Kommunikationstechnik“

• Requirements: Basics of electronics and communications


(i.e. lecture „Grundlagen der Elektronik “)

• Courses which are directly founded on this lecture:

– VLSI-Design Lab. (SS)


– Microelectronics CAD Lab. (2 weeks, full day course, WS)
– VHDL-course (1 week) and VHDL-laboratory (2 weeks)(full day course, SS)
– Computer Aided Design for Integrated Circuits (RSE1, SS)
– Advanced Methods of Computer Aided Design for Integrated Circuits (RSE2, WS)

Institute of
Microelectronic
Organizational Systems 2
Organizational (II)
Lecture:
Monday 1425h - 1605h in room S3|06/051 (former 48/051)
Wednesday 1140h - 1320h in room S3|06/053 (former 48/053)

Practice:
The excercises will take place within the lecture hours (Mon. or Wed.)

Attending Staff:
Dr.-Ing. Thomas Hollstein, Zi. S4|04/209, Tel. 16-4038
Dipl.-Ing. Heiko Hinkelmann, Zi. S4|04/207, Tel. 16-4238 Building “Sitte”
Dipl.-Ing. Petru Bacinschi, Zi. S4|04/201, Tel. 16-4439 Karlstr. 15

Consultation hours:
On request

Institute of
Microelectronic
Organizational Systems 3

Exam

Diploma Exam:

Type: written exam


Date: will be announced by FB18 examination office
Duration: 90 minutes
Allowed materials to use: none
Relevant topics: Topics of lectures and exercises

Institute of
Microelectronic
Organizational Systems 4
Overview

• Introduction • ASIC Design Concepts


• Repetition MOS Devices • Arithmetic Units
• CMOS Inverter • Micro Architectures
• CMOS Technology • Memories
• Static CMOS Logic • ASIC Design Guidelines
• Synchronous Logic • Design for Testability
• Basic Sequential Circuits • VLSI in Signal Processing
• Performance • VLSI in Communications
• CAD - Design Flow • Digital Baseband Design
• Digital Subsystem Design

Institute of
Microelectronic
Organizational Systems 5

Literature
[1] John P. Uyemura: Fundamentals of MOS Digital Integrated
Circuits, Addison Wesley, 1988

[2] John P. Uyemura: Circuit Design for CMOS VLSI, Kluwer


Academic Publishers, 1992

[3] Neil Weste and Kamran Eshragihian: Principles of CMOS


VLSI Design, Addison Wesley

[4] W. Maly: Atlas of IC Technologies: An Introduction to VLSI


Processes, The Benjamin/Cummings Publishing Company,
1987

[5] Jan M. Rabaey: Digital Integrated Circuits - A Design


Perspective, Prentice Hall
http://bwrc.eecs.berkeley.edu/Classes/IcBook/index.html
Institute of
Microelectronic
Organizational Systems 6
1. Introduction

Institute of
Microelectronic
Systems

Status of Microelectronics Technology


10
Threshold voltage Vt (V) and power supply (V)

Vdd Future VLSI chip 2005 2011


2
CMOS feature size 0.1 µm 0.05 µm
Gate oxide thickness t OX (nm)

Core voltage (V) 0.9-1.2 V 0.5-0.6 V


1 50
Chip size 520 mm 2 750 mm 2

0.5
Vt
Transistors/cm 2 40 M 100 M
20
DRAM bits /chip 17.2 G 275 G
0.2
10
Number of wiring levels 7-8 9

0.1 5 (Source: International Technology Roadmap for


tOX Semiconductors 1998 update)
2

1
0.02 0.05 0.1 0.5 1
MOSFET channel length (µm)

Institute of
Microelectronic
1: Introduction Systems 2
ASIC Outlook 1997: Semiconductor and Electronic
Equipment Sales Trends (1992 - 2001)

Institute of
Microelectronic
1: Introduction Systems 3

Interconnect

Passivation
Technology Requirements:
Dielectric
Inductive effects will become
Etch stop
increasingly important layer
Additional metal patterns or Global Dielectric
ground planes for inductive diffusion
barrier
shielding
Thinner metallization
Lower line-to-line capacitance Copper
conductor
Increasing pitch and with metal
Intermediate barrier liner
thickness at each
conductor level to alleviate
the impact of interconnect
Pre-metal
delay Local
dielectric
Tungsten
contact
plug
Source: SIA Roadmap 1999
Institute of
Microelectronic
1: Introduction Systems 4
Productivity Gap: Technology vs. CAD

Institute of
Microelectronic
1: Introduction Systems 5

Productivity Gap: Technology vs. CAD

Need to increase Designers Productivity in order to make use of


new Technologies
SIA Roadmap for the Design Technology Requirements (near
term)

Institute of
Microelectronic
1: Introduction Systems 6
Productivity Gap: Beyond 2008

SIA Roadmap for the Design Technology Requirements (far


term)

Institute of
Microelectronic
1: Introduction Systems 7

EDA: High-Level Design


architecture structural of first_tap is

signal x_q,red : std_logic_vector(bitwidth-1 downto 0);


VHDL-Description
signal mult : std_logic_vector(2*bitwidth-1 downto 0);

begin
Gate-Level
delay_register:
process(reset,clk)
begin
Netlist
if reset='1' then
x_q <= (others => '0');
elsif (clk'event and clk='1') then
RTL-Synthesis
x_q <= x_in;
end if; (Synopsys)
end process;

mult <= signed(coef)*signed(x_q);

Placement &
Production Routing
(Cadence/Mentor)

ASIC Layout

Institute of
Microelectronic
1: Introduction Systems 8
Challenge: System-on-a-Chip Design ?

System on a Chip

Reuse, IP Cores
Design RTL
Complexity
Synthesis
Gates

Place & Route


Transistors
Design
Masks Productivity
Polygons

1975 1980 1985 1990 1995 2000

Chasing the design gap


Institute of
Microelectronic
1: Introduction Systems 9

Traditional ASIC market


ASICs are customer specific Ics
If application-specific processor: ASIP
The product is made only once
an application is found
Semicustom One or more
customised layers
ASIC
(customer
specific)

All layers
Non-standard Custom
customised
IC

ASIP
(application Circuit with fuse,
Programmable
specific) antifuse or
memory that can
Institute of be programmed
Microelectronic
1: Introduction Systems 10
SoC: Silicon Components Categories

Silicon
Siliconcomponents
components

Discrete
Discretedevices
devices
Integrated
Integratedcircuits
circuits and optoelectronics
and optoelectronics

Analog
Analogand
and Logic Memory
Memory Microcomponets
Microcomponents
Logic
Mixed
Mixedsignal
signal ••Logic ••DRAMs
DRAMs ••Microprocessors
Microprocessors
Logic
••Gate
Gatearrays
arrays ••SRAMs
SRAMs ••Microcontrolers
Microcontrollers
••Cell
Cellbased
based ••Flash
Flash ••Microperipherals
Microperipherals
••FPLDs
FPLDs ••Other
Other
••SoC
Other

Modern SoCs can integrate different components

Institute of
Microelectronic
1: Introduction Systems 11

Market for Systems-on-a-Chip


Source:
Hugo De Man Services Broadband
EIS´99, Darmstadt Network
100Mb/sWLAN
RF
20Gop/s
Java <1 Watt
WWW Configurable
LAN
Multi-Standard
Info Plug...
MPEG 4-7
100 Gop/s
??
5 Gtr/s
10 Watt

Area Examples:
-> Domain Specific Computing
Multimedia
Mobile Communication SoC
Automotive
...
Institute of
Microelectronic
1: Introduction Systems 12
Application: Single-Chip Integrated CMOS
Berkeley Wireless Centre
Radio

• Research into Technology and


Design Methodologies for
CMOS single Chip Radios
• Exploring future Applications
of wireless Technology, 4th
Conventional cellular
Generation and beyond
Phone Solution

Institute of
Microelectronic
1: Introduction Systems 13

Application Example: Transceiver Design

Receiver Oscillator
Low Noise
Filter Mixer Demodulator
Amplifier

Digital Baseband
AD/DA AD/DA
Converter Memory CMOS Converter
Logic

Transmitter Oscillator
Power
Filter Mixer Modulator
Amplifier

Institute of
Microelectronic
1: Introduction Systems 14
2. Repetition Transistor Models

Institute of
Microelectronic
Systems

Structure of MOSFET
vS vG vD

iS iG iD
Gate (G)
Source (S) Drain (D) D

n+ Channel Region n+ G B

P-Type Substrate
S
Body (B)
iB
vB

MOSFET - Current through the channel region is controlled with


voltage vG

Institute of
Microelectronic
2: Transistors Systems 2
Inversion

• The bulk has to have the lowest


potential to ensure reverse
biased pn-junctions (no current
must flow between drain/source
and bulk!)
• VSB = 0 → in the following we
relate all voltages to the source
voltage
• VGS > VT → n-channel is induced
(blue area between drain and
source).
• Because the MOSFET is a
• White area → depletion region
symmetrical device, source and
• A current can flow between drain drain have to be defined: source
and source, if VDS > 0 has always a lower potential than
the drain for an n-channel FET!

Institute of
Microelectronic
2: Transistors Systems 3

Ohmic region
8.00e-4
• Increasing VDS to a value VDS > 0
VGS= 5 V
leads to a current ID.
• Near the drain the voltage
Drain-Source Current (A)

6.00e-4
responsible for the inversion is VGS= 4 V
(VGS - VT) - VDS and thus smaller
than near the source. 4.00e-4
• The channel acts like a linear VGS= 3 V
resistor - that’s why this region of
operation is called ohmic. 2.00e-4
VGS = 2 V

0.00e+0
0.0 0.2 0.4 0.6 0.8
Drain-Source Voltage (V)

In this region: iDS ∼ vDS ⇒ Ron


0.5kΩ < Ron < 10kΩ

Institute of
Microelectronic
2: Transistors Systems 4
Pinch - off

• If VDS rises to the point where it is


VGS - VT, there is no voltage near
the drain to induce an inversion
layer - the channel is pinched off
at the drain.

Institute of
Microelectronic
2: Transistors Systems 5

Saturation

• Further increasing VDS causes the


pinch-off point to move in the
direction of the source.
• The voltage at the pinch off point
is always VGS - VT.
• When the electrons coming from
the source reach the pinch off
point, they are injected into the
depleted region and the electric
field in this region sweeps the
electrons form the pinch off point
to the drain.

Institute of
Microelectronic
2: Transistors Systems 6
Output Characteristics
2.20e-4
Linear VGS = 5 V
2.00e-4 Region
Drain-Source Current (A)
1.80e-4 Pinchoff Locus
1.60e-4
Saturation Region
1.40e-4

1.20e-4 VGS= 4 V

1.00e-4

8.00e-5

6.00e-5 VGS = 3 V

4.00e-5
VGS < 1 V
2.00e-5 VGS= 2 V • VT = 1V
0.00e+0
0 2 4 6 8 10 12
Drain-Source Voltage (V)
Institute of
Microelectronic
2: Transistors Systems 7

Channel Length Modulation

Institute of
Microelectronic
2: Transistors Systems 8
Transfer Characteristics and Depletion Mode
MOSFET

Drain-Source Current (uA)


• Transfer characteristics: plot of 250
drain current versus gate-source Enhancement-Mode
voltage for a fixed drain-source 200 Depletion-Mode
voltage
• If threshold voltage of NMOS 150
transistor negative → depletion
mode MOSFET (there exists an 100
implanted n-type channel region)
50
S G D
0
VTN = -2 V VTN = +2 V
n+ n+
Implanted n-type
Channel Region -50
-4 -2 0 2 4 6
L Gate-Source Voltage (V)
p-type substrate

B
Institute of
Microelectronic
2: Transistors Systems 9

P-channel MOSFET (PMOS)


Source-Drain Current (A)

2.50e-4
VSG = 5 V GS (V = -5 V)
vS vG < 0 2.00e-4
vD < 0
iS iG Gate iD Drain 1.50e-4
Source VSG= 4 VGS (V = -4 V)

p+ Channel Region p+ 1.00e-4

VSG = 3 VGS (V = -3 V)
L 5.00e-5
VSG= 2 VGS (V = -2 V)
n-type substrate
0.00e+0 VSG< 1 V GS (V > -1 V)
Body
iB vB > 0 -5.00e-5
-2 0 2 4 6 8 10 12
Source-Drain Voltage (V)

NMOS Device PMOS Device


Enhancement-mode VTN > 0 VTP < 0
Depletion-mode VTN < 0 VTP > 0

Institute of
Microelectronic
2: Transistors Systems 10
IEEE Standard MOS Transistor Circuit
Symbols
D D
G G
B B

S S
(a) NMOS enhancement-mode device (b) PMOS enhancement-mode device

D D
G G
B B

S S
(c) NMOS depletion-mode device (d) PMOS depletion-mode device

D D
G G

S S
(e) Three-terminal NMOS transistor (f) Three-terminal PMOS transistor
Institute of
Microelectronic
2: Transistors Systems 11

Summary of MOS Equations

D S

G iDS
G B
B
iSD
S D

From NMOS to PMOS: Signs of all voltages change


Institute of
Microelectronic
2: Transistors Systems 12
MOS Capacitances - Linear Region
The channel shields the bulk electrode from the gate since the inversion layer
acts as conductor between drain and source.

Source Gate Drain

C' C" C" C'OL


OL OX OX

n+ n+
C n-type channel C
SB DB

p-type substrate
NMOS device in
the linear region Bulk

Institute of
Microelectronic
2: Transistors Systems 13

MOS Capacitances - Saturation


The channel shields the bulk electrode from the gate since the inversion layer acts
as conductor between drain and source. The channel is pinched off and does not
contact the drain n+ region. Gate
Source Drain

C' C" C" C'


OL OX OX OL

n+ n+
C n-type channel C
SB DB

p-type substrate

NMOS device in saturation


Bulk

Institute of
Microelectronic
2: Transistors Systems 14
MOS Capacitances - Cutoff
The gate-bulk capacitance consists of the gate capacitance in series with the
depletion capacitance of the depletion region.
Gate
Source Drain

C' C'OL
OL

n+ n+
CGB
CSB C DB

p-type substrate Depletion region

NMOS device in cutoff Bulk

Institute of
Microelectronic
2: Transistors Systems 15

Small-Signal Models for Field-Effect


Transistors (I)

- Considering the MOSFET as a three-terminal device.


- Small-signal model of the MOSFET is based on the y-parameter
two-port network.
id

+
ig

+ v
ds
v
gs
-
-

The MOSFET represented as a two-port network


Institute of
Microelectronic
2: Transistors Systems 16
Small-Signal Models for Field-Effect
Transistors

ig id
G D

+ +
v g v r v
gs m gs ο ds
- -
S

Small-signal model for the three-terminal MOSFET

Institute of
Microelectronic
2: Transistors Systems 17

Body Effect in the


Four-Terminal MOSFET

A second voltage-controlled current source has been added to


model the back-gate transconductance gmb.

G D B
+ + +
gmv gs gmbvbs ro vds vbs
vgs

- -
-
S

Small-Signal model for the four-terminal MOSFET

Institute of
Microelectronic
2: Transistors Systems 18
High-Frequency MOSFET
Small Signal Model
D*

CGD RD C BD
CGB
B
D
G + + +
gmv gs gmbvbs ro vds vbs
vgs

- -
-
S

CGS RS C BS
S*
Institute of
Microelectronic
2: Transistors Systems 19

High-Frequency MOSFET
Small Signal Model

Cutoff Ohmic Saturation

CGD COX WLD COX WLD + 1 WLCOX COX WLD


2
CGS COX WLD COX WLD + 1 WLCOX COX WLD + 2 WLCOX
2 3
C BG COX WL 0 0
C BC1
C BD C BD1 C BD1 + C BD1
2
C BC1 C BS1 + 2 C BC1
C BS C BS1 C BS1 + 3
2

LD : Overlap Gate to Drain or Source due to underdiffusion

Institute of
Microelectronic
2: Transistors Systems 20
3. Short Channel Effects on MOS
Transistors.

Institute of
Microelectronic
Systems

Overview.

• Short Channel
Devices.
• Velocity Saturation
Effect.
• Threshold Voltage
Variations.
• Hot Carrier Effects.
• Process Variations.

(Source: Jan M. Rabaey, Digital Integrated Circuits)

Institute of
Microelectronic
3: Short Channel Effects Systems 2
Short Channel Devices.

• As the technology scaling


reaches channel lengths less
than a micron (L<1µ), second
Gate Oxyde
order effects, that were ignored in Gate
devices with long channel length Source
Polysilicon
Drain Field-Oxyde

(L>1µ), become very important. n+ n+


(SiO2)
L<1µ
• MOSFET‘s owning those p+ stopper
p-substrate
dimensions are called „short
channel devices“.
Bulk Contact
• The main second order effects
are: Velocity Saturation, CROSS-SECTION of NMOS Transistor
Threshold Voltage Variations and
Hot Carrier Effects.

Institute of
Microelectronic
3: Short Channel Effects Systems 3

Velocity Saturation Effect (I)

• Review of the Classical


Derivation of the Drain Current:
VGS>VT VGS VDS
S
ID
VDS<<VGS G
D
• Induced channel charge at V(x): n+ – V(x) + n+

L x
Qi(x)=-COX[VGS-V(x)-VT] (1)
p-substrate

• The current is given as a product B

of the drift velocity of the carriers


vn and the available charge: MOS transistor and its bias conditions

ID=-vn(x)Qi(x)W (2)
Institute of
Microelectronic
3: Short Channel Effects Systems 4
Velocity Saturation Effect (II)

• The electron velocity is related


⎡ 2

to the electric field through the I D = µ n COX
W
(
⎢ GS
V − VT )V DS −
VDS
⎥ (5)
mobility: L ⎣ 2 ⎦

• The behavior of the short channel


vn = − µ n Ε ( x ) = µ n
dV (3) devices deviates considerably
dx from this model.
• Combining (1) and (3) in (2): • Eq. (3) assumes the mobility µn
as a constant independent of the
IDdx=µnCOXW(VGS-V(x)-VT)dV (4) value of the electric field Ε.
• At high electric field carriers fail to
follow this linear model.
• Integrating (4) from 0 to L yields
the voltage-current relation of the • This is due to the velocity
transistor: saturation effect.
Institute of
Microelectronic
3: Short Channel Effects Systems 5

Velocity Saturation Effect (III)

vn (m/s)

• When the electric field reaches a


critical value ΕC, (1.5×106 V/m for
vsat=105
p-type silicon) the velocity of the constant
velocity
carriers tends to saturate (105 m/s
for silicon) due to scattering
constant mobility
effects. (slope=µ)
E (V/µm)
Ec=1.5

Institute of
Microelectronic
3: Short Channel Effects Systems 6
Velocity Saturation Effect (IV)

• The impact of this effect over the ⎡ 2



I D = κ (VDS )µ n COX ( )
W VDS
⎢ GS
V − V V − ⎥
drain current of a MOSFET T DS
L ⎣ 2 ⎦
operating in the linear region is (7)
obtained as follows:
with:
• The velocity as a function of the
κ (VDS ) =
electric field, plotted in the last 1
figure can be approximated by: 1 + (VDS Ε C L )

µ nΕ
v= for Ε≤ΕC (6) • For large values of L or small
1 + Ε ΕC values of VDS, κ approaches 1
and (7) reduces to (5).
v = vsat for Ε≥ΕC
• For short channel devices κ<1
Reevaluating (1) and (2) using (6): and the current is smaller than
what would be expected.
Institute of
Microelectronic
3: Short Channel Effects Systems 7

Velocity Saturation Effect (V)

• When increasing the drain-source • Where VGT is a short notation for


voltage, the electric field reaches VGS-VT.
the value ΕC, and the carriers at
the drain become velocitiy • Equating (8) and (9) and solving
saturated. Assuming that the drift for VDSAT:
velocity is saturated, from (4) with
µndV=vsat the drain current is:
VDSAT = κ (VGT )VGT (10)
IDSAT=vsatCOXW(VG-VT-VDSAT) (8)
• For a short channel device and
large enough values of VGT,
Evaluating (7) with VDS=VDSAT κ(VGT) is smaller than 1, hence
W⎡ 2
⎤ the device enters saturation
I DSAT = κ (VDSAT )µ n COX
VDSAT
⎢VGT VDSAT − ⎥ before VDS reaches VGS-VT.
L⎣ 2 ⎦
Institute of
Microelectronic
3: Short Channel Effects Systems 8
Velocity Saturation Effect (VI)

ID

VGS=VDD Long-channel device

Short-channel device

VDS
VDSAT VGS-VT

Short channel devices display an extended saturation


region due to velocity-saturation

Institute of
Microelectronic
3: Short Channel Effects Systems 9

Simplificated model for hand calculations (I)

A substantially simpler model can be obtained by making two


assumptions:
• Velocity saturates abruptly at ΕC and is approximated by:
ν=µnΕ for Ε≤ΕC
ν=νsat= µnΕC for Ε≥ΕC
• VDSAT at which ΕC is reached is constant and has a value:

Lν sat
VDSAT = LΕ C = (11)
µn
Under these conditions the equation for the current in the linear
region remains unchanged from the long channel model. The
value for IDSAT is found by substituting eq. (11) in (5).

Institute of
Microelectronic
3: Short Channel Effects Systems 10
Simplificated model for hand calculations (II)

⎡ 2

⎢(VGS − VT )VDSAT −
W VDSAT
I DSAT = µ n COX ⎥
L ⎣ 2 ⎦

⎡ ⎤
I DSAT = vsat COX W ⎢(VGS − VT ) − DSAT ⎥
V
(12)
⎣ 2 ⎦

This model is truly first order and empirical and causes


substantial deviations in the transition zone between linear and
velocity saturated regions. However it shows a linear
dependence of the saturation current with respect to VGS for the
short channel devices.

Institute of
Microelectronic
3: Short Channel Effects Systems 11

I-V characteristics of long- and short-


channel MOS transistors both with W/L=1.5

Institute of
Microelectronic
3: Short Channel Effects Systems 12
ID-VGS characteristic for long- and short
channel devices both with W/L=1.5

Institute of
Microelectronic
3: Short Channel Effects Systems 13

Threshold Voltage Variations (I)

• For a long channel N-MOS transistor the threshold Voltage is


given for:

VT = VT 0 + γ ( − 2φ F + VSB − − 2φ F ) (11)

• Eq. (11) states that the threshold Voltage is only a function of the
technology and applied body bias VSB

• For short channel devices this model becomes inaccurate and


threshold voltage becomes function of L, W and VDS.

Institute of
Microelectronic
3: Short Channel Effects Systems 14
Threshold Voltage Variations (II)

VT VT

Long-channel threshold Low VDS threshold

L VDS

Threshold as a function of Drain-induced barrier lowering


the length (for low VDS) (for low L)
Institute of
Microelectronic
3: Short Channel Effects Systems 15

Hot Carrier Effects (I)

• During the last decades transistors dimensions were scaled


down, but not the power supply.
• The resulting increase in the electric field strength causes an
increasing energy of the electrons.
• Some electrons are able to leave the silicon and tunnel into the
gate oxide.
• Such electrons are called „Hot carriers“.
• Electrons trapped in the oxide change the VT of the transistors.
• This leads to a long term reliabilty problem.
• For an electron to become hot an electric field of 104 V/cm is
necessary.
• This condition is easily met with channel lengths below 1µm.
Institute of
Microelectronic
3: Short Channel Effects Systems 16
Hot Carrier Effects (II)

Hot carrier effects cause the I-V characteristics of an NMOS transistor to


degrade from extensive usage.
Institute of
Microelectronic
3: Short Channel Effects Systems 17

Process Variations.

Devices parameters vary between runs and even on


the same die!
Variations in the process parameters , such as impurity concentration den-
sities, oxide thicknesses, and diffusion depths. These are caused by non-
uniform conditions during the deposition and/or the diffusion of the
impurities. This introduces variations in the sheet resistances and transis-
tor parameters such as the threshold voltage.
Variations in the dimensions of the devices, mainly resulting from the
limited resolution of the photolithographic process. This causes ( W/L)
variations in MOS transistors and mismatches in the emitter areas of
bipolar devices.

Institute of
Microelectronic
3: Short Channel Effects Systems 18
Impact of Device Variations.

2.10

2.10

1.90

Delay (nsec)
Delay (nsec)

1.90

1.70
1.70

1.50 1.50
1.10 1.20 1.30 1.40 1.50 1.60 –0.90 –0.80 –0.70 –0.60 –0.50

Leff (in mm) VTp (V)

Delay of Adder circuit as a function of variations in L and VT


Institute of
Microelectronic
3: Short Channel Effects Systems 19

Parameter values for a 0.25µm CMOS


process. (minimum length devices).

VTO (V) γ (V0.5) VDSAT (V) K‘ (A/V2) λ (V-1)


NMOS 0.43 0.4 0.63 115 × 10-6 0.06
PMOS -0.4 -0.4 -1 -30 × 10-6 -0.1

Institute of
Microelectronic
3: Short Channel Effects Systems 20
4. SPICE LEVEL 1 MOSFET
MODEL

Institute of
Microelectronic
Systems

Four mask layout and cross section of a N


channel MOS Transistor.

Institute of
Microelectronic
4: MOSFET Model Systems 2
Layout and cross section of a n-well CMOS
technology.

Institute of
Microelectronic
4: MOSFET Model Systems 3

Equations for the different operation regions

I DS = 0 (VGS ≤ VTH )

(W Leff )VDS [2(VGS − VTH ) − VDS ](1 + LAMBDA ⋅ VDS )


KP
I DS = (0 ≤ VDS ≤ VGS − VTH )
2

I DS =
KP
(W Leff )(VGS − VTH )2 (1 + LAMBDA ⋅VDS ) (0 ≤ VGS − VTH ≤ VDS )
2

Where the threshold voltage is given by:

(
VTH = VT 0 + GAMMA 2 ⋅ PHI − VBS − 2 ⋅ PHI )
and the channel length:

Leff = L − 2 ⋅ LD

Institute of
Microelectronic
4: MOSFET Model Systems 4
Where L is the length of the polysilicon gate and LD is the gate
overlap of the source and drain.
The elements in the large signal MOSFET model are shown in
the following figure.

Institute of
Microelectronic
4: MOSFET Model Systems 5

MOSFET SPICE PARAMETERS.


Parameter Name SPICE Symbol Analytical Symbol Units

Channel length Leff L M

Poly gate length L Lgate M

Lateral diffusion/
Gate-source overlap LD LD M

Transconductance
parameter KP µnCOX A/V2

Threshold voltage/
Zero-bias threshold VTO VTO V

Channel-length
modulation parameter LAMBDA λn V-1

Bulk threshold/
Backgate effect parameter GAMMA γn V1/2

Surface potential/
Depletion drop in PHI -φP V
inversion

Institute of
Microelectronic
4: MOSFET Model Systems 6
Specifying MOSFET Geometry in SPICE.

Mname D G S B MODname L= W= AD= AS= PD= PS= NRD= NRS=

Institute of
Microelectronic
4: MOSFET Model Systems 7

LEVEL 1 MOSFET MODEL PARAMETERS.

.MODEL MODname NMOS/PMOS VTO= KP= GAMMA= PHI=


LAMBDA= RD= RS= RSH= CBD= CBS= CJ= MJ= CJSW=
MJSW= PB= IS= CGDO= CGSO= CGBO= TOX= LD=
where:
NMOS/PMOS- MOSFET type.
VTO- Threshold voltage (V)
KP- Transconductance parameter (A/V2)
GAMMA- Bulk threshold parameter (V1/2)
PHI- Surface potential (V)
LAMBDA- Channel length modulation parameter (V-1)
RD- Drain resistance (Ω)

Institute of
Microelectronic
4: MOSFET Model Systems 8
LEVEL 1 MOSFET MODEL PARAMETERS.

RS- Source resistance (Ω)


RSH- Sheet resistance of the drain/source diffusions (Ω/)
CBD- Zero bias drain-bulk junction capacitance (F)
CBS- Zero bias source-bulk junction capacitance (F)
MJ- Bulk junction grading coefficient (dimensionless)
PB- Built-in potential for the bulk junction (V)
• With CBD, CBS, MJ and PB, SPICE computes the voltage
dependences of the drain-bulk and source-bulk capacitances:

C BD (VBD ) = C BS (VBS ) =
CBD CBS
(1 − VBD PB )MJ (1 − VBS PB )MJ
Institute of
Microelectronic
4: MOSFET Model Systems 9

Large-signal, charge-storage capacitors of


the MOS device.

Institute of
Microelectronic
4: MOSFET Model Systems 10
LEVEL 1 MOSFET MODEL PARAMETERS.

CJ- Zero bias planar bulk junction capacitance (F/m2)


CJSW- Zero bias sidewall bulk junction capacitance (F/m)
MJSW- Sidewall junction grading coefficient (dimensionless)
• If CJ, CJSW, and MJSW are given, a more accurated simulation
of these capacitances is performed using the following equations:

CJ ⋅ AD CJSW ⋅ PD
C BD (VBD ) = +
(1 − VBD PB ) (1 − VBD PB )MJSW
MJ

CJ ⋅ AS CJSW ⋅ PS
C BS (VBS ) = +
(1 − VBS PB ) (1 − VBS PB )MJSW
MJ

Institute of
Microelectronic
4: MOSFET Model Systems 11

Bottom and Sidewall components of the


bulk junction capacitors.

Bottom=ABCD
Sidewall=ABEF+BCFG+DCGH+ADEH
Institute of
Microelectronic
4: MOSFET Model Systems 12
LEVEL 1 MOSFET MODEL PARAMETERS.

IS- Saturation current of the junction diode (A)


CGDO- Overlap capacitance of the gate with drain (F)
CGSO- Overlap capacitance of the gate with source (F)
CGBO- Overlap capacitance of the gate with bulk (F)
TOX- Gate oxide thickness (m)
LD- Lateral diffusion (m)

Institute of
Microelectronic
4: MOSFET Model Systems 13

Overlap Capacitances of an MOS transistor.


(a) Top view showing the overlap between the source or drain
and the gate. (b) Side view.

Institute of
Microelectronic
4: MOSFET Model Systems 14
Example of MOSFET model parameters
values.
Parameter Name N Channel MOSFET P Channel MOSFET Units

Gate oxide thickness TOX 150 150 Angstroms

Transconductance
parameter KP 50 x 10-6 25 x 10-6 A/V2

Threshold voltage 1.0 -1.0 V

Channel-length
modulation parameter 0.1/L (L in µm) 0.1/L (L in µm) V-1
LAMBDA

Bulk threshold parameter


GAMMA 0.6 0.6 V1/2

Surface potential PHI 0.8 0.8 V

Gate-Drain overlap
capacitance. CGDO 5 x 10-10 5 x 10-10 F/m

Gate-Source overlap
capacitance. CGSO 5 x 10-10 5 x 10-10 F/m

Zero-bias planar bulk


depeltion capacitance CJ 10-4 3 x 10-4 F/m2

Zero-bias sidewall bulk


depletion capacitance 5 x 10-10 3.5 x 10-10 F/m
CJSW

Bulk junction potential PB 0.95 0.95 V

Planar bulk junction


grading coefficient MJ 0.5 0.5

Sidewall bulk junction


grading coefficient MJSW 0.33 0.33

Institute of
Microelectronic
4: MOSFET Model Systems 15
5. CMOS Inverter

Institute of
Microelectronic
Systems

Overview

• Logic levels
• Noise Margin
• CMOS Inverter
– static behaviour
– dynamic behaviour

Courtesy Quiller Electronics Limited

Institute of
Microelectronic
5: CMOS Inverter Systems 2
Inverter as simplest logic gate
V+
V
+ R

v v v
I O O
vI VO

V DD VCC

R R

v v
i O
i O
D C
VI
vI
vI
M Q
S S

Institute of
Microelectronic
5: CMOS Inverter Systems 3

Logic Voltage Levels

VOL: Nominal voltage


corresponding to a low logic v
O
state at the output of a logic V
gate for vI = VOH. +
V
Generally V- ≤ VOL. OH Slope = -1

VOH: Nominal voltage


corresponding to a high logic
state at the output of a logic
gate for vI = VOL.
Generally VOH ≤ V+.
Slope = -1
VIL: Maximum input voltage that
will be recognised as a low V
OL
input logic level. v
NML NM I
VIH: Minimum input voltage that will 0 H
be recognised as a high input V- 0 V V V V V+
OL IL IH OH
logic level.
Institute of
Microelectronic
5: CMOS Inverter Systems 4
Noise Margins

vO vI
V+

"1"
NML: Noise margin associated with V OH "1"
a low input level NMH
VIH

NML = VIL - VOL Undefined


Logic State
NMH: Noise margin associated with
a high input level V IL

NM L
NMH = VOH - VIH "0"
VOL

"0"
V-

Institute of
Microelectronic
5: CMOS Inverter Systems 5

Dynamic Response of Logic Gates


v
I
• Rise time tr: time required for the VOH
transition from V10% to V90%. 90%

• Fall time tf: time required for the 50% V +V


OH OL
transition from V90% to V10%. 2
10%
VOL
V10% = VOL + 0.1(VOH - VOL) (a) t
tr tf
V90% = VOL + 0.9(VOH - VOL)
vO
τ PHL τ PLH
VOH
• Propagation delay τP: difference 90%
in time between the input and V +V
OH OL
50%
output signals reaching V50%. 2
10%
VOL
V50% = (VOH + VOL)/2
(b) t 1 t t2 t3 t t4 t
τ PLH + τ PHL
f r

τP = Switching waveforms for an idealised inverter


(a) Input voltage signal (b) Output voltage waveform
2
Institute of
Microelectronic
5: CMOS Inverter Systems 6
MOS Inverter with Resistive Load

V =5V
DD
• NMOS switching device MS
designed to force vO to VOL R

v
• Resistor load R to pull the output O
up toward the power supply VDD i
D
+
• VOH = VDD (driver in cut off v M v
⇒ iD = 0) I S DS
• VOL determined by W/L ratio of
MS -

Institute of
Microelectronic
5: CMOS Inverter Systems 7

Example

V = 5V V DD= 5V
DD i
DD

R R 95 k Ω

v =V =5V
O OH
v =V
O OL
0 50 µA
M +
S
M v = 0.25 V
S DS
2.06
1 -
v =V <V v =V =5V
I OL TH I OH
(a) (b)

Institute of
Microelectronic
5: CMOS Inverter Systems 8
On - Resistance
V V
DD DD

R R

VOH VOL

v = V OL v =V
I I OH

R on R on
(a) (b)

vDS 1 Ron 1
Ron = = VOL = VDD = VDD
iD W ⎛ v ⎞ Ron + R 1+
R
K 'n ⎜ vGS − VTN − DS ⎟ Ron
L ⎝ 2 ⎠

Institute of
Microelectronic
5: CMOS Inverter Systems 9

Transistor Alternatives to the Load Resistor


VDD VDD
ML ML
+
vO vO
vI MS vI MS

(a) NMOS inverter with gate of the load (b) NMOS inverter with gate
device connected to its source of the load device grounded

V DD V DD
VGG
ML ML

vO vO
vI MS VI
MS

(c) Saturated load inverter (d) Linear load inverter

Institute of
Microelectronic
5: CMOS Inverter Systems 10
CMOS Inverter Technology
V (0 V) v V (5 V)
SS I DD

B S D vo D S B
p+ n+ n+ p+ p+ n+
n-well
Ohmic NMOS transistor
contact PMOS transistor Ohmic
contact
p-type substrate

C M O S T ra n sisto r P a ra m e te rs
N M O S D e vice P M O S D e vice
VTO 1 V -1 V
γ 0 .5 0 V 0 .7 5 V
2 φF 0 .6 0 V 0 .7 0 V
K' 25 µA /V 2 1 0 µA /V 2

Institute of
Microelectronic
5: CMOS Inverter Systems 11

Complementary MOS (CMOS) Logic Design

VDD = 5 V VDD = 5 V
• Inverter with resistive S
load ⇒ power R onp
dissipation when the M
P
input is high. G
• If an NMOS and D v
I
PMOS transistor is v v
v O
I D O
used ⇒ CMOS.
• One transistor is G
M
N
always off while the
other is on ⇒ no S
R onn
static power
consumption.

Institute of
Microelectronic
5: CMOS Inverter Systems 12
CMOS voltage transfer Characteristic

VIL
1 2
4.0V M N off M N saturated
M P linear
v o = v I - VTP
vo M and M P saturated
N
2.0V 3

M P saturated
M N linear
VIH
v o= v I - VTN 5
0V 4 M P off
0V 1.0V 2.0V v 3.0V 4.0V 5.0V
I
Institute of
Microelectronic
5: CMOS Inverter Systems 13

Regions of Operation of Transistors in a


Symmetrical Inverter

Region Input Voltage vI Output NMOS PMOS


Voltage vO Transistor Transistor

1 vI ≤ VTN VOH = VDD Cutoff Linear

2 VTN < vI ≤ vO + VTP High Saturation Linear

3 vI ≈ VDD/2 VDD/2 Saturation Saturation

4 vO + VTN < vI ≤ (VDD + VTP) Low Linear Saturation

5 vI ≥ (VDD + VTP) VOL = 0 Linear Cutoff

Institute of
Microelectronic
5: CMOS Inverter Systems 14
What happens, if the inverter is not
symmetrical?
6.0V 6.0V
VDD = 5 V
vO= vI
VDD = 4 V
4.0V 4.0V KR= 5
VDD = 3 V v O= vI
VDD = 2 V K R= 1
2.0V 2.0V

K R = 0.2

0V 0V

0V 1.0V 2.0V 3.0V 4.0V 5.0V 6.0V 0V 1.0V 2.0V 3.0V 4.0V 5.0V
vI vI

Symmetrical inverter (Kn = Kp) Asymmetrical inverter (KR = Kn / Kp)

Institute of
Microelectronic
5: CMOS Inverter Systems 15

Calculation of VIL

Equating currents for saturated nMOS and nonsaturated pMOS device


(Region 2):
Kn
2
K
2
[
(Vin − VTn )2 = p 2(VDD − Vin − VTp )(VDD − Vout ) − (VDD − Vout )2 ]
The derivation condition (dVout / dVin) = -1 has to be evaluated for
IDn(Vin) = IDp(Vin , Vout):

dVout (dI Dn / dVin ) − (∂I Dp / ∂Vin )


= = −1
dVin ∂I Dp / ∂Vout
Evaluating the derivation gives:
⎛ K ⎞ K
VIL ⎜⎜1 + n ⎟⎟ = 2Vout + n VTn − VDD − VTp
⎝ Kp ⎠ Kp
This equation has to be solved together with the first equation ⇒ VIL
Institute of
Microelectronic
5: CMOS Inverter Systems 16
Calculation of VIH
At the point VIH the NMOS device is nonsaturated and the PMOS
transistor is saturated (region 4):

[ ] = p (VDD − VIH − VTp )


Kn K
2(VIH − VTn )Vout − Vout
2 2

2 2
The derivation condition (dVout / dVin) = -1 has to be evaluated for
IDn(Vin, Vout) = IDp(Vin):

dVout (dI Dp / dVin ) − (∂I Dn / ∂Vin )


= = −1
dVin ∂I Dn / ∂Vout
which gives:

⎛ K ⎞
VIH ⎜⎜1 + p ⎟⎟ = 2Vout + VTn + p (VDD − VTp )
K
⎝ Kn ⎠ Kn
This equation forms together with the first equation a quadratic in VIH
which has to be solved. Institute of
Microelectronic
5: CMOS Inverter Systems 17

Calculation of Vth

For Vth = Vin = Vout both transistors are


V IL Vin=Vout
saturated (λ is assumed to be 0): 1 2
4.0V
(Vth − VTn ) = (VDD − Vth − VTp )
Kn 2 Kp 2

2 2 vo
M N and M P saturated
Solving for Vth yields: 2.0V 3
VTn + K p / K n (VDD − VTp )
Vth = VIH
1+ K p / Kn 0V 4 5
0V 1.0V 2.0V 3.0V 4.0V 5.0V
vI

Vth

Institute of
Microelectronic
5: CMOS Inverter Systems 18
Design of CMOS inverter (I)

• NMH = VOH - VIH = VDD - VIH 4.0


• NML = VIL - VOL = VIL - 0 = VIL

Noise Margin (Volts)


3.5

• KR = Kp / Kn 3.0 NM
H

⎛W ⎞ 2.5
• Remember: K n = K 'n ⎜ ⎟
⎝ L ⎠n 2.0
⎛W ⎞
K p = K'p ⎜ ⎟ 1.5
⎝ L ⎠p
NM L
⇒Influence of the symmetry via 1.0
W/L of transistors!
0.5
0 1 2 3 4 5 6 7 8 9 10 11
KR

Institute of
Microelectronic
5: CMOS Inverter Systems 19

Design of CMOS inverter (II)


Kp µ p (W L ) p
The ratio (W/L) in CMOS design is =
used to set the level of Vth. Kn µn (W L )n

The ratio required to establish a K n VDD − Vth − VTp


given inverter threshold voltage is: =
Kp Vth − VTn
To get a symmetrical voltage
K n 12 VDD − VTp
transfer curve, Vth is set to VDD/2: = 1
K p 2 VDD − VTn
If in a process |VTp| = VTn, the
(W L ) p µn
device aspect ratios for a =
symmetrical inverter are related by: (W L )n µ p
Since µn / µp ≈ 2.5, a minimum area CMOS inverter will have (W/L)n ≈ 1 and
(W/L)p ≈ 2.5. In this case the voltage transfer function is completely symmetric.

Institute of
Microelectronic
5: CMOS Inverter Systems 20
Summary

So what did we accomplish until


V IL
now?
1 2
4.0V
• We know how a CMOS inverter
vo works.
2.0V
• VOL, VOH - do you still know it?
3
• We know how to set the W/L ratios
of the transistors to get optimal
VIH
noise margins.
0V 4 5 • So we make every inverter the
0V 1.0V 2.0V 3.0V 4.0V 5.0V same, that is to say minimal -or?
vI

Institute of
Microelectronic
5: CMOS Inverter Systems 21

Dynamic Behavior of the CMOS Inverter


High to Low Output Transition (I)

MN goes from Cutoff over Saturation into Nonsaturation region for the given
input.
The border between Saturation and Nonsaturation is reached at the time tx
and the output voltage Vout = VOH - VTn v
I

V DD = 5 V + 5V

MP
0V t
v I = 5V v O (0+) = 5V 0
v
O
MN C MN saturated
VOH = 5V
MN nonsaturated
(Vin - VTn)
VOL = 0 V t
t1 tX t2

Institute of
Microelectronic
5: CMOS Inverter Systems 22
High to Low Output Transition (II)

In order to simplify the final expressions, the dQ dV


i= = COUT OUT
integrations on the right for computing tHL are dt dt
done with the borders from VDD to V0 dV
(V1 = 0,9 VDD, V0 = 0,1 VDD) ∫ dt = COUT ∫ iOUT

Saturation:
VDD −VTn
dVOUT 2CoutVTn
t x − t1 = −COUT ∫ =
K n (VDD − VTn )
2
VDD
Kn
(VDD − VTn )2
2
Nonsaturation:
V0
⎛ ⎞
V0
dVOUT 2C 1 VOUT
t 2 − t x = −COUT ∫ = − OUT ln⎜⎜
K n 2(VDD − VTn ) ⎝ 2(VDD − VTh ) − VOUT
⎟⎟ =
Kn
VDD −VTn
2
[
2(VDD − VTn )VOUT − VOUT
2
] ⎠ VDD −VTn

COUT ⎛ 2(VDD − VTn ) ⎞


= ln⎜⎜ − 1⎟⎟
K n (VDD − VTn ) ⎝ V0 ⎠
Institute of
Microelectronic
5: CMOS Inverter Systems 23

High to Low Output Transition (III)

dx 1 ⎛ xn ⎞
We have used the following integral: ∫ x a + bx n = an ln⎜⎜⎝ a + bx n ⎟⎟⎠
( )
dx 1 ⎛ x ⎞
In our case: n = 1, b = −1 ∫ ax − x 2
= ln⎜ ⎟
a ⎝a−x⎠

t HL = (t x − t1 ) + (t 2 − t x )
⎡ 2VTn ⎛ 2(VDD − VTn ) ⎞⎤
therefore: t HL = τ ⎢ + ln⎜⎜ − 1⎟⎟⎥
V − V
⎣ DD Tn ⎝ V 0 ⎠⎦

COUT
where τ=
K n (VDD − VTn )

Institute of
Microelectronic
5: CMOS Inverter Systems 24
Low to high output transition

From symmetry (VTn → VTp; Kn → Kp) follows for the high to low transition

( )
time:
⎡ 2 VTp ⎛ 2 VDD − VTp ⎞⎤
+ ln⎜ − 1⎟⎥
COUT
⇒ t LH = ⎢
K p VDD − VTp ⎢VDD − VTp


⎝( V0 ) ⎟⎥
⎠⎦

V =5V
DD v
I

+ 5V
MP

0V t
V =0V
I 0
v (0+) = 0V
O v
O
M C
N + 5V

0V t
0

Institute of
Microelectronic
5: CMOS Inverter Systems 25

Dynamic Behavior of the CMOS Inverter


(cont’d)
• The choice of size of the NMOS and PMOS transistors can be dictated by the
desired average propagation delay τP
t PHL + t PLH
• For symmetrical inverter: τP = = t PHL = t PLH Kn' ≈ 2.5 K p'
2
tr = t f = 2τ P
Example:
VDD= 5 V
V DD= 5 V V DD= 5 V

M 5 32.5
P 1 M M 20
P 1 P 1
v v v v
I o I v I v
o o

M 2 13 8
N C M M
1 N 1 N 1
1 pF 2 pF

(a) (b)

Symmetrical reference inverter Scaled inverters


| VTP | = VTN = 1V τP = 6.4 ns a) τP = 1 ns b) τP = 3.2 ns
C = 1 pF tr = tf = 12.8 ns
Institute of
Microelectronic
5: CMOS Inverter Systems 26
Power Dissipation
6.0V
• Two kinds of power
dissipation in digital Output Voltage
electronics: 40uA

– static power dissipation 4.0V


(logic gate output is
stable)
– dynamic power
20uA
dissipation (during 2.0V
switching of logic gate)
Drain Current
• With CMOS nearly no static
power dissipation! 0V 0A >>
0V 2.0V vI 4.0V 6.0V

Institute of
Microelectronic
5: CMOS Inverter Systems 27

Dynamic Power Dissipation (I)


R1 Switch closes at t = 0

Power dissipation due to charge and


discharge of capacitances i(t) Non-linear +
Resistor C
The total energy ED delivered by the V + vc (t)
DD
source is given by -

-
ED = ∫ P(t )dt (a) vc (0) = 0
0

The power P(t) = VDDi(t), and because The current supplied by source VDD is
VDD is a constant, also equal to the current in capacitor C,
and so ∞ dv
∞ ∞
ED = VDD ∫ C C
dt
ED = ∫ VDD i (t )dt = VDD ∫ i (t )dt dt
0

0 0 VC ( ∞ )
= CVDD ∫ dvC
VC ( 0 )

Institute of
Microelectronic
5: CMOS Inverter Systems 28
Dynamic Power Dissipation (II)

Integrating from t = 0 to t = ∞, with The total energy ETD dissipated in


VC(0) = 0 and VC (∞) = VDD results in the process of first charging and
then discharging the capacitor is
ED = CVDD
2 equal to

We know that the energy Es stored in


⎛ CVDD
2
⎞ ⎛ CVDD
2

capacitor C is given by ETD = ⎜⎜ ⎟⎟ + ⎜⎜ ⎟⎟
2
CVDD ⎝ 2 ⎠Charge ⎝ 2 ⎠ Discharge
ES = = CVDD
2
2
and thus the energy EL lost in the
resistive element must be
2
CVDD
EL = ED − ES =
2
Institute of
Microelectronic
5: CMOS Inverter Systems 29

Dynamic Power Dissipation (III)

Thus, every time a logic gate goes through a complete switching cycle, the
transistors within the gate dissipate an energy equal to ETD. Logic gates
normally switch states at some relatively high frequency (switching
events/second), and the dynamic power PD dissipated by the logic gate is
then

PD = CVDD
2
f

In effect, an average current equal to (CVDDf) is supplied from the source


VDD.

Institute of
Microelectronic
5: CMOS Inverter Systems 30
Dynamic Power Dissipation (IV)

• Power dissipation due to the “short circuit current” (when both transistors
are on during transition)
• The short circuit current reaches a peak for Vin = Vout = VDD/2
VDD = 5 V
5.0 V
vO
Voltage

R onp
Vin = Vout = VDD/2
vI
0.0 V vout
30uA
i DD
Current

R onn

0 uA
0s 4ns 8ns 12ns 16ns
Time
Institute of
Microelectronic
5: CMOS Inverter Systems 31

Summary
Let’s repeat:
6.0V
• What is the dynamic behaviour of
Output Voltage
40uA the inverter?
4.0V • What do we need it for?
• What kind of power dissipation is
there?
20uA
2.0V • What kind of power dissipation is
dominant with CMOS logic?
Drain Current

0V 0A >>
0V 2.0V v 4.0V 6.0V
I

PD = CVDD
2
f
Institute of
Microelectronic
5: CMOS Inverter Systems 32
6. CMOS Technology

Institute of
6: CMOS Technology Microelectronic
Systems 1

CMOS Technology

• Basic Fabrication Operations


• Steps for Fabricating a NMOS Transistor
• LOCOS Process
• n-Well CMOS Technology
• Layout Design Rules
• CMOS Inverter Layout Design
• Circuit Extraction, Electrical Process Parameters
• Layout Tool Demonstration
• Appendix: MOSIS, EUROPRACTICE

Institute of
6: CMOS Technology Microelectronic
Systems 2
Wafer Terminology

1. Chip = Die = Microchip = Bar


2. Scribe Lines
3. Engineering Test Die
4. Edge Die
5. Crystal Planes
6. Wafer Flats

Institute of
6: CMOS Technology Microelectronic
Systems 3

Basic Wafer Fabrication Operations

The number of steps in IC fabrication flow depends upon the technology process
and the complexity of the circuit
Example:
CMOS n-Well process - 30 major steps, and each major step may involve up to
15 substeps
Only three basic operations are performed on the wafer:
• Layering
• Patterning
• Doping

Institute of
6: CMOS Technology Microelectronic
Systems 4
Layering

Grow or deposit thin layers of different materials on the wafer surface

Layers Technique
Thermal ChemicalVapor Evaporation Sputtering
oxidation Deposition (CVD)

Insulators Silicon Dioxide Silicon Dioxide (SiO2) Silicon Dioxide (SiO2)


(SiO2)
Silicon Nitrides (Si3N4) Silicon Monoxide SiO)
(

Semiconductors Epitaxial Silicon


Poly Silicon
Dopedpolysilicon Metals Metals
Conductors Metals Alloys Alloys
Al/Si Alloys
Silicides

Institute of
6: CMOS Technology Microelectronic
Systems 5

Layering - Thermal Oxidation


SiO2 functions:

Surface passivation Diffusion barrier Field oxide MOS Gate oxide

Natural oxide: silicon will readily grow an oxide (5-10nm) if exposed to oxygen in the air!
The range for useful oxide thickness: 25nm (MOS gates) - 1500nm (field oxide)

Dry oxidation
Si + O2 → SiO2 (900-1200°C)
O2
700nm oxide: 10hours (1200°C)
SiO2
Good oxide quality: gate oxide
Silicon
Wet oxidation (water vapor or steam)
Si + H2O → SiO2 + 2H2 (900-1200°C)
700nm oxide: 0.65hours (1200°C)
Poor oxide quality: field oxide
Institute of
6: CMOS Technology Microelectronic
Systems 6
Layering - Chemical Vapor Deposition (CVD)
Deposited materials:
• Insulators & Dielectrics: SiO2, Si3N4, Phosphorus Silicate Glass (PSG), Doped Oxide
• Semiconductors: Si
• Conductors: Al, Cu, Ni, Au, Pt, Ti, W, Mo, Cr, Silicides (WSi2, MoSi2), doped polysilicon

Basic CVD processing:


• a gas containing an atom(s) of the material to be deposited reacts with another gas
liberating the desired material
• the freed material (atom or molecular form) “deposits” on the substrate
• the unwanted products of the chemical reaction leave the reaction chamber
Example: CVD of silicon from silicon tetrachloride

SiCl4 + 2H2 → Si + 4HCl↑

wafer

Institute of
6: CMOS Technology Microelectronic
Systems 7

Layering - Evaporation
Used to deposit conductive layers (metallization): Al, Al/Si, Al/Cu, Au, Mo, Pt
When temperature is raised high enough, atoms of solid material (Al) will melt and “evaporate”
into the atmosphere and deposit on to the wafer
External energy needed to evaporate the metal are provided by:

Wafer
Magnet High Vacuum
1.A current flowing Al
(10-5-10-7 torr)
through a filament
Crucible

Al/Si 3.Electron beam Heater Evaporation


alloy Source

The evaporation take place into an


evacuated chamber; otherwise Al would Vacuum Pump
combine with oxygen in air to form Al2O3
2.Flash system
Institute of
6: CMOS Technology Microelectronic
Systems 8
Layering - Sputtering

Used to deposit thin metal/alloys films and


insulators: Al, Ti, Mo, Al/Si, Al/Cu, SiO2
Sputtering process:
• ionized argon atoms (+) are introduced into an
evacuated chamber
• the target (Al) is maintained at negative potential
• the argon ions accelerated towards the negative
charge
• following the impact some of the target material
atoms tear off
• the liberated material settles on everything in the
chamber, including the wafers
The material to be sputtered does not have to be
heated

Institute of
6: CMOS Technology Microelectronic
Systems 9

Patterning
• Patterning = Lithography = Masking
• Selective removal of the top layer(s) on the wafers
• Ex.: Process steps required for patterning SiO2

SiO2 4.Soluble photoresist etching


Si substrate (wafer)
Chemical/Dry etch
1.Initial structure

Photoresist

5.SiO2 etching

2.Photoresist deposition

UV light
5.SiO2 etching (end)
Mask
Insoluble
photoresist
Soluble
photoresist 3.UV Exposure 6.Photoresist etching

Institute of
6: CMOS Technology Microelectronic
Systems 10
Doping

• Change conductivity type and resistivity on selected regions of wafer


• Doping takes place to the wafer through the holes patterned in the surface layer
• Two techniques are used:
• Thermal diffusion
• Ion implantation

Thermal diffusion:
- heat the wafer to the vicinity of 1000°C
- expose the wafer to vapors containing the desired dopant
- the dopant atoms diffuse into the wafer surface creating a p/n region

Ion implantation:
- room temperature
- dopant atoms are accelerated to a high speed and “shot” into the wafer surface
- an annealing (heating) step is necessary to reorder the crystal structure damaged by implant

Institute of
6: CMOS Technology Microelectronic
Systems 11

NMOS Transistor Fabrication - process flow (1)

Si Substrate (p)

Oxidation (Layering)

SiO2 Field Oxide (Thick Oxide)

Oxide etching (Patterning)

Institute of
6: CMOS Technology Microelectronic
Systems 12
NMOS Transistor Fabrication - process flow (2)
Oxidation (Layering)

SiO2 Gate Oxide (Thin Oxide)

Polysilicon deposition (Layering)

Polysilicon etching (Patterning)

Institute of
6: CMOS Technology Microelectronic
Systems 13

NMOS Transistor Fabrication - process flow (3)


Oxide etching (Patterning)

Ion implantation (Doping)

n type
n+ n+

Oxidation (Layering)

SiO2 Insulated Oxide


n+ n+

Institute of
6: CMOS Technology Microelectronic
Systems 14
NMOS Transistor Fabrication - process flow (4)
Oxide etching (Patterning)

Contact windows
n+ n+

Metal deposition (Layering)

Al evaporation

n+ n+

S D Metal etching (Patterning)

G
n+ n+

Si Substrate (p)
Institute of
6: CMOS Technology Microelectronic
Systems 15

Device Isolation Techniques


MOS transistors must be electrically isolated from each other in order to:
• prevent unwanted conduction paths between devices
• avoid creation of inversion layers outside the channel regions
• reduce the leakage currents
Each device is created in dedicated regions - active areas
Each active area is surrounded by a field oxide barrier using few techniques:
A) Etched field-oxide isolation
1) grow a field oxide over the entire surface of the chip
2) pattern the oxide and define active areas
Drawbacks: -large oxide steps at the boundaries between active areas and field regions!
-cracking of polysilicon/metal subsequent deposited layers!
Not used!
B) Local Oxidation of Silicon (LOCOS)

Institute of
6: CMOS Technology Microelectronic
Systems 16
Local Oxidation of Silicon (LOCOS) (1)
More planar surface topology
Selectively growing the field oxide in certain regions - process flow:
1) grow a thin pad oxide (SiO2) on the silicon surface
2) define active area : deposition and patterning a silicon nitride (Si3N4) layer
Si3N4
SiO2

Silicon substrate

The thin pad oxide - protect the silicon surface from stress caused by nitride
3) channel stop implant: p-type regions that surround the transistors

p+ p+ p+

Institute of
6: CMOS Technology Microelectronic
Systems 17

Local Oxidation of Silicon (LOCOS) (2)


4) Grow a thick field oxide

Field oxide is partially recessed into the surface (oxidation consume some of the silicon)
Field oxides forms a lateral extension under the nitride layer - bird`s beak region
Bird’s beak region limits device scaling and device density in VLSI circuits!

5) Etch the nitride layer and the thin oxide pad layer

Active Active
area area

Institute of
6: CMOS Technology Microelectronic
Systems 18
n-Well CMOS Technology - simplified process sequence

Creating n-well regions (PMOS transistors) and channel


stop regions

Grow field oxide and gate oxide

Deposit and pattern polysilicon layer

Implant source and drain regions, substrate contacts

Create contact windows, deposit and pattern metal layer

Institute of
6: CMOS Technology Microelectronic
Systems 19

n-Well CMOS Technology - Inverter Example

• Process starts with a moderately doped (1015 cm-3) p-type substrate (wafer)
• An initial oxide layer is grown on the entire surface (barrier oxide)

SiO2

Si (p)

Institute of
6: CMOS Technology Microelectronic
Systems 20
1. n-Well mask - defines the n-Well regions
• Pattern the oxide
• Implant n-type impurity atoms (phosphorus) - 1016cm-3
• Drive-in the impurities (vertical but also lateral redistribution - limits the density )

SiO2

n-well
Si (p)

Institute of
6: CMOS Technology Microelectronic
Systems 21

2. Active area mask - define the regions in which MOS devices will be created
• LOCOS process to isolate NMOS and PMOS transistors
• lateral penetration of bird’s beak region ~ oxide thickness
• channel stop p+ implants (boron)
• Grow gate oxide (dry oxidation) - only in the open area of active region

SiO2
p+
n-well
Si (p)

Institute of
6: CMOS Technology Microelectronic
Systems 22
3. Polysilicon mask - define the gates of the MOS transistors
• Polysilicon is deposited over the entire wafer (CVD process) and doped (typically n-type)
• Pattern the polysilicon in the dry (plasma) etching process
• Etch the gate oxide

Polysilicon gate

SiO2
p+
n-well
Si (p)

Institute of
6: CMOS Technology Microelectronic
Systems 23

4. n-Select mask - define the n+ source/drain regions of NMOS transistors


• Define an ohmic contact to the n-well
• Implant n-type impurity atoms (arsenic)
• Polisilicon layer protects transistor channel regions from the arsenic dopant

n-well ohmic contact

SiO2
S n+ n+ D n+
p+
n-well
Si (p)

Institute of
6: CMOS Technology Microelectronic
Systems 24
5. Complement of the n-select mask - define the p+ source/drain regions of PMOS transistors
• Define the ohmic contacts to the substrate
• Implant p-type impurity atoms (boron)
• Polisilicon layer protects transistor channel regions from the boron dopant

substrate ohmic contact

p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
n-well
Si (p)

Institute of
6: CMOS Technology Microelectronic
Systems 25

• In the n-well two p+ and one n+ regions are created


• After source/drain implantation a short thermal process is performed (annealing):
• moderate temperature
• drive the impurities deeper into the substrate
• repair some of the crystal structure damage
• lateral diffusion under the gate: overlap capacitances
• Next the SiO2 insulated layer is deposited over the entire wafer area using a CVD technique
• The surface becomes nonplanar: impact on the metal deposition step

SiO2

p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
n-well
Si (p)

Institute of
6: CMOS Technology Microelectronic
Systems 26
6. Contact mask - define the contact cuts in the insulating layer
• Contacts to polysilicon must be made outside the gate region (avoid metal spikes through
the poly and the thin gate oxide)

Contact window
SiO2

p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
n-well
Si (p)

Institute of
6: CMOS Technology Microelectronic
Systems 27

7. Metallization mask - define the interconnection pattern


• Aluminum is deposited over the entire wafer (evaporation) and selectively etched
• The step coverage in this process is most critical (nonplanarity of the wafer surface)

Metal
SiO2

p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
n-well
Si (p)

Institute of
6: CMOS Technology Microelectronic
Systems 28
• The final step: the entire surface is passivated (overglass layer)
• Protect the surface from contaminants and scratches
• Than opening are etched to the bond pads to allow for wire bonding

Institute of
6: CMOS Technology Microelectronic
Systems 29

GND In VDD

Out
Poly
Metal
SiO2

p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
Gate oxide n-well
Si (p) N-channel transistor P-channel transistor

In

GND VDD
Out
Institute of
6: CMOS Technology Microelectronic
Systems 30
Design Rules

• Interface between designer and process engineer


• Guidelines for constructing process masks
• Unit dimension: minimum line width
• Scalable design rules - lambda (λ) parameter:
– define all rules as a function of a single parameter λ
– scaling of the minimum dimension: change the value of λ - linear scaling!
– linear scaling is only possible over a limited range of dimensions (1-3µm)
– are conservative: they have to represent the worst case rules for the whole set
– for small projects are a flexible and versatile design methodology
• Micron rules - absolute dimensions:
– can exploit the features of a given process to a maximum degree
– scaling and porting designs between technologies is more demanding: manually or
using advanced CAD tools!
• Ex.: Scalable CMOS design rules

Institute of
6: CMOS Technology Microelectronic
Systems 31

CMOS Process Layers

Layer Color Representation

Well (p,n) Yellow


Active Area (n+,p+) Green
Select (p+,n+) Green
Polysilicon Red
Metal1 Blue
Metal2 Magenta
Contact To Poly Black
Contact To Diffusion Black
Via Black

Institute of
6: CMOS Technology Microelectronic
Systems 32
Intra-Layer Design Rules (λ)

Same Potential Different Potential

Well Polysilicon
6 9 2
10 2

Active
3 Metal1
3
3
3

Select 2 Metal2
4
2
3
Contact/Via
hole Minimum dimensions and distances

2
Institute of
6: CMOS Technology Microelectronic
Systems 33

Inter-Layer Design Rules - Transistor Layout (λ)

poly active (n+)

Transistor
1

3 2

Well boundary

Institute of
6: CMOS Technology Microelectronic
Systems 34
Inter-Layer Design Rules - Contact and Via (λ)

2
m2 4
Via
1 1
m1 5
Metal1 to Metal to
Metal2 contact Poly contact
1

Metal to 3 2
Active contact Via
m2 m1
2 m1

2 2
poly
n+

Institute of
6: CMOS Technology Microelectronic
Systems 35

Select Layer (λ)

Select
2
Contact to Contact to
well substrate
2
Select 1

3 3

2
5

Well
Substrate

Institute of
6: CMOS Technology Microelectronic
Systems 36
CMOS Inverter Layout

GND In VDD

Out
Poly
Metal
SiO2

p+ S n+ n+ D SiO2 D p+ p+ S n+
p+
Gate oxide n-well
Si (p) N-channel transistor P-channel transistor

Institute of
6: CMOS Technology Microelectronic
Systems 37

CMOS Latchup
V (0 V) v V (5 V)
SS O DD

B S D D S B
p+ n+ n+ p+ p+ n+

Rn
n-well
npn transistor
Rp

p-type substrate
pnp transistor

• The parasitic bipolar transistors can destroy the CMOS circuitry


• The bipolar devices are normallly inactive
• The collector of each bipolar transistor is connected to the base of the
other in a positive feedback structure
• The latchup effect can occur when:
1. Both bipolar transistors conduct
2. Product of gains of the 2 transistors in the feedback loop
exceeds unity ( βPβN > 1)
Institute of
6: CMOS Technology Microelectronic
Systems 38
7. Complementary MOS (CMOS)
Logic Design

Institute of
Microelectronic
Systems

Basic CMOS Logic Gate Structure

VDD

• PMOS and NMOS PMOS Switching


Network
switching networks are
complementary
Logic
Y
⇒Either the PMOS or Inputs
the NMOS network is
on while the other is
NMOS Switching
off Network

⇒No static power


dissipation

Institute of
Microelectronic
7: CMOS Logic Systems 2
CMOS NOR Gate
VDD = 5 V VDD = 5 V

10
1 MP 5
1
v
I vo

10 NOR Gate Truth Table


1 MN 2
1
Z
AB Z=A+B

2 2
1 1 0 0 1
A B
0 1 0
1 0 0
1 1 0

Institute of
Microelectronic
7: CMOS Logic Systems 3

Transistor Sizing for CMOS Gates: Review

Goal: To maintain the delay times equal the reference inverter design
under the worst-case input conditions

Example: 2 input CMOS NOR gate

- Each transistor of the NMOS network is capable of discharging


individually the load capacitance C ⇒ Same size as NMOS
transistor of reference inverter
- PMOS network conducts only when AB = 00 (Transistors in
serie) ⇒ Each PMOS must be twice larger
( On-resistance proportional to (W/L)-1 )

Institute of
Microelectronic
7: CMOS Logic Systems 4
CMOS NAND Gate

NAND Gate Truth Table


V =5V
DD
AB Z = AB
5 5
1 1

V =5V
0 0 1
Z DD 0 1 1
1 0 1
M 5 1 1 0
4 P 1
1
A v v
I O

M 2
4 N 1
1
B

Institute of
Microelectronic
7: CMOS Logic Systems 5

Multi-Input NAND Gate


V =5V
DD

Y= ABCDE
5 5 5 5 5
1 1 1 1 1

Y
Y
10 C
1 Why should one
A
prefer a NAND
10
gate rather than a
1 NOR gate?
B

10
1
C

10
1
D

10
1
E
Institute of
Microelectronic
7: CMOS Logic Systems 6
Steps in Constructing Graphs for NMOS and
PMOS Networks (I)
+5 V

A PMOS
B Switch
C Network
D

Y
B MB
B (C + D)

A MA C MC D MD

C+D

A + B (C + D) Y = A + B (C + D)

Institute of
Microelectronic
7: CMOS Logic Systems 7

Steps in Constructing Graphs for NMOS and


+5 V
3
PMOS Networks (II)
A PMOS
(d) Graph with
B
2
Switch PMOS Arcs Added
C Network B
D

Y 3
2 4 4
(a) B MB
1
A 1 2
1 C 5
4

A MA C MC D MD D
2 4 4
1 1 0
1
(c) NMOS Graph with 3
0
1
2 New Nodes Added 2

B B
(b) NMOS Graph
3
A 1
C 2 A 4 1 2
C 5
D
D
0 Institute of
Microelectronic 0
7: CMOS Logic Systems 8
Steps in Constructing Graphs for NMOS and
PMOS Networks (III)
Final CMOS Circuit
3 +5 V
15
A 1
Graph with 4
2
PMOS Arcs Added B 15
4 C 1
7.5
3 B 1 5
4
A 1 2 D
15
C 1
5
2
D Y
0 MB 4
B 1
1

A MA C MC D MD 4
2 4 1
1 1

Institute of
Microelectronic
7: CMOS Logic Systems 9

Summary

+5 V
15
A 1
• AND - serially connected FET
• OR - parallel connected FET 15
C 1
7.5
• NMOS network implements B 1
“zeros” 15
D
• PMOS network implements 1

“ones”
Y
MB 4
• W/L ratio has to be determined as B 1
a design parameter
A MA C MC D MD 4
2 4 1
1 1

Institute of
Microelectronic
7: CMOS Logic Systems 10
CMOS Gate Design: Minimum Size Vs.
Performance (I)
CMOS circuit with only Considerable savings in chip area,
minimum size transistors but increased logic delay

Example:

Institute of
Microelectronic
7: CMOS Logic Systems 11

CMOS Gate Design: Minimum Size Vs.


Performance (II)
⎛5⎞
⎜ ⎟
= ⎝ ⎠ τ PLHI = 7 . 5 τ PLHI
1
(W/L) for PMOS network = 2/3 τ PLH
⎛2⎞
⎜ ⎟
⎝3⎠
τ PLHI =τ PLH of reference inverter

For NMOS network τ PHL = 2τ PHLI


The average propagation delay of the minimum size logic gate is:

τP =
(τ PHL + τ PLH ) = (2 τ PHLI + 7 .5 τ PLHI ) 9 .5 τ PLHI
= = 4 .75 τ PLHI
2 2 2
Mininimum size gate will 4.75 times slower than reference inverter when
driving the same load capacitance
Institute of
Microelectronic
7: CMOS Logic Systems 12
Power-Delay Product (PDP)
The PDP is an important figure of merit for a logic technology
PDP = PAV τ P
1
For CMOS: P AV = CV 2
DD f with f =
T

CMOS switching waveform


Institute of
Microelectronic
7: CMOS Logic Systems 13

Power-Delay Product (cont’d)

• The period T must satisfy: T ≥ tr + ta + t f + tb


• Assumptions: At high frequencies ta → 0 and tb → 0, tr and tf account for
approximately 80 % of the total transition time

For symmetrical inverter:


2 t r 2 (2τ P )
T ≥ = = 5τ P
0 .8 0 .8
2 2
CV DD CV DD
PDP ≤ τP =
5τ P 5

Institute of
Microelectronic
7: CMOS Logic Systems 14
7b. Passtransistor and
Transmission Gate Logic

Institute of
Microelectronic
Systems

Passtransistor Logic: Basic Principle

Idea:
0=open
control 1=closed

Vin Vout
Vin control Vout
1 0 x
Implementation: 1 1 1
0 0 x
Vin Vout 0 1 0

control

Institute of
Microelectronic
7b: Transmission Gate Logic Systems 2
Passtransistor Logic: NEXOR Realisation

B
A B OUT

OUT
0 0 1
0 1 0
A
1 0 0
B
1 1 1

Institute of
Microelectronic
7b: Transmission Gate Logic Systems 3

Passtransistor: Charging Characteristics

Vctrl (t )
NMOS Vctrl (t < 0) = 0
Vctrl (t >= 0) = VDD
Transistor is in
VGS
Saturation during
Vin = VDD Vout ( t ) Charging Process
Cout Vout ( t = 0) = 0

Vout (t )

VDD − VT ( VSB )

Institute of
Microelectronic
7b: Transmission Gate Logic Systems 4
Passtransistor Cascades
VDD VDD VDD VDD

Vin = VDD Vmax = VDD − VT ( Vmax )


Vmax Vmax Vmax
Cout Vmax

VDD

Vin = VDD Vmax,1 = VDD − VT ( Vmax,1 )


Vmax,1

Vin = VDD Vmax, 2 = Vmax,1 − VT ( Vmax, 2 )


≈ VDD − 2VT
Cout Vmax, 2

Institute of
Microelectronic
7b: Transmission Gate Logic Systems 5

Passtransistor: Discharging Characteristics

Vctrl (t )
NMOS Vctrl (t < 0) = 0
Vctrl (t >= 0) = VDD Transistor is always in
VGS Nonsaturation during
Discharging Process
Vin = 0 Vout ( t )
Cout Vout (t = 0) = VDD − VT ( VSB )

Vout (t )

VDD − VT ( VSB )

NMOS Passtransistor:
Discharging faster than
Charging, since Device
t Impedance is lower in NSat
than in Sat
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 6
Passtransistor: Charging Characteristics
PMOS Charging Process:
Vctrl (t )
Vctrl (t < 0) = VDD
Vctrl (t >= 0) = 0
VGS The output is
charged to VDD
(Transistor is initially
Vin = VDD Vout ( t )
saturated and goes
VDD Cout Vout ( t = 0) = 0
in nonsaturated
mode)

PMOS Discharging Process:


Vctrl (t )
Vctrl (t < 0) = VDD
Vctrl (t >= 0) = 0

VGS The output is


discharged to VT
Vin = 0 Vout ( t ) (Transistor is
Vout ( t = 0) = VDD
VDD Cout saturated and finally
goes in cut-off
mode)

Institute of
Microelectronic
7b: Transmission Gate Logic Systems 7

From Passtransistors to Transmission Gates

Logic
NMOS PMOS CMOS
Vctrl Level
Logic 0 0 VTP 0
Logic 1 VDD − VTN VDD VDD

VDD

Vin Vout Vctrl


Cout

Vin Vout
dV
I DN + I DP = Cout * out
Vctrl dt Vctrl
CMOS Transmission Gate Symbol: CMOS Transmission Gate

• Bidirectional resistive connection between the input and output terminals


• Useful in both analog (e.g. for relay contacts) and in digital design (e.g.
for multiplexers)
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 8
Transmission Gate: Operation States
Operation states of
the Transistors
which are passed
over during charging
the output from 0 to
VDD:
Final Voltage : VDD

cut-off
Mn
VDD − VTN

nonsaturated
Mp
Mn saturated
VTP

sat.
Mp
Initial Voltage : 0

Institute of
Microelectronic
7b: Transmission Gate Logic Systems 9

CMOS Transmission Gate: On-Resistance

R onP R onN
R EQ =
R onP + R onN

On-resistance of a transmission
gate, including body effect

VTON = 0.75V , VTOP = − 0.75V


γ = 0.5V 0.5 , 2φ F = 0.6V ,
K p = 20 µA / V 2 , K n = 50 µA / V 2

Institute of
Microelectronic
7b: Transmission Gate Logic Systems 10
CMOS Transmission Gate (III)

• Charge sharing problem

C BIGVBIG + C SMALLVSMALL
VF =
C BIG + C SMALL

Example: CSMALL = 0.02 pF, VSMALL = 5 V, VBIG = 0 V


CBIG = 0.2 pF (about 10 standard loads in a 0.5 CMOS process)
VF = 0.45 V ⇒ The ‘big‘ capacitor has forced node A to a voltage
close to a ‘0‘
Node A has to be insulated from node Z by including a buffer (e.g.
Inverter) between the 2 nodes, if node A is not strong enough to over-
come the ‘big‘ capacitor
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 11

Transmission Gate Logic


Multiplexer: Equivalence (NEXOR):
F = AS + BS F = AB + A B Alternate equivalence logic circuit:

= A⊕B B
S B

A A

S F B F A F

B A

S B
B

Institute of
Microelectronic
7b: Transmission Gate Logic Systems 12
Function Implementation with Passtransistor Logic

F = bd + abd + abd + bcd


Karnaugh Map of F:

F
1 0 0 1 Step 1: find minimum decomposition in such a
way, that each selected field is
0 0 1 0 depending on one variable or constant 0
or constant 1 only
b
1 0 1 1
(in our case: decompose with
a combinations of the literals b and d
1 1 1 1

c
d

Institute of
Microelectronic
7b: Transmission Gate Logic Systems 13

Function Implementation with Passtransistor Logic


Step 2: Attach decomposition variables to
selection lines VDD
Step 3: Determine the line input signals Sustainer transistor
(implement inverted function to
compensate output inverter
c

a
F

b b d d
Institute of
Microelectronic
7b: Transmission Gate Logic Systems 14
8. Combinational MOS Logic Circuits

Institute of
Microelectronic
Systems

Introduction
• Combinational logic circuits, or gates, witch perform Boolean operations on multiple
input variables and determine the output as Boolean functions of the inputs, are the
basic building blocks of all digital systems.
• We will examine the static and dynamic characteristics of various combinational
MOS logic circuits. It will be seen that many of the basic principles used in the
design and analysis of MOS inverters can be directly applied to the combinational
logic circuit as well.
• In its most general form, a combinational logic circuit, or gate, performing a Boolean
function can be represented as a multiple-input single-output system.

General combinational logic circuit (gate)


Institute of
8: Combinational MOS Microelectronic
Systems 2
Logic Circuits
MOS Logic Circuits with Depletion nMOS Loads
Two-Input NOR Gate

A two-input depletion-load NOR gate, its logic symbol,


and the corresponding truth table

Calculation of VOH
When both input voltages VA and VB are lower than the corresponding driver
threshold voltage, the driver transistor are turned off and conduct no drain current.
Consequently, the load device, which operates in the linear region, also has zero
drain current. In particular, its linear region current equation becomes
kn , load ⎡
ID , load = 2 VT , load ( VOH ) ( VDD − VOH ) − ( VDD − VOH )2 ⎤ = 0
2 ⎢
⎣ ⎥⎦
The solution of this equation gives VOH=VDD
Institute of
8: Combinational MOS Microelectronic
Systems 3
Logic Circuits

Calculation of VOL

To calculate the output voltage VOL, we must consider three different cases, i.e.,
three different input voltage combinations, which produce a conduction path from
the output node to the ground. These cases are

(i) VA=VOH VB=VOL


(ii) VA=VOL VB=VOH
(iii) VA=VOH VB=VOH

For first two cases the NOR circuit reduces to a simple nMOS depletion-load
inverter. Assuming that the threshold voltages of the two enhancement-type driver
transistors are identical (VT0,A=VT0,B=VT0), the driver-to-load ratio of the
corresponding inverter can be found as follows.
(i)
⎛W ⎞
k ′n , driver ⎜ ⎟
kR =
kdriver , A
= ⎝ L ⎠A
kload ⎛W ⎞
k ′n , load ⎜ ⎟
⎝ L ⎠ load

Institute of
8: Combinational MOS Microelectronic
Systems 4
Logic Circuits
(ii)
⎛W ⎞
k ′n , driver ⎜ ⎟B
kR =
kdriver , B
= ⎝L⎠
kload ⎛W ⎞
k ′n , load ⎜ ⎟
⎝ L ⎠ load

The output low voltage level VOL in both cases is found as follows:
⎛ kload ⎞ 2
VOL = VOH − VT 0 − ( VOH − VT 0 )2 − ⎜ ⎟ VT , load ( VOL )
⎝ kdriver ⎠
The output low voltage (VOL) values calculated for case (i) and (ii) will be identical.
In case (iii), where both driver transistors are turned on, the saturated load current
is the sum of the two linear-mode driver currents.
ID , load = ID , driverA + ID , driverB
kload kdriver , A ⎡ ⎤
VT , load ( VOL ) 2 = 2 ( VA − VT 0 )VOL − V 2
2 2 ⎢

OL
⎥⎦
kdriver , B ⎡
+ 2 ( VB − VT 0 )VOL − V 2 OL ⎤
2 ⎢
⎣ ⎥⎦

Institute of
8: Combinational MOS Microelectronic
Systems 5
Logic Circuits

Since the gate voltages of both driver transistors are equal (VA=VB=VOH), we
can devise an equivalent driver-to-load ratio for the NOR structure:
⎡ ⎤
kdriver , A + kdriver , B k ′n , driver ⎢⎣ ⎜⎝ L ⎟⎠ A+ ⎜⎝ L ⎟⎠ B ⎥⎦
⎛W ⎞ ⎛W ⎞

kR = =
kload ⎛W ⎞
k ′n , load ⎜ ⎟
⎝ L ⎠ load
Thus, the NOR gate with both of its inputs tied to a logic-high voltage is replaced
with an nMOS depletion-load circuit with the driver-to-load ratio given by the above
equation. The output voltage level in this case is:
⎛ kload ⎞ 2
VOL = VOH − VT 0 − ( VOH − VT 0 )2 − ⎜ ⎟ VT , load ( VOL )
⎝ kdriver , A + kdriver , B ⎠
The VOL is lower than the VOL values calculated for case (i) and for case (ii), when
only one input is logic-high. This also suggests a simple design strategy for NOR
gates. Usually, we have to achieve a certain maximum VOL for the worst case, i.e.,
when only one input is high. Thus, we assume that one input (either VA or VB) is
logic-high and determine the driver-to-load ratio of the resulting inverter. Then set
kdriver , A = kdriver , B = kRkload
This design choice yields two identical driver transistors, which guarantee the
required value of VOL in the worst case. When both inputs are logic-high, the
output voltage is even lower than the required maximum VOL, thus the design
constraint is satisfied.
Institute of
8: Combinational MOS Microelectronic
Systems 6
Logic Circuits
Generalized NOR Structure with Multiple Inputs

Generalized n-input NOR gate

The combined pull-down current can than be expressed as follows:


⎧ µnCox ⎛ W ⎞ ⎡
⎟ ⎢2 ( VGS , k − VT 0 )Vout − V out ⎤⎥
2
⎪ ∑ ⎜
⎠k ⎣ ⎦
linear
⎪k ( on ) 2 ⎝ L
ID = ∑ ID , k =⎨
k ( on ) ⎪ ∑ µnCox ⎛⎜ W ⎞
⎟ ( VGS , k − VT 0 )
2 saturation
⎪⎩k ( on ) 2 ⎝ L ⎠k
Assuming that the input voltages of all driver transistors are identical,
VGS , k = VGS for k = 1,2 ,..., n
Institute of
8: Combinational MOS Microelectronic
Systems 7
Logic Circuits

The pull-down current expression can be rewritten as


⎧ µnCox ⎛W ⎞ ⎡
⎜ ⎟ ⎢2 ( VGS − VT 0 )Vout − V out ⎤⎥
2
⎪ 2 ( ∑ ) linear
⎪ k ( on ) ⎝ L ⎠ k ⎣ ⎦
ID = ⎨
⎪ µnCox ⎛W ⎞ 2
⎪⎩ 2 ( ∑ )
⎜ ⎟ ( VGS − VT 0 ) saturation
k ( on ) ⎝ L ⎠ k

Equivalent inverter circuit corresponding to the n-input NOR gate

The (W/L) ratio of the driver transistor here is:


⎛W ⎞ ⎛W ⎞
⎜ ⎟ = ∑ ⎜ ⎟
⎝ L ⎠equivalent k ( on ) ⎝ L ⎠k
Institute of
8: Combinational MOS Microelectronic
Systems 8
Logic Circuits
Transient analysis of NOR Gate

Parasitic device capacitances in the NOR2 gate and the lumped equivalent
load capacitance. The gate-to-source capacitances of the driver transistors
are included in the load of the previous stages driving the inputs A and B.

The value of the combined load capacitance can be found:


Cload = Cgd , A + Cgd , B + Cgd , load + Cdb , A + Cdb , B + Csb , load + Cwire
Institute of
8: Combinational MOS Microelectronic
Systems 9
Logic Circuits

Two-input NAND Gate

A two-input depletion-load NAND gate, its logic symbol, and the


corresponding truth table.

It can easily be seen that the drain currents of all transistors in the circuit are
equal to each other.
ID , load = ID , driverA = ID , driverB

Institute of
8: Combinational MOS Microelectronic
Systems 10
Logic Circuits
kload kdriver , A ⎡ ⎤
VT , load ( VOL ) 2 = 2
2 ( VGS , A − VT , A )VDS , A − V DS
2 2 ⎢⎣ ,A
⎥⎦
kdriver , B ⎡ 2 ⎤
= 2 ( VGS , B − VT , B )VDS , B − V DS
2 ⎢

,B
⎥⎦
The gate-to-source voltages of both driver transistors can be assumed to be
approximately equal to VOH. ( VGS , A = VOH − VDS , B ≈ VOH , since VDS low in NSAT)
The drain-to-source voltages of both transistors can be
solved:
⎛ kload ⎞ 2
VDS , A = VOH − VT 0 − ( VOH − VT 0 )2 − ⎜ ⎟ VT , load ( VOL )
⎝ kdriver , A ⎠
⎛ kload ⎞ 2
VDS , B = VOH − VT 0 − ( VOH − VT 0 )2 − ⎜ ⎟ VT , load ( VOL )
⎝ kdriver , B ⎠
Let the two driver transistors be identical, i.e., kdriver,A=kdriver,B=kdriver. Noting that the
output voltage VOL is equal to the sum of the drain-to-source voltages of both
drivers, we obtain:


VOL ≈ 2 ⎜VOH − VT 0 − ( VOH − VT 0 )2 − ⎜⎜
⎛ kload ⎞⎟
V ( V ) 2 ⎞⎟
⎟ T , load OL ⎟
⎜ ⎝ kdriver ⎠ ⎟
⎝ ⎠

Institute of
8: Combinational MOS Microelectronic
Systems 11
Logic Circuits

The following analysis gives a better and more accurate view of the operation of
two series-connected driver transistors.Consider the two identical enhancement-
type nMOS transistors with their gate terminals connected. At this point, the only
simplifying assumption will be VT,A=VT,B=VT0. When both driver transistors are in
the linear region, the drain currents can be written as:
kdriver ⎡ 2 ⎤
ID , A = 2 ( VGS , A − VT 0 )VDS , A − V DS
2 ⎢⎣ ⎥⎦
,A

kdriver ⎡ 2 ⎤
ID , B = 2 ( VGS , B − VT 0 )VDS , B − V DS
2 ⎢⎣ ⎥⎦
,B

Since ID,A=ID,B, this current can also be expressed as


ID , A + ID , B
ID = ID , A = ID , B =
2
Using VGS,A=VGS,B-VDS,B yields
kdriver ⎡
ID = 2 ( VGS , B − VT 0 )( VDS , A + VDS , B ) − ( VDS , A + VDS , B )2 ⎤
4 ⎢⎣ ⎥⎦
Now let VGS=VGS,B and VDS=VDS,A+VDS,B. The drain-current expression can be
written as follows.

ID =
kdriver ⎡2 ( VGS − VT 0 )VDS − VDS 2 ⎤
4 ⎢⎣ ⎥⎦
Institute of
8: Combinational MOS Microelectronic
Systems 12
Logic Circuits
Generalized NAND Structure with Multiple Inputs

The generalized NAND2 structure and its inverter equivalent


Neglecting the substrate-bias effect, and assuming that the threshold voltages of
all transistors equal to VT0, the driver current ID in the linear region can be derived:
⎛ ⎞
⎜ ⎟
⎜ ⎟ ⎧ ⎡2 ( Vin − VT 0 )Vout − V 2 ⎤ linear
ID =
µnCox ⎜ 1 ⎟ ⎪ ⎢⎣ out
⎥⎦

2 ⎜ 1 ⎟
⎜ ∑ W ⎟ ⎪⎩( Vin − VT 0 )2 saturation
⎜ k ( on ) ⎛⎜ ⎞⎟ ⎟
⎝ ⎝ L ⎠k ⎠
Institute of
8: Combinational MOS Microelectronic
Systems 13
Logic Circuits

Hence, the (W/L) ratio of the equivalent driver transistor is


⎛W ⎞ 1
⎜ ⎟ =
⎝ L ⎠equivalent 1

k ( on ) ⎛ W ⎞
⎜ ⎟
⎝ L ⎠k
If the series-connected transistors are identical, i.e., (W/L)1= (W/L)2=...= (W/L), the
width-to-length ratio of the equivalent transistor becomes
⎛W ⎞ 1 ⎛W ⎞
⎜ ⎟ = ⎜ ⎟
⎝ L ⎠equivalent n ⎝ L ⎠

Institute of
8: Combinational MOS Microelectronic
Systems 14
Logic Circuits
Transient Analysis of NAND Gate

Parasitic device capacitances in the NAND2 gate

Institute of
8: Combinational MOS Microelectronic
Systems 15
Logic Circuits

As in the inverter case, we can combine the capacitances into one capacitance,
connected between the output and node and the ground. The value of the lumped
capacitance Cload depends on the input voltage conditions.

For example, the input VA is equal to VOH and the other input VB is switching from
VOH to VOL. In this case, both the output voltage Vout and the internal node voltage
Vx will rise, resulting in:
Cload = Cgd , load + Cgd , A + Cgd , B + Cgs , A
+ Cdb , A + Cdb , B + Csb , A + Csb , load + Cwire
Note that this value is quite conservative and fully reflects the internal node
capacitances into the lumped output capacitance Cload. In reality, only a fraction
of the internal node capacitances is reflected into Cload.

Institute of
8: Combinational MOS Microelectronic
Systems 16
Logic Circuits
9. Memory Elements and Dynamic Logic

Institute of
Microelectronic
Systems

RS Flipflop

The RS-flipflop is a bistable element


with two inputs:

• Reset (R), resets the output Q to 0


• Set (S), sets the output Q to 1

Institute of
9: Memory Elements & Microelectronic
Systems 2
Dynamic Logic
RS-Flipflops
There are two ways to implement a RS-flipflop:
• based on NOR-gates: positive logic
• based on NAND-gates: negative logic

Institute of
9: Memory Elements & Microelectronic
Systems 3
Dynamic Logic

Clocked RS-Latch

To achieve a synchronous
operation, we can add a clock
signal

• Clock= 0: R and S have no


influence upon the state of the
circuit
• Clock= 1: R and S can change
the state of the circuit

Institute of
9: Memory Elements & Microelectronic
Systems 4
Dynamic Logic
D-Latch

For storing data it is more


convenient to have a data
input. This is realized by
using the data input as set
signal and the inverted data
input as reset signal.

• Clock= 0: Q unchanged
• Clock= 1: Q= D

Institute of
9: Memory Elements & Microelectronic
Systems 5
Dynamic Logic

Transmission Gate D-Latch

An alternative way to build a D-latch is to use transmission gates


thus reducing the complexity (transistor count) of the circuit.

• Load= 0: Latch stores data


• Load= 1: Latch is transparent (output= input)

Institute of
9: Memory Elements & Microelectronic
Systems 6
Dynamic Logic
Clocked JK-Latch

An other extension of a simple RS-


flipflop is a JK-Latch

• J: enables/disables the low to


high transition of the latch
• K: enables/disables the high to
low transition of the latch

Institute of
9: Memory Elements & Microelectronic
Systems 7
Dynamic Logic

Edge Triggered Logic


If the previous presented D-latch would
be used in a synchronous circuit, i.e.
a counter, it would produce a
malfunction:
While clock is low the latches have the
state Q(n) and the feedback network
would apply the state Q(n+1) at the
inputs of the latches. When clock
goes high the latches change to the
new state Q(n+1). The feedback logic
calculates now the state Q(n+2). But
clock is still high so the latches
change falsely to the state Q(n+2).
So what we need is a latch which
changes only once per clock cycle,
this is edge triggered logic.
Institute of
9: Memory Elements & Microelectronic
Systems 8
Dynamic Logic
Edge Triggered JK-Flipflop
A straight forward way to implement an edge-triggered JK-flipflop is
to use a master-slave flipflop.
• Clock= 1: The master (left latch) is changeable, the slave (right
latch) is locked and holds the output at the current state
• Clock= 0: The master is locked and the slave is changes its state
if necessary
Æ The output value is the state of the master at the falling edge of
the clock signal

Institute of
9: Memory Elements & Microelectronic
Systems 9
Dynamic Logic

Edge Triggered TG D-Flipflop

Circuitry of an edge-triggered flipflop


• Clk= 0: First stage is loaded, second stage is locked and stores data
• Clk= 1: First stage is locked, second stage is loaded
Æ With the rising edge (low to high transition) the new value is
available a the output

Institute of
9: Memory Elements & Microelectronic
Systems 10
Dynamic Logic
Transmission Gate JK- Flipflop

It is also possible to
build a JK-flipflop with
transmission gates as
a edge-triggered
flipflop.
This achieves that the
output state can only
change at the rising
edge of the clock
signal

Institute of
9: Memory Elements & Microelectronic
Systems 11
Dynamic Logic

Dynamic D-Flipflop

Dynamic logic utilizes the parasitic capacitances of transistors and


interconnect to store the current state. This reduces the transistor
count but forbids a static operation. An application of dynamic
circuits is the dynamic D-flipflop.

Institute of
9: Memory Elements & Microelectronic
Systems 12
Dynamic Logic
Dynamic Shift Register

An other application is the dynamic shift register. It has also less


transistor count but requires a non-overlapping two-phase clock
which is expensive to generate.

Institute of
9: Memory Elements & Microelectronic
Systems 13
Dynamic Logic

Dynamic Chain Latch

Institute of
9: Memory Elements & Microelectronic
Systems 14
Dynamic Logic
Dynamic RAM
A special kind of memory is dynamic RAM. The major advantage is
the low transistor count, DRAM requires only one transistor and
one (small) capacitor per bit.
The first disadvantage is the destructive read. After reading a cell
the red value must be written back to keep the data in the RAM.
The second disadvantage is the limited duration of storage. After
some milliseconds the cell must be refreshed (read and written
back).

Institute of
9: Memory Elements & Microelectronic
Systems 15
Dynamic Logic

Dynamic RAM

Institute of
9: Memory Elements & Microelectronic
Systems 16
Dynamic Logic
Clocking
Clock Signal:
• used to synchronize data flow though
a digital network
⇒ clocked static or dynamic circuits
• problems: clock skew(delay caused by
clock distribution wires)

Condition for nonoverlapping clock


signals φ1( t ) and φ 2 ( t ):

φ1( t )φ 2 ( t ) = 0 ∀t Ideal nonoverlapping 2-phase clocks

Institute of
9: Memory Elements & Microelectronic
Systems 17
Dynamic Logic

Basic 2-phase clocking

Institute of
9: Memory Elements & Microelectronic
Systems 18
Dynamic Logic
Single and Multiple Clock Signals

Single clock 2-phase timing

⇒ For nonoverlapping clock phases φ and φ fine tuned and well designed
delay lines (realized as Transmission gates) have to be inserted in order to
avoid overlapping of φ and φ.

Institute of
9: Memory Elements & Microelectronic
Systems 19
Dynamic Logic

Generation of inverted clock phase

TG delay circuit

Institute of
9: Memory Elements & Microelectronic
Systems 20
Dynamic Logic
Pseudo 2-φ clocking

Institute of
9: Memory Elements & Microelectronic
Systems 21
Dynamic Logic

Clocked Static Logic


⇒ Synchronized data transfer

Shift register

1) Upper Frequency Limitation: Charging and Discharging Times

Clocked shift register circuit

Institute of
9: Memory Elements & Microelectronic
Systems 22
Dynamic Logic
Time constant for charging and discharging:
τTG = RTGCL
where
CL = CTG + Cin + Cline
VA=VDD: (Vin(0)=0)
Vin( t ) ≅ VDD ⎡1 − e −t / τTG ⎤
⎢⎣ ⎥⎦
Inverter is switched, when Vin=VIH which occurs after
⎡ VIH ⎤
ϕt 1 ≅ − τTG ln ⎢1 −
⎣ VDD ⎥⎦
Cin = Cox [(WL )n + (WL )p ]
VA=0: (Vin(0)= VDD)
Vin( t ) ≅ VDD ⋅ e −t / τTG

The time until Vin reaches VIL is given by


⎡VDD ⎤
t 0 ≅ − τTG ln ⎢
⎣ VIL ⎥⎦

Institute of
9: Memory Elements & Microelectronic
Systems 23
Dynamic Logic

2) Lower Frequency Limitation: Charge Leakage

Leakage patch in a CMOS TG

The load capacitance, seen by the transmission gate (TG) is


CL = CTG + Cline + Cin

The depletion capacitance contributions to CL are due to the reversed pn


junctions in the MOS transistors. As shown in fig. above a leakage current flow
exists across the reverse biased pn junctions. The influence of this leakage
current on the charge stored in CL depends on the values of ILp and ILn.

Institute of
9: Memory Elements & Microelectronic
Systems 24
Dynamic Logic
Charge leakage problem in CMOS TG
Institute of
9: Memory Elements & Microelectronic
Systems 25
Dynamic Logic

With
IL = ILn − ILp

the leakage current influence on Vin is given by


dVin
CL = − IL
dt

If ILp>ILn the capacitance is charged by IL otherwise it is discharged or remains


constant when the ideal condition ILp=ILn is true.

dQstore
= ILp − ILn
dt
dQstore
Cstore =
dV
Assuming that the leakage currents ILp and ILn are constant and that the node
charge voltage relation is linear of the form
Qstore = CstoreV

Institute of
9: Memory Elements & Microelectronic
Systems 26
Dynamic Logic
follows (because Cstore is const.)
dV
Cstor = ILp − ILn
dt
The solution of this equation is
( ILp − ILn )
V(t ) = t +V(0 )
Cstor
If ∆V is the maximum allowed voltage change:
Cstor∆V
t max =
IL

Charge leakage circuit

Institute of
9: Memory Elements & Microelectronic
Systems 27
Dynamic Logic

With Tmax=2tmax (the longest allowed clock period) follows for the minimum
frequency
1 IL
f min ≅ ≅
2 t max 2Cstore∆V

The transmission gate capacitance is

Transmission gate capacitance

CT ≅ CG + Cline + Cols + Cold + CSBp( V ) + CDBn( V )

Institute of
9: Memory Elements & Microelectronic
Systems 28
Dynamic Logic
So the storage capacitance can be estimated by voltage averaging of this
expression:
Cstor ≅ C G + Cline + Cols + Cold + K ( 0 ,VDD )[CSBp + CDBn ]
For a realistic analysis of the charge leakage problems the dependence of the
leakage currents from the reverse voltage bias has to be taken into consideration.

Institute of
9: Memory Elements & Microelectronic
Systems 29
Dynamic Logic

Charge Sharing

Basic charge sharing circuit

t<0: (TG switched off)


V 1( t < 0 ) = VDD
V 2( t < 0 ) = 0
QT = C1VDD

t>0: (TG switched on)


QT = ( C1 + C 2 )Vf
Vf = V 1( t > 0 ) = V 2 ( t > 0 )
C1 1
= VDD = VDD
C1 + C 2 1 + ( C 2 / C1 )
Institute of
9: Memory Elements & Microelectronic
Systems 30
Dynamic Logic
If we design a circuit with C1=C2, then Vf=(VDD/2), indicating drop in voltage. A
reliable forward transfer of a logic 1 state from C1 to C2 requires that C1>>C2 to
insure that Vf≈VDD.
Let us specify arbitrary initial conditions V1(0)and V2(0) on the capacitors
giving the system a total charge of
Qt = C1V 1( 0 ) + C 2V 2 ( 0 )
Applying basic circuit analysis gives the time-dependent voltage as

V 1(t ) = V 1(0) +
[V (0) − V
1 (0)]
2
[
C 1 + C 2 e −t / τ ]
(C 1 + C 2)
⎛ C1 ⎞
V 2(t ) = V 2(0) + [V 1(0) − V 2(0)]⎜ ⎟ 1− e[−t / τ
]
⎝ C1 + C 2 ⎠
where the time constant is given by
C1C 2
τ = RTGCeq with Ceq =
C1 + C 2
In the limit t→∝, V1=V2=Vf:
C1 C1
Vf = V 1( 0 ) + V 2( 0 )
C1 + C 2 C1 + C 2

Institute of
9: Memory Elements & Microelectronic
Systems 31
Dynamic Logic

This agrees with the result from simple charge conservation by noting that the
final charge distributes according to
QT = ( C1 + C 2 )Vf

Transient voltage behavior for initial conditions of V1(0)=VDD and V2(0)=0

Institute of
9: Memory Elements & Microelectronic
Systems 32
Dynamic Logic
Charge sharing among N TG-connected capacitors

N
Initial charge: QT = ∑ CiVi ( 0 )
i =1

QT = ⎛⎜ ∑ Ci ⎞⎟Vf
N
After connecting nodes:
⎝ i =1 ⎠

∑Ni =1 CiVi ( 0 )
Final voltage: Vf =
∑Ni =1 Ci

Institute of
9: Memory Elements & Microelectronic
Systems 33
Dynamic Logic

Dynamic Logic
• Pull-up (pull-down) network of static CMOS is replaced by a single precharge
(discharge) transistor.
The remaining network then conditionally discharges (changes up) the output
in a second operation pulse
• One logic level is held by dynamic charge storage
• Transistor count is reduced from 2n (static CMOS) to n+2 for dynamic
precharged CMOS (but now: 2 phases of operation)

Dynamic nMOS Inverter (Single clock, 2 phases)

Basic dynamic nMOS inverter


Institute of
9: Memory Elements & Microelectronic
Systems 34
Dynamic Logic
Precharge Phase

If Vin=0 then
Cout
τch = = RpCout
β p( VDD − VTp )

WORST case (Vin=VDD):


τch , max = Rp( Cout + Cn )
tch , max =
⎡ ⎞⎤
2 VTp ⎛ 2 ( VDD − VTp )
= τch , max ⎢⎢ +ln ⎜⎜ − 1 ⎟⎟ ⎥⎥
⎢ ( VDD

− VTp ) ⎜

V0 ⎟⎥
⎠⎦

Dynamic nMOS inverter:


precharge and evaluate
Institute of
9: Memory Elements & Microelectronic
Systems 35
Dynamic Logic

Evaluation Phase

For the case that M1 is switched on and identically designed channel width for M1
and Mn the discharge time constant is given by
( L1 + Ln )Cout
τdis =
k ′nW ( VDD − VTn )

Precharge network for worst case

Institute of
9: Memory Elements & Microelectronic
Systems 36
Dynamic Logic
Evaluation discharge network

⎡ ⎤
tdis = τdis ⎢⎢ 2VTn + ln ⎛⎜ 2 ( VDD − VTn ) − 1 ⎞⎟ ⎥
⎜ ⎟⎥
⎣⎢ ( VDD − VTn ) V0
⎜ ⎟⎥
⎝ ⎠⎦

Maximum clock frequency


tM = max( tch , max, tdis )
1
f max ≅
2 tM
Institute of
9: Memory Elements & Microelectronic
Systems 37
Dynamic Logic

Dynamic pMOS Inverter

φ=1 Precharge
φ=0 Evaluate

Basic dynamic pMOS inverter

Dynamic CMOS Properties and Conditions


• single phase clock
• input should change during precharge only
• input must be stable at the end of the precharge phase
• in the evaluation phase the output remains HIGH (LOW) or is optionally
discharged (charged)
Institute of
9: Memory Elements & Microelectronic
Systems 38
Dynamic Logic
Complex Logic

Complex dynamic logic

Institute of
9: Memory Elements & Microelectronic
Systems 39
Dynamic Logic

Dynamic Cascades
pMOS blocks and nMOS blocks have to be installed alternated in order to avoid
glitches

Cascaded nMOS-nMOS glitch problem

Dynamic cascades
Institute of
9: Memory Elements & Microelectronic
Systems 40
Dynamic Logic
Domino CMOS Logic

Basic domino logic circuit

Institute of
9: Memory Elements & Microelectronic
Systems 41
Dynamic Logic

• Domino Logic: design method for glitch-free cascading of nMOS logic blocks
• Each stage is driven by φ
- Precharge during φ = 0
- Evaluation when φ = 1
• Domino logic blocks consists of a precharge/ evaluation block and an output
inverter
Precharge Phase: The gate output is precharged to logic 1 and the inverter output
is going to logic 0. Logic transmission errors are avoided by providing a
logic 0 at the inverter output (avoiding discharge of the next logic state).
Evaluation Phase: The inverter output stays according to the actual input values
at logic 0 or is set to logic 1. The correct result signal is provided at the
end of the domino cascade after stabilization of all stages.

Institute of
9: Memory Elements & Microelectronic
Systems 42
Dynamic Logic
Domino AND gate

Cascaded domino logic

Institute of
9: Memory Elements & Microelectronic
Systems 43
Dynamic Logic

Visualization of domino effect

Domino timing
Institute of
9: Memory Elements & Microelectronic
Systems 44
Dynamic Logic
Cascaded domino circuit with fanout = 2

Institute of
9: Memory Elements & Microelectronic
Systems 45
Dynamic Logic

Domino Logic Properties

Cascaded domino logic

• Domino logic consists of either n-type or p-type blocks


• small load capacity to by driven by logic (one inverter only) ⇒ low dimensions of
transistors
• only one clock signal required
• only positive logic realizations possible because of the input inverters ⇒ domino
logic is noninverting
Functions as

cannot be directly realized in a domino chain


Institute of
9: Memory Elements & Microelectronic
Systems 46
Dynamic Logic
Analysis

Domino AND4 gate

CX=C0+CT. C0 represents the capacitance due to M0, while CT is the total of all
other contributions.

Institute of
9: Memory Elements & Microelectronic
Systems 47
Dynamic Logic

Precharge (φ=0: Mp1 in conduction, Mn1 in cutoff)


Mp1 conducting → Cx → Vx > VIH ( = log ic 1 )
Minimum precharge time
⎡ VTp ⎤ ⎛ 2 ( VDD − VTp ) ⎞
tch ≅ τch ⎢ ⎥ + ln⎜ − 1⎟
⎣ ( VDD − VTp ) ⎦ ⎝ VDD − VIH ⎠
VX(0)=0
⎡ CX ⎤
τch = ⎢ ⎥
⎣ β p( VDD − VTp ) ⎦
CX = C 0 + CT
≅ ( CGDn1 + CBDn1 ) + ( CGDp1 + CBDp1 ) + CG + Cline

Evaluate
If all inputs Ai are set to logic 1, the worst case delay time can be estimated by
tD ≅ RnCn + ( Rn + R 3 )C 3 + ( Rn + R 3 + R 2 )C 2 +
+ ( Rn + R 3 + R 2 + R1 )C1 + ( Rn + R 3 + R 2 + R1 + R 0 )CX
with
1
Rj +
k ′n( W / L ) j ( VDD − VTn )
Institute of
9: Memory Elements & Microelectronic
Systems 48
Dynamic Logic
Charge Leakage and
Charge Sharing

Domino stage with pull-up MOSFET


Institute of
9: Memory Elements & Microelectronic
Systems 49
Dynamic Logic

Cout,1>>Cx1+Cx2

Charge sharing in a domino chain

Institute of
9: Memory Elements & Microelectronic
Systems 50
Dynamic Logic
Use of feedback to control a pull-up MOSFET for charge sharing problem

Institute of
9: Memory Elements & Microelectronic
Systems 51
Dynamic Logic

NORA Logic
(NORA = NO RAce)

NORA Properties
• NORA is very insensitive to clock delay
• one clock signal and the inverted clock signal with short slopes rise times are
sufficient
• no inverter is needed between the logic stages, because of alternate use of
n-type and p-type blocks
• the last stage is a clocked inverter, a C2MOS latch
• ideal to clock pipelined logic systems

Institute of
9: Memory Elements & Microelectronic
Systems 52
Dynamic Logic
The Signal Race Problem

Signal race problem

The signal race problem can be seen: a signal race can arise, when both
transmission gates conduct at the same time. If the new input from TG1 reaches
the input of TG2 while TG2 is still transmitting the output, the output information
will be lost. Imperfect TG synchronization occurs because of normal transmission
intervals or clock skew.

Institute of
9: Memory Elements & Microelectronic
Systems 53
Dynamic Logic

tp>>tr,tf → no problems

Tskew=tp → race result


critical

Clock skew

Institute of
9: Memory Elements & Microelectronic
Systems 54
Dynamic Logic
φ=0 Precharge
φ=1 Evaluate

Accept data when φ=0,


hold data when φ=1

Dynamic latch operation

Institute of
9: Memory Elements & Microelectronic
Systems 55
Dynamic Logic

NORA Structuring

clk2

NORA structuring

Institute of
9: Memory Elements & Microelectronic
Systems 56
Dynamic Logic
NORA φ and φ sec tions
Institute of
9: Memory Elements & Microelectronic
Systems 57
Dynamic Logic

φ=1 Precharge
φ=0 Evaluate

C2MOS latch

NORA pipelined logic


Institute of
9: Memory Elements & Microelectronic
Systems 58
Dynamic Logic
φ = 0: P P locked E E transp.
φ = 1: E E transp. P P locked

φ
NORA φ and φ sec tions
Institute of
9: Memory Elements & Microelectronic
Systems 59
Dynamic Logic

0V

φ = 0: P P locked E E transp.
φ = 1: E E transp. P P locked

φ
NORA φ and φ sec tions
Institute of
? Microelectronic
9: Memory Elements &
Systems 60
Dynamic Logic
0V

φ = 0: P P locked E E transp.
φ = 1: E E transp. P P locked

C²MOS Latch
φ locked during
clock skew
period!

φ
NORA φ and φ sec tions
Institute of
? Microelectronic
9: Memory Elements &
Systems 61
Dynamic Logic

Duration of initial Value of Evalutation Phase (VDD) will be enhanced

Precharged
to 0V

φ = 0: P P locked E E transp.
φ = 1: E E transp. P P locked
And the other Duration of provision of logical
way round: output value to next stage will

φ eventually be enhanced

φ
NORA φ and φ sec tions
Institute of
? Microelectronic
9: Memory Elements &
Systems 62
Dynamic Logic
10. Performance

Institute of
Microelectronic
Systems

Summary

Interconnect Parameters: Capacitance, Resistance, Inductance


Electrical Wire Models
• Lumped C model
• Lumped RC model
• RC chain model
• Distributed RC line model
• Transmission line model
Technology Scaling
Power and Clock Distribution
Input Protection Circuits
Static Gate Sizing
Off-Chip Driver Circuits
Packaging Technology

Institute of
Microelectronic
10: Performance Systems 2
Interconnect Parameters

Interconnection choices in an actual CMOS process:


• multiple layers of Aluminum (up to 7)
• polysilicon layer (at least one)
• possibility of using the heavily doped n+ and p+ layers
The wiring forms a complex geometry that introduces parasitics:
• capacitive
• resistive
• inductive
Parasitic effects reduce the performance and the reliability by:
• increasing the propagation delay
• affecting the energy dissipation and the power distribution
• introducing extra noise source

Institute of
Microelectronic
10: Performance Systems 3

Modern Interconnect

Institute of
Microelectronic
10: Performance Systems 4
Full Wire Model
Assume that all wires in a bus network are implemented in a single interconnect layer (Al),
isolated from the silicon substrate and from each other by a layer of dielectric material (SiO2):

Schematic view
Physical view

Full wire circuit model:


• Consider parasitic capacitance, resistance and inductance
• Parasitics are distributed over the length of the wire
• Inter-wire parasitics: coupling effects

Institute of
Microelectronic
10: Performance Systems 5

Simplified (Only Capacitance) Wire Model

A simplified capacitance-only model can be used if:


• the wires are short
• the wires cross-section is large or the wire material has a
low resistivity (small resistance)

Other simplified models can be obtained


1) Neglecting the inductive effects, valid when:
• the resistance of the wire is large (long Al wires with a
small cross-section)
• trise and tfall of the signals are large (slow signals)
2) Neglecting the inter-wire capacitance, valid when:
• the separation between neighboring wires is large
• the wires run together for a short distance

Institute of
Microelectronic
10: Performance Systems 6
Wire Parallel-Plate Capacitance
The capacitance of a wire is function of:
• shape of the wire
• environment
• distance to substrate
Current Flow
• distance to surrounding wires
L
Simple model - the parallel-plate capacitance:
W
ε ox Electrical-field
C wire = C pp = WL H
lines
tox
tox SiO2
Cwire is the total capacitance of the
wire (pF)
Substrate

True for W >> tox ⇒ electric field lines are orthogonal to the capacitor plates

Institute of
Microelectronic
10: Performance Systems 7

Wire Fringing Capacitance


• Advanced processes have a reduced W/H ratio (<1)
• The capacitance between side-wall of the wires and the substrate (fringing capacitance)
must be considered!
H W - H/2
W
Cfringe +
H

SiO2 tOX
Cfringe Substrate Cpp
Substrate Cpp

cwire = c pp + c fringe

cwire ≈
(W − H / 2)ε ox + 2πε ox
tox log(tox / H ) cfringe
cwire
cwire is the wire capacity per unit length (pF/cm)
cpp
cpp
For W/H large cfringe < cpp, cwire ~ cpp
For W/H < 1.5 ⇒ cfringe > cpp
Institute of
Microelectronic
10: Performance Systems 8
Interwire Capacitance
Level2 In multilevel interconnects technologies the
wires are not completely isolated

Cfringe Each wire is coupled to the:


Cparallel
• substrate (grounded capacitor)
Level1
• neighboring wires on the same layer (floating
capacitor)
• neighboring wires on adjacent layers (floating
capacitor)

Assuming that oxide thickness (tox = 1µm) and metal


thickness (H=1µm) are held constant while scaling the
other dimensions ⇒ for W < 1.75H, C interwire dominates!

Institute of
Microelectronic
10: Performance Systems 9

Wiring Capacitances

Field Active Poly Al1 Al2 Al3 Al4

Cplate (aF/µm2) 88
Poly
Cfringe (aF/µm) 54

Cplate (aF/µm2) 30 41 57
Al1
Cfringe (aF/µm) 40 47 54

Cplate (aF/µm2) 13 15 17 36
Al2
Cfringe (aF/µm) 25 27 29 45

Cplate (aF/µm2) 8.9 9.4 10 15 41


Al3
Cfringe (aF/µm) 18 19 20 27 49

Cplate (aF/µm2) 6.5 6.8 7 8.9 15 35


Al4
Cfringe (aF/µm) 14 15 15 18 27 45

Cplate (aF/µm2) 5.2 5.4 5.4 6.6 9.1 14 38


Al5
Cfringe (aF/µm) 12 12 12 14 19 27 52

Plate and fringe capacitance values for a typical 0.25 µm CMOS process

Institute of
Microelectronic
10: Performance Systems 10
Wire Resistance
ρ L L
R= = R
H W W

L R - Sheet Resistance
H

W R1 ≡ R2

Institute of
Microelectronic
10: Performance Systems 11

Dealing With Resistance

• Selective technology scaling


• Use better interconnect materials (silicides,
bypasses)
• More interconnect layers (reduce average wire
length)

Polycide gate MOSFET

Silicides: WSi2, TiSi2, PtSi2, TaSi


Conductivity: 8-10 times better than Poly

Institute of
Microelectronic
10: Performance Systems 12
Other Resistive Effects
(1) Contact resistance
• Extra resistance added by transition between routing layers
• Can be reduced by making the contact holes larger
• Current crowding upper limits the size of the contact

(2) Skin effect


• High frequency (GHz) currents tends to flow on the surface of a conductor
• Resistance become frequency-dependent (increase when frequency increase)
• Affects only wider wires

(3) Electromigration
• Limits the DC currents to 1mA/µm

Institute of
Microelectronic
10: Performance Systems 13

Wire inductance
At switching frequencies in GHz range the wire inductance must be considered
di
A changing current passing through an inductor generates a voltage drop: ∆v = L
dt
On-chip inductance effects are:
• reflection of signals due to impedance mismatch
• inductive coupling between lines
• ringing effects
• switching noise due to Ldi/dt voltage drops
It is possible to compute the wire inductance directly from its geometry and its environment
A more simple approximation is given by following relation:
cl = εµ
where c is capacitance per unit length, l inductance per unit length, ε electric permittivity and
µ magnetic permeability of the surrounding dielectric
Ex.: 0.25 µm technology a 0.4µm width Al wire routed on top of the field oxide (SiO2) has
c = 92aF/µm, l = 0.47pH/µm

Institute of
Microelectronic
10: Performance Systems 14
Example: Intel 0.25 micron Process

Institute of
Microelectronic
10: Performance Systems 15

The Lumped C Model

Conditions:
• resistive component of the wire is small
• consider only the capacitive component
• switching frequencies are in medium range
The wire still represents an equipotential region and does not introduce any delay
The distributed capacitance is lumped into a single capacitor
The only impact on performance:
• loading effect of Clumped on the driving gate

Institute of
Microelectronic
10: Performance Systems 16
The Lumped RC Model

Metal wires of few mm length have a significant resistance and the equipotential assumption is
no longer adequate!
New model:
• Lumps the total resistance of the wire into a single resistor R
• Combines the global capacitance of the wire into a single capacitor C
The estimated wire delay: τ = RC
This model is pessimistic and inaccurate for long interconnect wires!

Institute of
Microelectronic
10: Performance Systems 17

The Elmore Delay

Consider the following RC-tree network:


• the network has a single input node (s)
• all capacitors are between a node and the ground
• the network does not contain any resistive loops

The shared path resistance Rik is the resistance shared


among the paths from the source node s to the nodes k and i:

Rik = ∑ R j , whereR j ∈ [ path (s → i ) ∩ path (s → k )] Ex.: Ri4 = R1 + R3; Ri2 = R1

Assume that each node of the network is initially discharged and a step input is applied at t=0
The Elmore delay at node i, for a network with N nodes, is given by:
N
τ Di = ∑ C k Rik
k =1

Ex.: τDi = R1C1 + R1C2 + (R1 + R3)C3 + (R1 + R3)C4 + (R1 + R3 + Ri)Ci

Institute of
Microelectronic
10: Performance Systems 18
The RC Chain Model
RC chain - a special case of the RC-tree network:
R1 1 R2 2 Ri-1 i-1 Ri i N
Vin VN

C1 C2 Ci-1 Ci

N i N
τ DN = ∑ Ci ∑ R j = ∑ Ci Rii Ex.: τ Di = C1R1 + C2(R1 + R2) + ... + Ci(R1 + ... + Ri)
i =1 j =1 i =1

Assume that a wire of length L is modeled by N equal-length segments, each having Ri = rL/N,
and Ci = cL/N (r, c are resistance and capacitance per unit length)

2 N ( N + 1) N +1
2

τ DN = ⎜
⎛L⎞
(
⎟ (rc + 2rc + ... + Nrc ) = rcL ) = RC
⎝N⎠
2
2N 2N
RC rcL2
For N large, the RC chain model approach the distributed RC line model: τ DN = =
2 2
(1) The delay of a wire is a quadratic function of its length
(2) The delay of the RC chain model is 1/2 of the delay predicted by the lumped RC model!

Institute of
Microelectronic
10: Performance Systems 19

The Distributed RC Line Model (1)


L - total length of the
wire
r - resistance per unit
length
c - capacitance per
unit length

The voltage at node i is given by the following partial differential equation:

∂Vi (Vi +1 − Vi ) − (Vi − Vi −1 )


c∆L =
∂t r ∆L V - the voltage at a particular
∂V ∂ 2V point in the wire
For ∆L -> 0, we obtain the diffusion equation: rc = 2
∂t ∂x x - the distance between this
point and the signal source

The diffusion equation is difficult to use for circuit analysis


However, the distributed RC line can be approximated by a lumped RC chain network, and:

rcL2
τ (out ) =
2
Institute of
Microelectronic
10: Performance Systems 20
The Distributed RC Line Model (2)

• The step input waveform diffuses from the


start to the end of the wire
• The waveform rapidly degrades: delay for
long wires

Voltage range Lumped RC network Distributed RC network

0 → 50%(tp) 0.69RC 0.38RC

0 → 63%(τ) RC 0.5RC

10% → 90%(tr) 2.2RC 0.9RC

0 → 90% 2.3RC RC

Step response of lumped and distributed RC networks: points of interests


Institute of
Microelectronic
10: Performance Systems 21

Figures of Merit for RC Interconnect


Criteria:
(1) RC delays should be considered if interconnect delay, tp(RC), is comparable to driving
gate delay, tp(gate). A critical length can be defined using the propagation table from slide
21:
δG
L =
0 .3 8 R C
where Lcrit depends upon the sizing of the driving gate and the chosen interconnect
material

(2) A distributed RC model should be used if the rise (fall) time at the line input is smaller
than the rise (fall) time of the line.
tr < RC
Otherwise, a simple lumped C model suffices.

Example: Driving an RC line RwC w


τ D = Rs Cw + = Rs C w + 0.5rwcw L2
2

t p = 0.69 Rs C w + 0.38 RwC w


Institute of
Microelectronic
10: Performance Systems 22
Transmission Lines
When the inductance of the wire dominates the delay behavior - transmission line effects!
Model: a distributed RLC wire
Signal propagate as a wave - alternatively transferring energy from electric to magnetic field
r l r l r l r l
Vin x Vout

g c g c g c g c

The wave propagation equation:

∂ 2v ∂v ∂ 2v r,c,l - resistance, capacitance and inductance per unit length


= rc + lc 2
∂x 2 ∂t ∂t g ~ 0 - the leakage conductance
The ideal wave propagation equation (for lossless transmission line, r=0) :

∂ 2v ∂ 2v 1 ∂ 2v 1
= lc 2 = 2 2 ν= propagation speed along the line
∂x 2 ∂t ν ∂t lc
Institute of
Microelectronic
10: Performance Systems 23

Lossless Transmission Lines Parameters (1)


Propagation speed: only a function of surrounding medium

c0 - speed of light in vacuum

ε - electric permittivity of insulator


1 1 c0
ν= = = µ - magnetic permeability of insulator
lc εµ ε r µr
εr - relative permittivity with respect to vacuum
µr - relative permeability with respect to vacuum
tflight = L/v - the times it takes for the wave to propagate from one to the other end of the wire

Dielectric constant and wave-propagation speed


for various materials used in IC technology

Institute of
Microelectronic
10: Performance Systems 24
Lossless Transmission Lines Parameters (2)
Characteristic impedance: impedance presented by wire

l 1
Z0 = = lν = 100 to 500Ω for typical wires
c cν
The behavior of the transmission line is influenced by the termination of the line
The termination how much of the wave is reflected upon arrival at the wire end
Vrefl I refl R − Z0
ρ= = =
Vinc I inc R + Z0
ρ - Reflection coefficient
R - the termination resistance

R = Z0 ρ=0
R=∞ ρ=1
R=0 ρ = -1

Institute of
Microelectronic
10: Performance Systems 25

Transmission Lines with Terminating Impedances Zs and ZL

Consider the case: ZL = ∞, ρ = 1

Zs VSource Z0 VDest
Vin
ZL

VSource = (Z0/(Z0+Zs))Vin

ρs = (Zs-Z0)/(Zs+Z0)

Institute of
Microelectronic
10: Performance Systems 26
Lattice Diagram

VSource VDest
Vin = 5V, RS = 5Z0, RL = ∞
0.8333 V + 0.8333
ρs = (Zs-Z0)/(Zs+Z0) = 0.66
1.6666 V
+ 0.8333 ρD = 1
2.2222 V
+ 0.5556 t = 0 ... tflight
+ 0.5556 2.7778 V V1S = (Z0/(Z0+Zs))Vin = 0.83V
V1D = V1S + Vr,1D; Vr,1D = ρD V1S = 0.83V
t 3.1482 V + 0.3704
V1D = 0.83V + 0.83 = 1.66V
+ 0.3704 3.5186 V
t = tflight ... 2tflight
3.7655 V + 0.2469 V2S = V1S + Vr,1D + Vr,1S ; Vr,1S = ρS Vr,1D = 0.55V
+ 0.2469 4.0124 V V2S = 2.22V
V2D = V1D + Vr,1S + Vr,2D; Vr,2D = ρD Vr,1S = 0.55V
...
V2D = 2.77V
L/ν ....

Conclusion: in order to avoid ringing or slow propagation delay the transmission line
should be terminated both at the source (series termination) and at the destination (parallel
termination) with a resistance equal to Z0
Institute of
Microelectronic
10: Performance Systems 27

Figures of Merit for RLC Interconnect

Criteria:

• Rise (fall) time of input signal, tr, must be smaller than propagation delay through
wire. Otherwise, a lumped model suffices.
t flight lw
tr < = lc Length (cm)
2 2 10.00
• Wire resistance R / damping factor ξ may not 2. High
2tr
< lw
be too large, otherwise distributed RC model attenuation lc 1. & 2.

sufficient 1.00
l Inductance is
R = rlw < 2 Z 0 = 2 important
2 l
c lw <
r c
0.10
rl c
or ξ= w <1
1. Large input
rise time
2 l
• In conclusion: 0.01
0.01 0.10 1.00 10.00

2tr 2 l Transition time (ns)


< lw <
lc r c Institute of
Microelectronic
10: Performance Systems 28
Scaling (1)
VLSI integration depends on the smallest-size feature permitted by the technology
The size of the transistors has to be as small as possible!
The internal operating physics of the down-scaled MOS transistor changes
First order scaling theory :
• Estimates the improvements that can be expected as technology is scaled
• Scaled MOS device is obtained by applying a dimensionless scaling factor α to:
• all dimensions (L, W, junction depth, oxide thickness, etc.)
• device voltages
• impurities concentration densities
• The characteristics of the scaled MOS device are similar to that of the original one
• A number of parameters such as voltage drop, line propagation delay, current density,
contact resistance exhibit significant degradation with scaling!

Institute of
Microelectronic
10: Performance Systems 29

Scaling (2)
Influence of first-order scaling on MOS device
Parameter Scaling Factor
Length; L 1/α
Width; W 1/α
Gate oxide thickness; tox 1/α
Device Junction depth; Xj 1/α
Parameter
Substrate doping; Na or Nd α
Supply voltage; VDD 1/α
Electric field across gate oxide; E 1
Depletion layer thickness; d 1/α
Parasitic capacitance; WL/tox 1/α
Gate delay; VC/I 1/α
DC power dissipation; Ps 1/α2
Resultant Dynamic power dissipation; Pd 1/α2
Influence Power speed product 1/α3
Gate area 1/α2
Power density; VI/A 1
Current density; I/A α
Transconductance; gm 1

Institute of
Microelectronic
10: Performance Systems 30
Scaling (3)
Interconnect layer scaling
Parameter Scaling Factor The scaled line resistance is:
Conductor line width; W 1/α
ρ ⎡ L /α ⎤
r' = = αr
t / α ⎢⎣W / α ⎥⎦
Conductor line length; L 1/α
Conductor line thickness; t 1/α
Line cross-section; A 1/α2 The voltage drop along the scaled line is:
Line resistance; r 1/α
Line response time; rc 1 Vd ' = (I / α )(αr ) = Ir = ct
Normalized line response time 1/α
Line voltage drop; Vd 1 The scaled line response time is:
Normalized line voltage drop 1/α τ s ' = (αr )(C / α ) = rC = ct
Current density; J 1/α
Normalized contact voltage drop; Vc/V 1/α2

For a constant chip size many of the signals paths do not scale down! Therefore:
• Voltage drops along the lines are larger by a factor of α than scaled line voltage drop
• The line response time is larger by a factor of α than scaled line response (see table)
Problems: distribution and organization of clocking signals, electromigration, the increase of
the wire capacitance (affects the gate delay)
Institute of
Microelectronic
10: Performance Systems 31

Power Distribution
Process with 1 Level of metal :
• VDD and ground (VSS) are routed in interdigitated trees
• Crossunders are very difficult (low resistance interconnect)
Power distribution is much easier for technologies with 2 (or
more) levels of metal

Cautions:
• Parts of the chip that are likely to simultaneous
transition are routed separately!
• Separate power pins might be used for the
output driver!

Institute of
Microelectronic
10: Performance Systems 32
Clock and Timing Circles (1)

The clock
• synchronize machine operations and data transfer
• global control technique that provide the “glue” for system operation
System level timing can be described using circular timing charts

Ideal pseudo 2-phase clocking chart:


• φ1(t)φ2(t) = 0, ∀t
• φ1=1 during first half-period
• φ2=1 during the last half-period

• time increases in a counter-clockwise direction


• one full rotation corresponds to a clock period T

Institute of
Microelectronic
10: Performance Systems 33

Clock and Timing Circles (2)


Overlapping pseudo 2-phase clocking chart:
• φ1(t)φ2(t) = 0, except during the transition times
• mutually-exclusive clock periods provide timing
intervals for logical operations
• overlapped segments must be avoided
• transition times can be made small by proper
clock generator design

Clock skew is represented by rotating one of the


clocks!
• φ1(t)φ2(t) = 1 defines the skew time, ts
• ts indicates the possibility of unwanted
simultaneous bit transfer
• skew are caused by the clock driving circuit or by
the distribution arrangement

Institute of
Microelectronic
10: Performance Systems 34
Clock Generation Circuits (1)
2-phase clock generator with transmission gate delay

• Mp1, Mn1 inverter acts as the first driver for the


chain
• Transmission gate (TG) is used as delay
element to minimize clock skew
• TG is modeled as an equivalent resistance
RTG and introduces a delay tD = RTGCin
• tP - the propagation delay through an inverter
• Choosing tD ~ tP the delay between the two
branches is the same
• Thus clocking skew can be controlled by
adjusting the size of the TG transistors (β)

1
RTG =
(
β n (VDD − VTn ) + β p VDD − VTp )

Institute of
Microelectronic
10: Performance Systems 35

Clock Generation Circuits (2)


2-phase clock generator with RS latch

To insure proper operation of the circuit two items should be checked:


• tP through the inverter must be small compared to the clock period (CLK has time to enter
the latch)
• the output capacitance in both branches should be equal for equal switching delays; but
capacitances are sensitive to the layout and interconnect geometry!

Institute of
Microelectronic
10: Performance Systems 36
Clock Drivers and Distribution Techniques (1)

The clock driver must be able to handle large


capacitive loads at the required clock frequency
Clock skew originate mostly from:
• unbalanced loads at the driver
• unequal distribution line delays (RC) - see figure

Distribution networks approaches:


• cascaded chain of inverting buffers that matches the clock generator to the distribution line
• balanced tree network with multiple fanouts
• symmetrical geometries (like H-tree) for the clock distribution lines

Institute of
Microelectronic
10: Performance Systems 37

Clock Drivers and Distribution Techniques (2)

Balanced tree network with multiple fanouts: H-tree network:


• identical drivers can be used within a given • each clock distribution point O is at the
stage same distance from the driver D, giving
equal delay times
• the drive requirements of the output circuits
are reduced from the single inverter design
since the fanout has been split into groups

Institute of
Microelectronic
10: Performance Systems 38
Input Protection Circuits (1)
Excessive electrical charge on the gate of the MOS transistor can destroy the device!
Protection circuits drain this excessive charge and avoid static burnout!
VG
C g = CoxWL Eox ≈ E BD ~ 7,5 • 10 6 V / cm
xox

If Eox>EBD, the oxide insulating properties break down and charge is transported through
the material - destruction of the device!
The max gate voltage VGmax is a relatively small number
Static electricity during handling could easily reach a few kV

VG max ≅ E BD ⋅ xox = 7,5 ⋅10 6 V / cm ⋅ 35 ⋅10 −9 = 26.25V

Protection circuits allow for alternate charge flow paths when the input voltage is too large
Diode structures are very useful in this application because:
• have relatively low breakdown voltages which can be controlled
• reverse breakdown in a pn junction is non-destructive

Institute of
Microelectronic
10: Performance Systems 39

Input Protection Circuits (2)

Diode input protection circuit: Thick oxide MOSFET protection circuit:


• D1...4 are reverse biased • the transistor has the threshold voltage > VDD
and is in cutoff during normal operation
• R reduces the voltage that reaches D3, D4
and increases the level of protection • If Vin > VT,f the transistor conducts providing
a path to ground to drain off the excessive
• D1, D2 and D3, D4 undergo breakdown for
charge
positive or negative voltage sources

Input protection circuits introduce parasitic RC time constants into the network!

Institute of
Microelectronic
10: Performance Systems 40
Static Gate Sizing (1)
Problem - determine the values of Sj for j = 2,... which minimizes the total propagation delay
through the inverter chain

• Sj - sizing factor, S1 = 1; Sj >1 for j>1


• βj - conduction factor, β1=k’(W/L)1; βj=Sjβ1
• Cw - wiring contribution of gate 1
• Ci, Co - in/out capacitances of gate 1
• Co,j = SjCo - output capacitance from gate j
• Ci,j = SjCi - input capacitance to gate j
• Cw,j = SjCw - wiring capacitance of gate j

The time delay through gate j is, tD,j:

⎛R⎞ ⎛R⎞
[
t D , j = ⎜ ⎟(Co , j + Ci , j +1 + Cw, j +1 ) = ⎜ ⎟ S j Co + S j +1 (Ci + Cw )
⎜S ⎟ ⎜S ⎟
]
⎝ j⎠ ⎝ j⎠

Institute of
Microelectronic
10: Performance Systems 41

Static Gate Sizing (2)


Suppose that there are N stages in the chain, the total time delay is given by:

TD = ∑
N [ ]
R S j Co + S j +1 (Ci + C w )
j =1 Sj
∂TD
To minimize TD we differentiate with respect to Sj and look for zero slope points: =0
∂S j
S j +1 Sj
This results in the recursion relation: = for j= 2,3,...N
Sj S j −1

S j +1
If this to hold for arbitrary values of j, then: = K = const
Sj

The boundary conditions of the problem are: S1 = 1, SN+1 = CL/Ci

S 2 S 3 S 4 S N +1 C
Forming the product: ⋅ ⋅ ⋅⋅⋅ = KN = L
S1 S 2 S 3 SN Ci
1/ N
⎛C ⎞
We obtain the scaling ratio in the form: K = ⎜⎜ L ⎟⎟
⎝ Ci ⎠

Institute of
Microelectronic
10: Performance Systems 42
Static Gate Sizing (3)
Explicitly, the scaling factors are given by: S1 = 1, S2 = K, S3 = K2 ... SN = KN-1

N
The minimum delay is then: TD ,min = ∑ R[Co + K (Ci + C w )] = NR[Co + K (Ci + C w )]
j =1

The equation K = Sj+1/Sj says that the minimum delay occurs when every stage has the
same individual delay time tD

The number of stages that optimize the delay is obtained by differentiating TD (replacing K
with its N-dependent equation) with respect to N and setting the result to 0:
1
⎛ C ⎞ N ⎡ ln (C L / Ci ) )⎤
RCo + R (Ci + C w )⎜⎜ L ⎟⎟ ⎢1 − ⎥=0
⎝ Ci ⎠ ⎣ N ⎦
⎛C ⎞
If Co is small: N = ln⎜⎜ L ⎟⎟ N is chosen the nearest integer for given values of Ci and CL
⎝ Ci ⎠
the optimum
C ⎛C ⎞
with K = L ⇔ N ln K = ln⎜⎜ L ⎟⎟ ⇒ N ln K = N ⇔ ln K = 1 ⇔ K = e = e scaling ratio
N 1

Ci ⎝ Ci ⎠ equals e !!!
Institute of
Microelectronic
10: Performance Systems 43

Off-Chip Driver Circuits

Off-chip driver circuits are critical to the overall chip design


Some important problems must be addressed:
• efficient buffer circuitry between internal and off-chip drivers
• minimization of transmission line effects
• fast switching
• static charge protection
• interface specific items, such as CMOS-TTL level converter, etc.
An inverter circuit can be used as a basic off-chip driver
Performance factors are :
• the transient switching times tLH and tHL
• transmission line effects

Institute of
Microelectronic
10: Performance Systems 44
Double-Inverter Off-Chip Driver Circuit
The simplest off-chip driver circuit: an inverter chain designed to handle a large capacitive load

The sizes of Mn2 and Mp2 can be estimated


using the high-to-low time constant τn and the
low-to-high time constant τp:

⎛W ⎞ Cout
⎜ ⎟ =
⎝ L ⎠ n 2 τ n k 'n (VDD − VTn )
⎛W ⎞ Cout
⎜ ⎟ =
(
⎝ L ⎠ p 2 τ p k ' p VDD − VTp )
Cout is large ⇒ Mn2 and Mp2 are large! ⇒ obtained using parallel connected transistors to aid in
layout and parasitic control
Mn1 and Mp1 can be sized using the previously presented sizing theory

The actual values of the fall and rise time can be estimated from:
⎡ 2VTn ⎛ 2(VDD − VTn ) ⎞⎤
⎡ 2 VTp
t LH = τ p ⎢ + ln⎜
(
⎛ 2 VDD − VTp ⎞⎤
− 1⎟⎥
)
t HL = τ n ⎢ + ln⎜⎜ − 1⎟⎟⎥ ⎜ ⎟⎥
⎣VDD − VTn ⎝ V0 ⎠⎦ ⎢VDD − VTp ⎝
V0
⎠⎦

where V0 is the 10% voltage point Institute of


Microelectronic
10: Performance Systems 45

Example

Consider a process characterized by the nominal values:


k’n = 55[µA/V2] VT0n = 0.9[V]
k’p = 25[µA/V2] VT0p = -0.75[V]
and VDD = 5[V]
The requirements for off-chip driver circuits are tLH = tHL = 20[ns] with a maximum load of
Cout = 50[pF]

Using the previous equations we can compute the time constants


τn = 6.45[ns]
τp = 6.58[ns]
⎛W ⎞ ⎛W ⎞
the aspect ratios are: ⎜ ⎟ ≅ 35 ⎜ ⎟ ≅ 72
⎝ L ⎠n2 ⎝ L ⎠ p2

Institute of
Microelectronic
10: Performance Systems 46
Tri-State Off-Chip Driver Circuit
The input signal is split and individually control each output transistor
The high-impedance state is obtained by driving both NMOS and PMOS output devices into
cutoff

Normal operation:
Z = 1 ⇒ Mp1 and Mp2 off, Mn on

High-impedance state:
Z = 0 ⇒ Mp1 and Mp2 on, Mn off
⇒ Vp = VDD, Vn = 0
⇒ the output transistors are in cutoff

Institute of
Microelectronic
10: Performance Systems 47

Bidirectional Off-Chip Driver Circuit

The tri-state section is a non-inverting buffer with an enable control E


E = 0 gives the high-Z state

Institute of
Microelectronic
10: Performance Systems 48
Packaging Technology (1)

2 Package types
7
1. Bare die
2. Dual-In-line Package (DIP)
3. Pin Grid Array (PGA)
1
4. Small-outline IC
5
5. Quad flat pack
6. Plastic Leaded Package
4 (PLCC)
7. Leadless carrier
3 6

Institute of
Microelectronic
10: Performance Systems 49

Packaging Technology (2)


Package has an important functionality in IC technology
• provides a means of bringing signal and supply wires in/out of the circuit
• removes the heat generated by the circuit
• protects the die against environmental conditions such as humidity
• provides mechanical support
Meantime packaging technology has a tremendous impact on the performance ⇒ up to 50%
of the delay of a high-performance computer is due to packaging delays!
Packages generate parasitic inductance and capacitance:

Package Type Capacitance (pF) Inductance (nH)

68-pin plastic DIP 4 35

68-pin ceramic DIP 7 20

256-PGA 1-5 2-15

Wire bond 0.5-1 1-2

Solder bump 0.1-0.5 0.01-0.1

Institute of
Microelectronic
10: Performance Systems 50
Packaging Technology (3)
Example: parasitic effects of the bond-wire inductance

A transient current is sourced/sunk from/into the supply


rails to charge/discharge CL

Inductive coupling between external (VDDext) and


internal (VDDint) supply voltage (bonding wires)
VDDext A changing current passing through an inductor
L
generates a voltage drop:
i(t)
di
VDDint ∆v = L
dt
Vin Vout

CL ∆v - the difference between VDDext and VDDint:


• affects the logic levels
• reduces the noise margin
L

Institute of
Microelectronic
10: Performance Systems 51

Packaging Technology (4)


Design techniques:
• Separate power pins for I/O pads and chip core
• Multiple power and ground pins
• Careful selection of the position of the power and ground pins on the package
• Adding decoupling capacitance on the board
• Increase the rise and fall times
• Use advanced packaging technologies

Board Bonding
Wiring Wire
+

SUPPLY Cd CHIP

Decoupling
Capacitor

Institute of
Microelectronic
10: Performance Systems 52
Packaging Technology (5)

Packaging Technology Requirements:


• Electrical: low parasitics (L, C, R)
• Mechanical: reliable and robust
• Thermal: efficient heat removal
• Economical: inexpensive

Two interconnect levels:


(1) Die-to-Package-Substrate
(2) Package substrate to PCB

Institute of
Microelectronic
10: Performance Systems 53

Packaging Technology (6)

1-a: Wire bonding

Substrate

Die

Pad

Lead Frame

• Wires must be attached serially


• Bonding wires have inferior electrical properties (L, C)
• Difficult to predict the exact value of parasitics (irregular)

Institute of
Microelectronic
10: Performance Systems 54
Packaging Technology (7)
1-b: Tape-automated bonding (TAB)

Sprocket
hole

Film + Pattern Solder Bump

Test Die
pads
Lead
frame Substrate

Polymer film

• The die is attached to a metal lead frame that is printed on a polymer film
• The connection between chip pads and polymer film wires is made using solder bumps
• Highly automated process
• Improve electrical performance (L ~ 0.5nH, C~0.3pF)
Institute of
Microelectronic
10: Performance Systems 55

Packaging Technology (8)


1-c: Flip-chip mounting

Die

Solder bumps
Interconnect
layers

Substrate

• Flip the die upside-down and attach it directly to the substrate using solder bumps
• Superior electrical performance
• Pads can be placed at any position on the chip (not only on the die boundary)
• A possible solution for power and clock distribution problems

Institute of
Microelectronic
10: Performance Systems 56
Packaging Technology (9)

2-a: Through-hole mounting


• mechanically reliable connections
• limits packaging density

2-b: Surface mounting


• increase package density:
• through holes are eliminated
• the lead pitch is reduced
• both sides of the board can be used
• the on-the-surface connection is weaker
• more expensive equipment needed
• testing on board is more complex

Institute of
Microelectronic
10: Performance Systems 57

Packaging Technology (10)


Multi-Chip-Modules (MCM) - Die-to-Board
(avionics processor module - Rabaey96)

Mount the die directly on the substrate


• increase the packaging density
• increase the performance
• reduce power consumption
• expensive technology

Institute of
Microelectronic
10: Performance Systems 58
11. CAD & Design Flow

Institute of
Microelectronic
Systems

Motivation: Microelectronics Design


Efficiency

Moore‘s ???
Efficiency

Law

Platform-based Design

Logic and Architectural Synthesis

Schematic Entry

Layout Editor

1970 1980 1990 2000 2010

Achieving required productivity by system-level


design methodologies

Institute of
Microelectronic
11: CAD & Design Flow Systems 2
Example for Complex Systems: Embedded SoC
Embedded „System-on-Chip“
Properties
Sensors
• Potentially consisting of a large number of
components
• Specialised to an application domain
I/O-
Micro- • reactive
Module
con- Memory • Real-time capability
troller
Constraints
• Costs
ASIC
• Power consumption
DSP • Latency
RF
Transc. • Required flexibility

Design Tasks
• Definition of communication architecture which
Actuators is adequate to the application‘s structure
• Mapping of the system specification on
available implementation components
Institute of
Microelectronic
11: CAD & Design Flow Systems 3

Platform-Based System Design: Platform Life-Cycle


Easy Implementation:
DSP
API
core
bus Generic
OS
CPU
Memory Platform
core Platform
+
Application-
Lifecycle Specific
Additions

Experiences
Applications
New Requirements Specific
blocks
DSP
core API
Æ Feedback for future bus
Memory OS
platform generations CPU
core
Drivers

multiple devices with similar basic functions


Institute of
Microelectronic
11: CAD & Design Flow Systems 4
Project Management: System Design: V Model
System Properties and Constraints Customer Application

Quality Assurance Product


Analysis of System Delivery
System Requirements Level

Product
Cost Analysis
Validation
Quality Assurance System
Design of System Integration
System Architecture Level

Abstract Interfaces Prototype Generation and/or


Manufacturing
Validation
Analysis of HW/SW
Quality Assurance
Component Requirements HW/SW
HW/SW
Integration Component
Level
Validation HW/SW Co-Design
Implemented HW/SW Modules

HW/SW
HW and SW Component IP Database
Implementation and Implementation
Level
Institute of
Microelectronic
11: CAD & Design Flow Systems 5

Hardware/Software Co-Design
Specification

Co-Simulation

HW/SW-Partitioning

Communication Synth.

HW-Specification SW-Specification

Synthesis Compilation

Placement/Routing Real-Time OS

O.k., let‘s go
bottom-up now Heterogeneous HW-/SW-System

Institute of
Microelectronic
11: CAD & Design Flow Systems 6
Classes of CAD Tools
• Design Entry:
– Graphical Editor (drawing schematic diagrams, physical layout, stick
layout diagrams, ...)
– Language based circuit capture tools (for hardware description
languages like VHDL, Verilog, EDIF)

• Design Validation:
– Physical design verification tools (design rule checker, extractor,
LVS, schematic and electrical rule checker)
– Design Simulation:
• analog simulation: circuit level; behavioural level
• digital simulations: circuit level, switch level, logic level, register transfer
level, architectural level, behavioural level;
• thermal simulation: displaying heat dissipation on chip
– Formal Verification Methods

Institute of
Microelectronic
11: CAD & Design Flow Systems 7

Classes of CAD Tools

• Design Implementation:
– Layout Compilers (stick2layout, macrocell generators, datapath
compilers)
– Layout Structuring & Optimization:
• Layout Compaction
• Placement and Routing
– Logic Synthesis
– Finite State Machine (FSM) Synthesis
– Architectural Synthesis

• Management of Design Projects:


– Design Databases:
• keep different versions (current, backup 1, ..., backup n) and views of a
design object (schematic, simulation netlist, stick diagram, physical
layout, ...) in database

Institute of
Microelectronic
11: CAD & Design Flow Systems 8
Full Custom Design: Design Entry
Full Custom Design

With Full Custom Design techniques, the


designer is able to individually specify the
geometrical layout of the integrated circuit
(transistor size
[channel length, channel width, shape, ...],
transistor placement, wire width, ...).
The designer has the option to manually
optimize
the layout

the most dense/area efficient layouts


can be generated using the full
custom design styles.
www.tanner.com
Layout Editor
and Design Rule Check
Hand-Crafted Layout:
• The layout is drawn in form of rectangles and polygons on different layers using a graphics
editor.
• The designer has to know a large set of process dependent design rules.
• The mask layout is generated as drawn on the screen: direct influence to component
placement, to important parameters as W and L of transistors, wire widths, ...

Institute of
Microelectronic
11: CAD & Design Flow Systems 9

Full Custom Design: Design Entry

Tool internal Design Representation: Geometrical Specification Language

• The layout is specified in textual form giving either the position and layer of rectangles
(similar to hand crafted layout) or lines (as in stick diagrams).

• Since programming language constructs like


– parameterized macros (to be used for layout segments as cells, ...),
– loops (while, repeat, for, ...), and
– conditional statements (if, case, ...) may be available,
– parameterized layouts (e.g. generic transistor with W and L as parameters, cells for
different bit widths, sss) can be described using geometrical specification languages.

• Used in a large number of macrocell compilers.

Institute of
Microelectronic
11: CAD & Design Flow Systems 10
Full Custom Design: Design Entry

Example for a simplified geometrical specification language:

B x y dx dy Box with length dx, width dy, an lower left hand corner placed at (x,y)
Ln Layout level (layer) for the box definiitions that follow
Mn Start of macro definition n
E End of macro definition
Cnxym Call for macro number n with translation x,y and orientation m.
Q End of layout file

MOS Layer definitions:

Layer CMOS NMOS

1 n-diffusion n-diffusion
2 p-diffusion ion implant
3 polysilicon polysilicon
4 metal metal
5 contact contact
8 n-well --
9 overglass overglass

Institute of
Microelectronic
11: CAD & Design Flow Systems 11

Full Custom Design: Design Entry

Cell Orientations:

Orien-
tation Description

1 no rotation
2 rotate 90° counterclockwise
3 rotate 180° counterclockwise
4 rotate 270° counterclockwise
5 mirror about y-axis
6 rotate 90° counterclockwise and mirror about y-axis
7 rotate 180° counterclockwise and mirror about y-axis
8 rotate 270° counterclockwise and mirror about y-axis

Institute of
Microelectronic
11: CAD & Design Flow Systems 12
Full Custom Design: Design Entry

Full custom layout Corresponding geometrical specification


(hand crafted or generated out of a stick file and schematic diagram
diagram resp. a layout description)

Institute of
Microelectronic
11: CAD & Design Flow Systems 13

Full Custom Design: Design Entry


Stick Diagram:
• The layout is drawn in form of lines and polygons on differentlayers using a
graphics editor.
• A stick--to--layout converter together with a compactor and a description of the
process design rules is then used to generate the rectangle
based layout.
• The designer can draw almost process and design rule independent symbolic
layouts. Process adaption is done by the converter/compactor.
• Converter constraints (cell dimensions, channel widths / lengths of transistors, ...)
can be specified.

• Stick Diagram Conventions:


– Diffusion Areas: green (b/w: dotted line)
– Polysilicon Lines: red (b/w: dashed line)
– Metal Lines: blue (b/w: solid line)
– Contacts: black

Example: Stick Diagram of a Transistor:


Institute of
Microelectronic
11: CAD & Design Flow Systems 14
Full Custom Design: Stick Diagrams

Memory cell schematic and corresponding


stick diagram

Institute of
Microelectronic
11: CAD & Design Flow Systems 15

Full Custom Design: Design Flow


Stick Diagram Symbol Generation
Editor

Schematic Entry
stick2layout
Converter
and Compactor

Simulation Netlist
Layout Editor Extraction and Simulation (SPICE)

Cells Block Layout Circuit Simulation (SPICE)


Timing Analysis
Test Pattern Generation

Floorplanning
Placement & Routing
Design Analysis
DRC, ERC
Mask Layout Data Circuit Extraction
LVS

Fabrication Fabrication Test Pattern


Institute of
Microelectronic
11: CAD & Design Flow Systems 16
Cell Based Design

Cell based Design approaches rely on layout components predefined and provided
by a silicon foundry. Several implemenation styles can be distinguished:

• Standard Cells:
– layout blocks predefined by silicon foundry
– full process sequence (amount of mask layers) for chip fabrication required

• Gate Arrays:
– Linear Gate Arrays:
• pre-fabricated diffusion and poly layers (regular structures, e.g. transistors)
• customized interconnect structures (wires in metal 1 and metal 2)
• fixed size interconnect areas (channels) discussed later in
– Sea of Gate Array this lecture
• pre-fabricated diffusion and poly layers (regular structures e.g. transistors)
• customized interconnect structures (wires in metal 1 and metal 2)
• variable size interconnect areas (channels) over unused transistors

Institute of
Microelectronic
11: CAD & Design Flow Systems 17

Cell based Full Custom Design: Design Flow


Macrocell
Symbol Generation Specification/Compilation

Graphical
Simulation Netlist
Data Schematic Entry
Cell Extraction
Library Simulation Models

Layout
Data Placement: Logic Simulation
Standard Cells Fault Simulation
Macro Cells Timing Analysis
I/O Cells Test Pattern Generation

Routing: Parasitic
Place &
Channel Generation Route
Wire Capacitances /
Delay Backannotation
Global Routing Optimization
Detailed Routing
Design Analysis
DRC, ERC
Mask Layout Data Circuit Extraction
LVS

Fabrication Fabrication Test Pattern

Institute of
Microelectronic
11: CAD & Design Flow Systems 18
Standard Cell Full Custom Design

Institute of
Microelectronic
11: CAD & Design Flow Systems 19

Design Verification
Physical Design Rule Check:

Physical design rule checks (DRCs) are


performed to guarantee the conformity of a
layout design to the
silicon vendor's set of design rules. Design
rules are defined between objects on the
same layer (minimum width, minimum
spacing) as well as for objects on different
layers (minimum spacing, overlapping,
extension).

• Minimum width
• Minimum spacing
• Overlapping
• Extension

Design rule violations are usually reported in


the physical layout using a graphics editor.
Sometimes, also a tabular form indicating the
location and type of design rule violation can
be generated.

Institute of
Microelectronic
11: CAD & Design Flow Systems 20
Design Verification

Extraction:

• Circuit Level Extraction can be used to create a netlist for circuit level simulations
(e.g. SPICE, ...). The netlist consists of MOS transistors (including geometrical
parameters as W / L, parasitic capacitances), resistors, capacitances, diodes, ...

• Switch Level Extraction: can be used to create a netlist which can be processed by a
switch level simulator. The resulting netlist consists of MOS transistors and parasitic
capacitances (to model storage effects in MOS circuits).

• Parasitics Extraction: is used in conjunction with cell based design techniques. Since
wire delay is dependent on the parasitic capacitance of a wire, parasitic capacitances of
nets and input capacitances of other gates connected to an output can be used to
estimate the extrinsic delays (Note: intrinsic delays [i.e. the delay of unloaded gates] are
fetched from the cell library's simulation model data).

• Schematic Extraction: is executed to generate the connectivity data out of a


graphical representation (schematic diagram) of a circuit module. The connectivity data is
forwarded to a netlister which provides the information required e.g. by simulation tools
(the simulators cannot operate on graphical data, they require netlists in a textual
format). This kind of extraction is usually required in pre-layout design specification
phases.

Institute of
Microelectronic
11: CAD & Design Flow Systems 21

Design Verification

LVS:

The layout-versus-schematic (LVS) comparison tool checks the equivalence of the layout and its schematic.
The tool can be used to find wrong connections or parameter mismatch (as W/L of transistors, ...) between
a schematic and its physical layout representation.

Schematic / Electrical Rule Check (SRC / ERC):

To verify schematics used e.g. in cell based designs, a schematic rulechecker can find schematic rule
violations (like the following examples):

• Warnings:
• unconnected (floating) wire segments
• open outputs
• exceeded fanout

• Errors:
• open inputs (undefined input value!)
• number of bits differ for 2 buses connected together
• number of input/output pins in a schematic differs from its symbol representation ( --> pins are
not accessible / not present at higher levels of schematic hierarchy)
• more than one active driver connected to a net at the same time

Institute of
Microelectronic
11: CAD & Design Flow Systems 22
Simulation

Goal of Simulation:

• Validation of the system, logic timing, and electricial behaviour


• Verify testability aspects
• Software development based on hardware simulation models

Simulator Classification:

Level Primitives observable Timing


Values Model

RT registers, user coded bit strings, discrete


primitives, busses, etc. vectors time set
Gate gates bits continuous
or discrete
Switch transistors, capacitators bits continuous
or discrete
Electricial capacitators, resistors, real values continuous
inductors, diodes etc. time set

Institute of
Microelectronic
11: CAD & Design Flow Systems 23

Simulation: Models
Signal Modelling:

• values which exist in real circuits (0, 1, high impedance, oscillation, ...)
• values which exist only in the simulator (unknown, transition, ...)
• boolean logic set not sufficient

3-valued Logic:

logic zero = 0
logic one = 1
unknown = U

Example: AND 0 1 U
0 0 0 0
1 0 1 U
U 0 U U

Problems:

• Pessimism of U value (for example: circuit initialisation, spikes)


• Logic values are often not sufficient (value strength needed)

Institute of
Microelectronic
11: CAD & Design Flow Systems 24
Simulation: Models

Circuit and Delay Modelling:

• Circuit is built up by simulator primitives


• Modelling of the timing/delay behaviour:

∆: basic time unit


τ(n) = n * ∆: delay of the gate
t1, t2, t3, ...: clock time of a synchronous circuit
(tν+1-tν): ∆t = m*∆

Timing Models:

• Zero Delay: ∆=0


• Unit Delay: τ(n) = constant
• Nominal delay: τ(n) = user-specified

Institute of
Microelectronic
11: CAD & Design Flow Systems 25

Simulation: Models
Advanced Logic Simulators:

• Introduction of signal strength additional to logic values for driver and bus modelling

A : active, e.g. low impedance driver


P : passive, e.g. high impedance driver (depletion load)
S : storing, e.g. capacitive stored state
X : active indeterminate (e.g. active or storing)
Y : passive indeterminate (e.g. passive or storing)
Z : high impedance
• Instead of simple logical values, signals are used for simulation. A signal consists of a logical value and a
strength.
• Logical Values = {0,1,X}
A0 A1 AX P0 P1 PX S0 S1 SX X0 X1 XX Y0 Y1 YX ZZ
• 16 states A0 A0 AX AX A0 A0 A0 A0 A0 A0 A0 AX AX A0 A0 A0 A0
A1 A1 A1 A1 A1 A1 A1 A1 A1 AX A1 AX A1 A1 A1 A1
AX AX AX AX AX AX AX AX AX AX AX AX AX AX AX
P0 P0 PX PX P0 P0 P0 X0 XX XX P0 PX PX P0
P1 P1 PX P1 P1 P1 XX X1 XX PX P1 PX P1
PX PX PX PX PX XX XX XX PX PX PX PX
S0 S0 SX SX X0 XX XX Y0 YX YX S0
S1 S1 SX XX X1 XX YX Y1 YX S1
SX SX XX XX XX YX YX YX SX
X0 X0 XX XX X0 X0 XX X0
X1 X1 XX X1 XX XX X1
XX XX XX XX XX XX
Overview Y0 Y0 YX YX Y0
on Y1 Y1 YX Y1
Signal YX YX YX
Combinations ZZ ZZ
Institute of
Microelectronic
11: CAD & Design Flow Systems 26
Simulation: Models

Example: Driver Modelling:

Competing Drivers at a Bus

Institute of
Microelectronic
11: CAD & Design Flow Systems 27

Simulation

www.modelsim.com

Institute of
Microelectronic
11: CAD & Design Flow Systems 28
Simulation: Techniques

Simulation Techniques:

• Compiler-driven technique:
– Problems:
• Feedbacks
• Sorting of gate netlist
• Zero delay model
• Entire circuit is simulated

• Event-driven simulation ...

Switch-Level Simulation:

• well-suited so simulate digital MOS


circuits
• no fixed direction of signal flow
• transistor modeled as a switch
with three states: open, closed,
unknown
• algebraic or RC models

Institute of
Microelectronic
11: CAD & Design Flow Systems 29

Simulation: MOS Transistor Model


Ideal Switch Transistor Model:

Drain

Logic n-Channel p-Channel Remarks:


(Gate) Enhancement Enhancement Depletion • Switch transition time is
1 Closed Open Weak assumed to be zero or
Gate 0 Open Closed Weak some nominal value
X Unknown Unknown Weak • Unknown states can
cause problems

Source

Linear Switch Transistor Model:

Drain

Remarks:
Logic n-Channel p-Channel • In the linear model,
(Gate) Enhancement Enhancement Depletion node capacitance and
Gate 1 REFF infinity REFF devices resistance are
0 infinity REFF REFF used to compute output
X [REFF, infinity] [REFF, infinity] REFF logic levels and
REFF transition time
• Ratio errors can be
detected
Source
Institute of
Microelectronic
11: CAD & Design Flow Systems 30
Executable Specifications: VHDL
VHDL: Very high speed integrated Circuits Hardware Description Language

architecture structural of first_tap is

signal x_q,red : std_logic_vector(bitwidth-1 downto 0); Different types of modeling:


signal mult : std_logic_vector(2*bitwidth-1 downto 0);
• Data Flow
begin • Behaviour
• Structure
delay_register:
process(reset,clk) VHDL is used for:
begin
if reset='1' then • Modelling
x_q <= (others => '0'); • Simulation
elsif (clk'event and clk='1') then • Hardware Synthesis
x_q <= x_in;
end if;
end process;

mult <= signed(coef)*signed(x_q);

Institute of
Microelectronic
11: CAD & Design Flow Systems 31

Design Flow: IC Design with High-Level-Entry


VHDL-Description
architecture structural of first_tap is

signal x_q,red : std_logic_vector(bitwidth-1 downto 0);


signal mult : std_logic_vector(2*bitwidth-1 downto 0);

begin
Gate-Level
delay_register:
process(reset,clk) Netlist
begin

RTL-Synthesis
if reset='1' then
x_q <= (others => '0');

(Synopsys)
elsif (clk'event and clk='1') then
x_q <= x_in;
end if;
end process;

mult <= signed(coef)*signed(x_q);

Placement &
Production Routing
(Cadence/Mentor)

ASIC Layout

Institute of
Microelectronic
11: CAD & Design Flow Systems 32
Future Outlook: Networks-on-Chip
– Regular platform integrating – Separation between
independent subsystems Communication and
• combine structures of Computation
today‘s SoC complexity

Generic
µP ASIC
Interface
Router

High-Speed
FPGA MEM Interconnect

Institute of
Microelectronic
11: CAD & Design Flow Systems 33

NoC-based design flow: Hardware/Software


Classical Flow Co-Design NoC-based Flow
Specification Specification

Co-Simulation Implementation

HW/SW-Partitioning
SW Library HW Library

Communication Synth.
NoC Mapping
Dynamic
Allocation/Re-
HW-Specification SW-Specification Mapping during
NoC Placement
Operation

Synthesis Compilation

Placement/Routing Real-Time OS

Heterogeneous HW-/SW-System
Institute of
Microelectronic
11: CAD & Design Flow Systems 34
Application Scenario: Mobile Video Terminal
Different Configurations for:
• High Quality (Resolution) Downstreaming
• Low-Power Mode (Quality Reduction)
• Image Compression and Upstreaming
• Multi-Stream Modes

Mobile Single Chip Mobile Terminal


Service
Base
Station(s) Centr.
RF
CTRL

DISPLAY
Displ.
CTRL

Institute of
Microelectronic
11: CAD & Design Flow Systems 35
12. Digital Subsystem Design

Institute of
Microelectronic
Systems

Weinberger Structuring

Is a structured approach that simplifies structural layout and improves


layout density. Method presented by Weinberger in 1967.
Weinberger Arrays:
• Are created by placing transistors on the chip in a geometrically
regular manner. Horizontal and vertical interconnect patterns are used
to wire the devices together.
• Using one type of gate (ex. NOR) complex NMOS circuits can be
realized.
• Regularity of Weinberger Arrays is very suitable for automatic layout
generation.

Institute of
Microelectronic
12: Digital Design Systems 2
Weinberger Structuring (2)

Example of NOR gate reduction for Weinberger structuring:


F = (A + B + C )

• Empty squares = input connections


• Filled squares = output connections
Institute of
Microelectronic
12: Digital Design Systems 3

Example: 3-to-8 decoder

Weinberger structuring:

Institute of
Microelectronic
12: Digital Design Systems 4
3-to-8 decoder (2)

Institute of
Microelectronic
12: Digital Design Systems 5

3-to-8 decoder (3)

Institute of
Microelectronic
12: Digital Design Systems 6
Example 2

F =U +V +W + X +Y

Random logic implementation

Weinberger NOR array representation

Institute of
Microelectronic
12: Digital Design Systems 7

Example 2 (2)

Weinberger stick diagram

Institute of
Microelectronic
12: Digital Design Systems 8
Example 2 (3)

Weinberger array structure: (a) schematic (b) layout


Institute of
Microelectronic
12: Digital Design Systems 9

Gate matrix layout

Gate matrix layout is a character based layout style for custom CMOS
circuitry. It is a regular design style employing a matrix of intersecting
transistor diffusion rows and poly-silicon columns such that intersections
are potential transistor sites.
Creating a gate matrix. Representational line drawing or stick figure
using the levels of interconnections available e.g. poly-silicon gate
technology poly-silicon metal diffusion.
– Immediately draw series of parallel poly lines corresponding to the
number of inputs to the circuit (may become more if an output is chosen to
be poly-silicon)
– Subsequent transistor placements will be determined by two factors, i.e.
input column and serial or parallel association among transistors.
– After row definition, further interconnections may be done with horizontal
and vertical metal interconnection tracks\item final improvements

Institute of
Microelectronic
12: Digital Design Systems 10
Gate matrix layout (2)

Gate matrix layout:


(a) Schematic
(b) Layout
(c) Optimized layout of N part
Institute of
Microelectronic
12: Digital Design Systems 11

Example: half adder

C = AB = AB
( )
S = AB + A B = A + B B + ( A + B ) A

= AB B + AB A = AB B ⋅ AB A

Institute of
Microelectronic
12: Digital Design Systems 12
Half adder realizations

(a) Standard cell


(b) Gate matrix

Institute of
Microelectronic
12: Digital Design Systems 13

Character definitions for symbolic layout

N n-channel transistor
P p-channel transistor
+ metal-poly or metal-diffusion crossover
* contact
| poly-silicon or n-diffusion wire
! p-diffusion wire
: vertical metal
- horizontal metal

Institute of
Microelectronic
12: Digital Design Systems 14
Character definitions (cont.)

Institute of
Microelectronic
12: Digital Design Systems 15

Rules
The following rules summarize the gate-matrix technique:
– Poly-silicon runs only in one direction and is of constant width and pitch
– Diffusion wires (of constant width) may run vertically between poly-silicon
columns.
– Metal may run horizontally and vertically. Any pitch departures from a
minimum (e.g. power rails) are manually specified.
– Transistors can only exist on poly-silicon columns.
Wide transistors may be specified by abutting two ort more N or P
symbols.

Institute of
Microelectronic
12: Digital Design Systems 16
Summary of gate matrix properties

☺ regular design style


☺ technology updateable
☺ modularity is encouraged by the block nature of the layout style
☺ circuit extraction may done at the symbolic level or at the mask
level by conventional circuit extractions
character symbolic description is not hierarchical modules must
be assembled in their entirety and ''pasted'' together at the mask
level
no freedom to locally optimize geometry, e.g. transistor size

Institute of
Microelectronic
12: Digital Design Systems 17

Optimal CMOS complex gate layout

In MOS circuit design, advantage can be taken by the application of


complex functional cells in order to achieve better performance. In this
section, the implementation of a random logic function on an array of
CMOS transistors will be discussed. The method has been presented by
Uehara and van Cleemputin 1981. A graph theoretical approach for
systematic and efficient layout generation minimizes the required chip
area.

optimal

Institute of
Microelectronic
12: Digital Design Systems 18
EXOR implementation

(a) Logic diagram


(b) Circuit
(c) Layout

Institute of
Microelectronic
12: Digital Design Systems 19

CMOS Functional cells (Complex gates)

Advantages of complex-gate approach:


– better performance
– smaller size

Institute of
Microelectronic
12: Digital Design Systems 20
Complex gates (2)
In the following, the consideration is limited to AND/OR networks realized in
complex gate CMOS by means of series/parallel connections of transistors.The
topology of the NMOS network and the PMOS network are assumed to be dual.
The delay of a complex CMOS cell mainly depends on the maximum number of
series transistors between VDD or VSS and the cell output, which is called level
of the complex cell. This quantity has a direct influence on the charging or
discharging resistance of the cell. Generally, cells with less than four levels are
desirable. The number of cells with parallel/serial topology is given by the
following table:

It is reasonable to use mainly cells


with three levels and only
sometimes cells with four levels
in order to get a sufficient
performance.

Institute of
Microelectronic
12: Digital Design Systems 21

Alternative EXOR implementation

Institute of
Microelectronic
12: Digital Design Systems 22
Basic layout strategy

Institute of
Microelectronic
12: Digital Design Systems 23

Layout strategy (2)


Layout properties:
– two rows of transistors, for the PMOS and NMOS parts of the circuit
– equal number of transistors in both rows
Optimizations: If the metal connections between adjacent transistors are
replaced by diffusion (designer should be careful in doing this for high-
speed circuits) the following layout (a) is achieved.

Institute of
Microelectronic
12: Digital Design Systems 24
Optimized layout
An even more sophisticated layout arrangement which reduces the
required area is shown in (b)

area = width * height


with
height = const.
width = basic grid size * (#inputs + #separations + 1)

A separation is required when there is no


connection between phisically adjacent
transistors.
An optimal layout is obtained by reducing
the number of separations.

Institute of
Microelectronic
12: Digital Design Systems 25

Optimal layout
The best layout is achieved by the following transistor arrangement,
logically equivalent to the previous figures:

Institute of
Microelectronic
12: Digital Design Systems 26
Graph theoretical algorithm
The p-side and the n-side of the circuit can be formulated as graphs
which can be defined:
G P = (V P , E P ) p − side network
G N = (V N , E N ) n − side network
Graph properties:
– the graphs are series/parallel graphs (CMOS complex gate
property/assumption)
– every source/drain potential is represented by a vertex V
– every transistor is represented by an edge E, connecting the vertices
representing source and drain
– edges are labeled by the corresponding transistor gate input signal
– GP and GN are dual

Institute of
Microelectronic
12: Digital Design Systems 27

Graph theoretical algorithm (2)


If two edges Ei and Ej are adjacent in the graph model, then it is possible
to place the corresponding gates in a physically adjacent position of an
array and hence, connect them by a diffusion area. In order to minimize
the number of separations a set of minimum size paths has to be found,
which corresponds to chains of transistors in the array.

Definition 1: An Euler path is a single (uninterrupted) path on a graph,


that covers every edge of the graph exactly once.

If there exist Euler paths for GN and GP then all transistors can be chained
by diffusion areas. Otherwise the graphs have to be partitioned into sub-
graphs which have Euler graphs.
It's necessary to find a pair of paths for GP and GN with the same
sequence of labels, because p- and n-type transistors corresponding to
the same input have to be positioned at the same horizontal position
(poly line).
Institute of
Microelectronic
12: Digital Design Systems 28
Graph theoretical algorithm (3)
General algorithm:
– enumerate all possible decompositions of the graph model to find the
minimum number of Euler paths that cover the graph
– chain the gates by means of a diffusion area according to the order of the
edges in each Euler path and
– if more than two Euler paths are necessary to cover the graph model,
then provide a separation area between each pair of chains
Result: Search of minimal number of Euler paths is NP-complete.
Problem reduction:
An odd number of series or parallel edges can be reduced to a single edge:

Institute of
Microelectronic
12: Digital Design Systems 29

Problem reduction

Definition 2: The reduced graph is obtained by iteratively replacing an


odd number of series (parallel) edges by a single edge, until no further
reduction is possible.
Theorem 1: If there is an Euler path in the reduced Graph then there
exists an Euler path in the original graph.
Proof: It is possible to reconstruct an Euler path in the original graph by
replacing each edge of the Euler path in the reduced graph by a sequence
of the original odd number of edges.
Theorem 2: If the number of inputs to every AND/OR element is odd,
then:
– the corresponding graph model has a single Euler path
– there exists a graph model such that the sequence of edges on an Euler
path corresponds to the vertical order of inputs on a planar
representation of the logic diagram.

Institute of
Microelectronic
12: Digital Design Systems 30
Problem reduction (2)

If there are gates in the logic diagram with an even number of inputs, additional
“pseudo” inputs have to be introduced in order to guarantee an odd number of
inputs. It is guaranteed by the second previously given theorem, that there exists
an Euler path for this modified problem. But the pseudo edges in the Euler path
have to be removed afterwards and then they can cause diffusion separations.
An algorithm for minimizing separations caused by pseudo edges is given in the
next section ( minimal interlace of normal and pseudo inputs).

Institute of
Microelectronic
12: Digital Design Systems 31

Problem reduction (3)

The heuristic algorithm for generating an Euler path is given by:


1. To every gate with an even number of inputs a “pseudo” input is added
2. Add this new input to the gate such that the planar representation of the
logic diagram shows a minimal interlace of “pseudo” and real inputs. It
should be noted that a “pseudo” input at the top or at the bottom of the
logic diagram does not contribute to the separation areas.
3. Construct the graph model such that the sequence of edges corresponds
to the vertical order of inputs on the planar logic diagram.
4. Chain together the gates by means of diffusion areas, as indicated by the
sequence of edges on the Euler path. “Pseudo” edges indicate separation
areas.
5. The final circuit topology can be derived by deleting “pseudo” edges in
parallel with other edges and by contracting “pseudo” edges in series with
other edges.

Institute of
Microelectronic
12: Digital Design Systems 32
Application of reduction rule

(a) Logic diagram


(b) Graph model and
its reduction
(c) Reconstruction of
an Euler path

Institute of
Microelectronic
12: Digital Design Systems 33

Application of heuristic algorithm

This heuristic algorithm does not necessarily give the optimal layout, but if
the resulting sequence has no separation areas, it is the real optimal
solution.

(a) New inputs p1 and p2 are added


(b) Optimal sequence of inputs without the interlace of p1 and p2
(c) Circuit with the dual path {p1,2,3,1,4,5,p2}

Institute of
Microelectronic
12: Digital Design Systems 34
Algorithm for calculating minimal interlace
start An example of line.

Any
Yes
white triangle Put it in the line.
left?

No
Any Put it in the line,
Yes
blackwhite triangle and set the white
left? part on top.

No
Any
Yes
black triangle Put it in the line.
left?

No
Any Put it in the line,
Yes
blackwhite triangle and set the black
left? part on top.

No
Any
Yes
white triangle
left?
No
Institute of
stop Microelectronic
12: Digital Design Systems 35

Application example for minimal interlace algorithm

Institute of
Microelectronic
12: Digital Design Systems 36
Example: carry look-ahead

This implementation has no Euler path!


Institute of
Microelectronic
12: Digital Design Systems 37

Alternative carry look-ahead topology

This topology
does have Euler path!

Institute of
Microelectronic
12: Digital Design Systems 38
Comparison of space

(a) Functional cell realization


(b) Conventional NAND realization

Institute of
Microelectronic
12: Digital Design Systems 39

Standard cell layout

Institute of
Microelectronic
12: Digital Design Systems 40
Example: synchronous counter

Institute of
Microelectronic
12: Digital Design Systems 41

Programmable Logic Arrays (1)

• Map a set of Boolean functions in canonical, two-level sum-of-


product form into a geometrical structure
• Consist of an AND-plane and an OR-plane
• For every input variable in the Boolean equations, there is an
input signal to the AND-plane
• The AND plane produces a set of product terms by performing an
AND operation
• The OR plane generates output signals by performing an OR
operation on the product terms fed by the AND plane

Institute of
Microelectronic
12: Digital Design Systems 42
Programmable Logic Arrays (2)

Institute of
Microelectronic
12: Digital Design Systems 43

Programmable Logic Arrays (3)

• PLA (Programmable Logic Array):


– AND and OR array are programmable
– every product term of the AND array can be connected to any of the
OR output gates
• PAL (Programmable Array Logic):
– AND array is programmable
– OR array has fixed connection points (OR gates)
• PROM (Programmable Read Only Memory):
– AND array hardwired
– OR array programmable
– Set of all possible product terms is realized

Institute of
Microelectronic
12: Digital Design Systems 44
Architectures (1)

Institute of
Microelectronic
12: Digital Design Systems 45

Architectures (2)

Institute of
Microelectronic
12: Digital Design Systems 46
Example (1)

x0 x1 x2 z0 z1
• PROM implementation realizes all
0 0 0 1 1
of the 8 product terms
0 0 1 1 1
0 1 0 0 0

z 0 = x 0 x 1 x 2 + x 0 x 1x 2 + x 0 x 1 x 2 0 1 1 0 0
1 0 0 0 0
= x 0 x1 + x 0 x1 x 2 1 0 1 0 0
1 1 0 1 0
z 1 = x 0 x 1 x 2 + x 0 x 1x 2 + x 0 x 1 x 2
1 1 1 0 1
= x 0 x1 + x 0 x1 x 2

Institute of
Microelectronic
12: Digital Design Systems 47

Example (2)

• PLA implementation needs only 3 0 0 X 1 1


product terms
1 1 0 1 0
1 1 1 0 1
z 0 = x 0 x 1 x 2 + x 0 x 1x 2 + x 0 x 1 x 2
= x 0 x1 + x 0 x1 x 2 x0 x1 x2 z0 z1

z 1 = x 0 x 1 x 2 + x 0 x 1x 2 + x 0 x 1 x 2
= x 0 x1 + x 0 x1 x 2

Institute of
Microelectronic
12: Digital Design Systems 48
Floor Plan for PLA
A AND plane programming cell
O OR plane programming cell
AO AND-OR communication cell
IN AND plane input cell
OUT OR plane output cell
LA left AND plane cell
RO right OR plane cell
BL bottom left cell
BM bottom middle cell
BR bottom right cell
TL top left cell
TA top AND cell
PLA generic floor plan
TM top middle cell
TO top OR cell
TR top right cell

Institute of
Microelectronic
12: Digital Design Systems 49

Static nMOS and Pseudo-nMOS PLA

• nMOS PLA: Pull-up network realized by single nMOS depletion


transistor
• Pseudo nMOS PLA: Pull-up by high resistance pMOS transistor
with permanently grounded gate input

• But: AND-OR structure not suited to MOS circuit technology


• Therefore: AND and OR planes are implemented through NOR or
NAND gate structures
• The transformation is based on deMorgan’s law

Institute of
Microelectronic
12: Digital Design Systems 50
INV-NOR-NOR-INV Structure (1)

Transformation according to deMorgan’s law:

Institute of
Microelectronic
12: Digital Design Systems 51

INV-NOR-NOR-INV Structure (2)

Example:

General structure:

Institute of
Microelectronic
12: Digital Design Systems 52
INV-NOR-NOR-INV Structure (3)

Properties:
• high static power dissipation
• small area
• useful if high speed is not required

Institute of
Microelectronic
12: Digital Design Systems 53

INV-NOR-NOR-INV Structure (4)

Pseudo nMOS NOR-NOR PLA circuit

Institute of
Microelectronic
12: Digital Design Systems 54
INV-NOR-NOR-INV Structure (5)

PLA implementation in pseudo nMOS logic

Institute of
Microelectronic
12: Digital Design Systems 55

INV-NOR-NOR-INV Structure (6)

Stick diagram of a nMOS PLA

Institute of
Microelectronic
12: Digital Design Systems 56
NAND-NAND Structure (1)

Transformation according to deMorgan’s law:

Example:

Institute of
Microelectronic
12: Digital Design Systems 57

NAND-NAND Structure (2)

Properties:
• NAND-NAND approach not recommended:
• decreasing performance at increasing number of inputs (because
of series connection of nMOS transistors)
• high static power dissipation

Institute of
Microelectronic
12: Digital Design Systems 58
Static CMOS PLA (1)

• NOR gates with a large number of inputs should be avoided in


CMOS (because the p-channel devices are in series)
• Static CMOS PLAs are usually realized in NAND-INV-INV-NAND
structure in order to avoid long chains of pMOS transistors

Properties:
• no static power dissipation
• area increase becomes unacceptable for large PLAs
• working fast

Institute of
Microelectronic
12: Digital Design Systems 59

Static CMOS PLA (2)

PLA NAND-INV-INV-NAND implementation

Institute of
Microelectronic
12: Digital Design Systems 60
Static CMOS PLA Layout

Institute of
Microelectronic
12: Digital Design Systems 61

Dynamic CMOS PLA (1)

• less size than static CMOS


• fast
• 2-phase clocking
• states of Φ1: Φ1 = 1
– no path to ground
– inputs change
– both NOR planes are precharged
• states of Φ1: Φ1 = 0
– first NOR plane discharges
– dummy: worst case discharge (prevents second NOR plane to
discharge)
– after first NOR plane, the second plane evaluates

Institute of
Microelectronic
12: Digital Design Systems 62
Dynamic CMOS PLA (2)

• Φ2 is used to latch the second stage


• Intermediate clock is required to precharge OR plane
– generated by the cells TL, TA and TM
– uses a dummy product row that discharges at the worst case rate
according to the loading of the AND array

Institute of
Microelectronic
12: Digital Design Systems 63

Dynamic CMOS PLA (3)

Dynamic 2-phase PLA circuit

Institute of
Microelectronic
12: Digital Design Systems 64
Noise in PLA circuits (1)

• Noise Problems on switched supply lines in dynamic PLAs


• The discharge current generates transients in the power supply
bus
• To reduce noise: locally grounding the PLA; use of metal lines for
power supply whenever possible (reduced impedance)

Institute of
Microelectronic
12: Digital Design Systems 65

Noise in PLA circuits (2)

Institute of
Microelectronic
12: Digital Design Systems 66
Optimization of PLAs – Logic Minimization

• optimizations (minimizations) of boolean equations in order to


reduce the number of minterms or literals
• decoder in front of the AND plane to generate combined input
variables
• if a term is needed both positive and negative, a reduction can be
achieved sometimes by using negative logic
Example: z = x1 + x0x1’x2’ + x0’x1’x2 3 minterms

z’ = (x1 + x0x1’x2’ + x0’x1’x2)’


= x1’(x0x1’x2’)’(x0’x1’x2)’
= x1’(x0’ + x1 + x2)’(x0 + x1 + x2’)’
= (x0’x1’ + x1’x2)(x0 + x1 + x2’)
= x0x1’ + x0’x1’x2’ 2 minterms

Institute of
Microelectronic
12: Digital Design Systems 67

Optimization of PLAs – Folding

Row-folded PLA

PLA before folding

Column-folded PLA

Institute of
Microelectronic
12: Digital Design Systems 68
Optimization of PLAs – Multi Sided Access

Multi sided input/output access

• An advantage of multi-sided access and folding is the decreased


layout area, but the layout structure has changed and the wiring
is more difficult.

Institute of
Microelectronic
12: Digital Design Systems 69

Timing & Power Dissipation of a Static PLA

• Delay is determined by
– (W/L) of the AND/OR load
– (W/L) of the AND/OR cells
• Minimum Delay:
– large load current Iload
– (W/L)ORplane = e*(W/L)ANDplane
• Limitations:
– Iload limited by:
• the total power of the PLA
• the internal logical ‘0’: (I * RnMOS = ‘0’) < VT !
– the stage sizing factor e for successive stages can not always be
realized due to the floorplan

Institute of
Microelectronic
12: Digital Design Systems 70
Automatic PLA Layout Generation (1)

Input: boolean equations

logical optimization

Cells: truth table = matrix


input/output buffer
clock driver structure of PLA
floorplanner
VDD/VSS cells
Schmittrigger …

Output: layout with mask data

Institute of
Microelectronic
12: Digital Design Systems 71

Automatic PLA Layout Generation (2)

Example: PLA generator input file Truth table matrix:


PLA adderpla; optimized intermediate
INPUT: I1,I2,I3; result
OUTPUT: O1,O2;
PRODUCT: P1,P2,P3,P4,P5,P6,P7;

AND_BEGIN 1 1 X 1 0
P1 := I1 * I2;
P2 := I1 * I3; 1 X 1 1 0
P3 := I2 * I3;
P4 := I1 * I2' * I3'; X 1 1 1 0
P5 := I1' * I2 * I3';
P6 := I1' * I2' * I3; 1 0 0 0 1
P7 := I1 * I2 * I3;
END_END
0 1 0 0 1
0 0 1 0 1
OR_BEGIN
O1 := P1 + P2 + P3; 1 1 1 0 1
O2 := P4 + P5 + P6 + P7;
OR_END

Institute of
Microelectronic
12: Digital Design Systems 72
13. ASIC Design Concepts:
Gate Arrays

Institute of
Microelectronic
Systems

Cost Issues

• Design Costs
• Non-recurring Engineering Costs (NRE)
• Manufacturing Costs

Total Costs
Costs per Chip

Design Design
+ NRE + NRE
Costs Costs
= Fixed = Fixed
Costs Costs

Number of manufactured Chips Number of manufactured Chips

Institute of
Microelectronic
13: Gate Arrays Systems 2
Cost Issues: Design Costs

Design Costs reduced by Cost-affecting Decisions:


• raising level of abstraction • System Level:
– System architecture
• re-use
– Communication architecture
• powerful synthesis methods • Block-Level:
– appropriate modeling of control-
dominated and data path oriented
components

Synthesis:
• High-level Synthesis (allocation, scheduling, binding)
• Logic Synthesis (RTL to logic translation, FSM synthesis, logic optimisation, retiming)
• Layout Synthesis (module generators, PLA generators, Place & Route)

Institute of
Microelectronic
13: Gate Arrays Systems 3

Cost Issues: Manufacturing Costs

...depending on Design Style:

ASIC

Semi Custom Full Custom

Cell-based Array-based

(synthesized) Macro Cells Gate Arrays FPGAs/PLDs


Standard Cells

Institute of
Microelectronic
13: Gate Arrays Systems 4
Gate Arrays – Introduction (1)

Gate Arrays (Masterslices):


• Prefabricated active elements (master)
• Construction of logic functions by personalization (wiring macros
from a cell library, intra-cell routing)
• Connection of functional blocks by inter-cell routing in 1...3 layers
plus contact/via layers
• Arrangement of gate arrays:
– row structure
– island structure
– matrix of structures (= sea of gates)
• Mixed analog/digital gate arrays

Institute of
Microelectronic
13: Gate Arrays Systems 5

Gate Arrays – Introduction (2)

Gate array floor plan with row structure

Institute of
Microelectronic
13: Gate Arrays Systems 6
Gate Arrays – Introduction (3)

Floor plan for a sea of gates array

Institute of
Microelectronic
13: Gate Arrays Systems 7

IMI Grid Structure (1)

IMI gate array structure

Institute of
Microelectronic
13: Gate Arrays Systems 8
IMI Grid Structure (2)

The figure on the previous slide principally shows the structure of


gate arrays of International Microcircuits Inc. (IMI) (single metal
layers). The real circuit has 1440 cells. In the figure a reduced
number of 40 cells is drawn in order to improve the clarity of the
representation.

The gate array consists of the following elements:


• Pad (connection to outside world)
• Buffer devices (drive off-chip load capacitances)
• Distributed power and ground buses
• Underpasses to cross under the power and ground buses without
contacting them
• Each point represents a contact (potential interconnection point)

Institute of
Microelectronic
13: Gate Arrays Systems 9

IMI Grid Structure (3)

Corner of IMI gate array die


Institute of
Microelectronic
13: Gate Arrays Systems 10
IMI Grid Structure (4)

From the figure on the previous slide the following features can be
seen:
• Cells containing transistors are clustered around the VDD and VSS
buses
• In each cell four horizontal bars (crossing VDD and VSS) can be
seen. The thick bar represents a poly underpass while the three
thin bars are common poly input lines to an nMOS/pMOS
transistor pair
• Between cell columns a column of short horizontal poly
underpasses is placed

Institute of
Microelectronic
13: Gate Arrays Systems 11

IMI Grid Structure (5)

Grid representation of IMI gate array

Institute of
Microelectronic
13: Gate Arrays Systems 12
IMI Grid Structure (6)

Explanation of the grid:


a) basic cell
(VSS = GND)
b) internal interconnects
- internal gates = short horizontal poly
lines
- internal diffusion = short horizontal
diffusion lines
c) basic cell and crossover (poly) block

Institute of
Microelectronic
13: Gate Arrays Systems 13

IMI Grid Structure (7)


Underpass
Metal PolySilicon
Power Bars

Diffusion Areas Routing Channel


3 nMOS Transistors and 3 pMOS Transistors with common with horizontal PolySilicon underpasses
Drain/Source Terminal (possibility of underpassing (vertical) metal
lines in this area)

Institute of
Microelectronic
13: Gate Arrays Systems 14
IMI Grid Structure (8)

Explanation of the grid (continued):


d) XR = transistor
- adjacent nMOS and pMOS
transistors have a common
drain/source connection
- contacts for the nMOS source and
drain connections are on both sides
of the VSS bus (same for pMOS
transistors and VDD bus)
e) crossover block interconnects

Institute of
Microelectronic
13: Gate Arrays Systems 15

IMI Grid Structure (9)

Symbolic IMI cell structure representation

Institute of
Microelectronic
13: Gate Arrays Systems 16
IMI Grid Structure (10)

CMOS matrix cell

Institute of
Microelectronic
13: Gate Arrays Systems 17

CDI Grid Structure

CDI single metal layer gate array structure


Institute of
Microelectronic
13: Gate Arrays Systems 18
Gate Array Design Flow

Institute of
Microelectronic
13: Gate Arrays Systems 19

Personalization Examples (1)

Personalization of
IMI and CDI gate
arrays for an
inverter :
a) schematic
b) IMI layout
c) IMI layout
d) CDI layout

Institute of
Microelectronic
13: Gate Arrays Systems 20
Personalization Examples (2)

NOR gate on IMI

Institute of
Microelectronic
13: Gate Arrays Systems 21

Personalization Examples (3)

Layout of transmission gates (TG):


a) single TG
b) pair of TGs with common output

Institute of
Microelectronic
13: Gate Arrays Systems 22
Qualification of Gate Array Design Style

• Advantages:
– Lower number of individual masks needed
– Higher number of pieces for uncustomized master (cost reduction)
– Many others for masters, second source fabrication, libraries and
design systems
• Disadvantages:
– Area overhead (by unused transistor cells)
– Overdimensioned routing channels
– Larger cell size

Æ Advantages dominate for smaller production volumes

Institute of
Microelectronic
13: Gate Arrays Systems 23

Costs: Full Custom vs. Gate Array

Total Gate Array Costs


Costs per Chip
Full Custom
Design Design
+ NRE + NRE
Costs Costs
= Fixed = Fixed
Costs Costs

Number of manufactured Chips Number of manufactured Chips

• Gate Arrays: Reduction of fixed costs (reduced mask costs)


• Increased per piece costs, since utilisation of transistors is not optimal,
therefore larger chip area and less yield, implying larger cost

Institute of
Microelectronic
13: Gate Arrays Systems 24
14. Programmable Logic Devices

Institute of
Microelectronic
Systems

Overview

• Introduction
• Programming Technologies
• Basic Programmable Logic Device (PLD) Concepts
• Complex PLD
• Field Programmable Gate Array (FPGA)
• CAD (Computer Aided Design) for FPGAs
• Design flow for Xilinx FPGAs
• Economical Considerations
• Logic design Alternatives

Institute of
Microelectronic
14: PLDs Systems 2
Introduction

• A Programmable Logic Device is an integrated circuit with internal logic


gates and interconnects. These gates can be connected to obtain the
required logic configuration.
• The term “programmable” means changing either hardware or software
configuration of an internal logic and interconnects.
• The configuration of the internal logic is done by the user.
• PROM, EPROM, PAL, GAL etc. are examples of Programmable Logic
Devices.

Institute of
Microelectronic
14: PLDs Systems 3

Programming Technologies
Programmable Logic Device can be programmed in two ways:
1. Mask programming (in some few cases)
2. Field programming (typical)
1.) Mask programming: programming of device is done in the mask level.
+ good timing performance due to internal connections hardwired during
manufacture
+ cheap at high volume production
- programmed by manufacturer
- development cycle = weeks or months
- not re-programmable

Institute of
Microelectronic
14: PLDs Systems 4
Programming Technologies (II)
2.) Field programming: Programming of device is done by the user. The
programming technologies are of two
types

Permanent type (Non-volatile):


• Fuse (normal on) - ‘CLOSE (intact)’ ‘OPEN (blown)’
• Anti-fuse (normal off) - just the opposite of a FUSE
• EPROM
• EEPROM

Nonpermanent type (Volatile):


• driving n-MOS pass transistor by SRAM
• NOTE:
-When power of device is switched off then the content of SRAM is lost.

Institute of
Microelectronic
14: PLDs Systems 5

Basic PLD Concepts


1.) PLA (Programmable Logic Array):
• array of AND and OR gates are programmable
• product term sharing: every product term of the AND array can be
connected to the input of any OR gate
• unidirectional input/output pins

Figure 1: PLA device

Institute of
Microelectronic
14: PLDs Systems 6
Basic PLD Concepts (II)
2.) Memory based: Device with fixed AND array and programmable OR array
• output of OR gate has fixed connection with input of AND gates
• PROM, EPROM and EEPROM are memory based PLD device

3.) PAL/GAL(Programmable Array Logic/ Gate Array Logic):


AND array is programmable and OR array has fix connection with outputs
of AND gates. PAL/GAL devices may have bi-directional I/O pins.
There are three different types of PAL/GAL devices

• combinational PAL devices are used for the implementation of logic


function
• sequential PAL devices are used for the implementation of sequential
logic (finite state machines)
• arithmetic PAL devices sum of product terms may be combined by XOR
gates at the input of the macrocell D flip-flop
Institute of
Microelectronic
14: PLDs Systems 7

Basic PLD Concepts (IV)

Additional features of PAL/GAL devices


• PAL:
- EPROM - based programming Technology

• GAL:
- has array of programmable AND gates and OLMC (Output
Logic Macro Cell)
- EEPROM - based programming Technology
- programmable output polarity
- device can be configured as dedicated input and output mode

Institute of
Microelectronic
14: PLDs Systems 8
Figure 2:
Combinational PAL
device, AMD PAL16L8

Institute of
Microelectronic
14: PLDs Systems 9

Figure 3:
Sequential PAL devices,
AMD PAL16R8

Institute of
Microelectronic
14: PLDs Systems 10
Figure 4:
Arithmetic PAL
device, AMD
PAL16A4
Institute of
Microelectronic
14: PLDs Systems 11

• GAL16V8 has 8
configurable OLMC
(Output Logic Macro Cell)
• each OLMC has
programmable XOR to get
active low or high output
signal
• there is a feedback from
output to input

Figure 5: GAL device, GAL 16V8

Institute of
Microelectronic
14: PLDs Systems 12
Complex PLD (CPLD)
• is combination of multiple PAL or GAL type devices on a single chip
• CPLD architectures consists of
- Macrocells
- configurable flip-flop (D, T, JK or SR)
- Output enable/clock select
- Feedback select
• CPLD has predictable time delay because of hierarchical inter-connection
• easy to route, very fast turnaround
• performance independent of netlist
• devices is erasable and programmable with non-volatile EPROM or
EEPROM configuration
• wide designer acceptance
• has more logic density than any classical PLDs device
• relatively mature technology, but some innovation still ongoing
Institute of
Microelectronic
14: PLDs Systems 13

Complex PLD (II)

Figure 6:
Complex PLD device
Altera EP1800

Institute of
Microelectronic
14: PLDs Systems 14
Erasable CPLD
• EP1800 is erasable PLD device and has 48 macrocells, 16 dedicated
input pins and 48 I/O pins.
• device is divided into four quadrants, each contains 12 macrocells and
has local bus with 24 lines and a local clock
• out of 12 microcells, 8 are “local” macrocells and 4 are “global”
macrocells

Figure 8: Global macrocell


Figure 7: Local macrocell
Institute of
Microelectronic
14: PLDs Systems 15

Erasable CPLD (II)


• global bus has 64 lines and runs through all of the four quadrants (true
and complement signals of 12 inputs (=24 lines) + true and
complement of 4 clocks (=8 lines) + true and complement of I/O pins of
the 4 global macro cells in each quadrant (=32 lines)
• macrocells: combinational or registered data output; the flip-flop is
configurable as D, T, JK or SR type.

Figure 10: Asynchronous clock,


Figure 9: Synchronous clock,
output permanently enabled
output enable by product term
Institute of
Microelectronic
14: PLDs Systems 16
Electrically Erasable PLD
• MAX 7000 is EEPROM
based programmable logic
device
• it’s architecture includes
following elements,
- Logic Array Blocks
(LABs)
- Macrocells
- Programmable
Interconnect Array (PIA)
- I/O control blocks
• Pin to pin delay is about 5
ns
• predictable delay because
of hierarchical routing
structure of PIA
Figure 11: Block diagram of Altera MAX 7000 family

Institute of
Microelectronic
14: PLDs Systems 17

Electrically Erasable PLD (II)

• each Logic Array Block


(LAB) has 16 macrocells
• each macrocell consists of
logic array, product term
select matrix and
programmable register
• the product term select
matrix allocates product
terms from logic array to use
them as either primary logic
inputs to OR and XOR gate
or secondary inputs to clear,
preset, clock and clock
enable control function for
the register of macrocell
Figure 12: MAX 7000 device, macrocell

Institute of
Microelectronic
14: PLDs Systems 18
Electrically Erasable PLD (III)

• logic is routed among LABs


via the PIA.
• dedicated inputs, I/O pins,
and macrocell outputs feed
the PIA, which makes the
signals available throughout
the entire device
• only the signals required by
each LAB are actually
routed from the PIA into the
LAB
Figure 13: • selecting of signal from PIA
MAX 7000 device, programmable to LAB is done by an
Interconnect Array (PIA) EEPROM cell

Institute of
Microelectronic
14: PLDs Systems 19

Field Programmable Gate Array

• FPGA is a general purpose, multi-level programmable logic device


• FPGA is composed of,

- logic blocks to implement combinational and sequential


logic circuit
- programmable interconnect wire to connect input and
output of logic blocks
- I/O blocks logic blocks at periphery of device for the
external connection

•“The routing resources are both the greatest strength and weakness
of the FPGA’s”

Institute of
Microelectronic
14: PLDs Systems 20
Field Programmable Gate Array (II)

Figure 14: Symmetrical array


architecture of FPGAs

Institute of
Microelectronic
14: PLDs Systems 21

Field Programmable Gate Array (III)

• There are four main


categories of FPGAs
available
commercially,
- symmetrical array
- row - based
- hierarchical PLD
- sea of gates
• They are differ to each
other on their
interconnection and
how they are
programmed

Figure 15: Category of different FPGA


Institute of
Microelectronic
14: PLDs Systems 22
Programming Technologies
• Currently, there are four programming technologies for FPGAs,
- static RAM cells
- anti fuse
- EPROM transistor
- EEPROM transistor

Static RAM programming technology:

a) pass-transister b) transmission
c) multiplexer
gate
Figure 16: SRAM based programming technology
Institute of
Microelectronic
14: PLDs Systems 23

SRAM Programming technology

• completely reusable - no limit concerning re-programmability


• pass gate closes when a “1” is stored in the SRAM cell
• allows iterative prototyping
• volatile memory - power must be maintained
• large area - five transistor SRAM cell plus pass gate
• memory cells distributed throughout the chip
• fast re-programmability (tens of milliseconds)
• only standard CMOS process required

Institute of
Microelectronic
14: PLDs Systems 24
Anti-fuse Programming

• An anti-fuse is the opposite of normal fuse.


• Anti-fuse are made with a modified CMOS process having an extra step
• This step creates a very thin insulating layer which separates two
conducting layers
• That thin insulating layer is fused by applying a high voltage across the
conducting layer
• Such high voltage can be destructive for CMOS logic circuit
• Non-volatile (Permanent)
• Requires extra programming circuitry, including a programming
transistor

Institute of
Microelectronic
14: PLDs Systems 25

Actel PLICE Anti-fuse programming technology


• The Actel PLICE anti-fuse consists of a layer of positively doped silicon (n+
diffusion), a layer of dielectric (Oxygen-Nitrogen-Oxygen) and a layer of
polysilicon
• it is programmed by placing a relatively high voltage (18V) across the anti-
fuse terminals which results current of about 5 mA through it
• typical resistance of a fused contact is 300 to 500 Ω
• manufactured by 3 additional masks to a normal CMOS process

Figure 17: Actel PLICE anti-fuse structure


Institute of
Microelectronic
14: PLDs Systems 26
Quicklogic ViaLink Anti fuse programming technology
• amorphous silicon is used as an insulating layer
• direct metal to metal contact results path resistance below 50 Ω
• 10 V terminal voltage is required to fuse the amorphous silicon

Figure 18 : Four layer Metal ViaLink


structure Figure 19: ViaLink
Institute of element
Microelectronic
14: PLDs Systems 27

EEPROM programming technology


• static charge on floating gate turns the transistor permanently off
• re-programmable
• non-volatile
• external permanent memory is not required
• slow re-configuration time
• floating-gate FET has relatively high on resistance
• higher static power consumption due to pull up resistor

Figure 20:
EEPROM programming
technology

Institute of
Microelectronic
14: PLDs Systems 28
Commercially available FPGAs

Institute of
Microelectronic
14: PLDs Systems 29

Xilinx FPGA

• Xilinx architecture
comprises of two
dimensional array of
logic block called as
CLB.
• They are
interconnected via
horizontal and vertical
routing channel
• I/O Blocks are user
configurable to provide
an interface between
external package pin
and input logic

Figure 21: General architecture of Xilinx FPGA • I/O can be configured


as input, output and bi-
Institute of directional signal
Microelectronic
14: PLDs Systems 30
Xilinx FPGA (II)
• Xilinx XC4000 is an SRAM
based FPGA
• each CLB has three LUTs
(Look Up Tables) and two
flip-flops.
• result of combinatorial logic
is stored in 16x1 SRAM
LUTs
• LUTs can be also used as
RAM
• combinatorial results of CLB
is passed to the interconnect
network or can be stored in
flip-flops and pass to the
interconnect network
Figure 22: Xilinx XC4000 CLB • with two stage of LUTs, two
Institute of
functions of 4 variables or
Microelectronic one function of 5 variables
14: PLDs Systems can be implemented 31

Xilinx FPGA (III)

Figure 24: Switch


Horizontal matrix
longlines
Single length lines

Double length lines

Figure 23: Programmable interconnect associated with XC4000 series CLB

Institute of
Microelectronic
14: PLDs Systems 32
Xilinx FPGA (IV)
• interconnects of XC4000 device are arranged in horizontal and vertical
channels
• each channel contains some number of wire segments
• They are,
Single length lines:
• they span a single CLB
• provide highest interconnect flexibility and offer fast routing
• acquire delay whenever line passes through switch matrix
• they are not suitable for routing signal for long distance
Double length lines:
• they span two CLB so that each line is twice as long as single length
lines
• provide faster signal routing over intermediate distance
Longlines:
• Longlines form a grid of metal interconnect segments that run entire
length or width of the array
• they are for high fan-out and nets with critical delay

Institute of
Microelectronic
14: PLDs Systems 33

Xilinx, Virtex-II ProTM FPGA family


• The Virtex-II Pro Platform FPGA is the most technically sophisticated
silicon and software product development in the history of the
programmable logic industry.
• The Virtex-II Pro FPGAs are manufactured in a 0.13-micron process.
• It is capable of implementing high performance System-On-a-Chip
designs with low development cost
• It can be used in the application such as system architectures in
networking applications, deeply embedded systems and digital signal
processing systems etc.
• Virtex-II Pro devices incorporates one to four PowerPC 405 processor
cores. The PowerPC 405 cores are fully embedded within the FPGA,
where all processor nodes are controlled by the FPGA routing
resources.
• Each PowerPC 405 core is capable of more than 300 MHz clock
frequency.

Institute of
Microelectronic
14: PLDs Systems 34
Xilinx, Virtex-II ProTM FPGA family (II)
• The Virtex-II Pro FPGA consists
of the following components:

- Embedded Rocket I/O™


Multi-Gigabit Transceivers
(MGTs)
- Processor Blocks containing
embedded IBM ® PowerPC
® 405 RISC CPU (PPC405)
cores and integration circuitry
- FPGA fabric based on
Virtex- II architecture.

Figure 25: Virtex-II Pro Generic


Architecture Overview
Institute of
Microelectronic
14: PLDs Systems 35

Xilinx, Virtex-II ProTM FPGA family (III)


• CLB (Configurable Logic Block)
include four slices and two 3-
state buffers
• Each slice is equivalent and
contains:
• Two function generators (F
& G)
• Two storage elements
• Arithmetic logic gates
• Large multiplexers
• Wide function capability
• Fast carry look-ahead chain
• Horizontal cascade chain
(OR gate)

Figure 26: CLB (Configurable Logic Block) of Virtex-II Pro


FPGA

Institute of
Microelectronic
14: PLDs Systems 36
Xilinx, Virtex-II ProTM FPGA family (IV)
• IOB blocks include six storage
elements, as shown in Figure.
• Each storage element can be
configured either as an edge-
triggered D-type flip-flop or as a
level-sensitive latch.
• On the input, output, and 3-state
path, one or two DDR (Double Data
Rate) registers can be used.
• Double data rate is directly
accomplished by the two registers
on each path, clocked by the rising
edges (or falling edges) from two
different clock nets.

Figure 27: IOB block of Virtex-II Pro


FPGA

Institute of
Microelectronic
14: PLDs Systems 37

Actel/TI FPGA architecture

• Actel offers three main


families:
- Act 1, Act 2, Act 3
• programmable Logic
blocks are arranged in
row
• horizontal routing
channels are arranged
between the adjacent
rows
• Actel FPGA are based
on anti fused technology
• instead of LUTs, it has
Figure 28: General architecture of Actel FPGA multiplexer

Institute of
Microelectronic
14: PLDs Systems 38
Actel/TI FPGA architecture (II)
Act-1 Logic Module:
• The Act-1 logic module has 8 - input and 1- output
logic circuit
• it has only combinatorial logic circuit module
• The Logic Module can implement the four basic
functions which are NAND, AND, NOR and OR

Figure 29: Act-1 logic


module

Institute of
Microelectronic
14: PLDs Systems 39

Actel/TI FPGA architecture (III)


Act-2 Logic Module:
• Act-2 family has two module architecture, consisting of C module
(Combinatorial) and S module (Sequential)
• the Logic Module is optimized for both combinatorial and sequential
designs
S module

C module

Figure 30: Act-2 logic module


Institute of
Microelectronic
14: PLDs Systems 40
Actel/TI FPGA architecture (IV)
Act-3 Logic Module:

• it comprises an AND and OR gate that are connected to a


multiplexer-based circuit block.
• The multiplexer circuit is arranged such that, in combination with
the two logic gates, a very wide range of functions can be realized
in a single logic block
• about half of the logic blocks in an Act-3 device also contains a
flip-flop

Figure 31: Act-3 Logic


module

Institute of
Microelectronic
14: PLDs Systems 41

Actel/TI FPGA architecture (V)

Figure 32: Act-1 programmable interconnection architecture


Institute of
Microelectronic
14: PLDs Systems 42
CAD for FPGAs Initial Design Entry

Logic Optimization

Technology Mapping

Placement

Routing

Programming Unit
Figure 33: Design flow for
FPGA Institute of Configured FPGA
Microelectronic
14: PLDs Systems 43

Design
DesignEntry
Entry Design flow for Xilinx FPGA
Design validation
Design validation

Device
DeviceSelection
Selection
DESIGN IMPLEMENTATION

Design Synthesis Optimization


Design Synthesis Optimization

Design
Designvalidation
validation Mapping
Mapping

Placement
Placement

Routing
Routing

Design
Designvalidation/
validation/
Back
BackAnnotation
Annotation

Bits Stream generation


Bits Stream generation
Download
Downloadto
toXilinx
Xilinx
FPGA
FPGA
Institute of
Microelectronic
14: PLDs Systems 44
Economical Considerations

Figure 34: Cost per Chip

Institute of
Microelectronic
14: PLDs Systems 45

Economical Considerations (I)

FPGA MPGA
1. Cost per chip is less for low 1. Less cost per chip for high volumes
volumes (low fixed cost) 2. Fabrication is done with hardwired
2. Short turnaround time metal connection layer, this results
3. Design flexibility is high and fast operation
cost for re-designing is low 3. High logic density
4. Speed is relatively slow 4. Very high costs for low volumes
because of resistance and (high fixed cost)
capacitance of the 5. No redesign flexibility
programmable switch
5. Programmable switches and
configuration network require
chip area, this results
decreased in logical density

Institute of
Microelectronic
14: PLDs Systems 46
Logic design Alternatives

SSI and PLDs Programmable Gate Custom


MSI Ics gate arrays arrays ICs
Integration in 100s < 500k 10k – 1M 100 –10M 1M – 100M
gates
Speed Fast Slow to Slow to Slow to Fast
medium medium fast
Function No Yes Yes Yes Yes
defined by user
Time to - Second Seconds Months Year
costomize s
User No Yes Yes No No
programmable

Institute of
Microelectronic
14: PLDs Systems 47

Logic design Alternatives (I)

Figure 35: Relative merits of various ASIC implementation styles

Institute of
Microelectronic
14: PLDs Systems 48
CPLDs and FPGAs

Complex Programmable Logic Field-Programmable Gate Array


Device (CPLD) (FPGA)

Architecture More Combinational Gate array-like


More Registers + RAM

Density Low-to-medium Medium-to-high


0.5-10K logic gates 1K to 3.2M system gates

Performance Predictable timing Application dependent


Up to 250 MHz today Up to 200 MHz today

Interconnect “Crossbar Switch” Incremental


Institute of
Microelectronic
14: PLDs Systems 49
Adders / Subtracters

Chapter 15

Arithmetic Units

In the following chapter, basic arithmetic units like adders, subtracters, or multipliers are
discussed. These components are widely used in VLSI circuits e. g. for the digital signal
processing application domain. More detailed descriptions on arithmetic units can be found
e. g. in [4] or [1].

15.1 Adders / Subtracters

15.1.1 Basic Adder Cells

Half Adder The circuit realizing the function

C = A1 A2 (15.1)
S = A1 ⊕ A2 (15.2)

is called half–adder and can be used to calculate the sum S of two bits A1 and A0 . A possible
carry is set at the C output.

Full Adder For adding binary numbers having a bitwidth of more than one single bit, the
concept of the half–adder has to be extended. The carry output of less significant bits in the
addition process have to be taken into account in the more significant bits. For that, a new
circuit structure called full–adder is used which is based on the following functional equations:

Cout = Cin (A1 + A2 ) + A1 A2 (15.3)


Sout = A1 ⊕ A2 ⊕ Cin (15.4)

These equations can be realized either by logic gates (AND, OR, XOR) or by two half–adders
and an OR gate.

15.1.2 Adders / Subtracters for Binary Coded Integers

The following section introduces the basic arithmetic components used in VLSI designs. First,
adder and subtracter architectures are discussed. Since addition and subtraction for binary

VLSI Design
Course 15-1
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Adders / Subtracters

numbers can be calculated by almost the same hardware (by selecting the appropriate comple-
ment representation first), the term “adder” is used as synonym for both adder and subtracter
in the following section.

Serial Adders

The principle of serial adders is shown in Fig. 15.1:


Carry
Register
-
.....
n .... Operand A
?
Shift Register Cout
X
.....
- S
n .... Operand B
- +
? Y
Shift Register 6Cin
Full-Adder

- Shift Register
.....
n ....
?Sum ?Cout

Figure 15.1: Serial adder principle

At the beginning of the operation, the two n–bit operands A and B are loaded to the shift
registers. The carry register is cleared resp. set to the value of the carry input. During the
next n clock cycles (if a wordlength of n bits for each operand is assumed), the operands are
added bitwise in the full–adder and stored in the sum register. For that, the operand shift
registers apply the least significant bit to the full–adder inputs whereas the sum shift register
reads the current sum output of the full–adder at the serial input and and shift the contents
by one bit to the right each clock cycle. The carry output of an addition is stored in the carry
register for use in the next clock cycle. The n-bit sum and the carry output are available after
(n+1) clock cycles [1 operand load, n calculation].
The serial adder has the smallest hardware complexity which is wordlength independent (if
the shift registers are not considered) but requires the highest computation time of all adder
implementations.

Parallel Adders

Ripple Carry Adder Chained full–adders which form an adder of the required wordlength
are called ripple carry adder since during addition the carry “ripples” through the whole chain
from the least significant to the most significant bit as shown in Fig. 15.2:
The addition time is therefore dependent on the wordlength of the operands.

Carry Lookahead Adder To speed up the addition process, lookahead methods can be
applied to reduce the time associated with carry propagation. The carry input of a stage
i is calculated directly from the input of the preceding stages i − 1, i − 2, . . . i − k rather

VLSI Design
Course 15-2
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Adders / Subtracters

A[n-1] B[n-1] A[1] B[1] A[0] B[0]


? ? ? ? ? ?
Full-Adders +  . .Cout[1]
.  +  Cout[0]
+  Cin

? ? ? ?
CoutSum[n-1] Sum[1] Sum[0]

Figure 15.2: Ripple carry adder principle

than allowing carries to ripple from stage to stage. To perform that task, the cout of ordinary
full–adders are substituted by the generate and propagate signals defined by

gi = ai bi (15.5)
pi = ai + bi . (15.6)

The carry input signal of stage i + 1 is defined by the equation

cini+1 = ci = gi + pi ci−1 (15.7)

and by recursive substitution in an example of a 4 bit adder

c0 = cin1 = g0 + p0 cin (15.8)


c1 = cin2 = g1 + p1 g0 + p1 p0 cin (15.9)
c2 = cin3 = g2 + p2 g1 + p2 p1 g0 + p2 p1 p0 cin (15.10)
c3 = cout = g3 + p3 g2 + p3 p2 g1 + p3 p2 p1 g0 + p3 p2 p1 p0 cin . (15.11)

As can be seen in the equations above, the carry lookahead logic circuits can be realized by a
two level logic implementation, that means the whole addition is performed in constant time
(without influence of wordlength). The implementation of the carry lookahead corresponding
to the above equations is shown in Fig. 15.3.
A[3] B[3] A[2] B[2] A[1] B[1] A[0] B[0]

? ? ? ? ? ? ? ?
Cin[3]
+  + Cin[2] + Cin[1] Cin[0]
+  Cin

? ? ? ?
Sum[3] Sum[2] Sum[1] Sum[0]

g[3] p[3] g[2] p[2] g[1] p[1] g[0] p[0]


?? ?? ?? ??
Carry Lookahead Circuit 

?
Cout

Figure 15.3: Carry lookahead adder for 4 bits

The number of gate inputs is restricted due to technological constraints. That means, the
wordlength of a carry lookahead cannot increase above any number. Due to that reason,
adders for a big wordlength are split into smaller groups processed by single carry lookahead
adders with reasonable wordlengths as shown in Fig. 15.4.

VLSI Design
Course 15-3
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Adders / Subtracters

A[15:12] B[15:12] A[11:8] B[11:8] A[7:4] B[7:4] A[3:0] B[3:0]


? ? ? ? ? ? ? ?
C[15] 4 bit 4 bit 4 bit 4 bit
+ C[11] +  C[7] +  C[3] +  Cin
CLA-Add CLA-Add CLA-Add CLA-Add

? ? ? ? ?
Cout Sum[15:12] Sum[11:8] Sum[7:4] Sum[3:0]

Figure 15.4: Clustered carry lookahead adder for 16 bits

The carry signal produced by a group is forwarded to the next group so that, if the group is
considered as a single block, the carry ripples through different blocks as in the carry ripple
adder. Alternatively, a hierarchical approach might be chosen in a way, that for each group
a group-generate as well as a group-propagate signal are generated which are evaluated by a
second level carry lookahead circuit.

Carry Select Adder In the following adder type, the wordlength of the operands is again
subdivided into clusters (see Fig. 15.5). The cluster subwordlength is chosen to balance the
time required for intra-cluster carry ripple additions and carry calculation of the preceding
clusters. The additions are all performed in parallel assuming the following two cases: carry in
of a cluster are ’0’ and are ’1’. The results (cluster carry out and partial sum C/Sum[i : j]) are
forwarded to multiplexors which select the appropriate value depending on the carry output of
the preceding stages. Since the time to switch a multiplexor is almost negligible compared to
the time required for the carry ripple additions, the overall addition time is almost independent
of the wordlength.
A[15:12] B[15:12] A[11:8] B[11:8] A[7:4] B[7:4] A[3:0] B[3:0]

? ? ? ? ? ? ? ?
4 bit 4 bit 4 bit 4 bit
+ 
CR-Adder
0 + 
CR-Adder
0 + 
CR-Adder
0 + 
CR-Adder
Cin

C/Sum0[15:12] C/Sum0[11:8] C/Sum0[7:4] C/Sum[3:0]

A[15:12] B[15:12] A[11:8] B[11:8] A[7:4] B[7:4]

? ? ? ? ? ?
4 bit 4 bit 4 bit
+ 
CR-Adder
1 + 
CR-Adder
1 + 
CR-Adder
1

C/Sum1[15:12] C/Sum1[11:8] C/Sum1[7:4]

H1? ?
0 
 H1? ?
0 
 H1? ?
0 C[3]

H
H H
H H
H

? ? C[11] ? C[7] ? ?
Cout Sum[15:12] Sum[11:8] Sum[7:0] Sum[3:0]

Figure 15.5: Carry select adder for 16 bits

Since the carry select adder requires two carry ripple adder chains for each cluster (except in
the least significant), the hardware amount is almost twice that of a simple ripple carry adder.
It is slower than a carry lookahead adder but compared to that type it has a higher regularity
and is for that reason better suited for VLSI implementation.

VLSI Design
Course 15-4
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Adders / Subtracters

Carry Save Adder For the addition of very many addends (e. g. in parallel multipliers),
the time required for full carry propagation even in the case of use of carry lookahead adders
might be to high for some applications. To achieve constant addition time complexity, the
propagation of computed carry results is avoided in the same stage and both, the S and the
Cout vectors are connected to the correct adder in the succeeding stage. This concept requires
a final addition to merge the sum and the carry vector of the final stage into a single sum
vector which can be realized using any of the adders discussed above (in Fig. 15.6 a carry ripple
adder has been chosen for simplicity). In a carry save adder, the adder delay is increased by
one full-adder delay if it is extended by an additional operand.
X[n-1] Y[n-1] X[2] Y[2] X[1] Y[1] X[0] Y[0] ....
...
?? ?? ?? ??
Full-Adders + + + +  Cin
  
. . . .
.....  ..... .....  .....
.... . . .  .... .... ....
W[n-1] W[2]
 W[1]
 W[0]
  
? ? ? ? ? ? ??
Full-Adders +  +  +  +  0 Carry
Save
   Adder
. . . .
....  ....  ....  .... Array
.... . . .
 .... .... ....
V[n-1] V[2]
 V[1]
 V[0]
  
?? ? ? ? ? ? ? ??
Full-Adders +  0 +  +  +  +  0

. . .
.... .... .... ....
..... ..... # ..... .....
...
## .......
...
? ?? ?? ?? ?? Final
Full-Adders +  +  +  . Cout[2]
. .  +  Cout[1] +  0 Carry
Propagation
...
....
? ? ? ? ? ? ?
Cout Sum[n+1] Sum[n] Sum[n-1] Sum[2] Sum[1] Sum[0]
.... ..
... ....
Stages required to
evaluate the carry outputs
of preceeding stages

Figure 15.6: Carry save adder for summation of 4 operands (V, W, X, Y)

VLSI Design
Course 15-5
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Multipliers

15.2 Multipliers

Shift and Add Multiplier The most common multiplier is the Shift and Add Multiplier
(SAA Mult.). Two binary unsigned integer words X and Y of bit-size Nx and Ny , respectively,
can be written using their binary representation:

x −1
NX Ny −1
X
X= xi 2i Y = y j 2j (15.12)
i=0 j=0

The product Z = X ∗ Y can now be computed:

x −1
NX
Z= xi Y 2i = (...((xNx −1 Y )2 + xNx −2 Y )2 + ...)2 + x0 Y (15.13)
i=0

The following recurrence can be derived from formula 15.13:

D0 = 0 Di+1 = Di 2−1 + xi Y Z = DNx 2Nx −1 (15.14)

In each step of the recurrence one bit of X is multiplied (a simple AND-operation) with Y and
added to the intermediate result Di which is shifted one bit. Figure 15.7 shows the general
structure of the Shift and Add multiplier with bit-sizes Nx and Ny .

Figure 15.7: Structure of SAA multipliers

For this multiplier type it takes Nx clock cycles to complete the multiplication, since one bit
of X is processed each step. The delay of the combinatorical circuit (which determines the
maximum clock frequency) is approximately: Ny δF A (δF A is the delay of a full adder, the
register delays are not considered).
The cost of a Shift and Add Multiplier is (3Ny + 2Nx )γF A (the cost of a full adder γF A is
assumed to be equal to the cost of a register).

Carry Save Multiplier In opposite to the SAA-Multiplier, the Carry Save Multiplier
(CSM) calculates the result in one step. Every bit of the first argument is multiplied with
every bit of the second argument concurrently. The results are added up according to the
position of the source bits.

VLSI Design
Course 15-6
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Multipliers

The CSM consists of combinatorial logic only. The multiplication of two 4-bit binary numbers
can be written as

X3 X2 X1 X0
Y3 Y2 Y1 Y0
————————–
P30 P20 P10 P00
P31 P21 P11 P01
P32 P22 P12 P02
P33 P23 P13 P03
—————————————————
Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0

where Pij = Xi ∧ Yj . The addition of all Pij terms can be done in an array of full adders.
Figure 15.8 shows the general structure of a Carry Save Multiplier assuming Nx ≥ Ny . Part
II is omitted in case of same size for Nx and Ny . The Carry In of the full adder is supplied in
the upper right corner. Not every full adder needs a Carry In, for some position half adders
are sufficient. The adder Carry Out is depicted in the lower left corner.

Figure 15.8: Structure of CSM multipliers

The delay of this type of multipliers is (Nx + Ny − 2)δF A . The cost is (Nx − 1)Ny γF A plus
(2Ny + 2Nx )γF A , if X, Y and the Z-register are accounted as in the shift and add case above.

Block Multiplier A combination of the fully parallel Carry Save Multiplier and the serial
Shift and Add Multiplier leads to a flexible architecture which can be configured from working
fully serial to working fully parallel. Many combinations in between are possible, thus allowing
the adaptation to given specifications and restrictions.

VLSI Design
Course 15-7
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Multipliers

The basic idea of the block multiplication is to divide each argument into blocks of the same
size. Each block of the first argument is multiplied with each block of the second argument in
a fast Carry Save Multiplier. All calculated block products are added up taking into account
the positions of the current argument blocks. Therefore, as in the Shift and Add Multiplier,
the arguments and the intermediate result have to be shifted in an appropriate way.
.
. . . . . . .......
.

X register nx Y register
 .
AA
.......... .... ...
.. ny
Carry Save
Multiplier

XX ( ( nx+ n y
XX nx+ n y ..... ..... ..... .....
.......... ..........

Adder
( ( nx+ n y
Controller
..... .....
........ ..........
..
.
. . . .......
.

Z register

Figure 15.9: Architecture of the block multiplier

Figure 15.9 shows the architecture of the block multiplier. The argument registers and the
Carry Hold Register are simple shift registers. The intermediate result has to be shifted in
both directions, thus requiring a bidirectional shift register. Signals for controlling the shift
directions are generated by a controller, which can be realized using a simple counter.
The multiplier can be configured by varying the block sizes of the arguments. With increasing
block sizes the multiplier becomes more parallel, thus reducing the number of clock cycles
needed to perform a multiplication. Larger block sizes, however, require a larger Carry Save
Multiplier, which increases the area needed to realize the multiplier. Assuming that the first
argument is separated in kx Blocks of size nx and the second argument in ky blocks of size
ny , the multiplier needs kx ∗ ky clock cycles to perform a multiplication. The delay of the
multiplier is determined by the size of the ripple carry adder, which has a width of nx + ny
bits.

VLSI Design
Course 15-8
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Microarchitectures

Chapter 16

Microarchitectures

The term microarchitecture describes the domain between the macroarchitecture (the lowest-
level hardware visible to the user) and the implementation technology (MOS VLSI) [9]. For
better analysis, microarchitectures are usually divided into 3 parts: the data path which
performs the data manipulations and calculations, the control path is used to apply correct
sequences of control signals to the data path, and the input/output unit providing access
from/to the external world (see Fig. 16.1)
Control
.. ..
.....

Signals
Data Control

Path Path
Status
.. ...
....
-
Flags
...
.....
6
?
Input /
Output
.....
6
....
?
External I/O Data

Figure 16.1: Microarchitecture blocks

The control path which can be interpreted as a more or less complex finite state machine
(FSM) can be either hardwired (used in fixed applications like a controller for the serial
adder in Fig. 15.1) or programmable (microprocessor with downloadable microcode). The
microarchitecture scheme as shown in Fig. 16.1 can represent quite simple circuits (like a
traffic light controller) as well as complex microprocessors.

VLSI Design
Course 16-1
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Datapath Design

16.1 Datapath Design

In the datapath of a microarchitecture, the operations and data manipulations are performed.
For that, control signals are generated by the control path depending on the operation(s) to
be executed. By forwarding information about the status of the data path (e. g. exceptional
conditions, underflow, overflow, division by zero, . . . ), the control path is able to react in a
correct way to the actual needs. The state signals (flags) can be used to enable conditional
branching depending on the state of the data path. Data processing is usually performed by
typical components like ALUs, shifters, register files, . . . .
The following section shows how datapath structures are usually implemented in larger VLSI
designs. For that, we assume the following simple datapath structure:

Control Signals
Clock OP-Sel Sel Shift Clock
Cin
..........
?
.

Ain - -PP
?
P ?
Inputs -@
? ..........
?
.
..... 6
.
.....
- Rout
Output
@ - -
..........
? -
.
?
Bin - - 



?
Status Flags
Status Signals
Figure 16.2: Datapath example

The datapath consists of 2 input registers for the input operands Ain and Bin, an arithmetic-
logic unit (ALU), a multiplexor to select between the Cin input and the ALU output, a
shifter unit, and an output register. The datapath structure could be implemented based on
standard cells, where basic library cells (like gates, muxes, registers, . . . ) are selected and
interconnected, or, if a datapath compiler is used, based on a set of several layout tiles as
shown in Fig. 16.3.
A datapath compiler creates a regular layout depending on the wordlength of the operands by
stacking the appropriate number of tiles in the layout. The horizontal structure consisting of
a set of tiles performing all functions for a single bit is called bit slice. If we apply vertical cuts
to the layout structure, the whole layout will be subdivided in layout blocks corresponding to
a single function implemented. These layout stripes are called functional slices.

16.1.1 Bit-slice ALU AMD 2901

As an example for a discrete datapath implementation the 2901 bit-slice will be discussed in
the following section (→ [3]).
The 2901 integrated circuit contains besides of a 16 word register set, a Q register (used

VLSI Design
Course 16-2
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Datapath Design

AReg BReg ALU MUX Shifter RReg

Control
Signal
Buffers
.....
.....
.
Bit[0]
Bit[1] Bit
Slices
.
Bit[n-1] ....
.....

Status
Buffers

..... .
..... ....
. .....

Functional Slices

Figure 16.3: Corresponding layout scheme

Figure 16.4: 2901 4-bit ALU slice Figure 16.5: 2901 µ-OPs

VLSI Design
Course 16-3
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Datapath Design

within add-shift multiplications or divisions) an arithmetic-logic unit (ALU), a shifter, and


an instruction decoder (see Fig. 16.4). All operations and the registers are designed for 4 bit
operands. The set of instructions which can be executed by the 2901 IC is also shown in
Fig. 16.5. The instructions are encoded in a 9 bit I vector which is provided by an external
microcode controller. The first of these tables shows the selection of the sources for both ALU
inputs (R and S), the second mentions the ALU functions, whereas the third indicates the
destination of the ALU results.
To form an ALU for wordlengths with multiples of 4 bits, the 2901 ICs can be cascaded as
shown in Fig. 16.6. In the example, a simple carry propagation scheme has been selected.
As an additional option, carry-lookahead circuits (AMD 2902) could be used to enhance the
speed for carry propagation.

Figure 16.6: 16-bit bit-sliced ALU

The 2901 IC has been widely used for applications in digital signal processing and for minicom-
puters. It is available as stand-alone IC and some silicon manufacturers also provide macrocells
with the functionality of the 2901 (for different wordlengths) that might be included to ASIC
designs.

VLSI Design
Course 16-4
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Controller Implementations

16.2 Controller Implementations

Controllers are used to apply a sequence of control signals to the datapath components. These
control signals are chosen to perform the desired operation(s) within the datapath. The
datapath is able to interact with the controller unit by sending appropriate status signals
(e. g. overflow flag when an addition is performed, equal flag as a result of a comparison, . . . ).
The controller can be designed to change the sequence of control signals depending on these
flags (used e. g. in microprocessors to perform conditional branches).
The general structure of such a controller can be found in Fig. 16.7.

Environmental Inputs

? ?

Combinational
Logic

?
State Register

?
Control Outputs

Figure 16.7: Basic controller structure

It consists of a combinational logic block and a register. The combinational logic block gener-
ates out of the input signals (which can be e. g. an instruction word defining the sequence of
control signals to be generated, state flags, . . . ) and parts of the previous register content the
control output signals as well as the information which step in the sequence of control signals
is to be executed in the next cycle. The controller can be seen as a realization of the abstract
model of a finite state machine.
To get a high level of regularity in the design of a controller, very often regular layout structures
(like ROMs or PLAs) are used to implement the combinational logic block rather than directly
implement the logic functions in separate gates (random logic). The random logic approach
was chosen in the control unit of many early microprocessors (≤ 8 bit) and in RISC (Reduced
Instruction Set Computer) processors whereas the regular layout structures are used in CISC
(Complex Instruction Set Computer) processors to simplify their controller design. Regular
structures simplify the design process due to the fact that if modifications in the control
sequences are required only the contents of a PLA resp. a ROM has to be redefined instead
of designing a whole combinational gate network. Since the design process for the latter
approach can be compared with programming a memory contents instead of circuit design,
that approach is called microprogramming and will be considered in detail in the sequel.

VLSI Design
Course 16-5
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Controller Implementations

16.2.1 Microprogrammed Controllers

Microprogrammed controllers mainly consist of a control memory and a microinstruction regis-


ter. The control memory is implemented using ROM (Fig. 16.8) or PLA (Fig. 16.9) structures.
For special applications, also RAM based control memories are used if e. g. the instruction
set of a processor has to be changed for special purposes. That flexibility is not available
when using hardwired logic. On the other hand, extra hardware cost compared to random
logic due to address decoding (in the ROM based controller) and sparse control matrices and
a performance penalty due to larger internal delays in the PLA or ROM could be the prize
for that flexibility. The control memory contains both the control signals to be forwarded
through the microinstruction register to the datapath and some sequencing information giv-
ing the address (NA next address) of the subsequent microinstruction. The concatenation of
the control signals and the next address is called microinstruction.

@ Address
@ Decoder
ROM 

?
Control NA
6 6
?
Control Outputs Environmental Inputs

Figure 16.8: ROM based controller

. . . . . . . . . . . . . . .P
. .L. .A. . . ......
. .
. .
. .
. .
. .
. OR  A ND .
. .
. .
. .
. .
...................... .6
.....

?
Control NA
6 6
?
Control Outputs Environmental Inputs

Figure 16.9: PLA based controller

Depending on the generation of the control signals, two types of microinstructions can be
distinguished:

VLSI Design
Course 16-6
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Controller Implementations

Horizontal Microinstructions. The control word from the microinstruction register is


directly applied to the circuit which is to be controlled (see Fig. 16.10). Each elementary
control point has a corresponding entry in the control word. That results in a very long
control word and therefore big control memories. On the other hand, very specific encoding
and a high degree of parallelism in the operations is possible.

Vertical Microinstructions. That type of microinstructions is based on a different ap-


proach: since in a n-bit control word 2n configurations would be possible which are hardly
used by the controller, the wordlength of the control word in the control memory is reduced
by encoding the smaller number of, let’s say M , used control vectors into a vector of dlog2 M e
bits. In a second step, the n-bit control word is fetched from a secondary memory used as
control vector decoder (implemented e. g. as ROM or PLA) and forwarded to the datapath
(see Fig. 16.11). It is also possible to use encoding of the control vector in groups for differ-
ent hardware units (one group for ALU control, the next for shifter control, . . . ) which are
decoded group by group instead of using a single and large control vector decoder.
Control Bits in the Microinstruction

? ? ? ? ? ? ? ? ? ? ? ?
..... .
..... .....
. ....

Control Lines

Figure 16.10: Horizontal microinstruction

Control Bits in the Microinstruction

?
@
@
Control Bit Decoder

? ? ? ? ? ? ? ? ? ? ? ?
..... .....
.....
. .....

Control Lines

Figure 16.11: Vertical microinstruction

In controller design, one can proceed one step further: if a microinstruction itself can be
represented as a sequence of ‘sub’microinstructions (so called nanoinstructions, the structure
shown in Fig. 16.12 can be used. The most simple approach, which already has been mentioned
under vertical microcode, is a single step ‘sequence’ of nanoinstructions, namely the decoding
of the control outputs out of an encoded control vector from the microcode control memory.

VLSI Design
Course 16-7
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Controller Implementations

If feedback is introduced in the decoder PLA (via the NNA [nanocode next address] register),
control sequences can be generated by the nanocode PLA. As long as a nanocode sequence
is running, the MNA [microcode next address] register is halted. In the case that many
microinstructions use the same nanocode sequences, significant savings in implementation
area for the whole controller can be reached.

. . . . . . . .Microcode
. . . . . . . . .PLA
..... ......
. .
. .
. .
. .
. .
. OR  A ND .
. .
. .
. .
. .
...................... .6
.....

?
MNA
6 6

Environmental Inputs

. . . . . . . . Nanocode . . . . . . . . .PLA .......... .


. ? .
. .
. .
. .
. .
. AND - OR .
. .
. .
. .
. . .
. . . . . ....6 ...................... .
...
....
.... ... ?
... ..
....
NNA ... ... Control
..................................... .

?
Control Outputs

Figure 16.12: A microcode/nanocode controller

VLSI Design
Course 16-8
Darmstadt University of Technology
Institute of Microelectronic Systems 0
17. Semiconductor Memories

Institute of
Microelectronic
Systems

Overview

• Introduction
• Read Only Memory (ROM)
• Nonvolatile Read/Write Memory (RWM)
• Static Random Access Memory (SRAM)
• Dynamic Random Access Memory (DRAM)
• Summary

Institute of
Microelectronic
17: Semiconductor Memories Systems 2
Semiconductor Memory Classification

Non-Volatile Memory Volatile Memory

Read Only Memory Read/Write Memory


Read/Write Memory
(ROM) (RWM)
Random Non-Random
Mask-Programmable EPROM Access Access
ROM E2PROM
SRAM FIFO
Programmable ROM FLASH
DRAM LIFO
Shift Register

EPROM - Erasable Programmable ROM SRAM - Static Random Access Memory


E2PROM - Electrically Erasable DRAM - Dynamic Random Access Memory
Programmable ROM
FIFO - First-In First-Out
LIFO - Last-In First-Out
Institute of
Microelectronic
17: Semiconductor Memories Systems 3

Random Access Memory Array Organization

Memory array
• Memory storage cells
• Address decoders

Each memory cell


• stores one bit of binary information (”0“ or ”1“ logic)
• shares common connections with other cells: rows, columns
Institute of
Microelectronic
17: Semiconductor Memories Systems 4
Read Only Memory - ROM

• Simple combinatorial Boolean network which produces a specific output for each input
combination (address)
• ”1“ bit stored - absence of an active transistor
• ”0“ bit stored - presence of an active transistor
• Organized in arrays of 2N words

• Typical applications:
• store the microcoded instructions set of a microprocessor
• store a portion of the operation system for PCs
• store the fixed programs for microcontrollers (firmware)

Institute of
Microelectronic
17: Semiconductor Memories Systems 5

Mask Programmable NOR ROM (1)

• ”1“ bit stored - absence of an active transistor


• ”0“ bit stored - presence of an active
transistor

NOR ROM with 4-bit words

• Each column Ci (NOR gate) corresponds to one bit of the stored word
• A word is selected by rising to “1“ the corresponding wordline
• All the wordlines are “0“ except the selected wordline which is “1“

Institute of
Microelectronic
17: Semiconductor Memories Systems 6
Mask Programmable NOR ROM (2)

D
G
S
common ground line

S
G
D

• “1” bit stored - the drain/source connection (or the gate electrode) are omitted in the final
metallization step
• “0” bit stored - the drain of the corresponding transistor is connected to the metal bit line

Cost efficient, since few masks have to be manufactured only


Institute of
Microelectronic
17: Semiconductor Memories Systems 7

Implant Mask Programmable NOR ROM

Idea: deactivation of the NMOS transistors by raising their threshold voltage above the VOH
level through channel implants
• “1” bit stored - the corresponding transistor is turned off through channel implant
• “0” bit stored - non-implanted (normal) transistors
Advantage: higher density (smaller area)!

Institute of
Microelectronic
17: Semiconductor Memories Systems 8
Implant Mask Programmable NAND ROM (1)

• “1” bit stored - presence of a transistor that can be


switched off
• “0” bit stored - shorted/normally-on transistor

NAND ROM with 4-bit words

• Each column Ci (NAND gate) corresponds to one bit of the stored word
• A word is selected by putting to “0“ the corresponding wordline Ri
• All the wordlines Ri are “1“ except the selected wordline which is “0“
Normally on transistors: have a lower threshold voltage (channel implant)

Institute of
Microelectronic
17: Semiconductor Memories Systems 9

Implant-Mask-Programmable NAND ROM (2)

D D
R1
S S

4x4 bit NAND ROM array layout

• The structure is more compact than NOR array (no contacts)


• The access time is larger than NOR array access time (chain of nMOS)

Institute of
Microelectronic
17: Semiconductor Memories Systems 10
NOR Row Address Decoder for a NOR ROM Array

NOR ROM
Array

A1 A2 R1 R2 R3 R4
0 0 1 0 0 0
0 1 0 1 0 0
1 0 0 0 1 0
1 1 0 0 0 1

• The decoder must select out one row by rising its voltage to “1” logic
• Different combinations for the address bits A1A2 select the desired row
• The NOR decoder array and the NOR ROM array are fabricated as two adjacent arrays,
using the same layout strategy
Institute of
Microelectronic
17: Semiconductor Memories Systems 11

NAND Row Address Decoder for a NAND ROM Array

• The decoder has to lower the voltage level of the selected row to logic “0” wile keeping all
the other rows at logic “1”
• The NAND row decoder of the NAND ROM array is implemented using the same layout
strategy as the memory itself

Institute of
Microelectronic
17: Semiconductor Memories Systems 12
NOR Column Address Decoder for a NOR ROM Array

NOR Address decoder + 2M pass transistors Binary selection tree decoder


• Large area! • No need for NOR address decoder, but
are necessary additional inverters!
• Smaller area
• Drawback - long data access time

Institute of
Microelectronic
17: Semiconductor Memories Systems 13

Nonvolatile Read-Write Memories

• The architecture is similar to the ROM structure


• Array of transistors placed on a word-line/bit-line grid
• Special transistor that permits its threshold to be altered electrically
• Programming: selectively disabling or enabling some of these transistors
• Reprogramming: erasing the old threshold values and start a new programming cycle

Method of erasing:
• ultraviolet light - EPROMs
• electrically - EEPROMs

Institute of
Microelectronic
17: Semiconductor Memories Systems 14
EPROM (1)
The floating gate avalanche-injection MOS (FAMOS) transistor:
• extra polysilicon strip is inserted between the gate and the channel - floating gate
• impact: double the gate oxide thickness, reduce the transconductance, increase the
threshold voltage
• threshold voltage is programmable by the trapping electrons on the floating gate through
avalanche injection

Schematic
symbol

Institute of
Microelectronic
17: Semiconductor Memories Systems 15

EPROM (2)

Removing programming voltage Programming results in


Avalanche injection leaves charge trapped higher VT

• Electrons acquire sufficient energy to became “hot” and traverse the first oxide insulator
(100nm) so that they get trapped on the floating gate
• Electron accumulation on the floating gate is a self-limiting process that increases the
threshold voltage (~7V)
• The trapped charge can be stored for many years
• The erasure is performed by shining strong ultraviolet light on the cells through a
transparent window in the package
• The UV radiation renders the oxide conductive by direct generation of electron-hole pairs

Institute of
Microelectronic
17: Semiconductor Memories Systems 16
EPROM (3)

• The erasure process is slow (~min.)


• The erasure procedure is off-system!
• Programming takes several usecs/word
• Limited endurance - max 1000 erase/program cycles
• The cell is very simple and dense: large memories at low cost!
• Applications that do not require regular reprogramming

Institute of
Microelectronic
17: Semiconductor Memories Systems 17

EEPROM

• Provide an electrical-erasure procedure


• Modified floating-gate device, floating-gate
tunneling oxide (FLOTOX):
• reduce the distance between floating gate
and channel near the drain
• Fowler-Nordheim tunneling mechanism
(when apply 10V over the thin insulator)

• Reversible programming by reversing the applied voltage (rise


and lower the threshold voltage) Ð difficult to control the
threshold voltage Ð extra transistor required as access device
• Larger area than EPROM
• More expensive technology than EPROM
• Offers a higher versatility than EPROM
• Can support 105 erase/write cycles

Institute of
Microelectronic
17: Semiconductor Memories Systems 18
Flash Memories
Combines the density of the EPROM with the versatility of EEPROM structures
• Programming: avalanche hot-electron-injection
• Erasure: Fowler-Nordheim tunneling (as for EEPROM cells)
• Difference: erasure is performed in bulk for the complete (or subsection of) memory chip -
reduction in flexibility!
• Extra access transistor of the EEPROM is eliminated because the global erasure process
allows a careful monitoring of the device characteristics and control of the threshold
voltage!
• High integration density

ETOX Flash cell - introduced by INTEL


Institute of
Microelectronic
17: Semiconductor Memories Systems 19

Static Random Access Memory - SRAM (1)

• Permit the modification (writing) of stored data bits


• The stored data can be retained infinitely, without need of any refresh operation
• Data storage cell - simple latch circuit with 2 stable states
• Any voltages disturbance Ð the latch switches from one stable point to the other stable point
• Two switches are required to access (r/w) the data

v
o
6
Stable
Q-Point
V OH
v vo
I
v
1 2 I 1 4 vo = v I
0 1 0 1 0

Unstable
0 1 vo Q-Point
2
(a) (b) 2 Stable
Q-Point

V
OL
0
0 2 4 6 v
I

Institute of
Microelectronic
17: Semiconductor Memories Systems 20
Static Random Access Memory - SRAM (2)

a) general structure of a SRAM cell


based on two inverter latch
circuit
b) implementation of the SRAM cell
c) resistive load (undoped
polysilicon resistors) SRAM cell
d) depletion load NMOS SRAM cell
e) full CMOS SRAM cell

Institute of
Microelectronic
17: Semiconductor Memories Systems 21

Resistive Load SRAM Cell - Operation Principle (1)

• MP1,2 pull up transistors - charge up the large


column parasitic capacitances CC, CC
• The steady-state voltage: VCc= VDD -VT ~ 3.5V
V1 V2

Here we define the memory


content to be located

The basic operations on SRAM cells


RS = 1 (M3, M4 on)
• Read/Write “1”
• Read/Write “0”
RS = 0 (M3, M4 off)
• data is being held

Institute of
Microelectronic
17: Semiconductor Memories Systems 22
Resistive Load SRAM Cell - Operation Principle (2)

• Write “1” operation (RS = 1 - M3, M4 on)


VC - forced to 0 by data write circuitry, V2 decreases to 0, M1 off; V1 increases;
Final state: V1= 1, V2= 0

• Read “1” operation (RS = 1 - M3, M4 on)


M1 off; M2, M4 on; VC - pulled down , VC > VC read as a logic “1”

• Write “0” operation (RS = 1 - M3, M4 on)


VC - forced to 0 by data write circuitry, V1 goes to 0, M2 off; V2 increases to 1
Final state: V1= 0, V2= 1

• Read “0” operation (RS = 1 - M3, M4 on)


M2 off; M1, M3 on; VC - pulled down, VC < VC read as logic 0

Institute of
Microelectronic
17: Semiconductor Memories Systems 23

Full CMOS SRAM Cell

• Low-power SRAM Cell: the static power dissipation is limited by the leakage current during a
switching event
• The pMOS pull-up transistors allow the column voltage to reach full VDD level
• High noise immunity due to larger noise margins
• Lower power supply voltages than resistive-load SRAM cell
• Drawback: large area!
Institute of
Microelectronic
17: Semiconductor Memories Systems 24
CMOS SRAM Cell Design Strategy (1)

Layout of the resistive-load SRAM cell Layout of the CMOS SRAM cell

Institute of
Microelectronic
17: Semiconductor Memories Systems 25

CMOS SRAM Cell Design Strategy (2)


(1) The data read operation should not destroy the stored information
Assume that a logic “0” is stored in the cell (V1 = 0, V2 = 1: M1, M6-linear; M2, M5-off)

• RS = 0: M3, M4-off;
• RS = 1: M3-saturation; M4, M1-linear
VC decreases , V1 increases slowly
Condition - M2 must remain turned off during
the data reading operation:
V1, max ≤ V T,2 ; IM3 = IM1 ⇒

⎛W ⎞
⎜ ⎟
⎝ L ⎠ 3 2(VDD − 1.5VT ,n )VT ,n
Design rule: <
⎛W ⎞
⎜ ⎟
(VDD − 2VT ,n )2
A symmetrical rule is valid also for M2 and M4
⎝ L ⎠1

Institute of
Microelectronic
17: Semiconductor Memories Systems 26
CMOS SRAM Cell Design Strategy (3)
(2) The cell should allow modification of the stored information during the data write phase

Consider the write “0“ operation, assuming that “1“ is stored in the cell (V1 = 1, V2 = 0: M1,
M6-off; M2, M5-linear)
• RS = 0: M3, M4-off;
• RS = 1: M3, M4 saturation, M5-linear
In order to change the stored information: V1 =
0, V2 = 1 ⇒ M1 on and M2 off!
But V2 < VT1 (previous design condition) ⇒ M1
cannot be switched on! ⇒ M2 must be
0V
VDD 0V switched off ⇒ V1 must be reduced below VT2
V1 ≤ V T,2 ; IM3 = IM5 ⇒

⎛W ⎞
⎜ ⎟
⎝ L ⎠ 5 µ n 2(VDD − 1.5VT ,n )VT ,n
Design rule: =
⎛W ⎞
⎜ ⎟
µp (VDD + VT , p )2
A symmetrical rule is valid also for M6 and M4 ⎝ L ⎠3
Institute of
Microelectronic
17: Semiconductor Memories Systems 27

SRAM Write Circuitry

W DATA WB WB Operation
0 1 1 0 M1-off, M2-on, VC high, VC low
0 0 0 1 M1-on, M2-off, VC low, VC high
1 X 0 0 M1, M2 off, VC, VC high

Write operation is performing by forcing the voltage level of either column (bit line) to “0”

Institute of
Microelectronic
17: Semiconductor Memories Systems 28
SRAM Read Circuitry

The read circuitry must detect a very small difference between


the two complementary columns (sense amplifier)

∂ (Vo1 − Vo 2 ) ∂I D
= − R • g m , where g m = = 2k n I D
∂ (VC − VC ) ∂VGS

The gain can be increased by using


• active loads
• cascode configuration

Precharging of bit lines plays a significant role in the access time!


• The equalization of bit lines prior to each new access (between two access cycles)

Institute of
Microelectronic
17: Semiconductor Memories Systems 29

Dual Port SRAM Arrays

Allows simultaneous access to the same


location in the memory array (systems
with multiple high speed processors).

• Eliminates wait states for the processes during data read operation
• Problems can occur if:
• two processors attempt to write data simultaneously onto the same cell
• one processor attempts to read while other writes data onto the same cell
• Solution: contention arbitration logic

Institute of
Microelectronic
17: Semiconductor Memories Systems 30
Dynamic Random Access Memories - DRAM
SRAM drawbacks (1)
• large area: 4-6 transistors/bit + 4 lines connections
• static power dissipation (exception CMOS SRAM)

Need for high density RAM arrays → DRAM

DRAM
• binary data is stored as charge in a capacitor
• requires periodic refreshing of the stored data
• no static power dissipation

4-transistor DRAM cell


• one of the earliest DRAM cells
• derived from 6 transistor SRAM cell
• two storage nodes (parasitic capacitances)
• large area

Institute of
Microelectronic
17: Semiconductor Memories Systems 31

Dynamic Random Access Memories - DRAM (2)

3-transistor DRAM cell


• 1 transistor - storage device
• 2 transistors for r/w access (switches)
• 2 r/w control lines
• 2 I/O lines

1-transistor DRAM cell


• 1 transistor for r/w access
• 1 explicit capacitor - information storage
• 1 r/w control line
• 1 I/O line

Institute of
Microelectronic
17: Semiconductor Memories Systems 32
Three-Transistor DRAM Cell (1)

MP1, MP2 pull-up (precharge) transistors


M2 storage transistor (on or off depending on the
charge stored in C1)
M1, 3 access switches
C2, 3 >> C1

Two phase non overlapping clock scheme


CLK1 - precharge events
CLK2 - r/w events (CLK1 - low)
MD

Institute of
Microelectronic
17: Semiconductor Memories Systems 33

Three-Transistors DRAM Cell (2)

• Every r/w operation is preceded by a precharge cycle - C2, 3 are charghed up


• Refresh operation (row): data are read, inverted and written back into the same cell
location every 2-4 ms

Institute of
Microelectronic
17: Semiconductor Memories Systems 34
Three-Transistors DRAM Cell (3)
WRITE 1 operation:

• Precharge: C2, C3 charged up to 1 logic level


• DATA = 0, MD off; WS = 1, M1 on ⇒ the charge on
C2 is shared with C1
• After write operation: WS = 0, M1 off; Since C1 is
charged up to 1: M2 on

READ 1 operation:

• Precharge: C2, C3 charged up to 1 logic level


• RS = 1, M3 on, M2 on
• C3 discharges through M2, M3 and the falling
column voltage is interpreted as a stored 1

Institute of
Microelectronic
17: Semiconductor Memories Systems 35

Three Transistors DRAM Cell (4)

Write 0 operation:
• Precharge: C2, 3
• DATA = 1, MD on; WS = 1, M1 on ⇒ C2, C1
pulled to 0 through M1 and MD;
• After write operation Ws = 0, M1 off; C2 is
discharged to 0, M2 off MD

READ 0 operation:
• Precharge: C2, 3
• RS = 1, M3 on; M2 off
• C3 does not discharge - the 1 logic level is
interpreted as a stored 0

C1 is discharged by the leakage currents of M1 - data must be periodically read,


inverted and written back!
Institute of
Microelectronic
17: Semiconductor Memories Systems 36
One-Transistor DRAM Cell (1)

• 1 transistor M1
• 1 explicit capacitor C1: 30-100 fF, (C1<<C2)

Charge sharing between C2 and C1 has a key role in the r/w operations

• Data WRITE:
“1” - D = 1, R/W = 1 M1-on; C1 charge up to 1 level
“0” - D = 0, R/W = 0 M1-on; C1 discharge to 0 level

• Data READ (destructive operation):


Precharge C2
R/W = 1 M1-on; charge sharing between C1 and C2
Data refresh operation is required!
Institute of
Microelectronic
17: Semiconductor Memories Systems 37

One-Transistor DRAM cell (2)

(1) Data
(2) Gate
(3) Drain area
(4) Source area (6)Capacitor Plate
(5) Field oxide
(6) Capacitor plate (Poly
(7)Capacitor Insulator
Si) Refilling Poly
(7) Capacitor insulator
(8) Storage node (8)Storage Node
electrode (Poly Si)
Substrate-Si (9)
(9) Substrate (Si)
(5)Field Oxide

One Transistor DRAM cell with trench capacitor (cross-section)

Institute of
Microelectronic
17: Semiconductor Memories Systems 38
Data Read Example (1)

• 256 cells per column DRAM


• The storage array is split in half
• A cross-coupled dynamic latch is used to
restored the signal levels
• The dummy cell has a capacitance equal
to half of the storage capacitance value Three stages read-refresh operation

Institute of
Microelectronic
17: Semiconductor Memories Systems 39

Data Read Example (2)

Precharge phase (1)

• Precharge devices are turned on, CD and CD are charged up to ”1” level
• The dummy nodes X and Y are pulled to “0” level
• During this phase all other signals are inactive

Institute of
Microelectronic
17: Semiconductor Memories Systems 40
Data Read Example (3)

Row selection phase (2)

• One of the 256 word lines is raised to “1” (cell R128 is selected)
• The corresponding dummy cell on the other side is also selected (right)
• Charge sharing between the selected cell and CD (depending on the value stored by cell
“0” or “1”) and between dummy cell and CD
• Voltage level is detected through the charge sharing

Institute of
Microelectronic
17: Semiconductor Memories Systems 41

Data Read Example (4)

Read refresh phase (3)

• Performed during the active phase of the CS (column-select signal)


• The slight voltage difference between the two half-column is amplified and the
latch forces the two half-columns into opposite states
• The voltage level on the accessed cell is restored

Institute of
Microelectronic
17: Semiconductor Memories Systems 42
DRAM Architectures
Name Feature Die size Frequency Application
increase (system
level)
DRAM Fast page mode - 25MHz Main memory
VRAM DRAM+SAM 50% 40MHz Viedo display buffer
EDO DRAM with modofied 0% 40-50MHZ Main memory, low-end graphic
CAS memory
SDRAM Sync.DRAM+Register 0-10% 60-150MHz Main memory in workstations,
(Latch) high end PCs, middle range
graphic memory
SGRAM SDRAM+Block write+ 10% 60-150MHz High-end memory
WPB 3Gb/s
CDRAM Sync.DRAM+SRAM+ 7-10% 66MHz Low-end PC
DTB
RDRAM Sync.DRAM+Raambus 12-15% 250MHz High-end PC, graphic memory
I/O
3D-RAM Sync.DRAM+SRAM+ ? 400Mb/s High-end graphic memory
SAM+ALU ext, 1.6
Gb/s int
EDRAM DRAM + SRAM ? ? Low-end PC
SVRAM Sync.DRAM+SAM 50% 100MHz High-end graphic memory

WRAM VRAM with localized <40% 66MHz Middle to high-end graphic


SAM Institute of memory
Microelectronic
17: Semiconductor Memories Systems 43

Summary

• the memory architecture has a major impact on the ease of use of the memory, its
reliability and yield, its performance and power consumption;
• memories are organized as arrays of cells; an individual cell is addressed by a
column and row address;
• the memory cells should be designed so that a maximum signal is obtained in a
minimum area; the cell design is dominated by technological considerations and
most of the improvement in density results from scaling and advanced
manufacturing processes;
• we have discussed cells for read-only memories (NOR and NAND ROM), nonvolatile
memories (EPROM, EEPROM and FLASH) and read-write memories (SRAM and
DRAM)
• the peripheral circuitry is very important to operate the memory in a reliable way
and with reasonable performance; decoders, sense amplifiers and I/O buffers are
an integral part of every memory design;

Institute of
Microelectronic
17: Semiconductor Memories Systems 44
18. ASIC Design Guidelines

Institute of
Microelectronic
Systems

Introduction

• The following design guidelines have been adapted from [2]:


European Silicon Structures (ES2), Zone
Industrielle, 13106 France. Solo 2030 User
Guide, e02a02 edition, June 1992
• These recommendations are useful in order to avoid functional
faults and get the desired functionality

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 2
Synchronous Circuits (1)

• All data storage elements are clocked


• The same active edge of a single clock is applied at precisely the
same time to all storage elements

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 3

Synchronous Circuits (2)

• NON-RECOMMENDED CIRCUITS:
– Flip-flop driving clock input of another Flip-flop:

– The clock-input of the second FF is skewed by the clock-to-q delay


of the first FF and not activated at every activation clock edge (e.g.
ripple counter)

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 4
Synchronous Circuits (3)

• NON-RECOMMENDED CIRCUITS:
– Gated clock line:

– Clock skew caused by gating the clock line (e.g. multiplexer in clock
line)

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 5

Synchronous Circuits (4)

• NON-RECOMMENDED CIRCUITS:
– Double-edged clocking:

– FFs are clocked on the opposite edges of the clock signal


– Insertion of scan-path impossible
– Difficulties in determining critical path lengths

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 6
Synchronous Circuits (5)

• NON-RECOMMENDED CIRCUITS:
– Flip-flop driving asynchronous reset of another Flip-flop:

– Synchronous design principle, that all FFs change state at exactly


the same time is not fulfilled
• Recommended Circuits will be described during the following
sections

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 7

Clock Buffering (1)

• NON-RECOMMENDED CIRCUITS:
– Unequal depth of clock buffering:

– causes clock skew

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 8
Clock Buffering (2)

• NON-RECOMMENDED CIRCUITS:
– Unbalanced fanout of clock buffers:

– Clock skew by different


load-dependent delays
– Excessive clock fanout
should be avoided (slow edges)

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 9

Clock Buffering (3)

• Recommended circuits:
– Balanced clock tree buffering

– Same depth of buffering


– Same fanout
– Limited fanout in order to
achieve sharp clock edges

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 10
Clock Buffering (4)

• Recommended circuits:
– Combined geometric/tree buffering

– Using intermediate buffer


of suitable strength at each
fanout point

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 11

Gated Clocks (1)

• NON-RECOMMENDED CIRCUITS:
– Multiplexer on clock line:

– Signal change at multiplexer input can cause a glitch at the clk input
(FF captures invalid data)
– Gating the clock line introduces clock skew

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 12
Gated Clocks (2)

• Recommended circuits:
1) Enabled (E-type) flip-flop: 2) Toggle (T-type) flip-flop:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 13

Double-edged Clocking (1)

• NON-RECOMMENDED CIRCUITS:
– Pipelined logic with double-edged clocking:

– Not recommended in context with scan-path methods

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 14
Double-edged Clocking (2)

• Recommended circuits:
– Pipelined logic with single-edged clocking:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 15

Asynchronous Resets (1)

• NON-RECOMMENDED CIRCUITS:
– Flip-flop driving the asynchronous reset of another flip-flop:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 16
Asynchronous Resets (2)

• Recommended circuits:
– Global asynchronous reset by external signal:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 17

Asynchronous Resets (3)

• Recommended circuits:
– Flip-flop driving the synchronous reset of another flip-flop:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 18
Shift Registers (1)

• NON-RECOMMENDED CIRCUITS:
– Shift register with forward or reverse chain of clock buffers:

– Internal clock skew can cause data fallthrough

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 19

Shift Registers (2)

• Recommended circuits:
– Shift register with balanced tree of clock buffers:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 20
Asynchronous Inputs (1)

• NON-RECOMMENDED CIRCUITS:
– Circuits with complicated feedback loops to capture asynchronous
inputs (very sensitive to noise, and functionality can be influenced
by placement and routing delays)

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 21

Asynchronous Inputs (2)

• Recommended circuits:
– Chain of two or more D-type flip-flops for capturing an asynchronous
input:

– The probability of propagating a metastable state is decreased with


increasing number of register stages

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 22
Asynchronous Inputs (3)

• Recommended circuits:
– Use of 4-bit register as shift register for capturing an asynchronous
input:

– The probability of propagating a metastable state is decreased with


increasing number of register stages

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 23

Asynchronous Inputs (4)

• Recommended circuits:
– Asynchronous handshake circuit:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 24
Asynchronous Inputs (5)

• The asynchronous handshake ciruit works as follows:


a) The first flip-flop is reset asynchronously when the r input is zero or
when the qb outputs of the second and the third FF both have the
value 0
b) The q-output of the first FF is asynchronously set to high, when a
positive edge arises at its ck-input
c) The high output of the first FF is propagated through the second
and the third FF in the two following cycles. The q-outputs of these
FFs are set to zero and the reset logic for the first FF is activated.
Now the first FF is ready to receive another edge at its input.
d) ...

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 25

Asynchronous Inputs (6)

d) Three cases of metastability caused by simultaneously rising edges


of the asynchronous input and the system clock:
1) the second FF stabilizes to q=1 before the next rising clock
edge (circuit works as desired)
2) the second FF settles to q=0 and the third FF remains in its
state. Since the output q of the first FF is high, the propagation
of this output works correctly, but it needs one cycle more than
in the first case.
3) The metastable state of the second FF is still there at the next
rising edge of the clock signal. Then the third FF also becomes
metastable. The probability of receiving a metastable d
(internal) signal can be reduced by increasing the length of the
register chain.

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 26
Asynchronous Inputs (7)

• Operation of asynchronous handshake circuit:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 27

Delay Lines and Monostables (1)

• NON-RECOMMENDED CIRCUITS:
– In general, it cannot be recommended to build circuits with a
functionality that relies on delays.

– E.g. monostable pulse generator:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 28
Delay Lines and Monostables (2)

• NON-RECOMMENDED CIRCUITS:
– Pulse generator using flip-flop:

– Multivibrator:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 29

Delay Lines and Monostables (3)

• Recommended circuits:
– Synchronous pulse generator:

– Usage of higher clock speed


– Minimum time resolution is given by clock cycle

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 30
Bistable Elements (1)

• NON-RECOMMENDED CIRCUITS:
– Cross-coupled flip-flops and RS-flip-flops
– Bistable storing elements formed by cross-coupled NAND or NOR
gates:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 31

Bistable Elements (2)

• NON-RECOMMENDED CIRCUITS:
– Asynchronous RS-flip-flop:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 32
Bistable Elements (3)

• Recommended circuits:
– Use D-types with set/reset
– Use latch configured as RS flip-flop:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 33

RAMs and ROMs in Synchronous Circuits 1

• Problem: RAMs are double-edge triggered. The address is


latched on the opposite edge to the data
• Timing scheme:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 34
RAMs and ROMs in Synchronous Circuits 2

• Recommended circuits:
– Interfacing RAM into synchronous circuit: ME and WEbar generation

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 35

RAMs and ROMs in Synchronous Circuits 3

• Recommended circuits:
– Using flip-flop for WEbar generation: timing scheme

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 36
RAMs and ROMs in Synchronous Circuits 4

• Recommended circuits:
– Avoiding floating RAM/DPRAM output propagation

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 37

Tristates (1)

• NON-RECOMMENDED CIRCUITS:
– Tristate bus with non-central enable control:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 38
Tristates (2)

• Recommended circuits:
– Tristate bus with central control of all tristate enable signals and one
additional driver that is activated on non-controlled states

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 39

Tristates vs. Multiplexer

Tristates: Multiplexer:

– large area – small area


– limited buffering – efficient routing
– large routing load Æ slow

• Control decoding expense is the


same for tristates and
multiplexers.
• Æ Multiplexers are more
favourable

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 40
Parallel Signals

• NON-RECOMMENDED CIRCUITS:
– Wired-OR part used to create higher fanout:

• Recommended Circuits:
– High-fanout buffer replacing wired OR part

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 41

Fanout (1)

• NON-RECOMMENDED CIRCUITS:
– Excessive fanout on
control signals:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 42
Fanout (2)

• Recommended circuits:
– Geometric buffering
on control signal:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 43

Fanout (3)

• Recommended circuits:
– Tree buffering
on control signal:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 44
Design for Speed (1)

• Use a maximum of 2 inputs on all combinational logic gates:

• Use AOI logic (complex cells from standard cell library) where
possible. The figure below shows a multiplexer using AOI logic:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 45

Design for Speed (2)

• Feed late changing inputs late into combinational logic:

• Use shift (Johnson) counters instead of binary counters:


q0 q1 q2 q3
0 0 0 0
1 0 0 0
1 1 0 0
1 1 1 0
1 1 1 1
0 1 1 1
0 0 1 1
0 0 0 1
Institute of 0 0 0 0
Microelectronic
18: ASIC Design Guidelines Systems 46
Design for Speed (3)

• Use duplicate logic to reduce fanout:

• Use fast library cells where available


• Reduce length of critical signal paths
• Use Schmitt trigger inputs in noisy environments

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 47

Design for Testability (1)

• Testability = Controllability + Observability


• NON-RECOMMENDED CIRCUITS:
– Circuit with inaccessible internal logic: only the first block is
controllable, and only the last block is directly observable

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 48
Design for Testability (2)

• Recommended circuit:
– Insert test inputs and outputs

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 49

Design for Testability (3)

• NON-RECOMMENDED CIRCUITS:
– Chain of counters: first counter is not directly observable and
second counter is not directly controllable

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 50
Design for Testability (4)

• Recommended circuit:
– Break long counter / shift register chains
– Chain of counters broken by test input tc and output signals:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 51

Design for Testability (5)

• NON-RECOMMENDED CIRCUITS:
– Counter with closed feedback loop: initial state is not known

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 52
Design for Testability (6)

• Recommended circuit:
– Open feedback loops
– Counter with feedback loop opened by test control tr and output
signals:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 53

Design for Testability (7)

• Recommended circuits:
– Use BIST (Built-In-Self-Test) with compiled megacells
– Compiled megacell with compiled inputs/outputs:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 54
Design for Testability (8)

• Recommended circuits:
– Scan path testing
– E-type scan path flip-flop (right):
– Circuit with scan path (below):

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 55

Design for Testability (9)

• Recommended circuits:
– Use of JTAG boundary scan path
– JTAG test circuitry:

Institute of
Microelectronic
18: ASIC Design Guidelines Systems 56
19. Testing and
Design for Testability

Institute of
Microelectronic
Systems

Motivation

• Stable chip manufacturing costs


• Increasing testing costs:
– Increasing number of gates/device
– Limited number of pins
– Æ Increasing number of internal states
– Æ Increasing logical and sequential depth
• Example: n time for test
– Testing of a combinational 25 3 s
circuit with n inputs 30 107 s

(10 MHz, one test per cycle) 40 1 day


50 3,5 years
• Testability has to be considered in all 60 3656 years

phases of design
Institute of
Microelectronic
19: Testing Systems 2
Economical Considerations (1)

• Average Quality Level (AQL):

# DevectiveParts
aql =
# AcceptedParts

Institute of
Microelectronic
19: Testing Systems 3

Economical Considerations (2)

• Correlation: Fault Coverage and Defective Parts

Institute of
Microelectronic
19: Testing Systems 4
Economical Considerations (3)

• Correlation: Fault Coverage and Defective Parts

– DL(=AQL): Defect Level; Number of defective circuits which have


been classified as correct working (testing with T )
– Y: yield
– T: fault coverage

DL = 1 − Y 1−T

Institute of
Microelectronic
19: Testing Systems 5

Economical Considerations (3)

Defect level as function of yield and fault coverage

Institute of
Microelectronic
19: Testing Systems 6
Design Flow: Testing (1)

Institute of
Microelectronic
19: Testing Systems 7

Design Flow: Testing (2)

• Chip Test after Manufacturing:

Manufacturing Process

Parametric Test (current/power dissipation)


(erroneous chips are marked with color points and removed after sawing)

Chip Test on Tester

Institute of
Microelectronic
19: Testing Systems 8
Fundamental Definitions

• Relationship between faults, errors and failures:

fault error failure

• Fault: physical defect, imperfection or flaw which occurs in a


hardware or software component
• Error: manifestation of a fault (erroneous information on a
hardware line or in a program, caused by a fault)
• Failure: malfunction of a system
• Three-universe model of a system:

Physical Informational External


Universe Universe Universe
Faults Errors Failures
Institute of
Microelectronic
19: Testing Systems 9

Fault Models (1)

• Basis: physical phenomena • Examples for physical faults:


– Oxide defects
– Missing implants
– Lithographic defects
– Junction defects
– Metal shorts & opens
– Moisture accumulation
– Impurities / Contaminations
– Static discharge

Institute of
Microelectronic
19: Testing Systems 10
Fault Models (2)

Institute of
Microelectronic
19: Testing Systems 11

Fault Models for Gates (1)

PHYSICAL LOGICAL
(analog) (digital)

• The GATE model: Stuck-at


– stuck @0
– stuck @1
– 1 fault at a time (single-stuck)

Institute of
Microelectronic
19: Testing Systems 12
Fault Models for Gates (2)

• Issue: complexity
– as 1 model .......................
• 12 faults

– as 12 gates ......................................................
• 30 (collapsed) faults
• 12x larger netlist
• Æ 30x computation

– as 60 transistors ................
• 90 (collapsed) faults
• 60 transistors
• Æ 400x computation

Institute of
Microelectronic
19: Testing Systems 13

Fault Models for Gates (3)

• The controversy:
– IBM: comprehensive stuck-at Æ no empirical need for MOS fault
models
– UNISYS: MOS model required for < 1% AQL

Institute of
Microelectronic
19: Testing Systems 14
Fault Models for Gates (4)

• The MOS problem: Gates Æ Memory

• Example: the output floats ..................................


– Fault-free: C always driven
– Fault: C un-driven;
Set Test
assumes last value;
branch A B A B
sequential !
a 0 0 1 1
• Æ Need 2-pattern test ........... 0 1 1 1 Anything works !
– set C to opposite 1 0 1 1

– test b 1 1 0 1
c 1 1 1 0

Institute of
Microelectronic
19: Testing Systems 15

Fault Tolerant Design (1)

• Fault tolerance achieved by redundancy techniques:


– Duplication with Complementary Logic
– Self-Checking Logic
– Reconfigurable Array Structures

Fault detection by
duplication with
complementary logic

Institute of
Microelectronic
19: Testing Systems 16
Fault Tolerant Design (2)

4-by-4 array with one spare column

Institute of
Microelectronic
19: Testing Systems 17

Fault Tolerant Design (3)

Reconfigured array

Institute of
Microelectronic
19: Testing Systems 18
Test Pattern Generation (1)

• manually
• pseudo random (leads up to 60% fault coverage)
• algorithmic
• special test patterns for RAMs

• fault coverage sufficient ?


Æ fault simulation

Institute of
Microelectronic
19: Testing Systems 19

The D-Algorithm (1)


• Every test generation procedure has to solve the following problems:
– Creation of a change at the faulty line
– Propagation of the change to the primary output line
• In the D-Algorithm the symbols D and D are used to refer to the
changes. D and D are used as follows:
– D : used if a line has the value 1 in absence of a fault and the value 0 in case
of a fault occurrence
– D :used if a line has the value 0 if no fault occurs and otherwise the value 1
• The D-algorithm method for path sensitization consists of two principal
phases:
– forward drive (propagation) of an D-value to an primary output
– backward trace (consistency operation)
• These two steps are iterated for different propagation paths for the D-
value from one dedicated internal point i to one dedicated primary output
point o until the backward trace phase is finished without any
contradiction (a test vector for a fault at i has been found) or until all
possible paths from i to o have been examined.

Institute of
Microelectronic
19: Testing Systems 20
The D-Algorithm (2)

Basic concept of D-algorithm

Institute of
Microelectronic
19: Testing Systems 21

The D-Algorithm (3)

• A primitive D-cube of a failure is a D-cube associated with a fault l / α


on the output line l of a gate G. This produces the value D or D on l and
the input lines have values which would produce α in the fault-free case.

Primitive D-cube of fault (pdcf) for two-input NAND gate


Institute of
Microelectronic
19: Testing Systems 22
The D-Algorithm (4)

• A propagation D-cube of a failure specifies the propagation of changes


at one (or more) inputs of a gate G to its inputs l.

Propagation D-cube (pdc) for two-input NAND gate

Institute of
Microelectronic
19: Testing Systems 23

The D-Algorithm (5)

• A singular cover of a gate G is a {0, 1, X} truth table representation


of G.

Singular cover for two-input NAND gate

Institute of
Microelectronic
19: Testing Systems 24
The D-Algorithm (6)

Singular covers for several basic logic gates

Institute of
Microelectronic
19: Testing Systems 25

The D-Algorithm (7)

Construction of the
singular cover of a
logic module

Institute of
Microelectronic
19: Testing Systems 26
D-Algorithm Example (1)

• In the following the D-Algorithm is illustrated for the example


circuit given below:

Institute of
Microelectronic
19: Testing Systems 27

D-Algorithm Example (2)

Propagation D-cube table

Institute of
Microelectronic
19: Testing Systems 28
D-Algorithm Example (3)

Singular cover table

Institute of
Microelectronic
19: Testing Systems 29

D-Algorithm Example (4)

D-cube intersection table

Institute of
Microelectronic
19: Testing Systems 30
D-Algorithm Example (5)

• Running the D-Algorithm for generating a test for line 5/0:


1) Start with D-cube for the fault 5/0:

2) The D of line 5 is automatically propagated to line 6 and 7 by cube j


3) Now the propagation along path 6 Æ 9 Æ 11 is considered: D on
line 6 is propagated to line 9 by cube d. Combining d and k yields
cube l:

Institute of
Microelectronic
19: Testing Systems 31

D-Algorithm Example (6)

• Running the D-Algorithm (continued):


4) If cube i is used with D instead of D, the propagation to the output
can be done:

5) Now the consistency phase is started and a value for line 4 has to
be found. From the singular cover table it can be seen that a 0 on
line 10 implies both line 7 and line 8 to be 1. In cube m line 7 is a D
(and also line 5 which is connected to 7 by j), and this D must now
be set to 1 which is a contradiction that disables the path
sensitization 5 Æ 6/7 Æ 9 Æ 11.

Institute of
Microelectronic
19: Testing Systems 32
D-Algorithm Example (7)

• Running the D-Algorithm (continued):


6) Starting the propagation along 5 Æ 7 Æ 10 Æ 11 leads to the
following cube:

7) From the singular cover table we get the information that a 1 on line
8 is the same as a 0 on line 4. Additionally, it can be seen that the 0
on line 9 can be obtained by a 1 on line 1.
8) This yields the final cube:
1110DDD10DD
9) Î A test vector for line 5/0 is given by:
1110

Institute of
Microelectronic
19: Testing Systems 33

Fault Simulation

• Algorithms: Serial Fault Simulation

• Improved Algorithms:
– Parallel Fault Simulation
– Concurrent Fault Simulation
Æ discussed in CAD lecture

Institute of
Microelectronic
19: Testing Systems 34
Design for Testability (1)

• Circuit level: restriction of physically possible faults


• Logic level: restrict possibilities of realizations
• System level: restrict size of component and number of states

Testability:
• controllability
• observability
• Æ additional chip area required
• Æ shorter design cycle

Methods to improve controllability and observability:


• ad-hoc techniques
• structured approaches

Institute of
Microelectronic
19: Testing Systems 35

Design for Testability (2)

Design for testability: complex gate (a) not testable with stuck-at model;
(b) fully testable with stuck-at model

Institute of
Microelectronic
19: Testing Systems 36
Design for Testability (3)

• Ad-Hoc Techniques:
– developed for special design
– less silicon area
– design automation almost impossible
– partitioning (test of circuit components by use of dedicated
multiplexers)

Institute of
Microelectronic
19: Testing Systems 37

Design for Testability (4)

Ad-hoc techniques: partitioning for testability

Institute of
Microelectronic
19: Testing Systems 38
Design for Testability (5)

A-hoc techniques:
insertion of register in order to limit logic depth to a given maximum value

Institute of
Microelectronic
19: Testing Systems 39

Design for Testability (6)

Ad-hoc techniques :
test shift registers for PLA test (increasing PLA area)

Institute of
Microelectronic
19: Testing Systems 40
Scan-Path Methods (1)

• Main idea: test of sequential network is reduced to test of combinational


network
• for circuits consisting of logic with some feedbacks
• can be realized by reconfiguration of latches as shift registers (two
modes of use)

Feedback logic with


scan-path

Institute of
Microelectronic
19: Testing Systems 41

Scan-Path Methods (2)

• Test scan-path / register function first:


– Flush test ( 0...010...0 ) or
– Shift test ( 00110011... ) (each register transfer is tested by this
combination: 0Æ0, 0Æ1, 1Æ1, 1Æ0 ).

• Cycle for testing combinational logic function:


1) Scan mode: Preload Y and set PI
2) System operation mode: Wait until inputs of Y are steady. Clock
new state into Y.
3) Shift state out. Compare PO and state values with expected
responses.

Institute of
Microelectronic
19: Testing Systems 42
Scan-Path Methods (3)

• Advantages:
– Testability of clocked circuits is improved and guaranteed at design
stage
– Consistent with good VLSI design practice (rules, abstraction,
modularity, ...)
– Does not require special CAD
• Disadvantages:
– Wastes silicon
– Constrains designer to design according given conditions
– Additional complexity
• Overhead:
~
– 2% for a fundamentally ‘structured’ design
~
– 30% for ‘wild’ logic
Institute of
Microelectronic
19: Testing Systems 43

Built-In Tests (1)

• System generates test vectors by its own


• Analysis and evaluation of test vectors is also automatically
done
• Compromise: silicon ÅÆ testability

Test Pattern Generators:


• Test patterns are generated inside the circuit to be tested
• Short design time, simple test programs, self-test
• Example: Test pattern memories, deterministic generators,
counter

Institute of
Microelectronic
19: Testing Systems 44
Built-In Tests (2)

Two examples for built-in test pattern generators

Institute of
Microelectronic
19: Testing Systems 45

Built-In Tests (3)

• Pseudo Random Number Generators:


– used as pseudo random pattern generator

xi (t ) = xi −1 (t − 1) für 2 ≤ i ≤ n
n
xi (t ) = ∑ ki * ( xi (t − 1)) (mod 2)
i =1

K ( x) = k n x n + k n −1 x n −1 + L + k1 x + k0

Institute of
Microelectronic
19: Testing Systems 46
Built-In Tests (4)

• Pseudo Random Number Generators:


– Example for pseudo random pattern generator:

K ( x) = x 4 + x + 1

Institute of
Microelectronic
19: Testing Systems 47

Evaluation of Testing Data (1)

• Evaluation of testing results inside the circuit


• Counting techniques, signature analysis

Example: Counting techniques for test data evaluation

1
F ≈ 1−
m *π

Institute of
Microelectronic
19: Testing Systems 48
Evaluation of Testing Data (2)

• Signature analysis
– Communication technique: coding theory
– Code words: data stream D, polynomial P(x), division modulo 2

D R
=Q+
P P

– Æ Evaluation of testing data

Institute of
Microelectronic
19: Testing Systems 49

Evaluation of Testing Data (3)

Example: Test data evaluation by signature analysis

Institute of
Microelectronic
19: Testing Systems 50
Evaluation of Testing Data (4)

• Signature analysis: Degree of Fault Recognition


1) Length of sequence: m bit → 2 sequences possible
m

2) One sequence contains no faults Æ number of erronous sequences


is 2 − 1
m

3) Length of signature register: n bit → 2 signatures


n

4) 2 m sequences are mapped on 2 n signatures Æ number of non-


detectable faults is: 2m m −n
−1 = 2 −1
2n
5) Possibility for non-detection of erronous sequence: number of non-
detectable faults divided by number of possible faults: 2m−n − 1
N=
m−n
2 −1 2m − 1
6) Fault detection rate: F = 1− m
2 −1
F ≈ 1 − 2n
Institute of
Microelectronic
19: Testing Systems 51

Evaluation of Testing Data (5)

• Interpretation:
– all faults recognized if m < n (trivial)
– long sequences: n is important only
– n = 16 bit Æ F = 99,99985%
2 mk − n − 1
• Parallel signature register with k inputs: F = 1 − mk
2 −1

Institute of
Microelectronic
19: Testing Systems 52
Built-in Logic Block Observation (1)

• A BILBO register is a universal element for use in either a scan-path


environment or a self-test (signature analysis) environment.

BILBO register: 1. full circuit, 2. normal use, 3. scan-path, 4. signature analysis


Institute of
Microelectronic
19: Testing Systems 53

Built-in Logic Block Observation (2)

• Advantages:
– Versatility
• Normal operation
• Scan-path test: enhances testability
• Test vector generation via LFSR
• Data compression via LFSR
• Combined scab-path/self-test using LFSRs
• Disadvantages:
– silicon area
• Bilbo latch can be ≈ 50% larger than ordinary latch

Institute of
Microelectronic
19: Testing Systems 54
Built-in Logic Block Observation (3)
feedback disconnect:
open in test mode

decoder

binary up-counter
go / no go
output
Test Clock
pass gate

red LED,
For clarity, mode control lines, normal green LED
system clocks, and preset/clear facilities
have been omitted

Example: Self-testing circuit


Institute of
Microelectronic
19: Testing Systems 55
JTAG Standard

Chapter 20

Boundary-Scan Architecture –
JTAG Standard

• miniaturization of electronic components, multilayer and surface mount techniques make


test of boards more complicate
⇒ requirement of design-integrated test structures

• 1985 first meeting of small group from European electronics companies

• later North American companies joined the group (→ Joint Test Action Group = JTAG)

• results: IEEE Standard Test Access Port and Boundary-Scan Architecture

VLSI Design
Course 20-1
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Classical Board Test Approaches

20.1 Classical Board Test Approaches

Figure 20.1: In-circuit test using bed-of-nails

Figure 20.2: Functional test using board connector

VLSI Design
Course 20-2
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Classical Board Test Approaches

Figure 20.3: Combined use of in-circuit and functional test

Disadvantages of classical approach:

• high costs for test hardware

• increased density

• not suited for surface mount technology

• modern chip testing techniques as

– scan path techniques


– built-in self-test techniques (BIST)/BILBO

are not exploited well

VLSI Design
Course 20-3
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Introduction to Boundary Scan

20.2 Introduction to Boundary Scan

Scan-testing at the board-level:

• permits use of automatic test pattern generation tools

• simplification of the hardware of the test equipment

Figure 20.4: Scan design at the board level

VLSI Design
Course 20-4
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Introduction to Boundary Scan

Figure 20.5: Testing for interconnection faults

Input Output
Expected Actual
x1x1x0xxxxxx xxxxxxxx01x1 xxxxxxxx11x0
x0x0x1xxxxxx xxxxxxxx10x0 xxxxxxxx11x0

VLSI Design
Course 20-5
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Introduction to Boundary Scan

Figure 20.6: Testing on-chip logic

Input Expected Output


x10xxxxx xxxxx1xx
x01xxxxx xxxxx1xx
x11xxxxx xxxxx0xx

Boundary scan application properties and limitations

• each test vector has to be shifted into scan path


⇒ not very suitable for testing the chips themselves because of reduced test rate com-
pared to stand-alone chip testing

• well suited for interconnection testing

• testing of dynamic behaviour impossible

• self-testing ICs: boundary scan can be used to trigger the self-test procedure

VLSI Design
Course 20-6
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1

20.3 The IEEE Standard 1149.1

20.3.1 IEEE Std 1149.1 Architecture

Figure 20.7: IEEE Std 1149.1 test logic

• TAP Controller: responds to the control sequences supplied through the test access port
(TAP) and generates the clocks an control signals required for the operation of the other
circuit blocks

• Instruction Register: shift register which is serially loaded with instruction for test

• Test Data Registers: Bank of shift registers. The stimuli values required for a test are
serially loaded into a test register selected by the current instruction. After execution
the results can be shifted out for examination

VLSI Design
Course 20-7
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1

Figure 20.8: Test data registers

20.3.2 Test Access Port

• Test Clock Input (TCK): independent of the system clock; used for synchronization of
test operations between various chips on a board

• Test Mode Select Input (TMS): Input for controlling the test logic

• Test Data Input (TDI): Serial input for instruction and test register data

• Test Data Output (TDO): Serial output of instruction or test register data (source se-
lected by TMS code)

• Optional Test Reset Input (TRST∗): For test initialization

VLSI Design
Course 20-8
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1

Figure 20.9: Serial connection of IEEE Std 1149.1-compatible ICs

Figure 20.10: Parallel connection of IEEE Std 1149.1-compatible ICs

VLSI Design
Course 20-9
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1

Control of the test signals

• by external automatic test equipment (ATE) or

• by on-board bus master chip

Figure 20.11: Use of bus master chip to control IEEE Std 1149.1 chips

20.3.3 TAP-Controller

• 16-state FSM which controls data register (DR) and instruction register (IR) operations

• input signals:

– TRST∗
– TCK
– TMS
– last state (stored in internal FFs)

VLSI Design
Course 20-10
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1

• output signals:

– Reset*
– Select
– Enable
– ShiftIR
– ClockIR
– UpdateIR
– ShiftDR
– ClockDR
– UpdateDR

20.3.4 The Instruction Register

Figure 20.12: Daisy-chain connection of instruction registers

Figure 20.13: Instruction register

VLSI Design
Course 20-11
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1

Figure 20.14: An example instruction register cell (stage)

VLSI Design
Course 20-12
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1

20.3.5 Test Data Registers

Test data registers:

• bypass register (mandatory)

• boundary scan register (mandatory)

• device identification register (optional)

Bypass Register

Figure 20.15: Example design for bypass register

Figure 20.16: Use of bypass register

VLSI Design
Course 20-13
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1

Basic Boundary Cells

Figure 20.17: Provision of boundary-scan cells

VLSI Design
Course 20-14
Darmstadt University of Technology
Institute of Microelectronic Systems 0
The IEEE Standard 1149.1

Figure 20.18: Basic boundary-scan cell for input pin

Figure 20.19: Basic boundary scan cell for output pin

VLSI Design
Course 20-15
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog Signal Processing

Chapter 21

Analog VLSI systems

21.1 Analog Signal Processing

Typical signal processing applications require mixed analog/digital implementations. These


mainly consist of

• Preprocessing of the signals, e.g. filtering and A/D conversion

• Digital signal processing, e.g. digital filtering, calculation of FFT

• Postprocessing, e.g. D/A conversion

as shown in Fig.21.1
The aim of development is to integrate all these functions on a single chip.

Figure 21.1: Block diagram of a typical signal processing system

VLSI Design
Course 21-1
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog Signal Processing

21.1.1 Signal Bandwidths in Analog VLSI

Figure 21.2: Bandwidths of signals used in signal processing applications

Figure 21.3: Signal bandwidths that can be processed by present day (1989)
technologies

VLSI Design
Course 21-2
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog Signal Processing

21.1.2 A/D and D/A Conversion in Signal Processing Systems

Fig. 21.4 illustrates how analog-to-digital (A/D) and digital-to-analog (D/A) converters are
used in data systems. In general, an A/D conversion process will convert a sampled and
held analog signal to a digital word that is a representative of the analog signal. The D/A
conversion process is essentially the inverse of the A/D process. Digital words are applied to
the input of the D/A converter to create from a reference voltage an analog output signal that
is a representative of the digital word.

Figure 21.4: Converters in signal processing systems: (a) A/D, (b) D/A

VLSI Design
Course 21-3
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Digital-To-Analog Converters

21.2 Digital-To-Analog Converters

Input to D/A converters are

(a) a digital word of N bits (b1 , b2 , b3 , . . . , bN )


(b) a reference Voltage Vref

The output voltage can be expressed as


VOU T = KVref D (21.1)
where K is a scaling factor and D is given as
b1 b2 b3 bN
D= 1
+ 2 + 3 + ... + N (21.2)
2 2 2 2
Thus, the output of a D/A converter can be expressed by
N
bi 2−i
X
VOU T = KVref (21.3)
i=1

Figure 21.5: (a) Conceptual block diagram of a D/A converter, (b) Clocked
D/A converter

In most cases, the digital input of the D/A converter is synchronously clocked. It is therefore
necessary to provide a latch to hold the word for conversion and a sample-and-hold circuit at
the output, as shown in Fig. 21.5(b).
The basic architecture of the D/A converter without an output sample-and-hold circuit is
shown in Fig. 21.7. Fig. 21.8 shows the ideal input-output characteristics for such a D/A
converter.

21.2.1 Current Scaling D/A Converters

The output Voltage of a current-scaling D/A converter as shown in Fig. 21.9 can be expressed
as
R R b1 b2 b3 bN
 
Vout = − I0 = − + + + . . . + N −1 Vref (21.4)
2 2 R 2R 4R 2 R
= −Vref (b1 2−1 + b2 2−2 + b3 2−3 + . . . + bN 2−N ) (21.5)

VLSI Design
Course 21-4
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Digital-To-Analog Converters

Figure 21.6: (a) Sample-and-hold circuit, (b) Waveforms illustrating the op-
eration of the sample-and-hold circuit

Figure 21.7: Block diagram of a D/A converter

VLSI Design
Course 21-5
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Digital-To-Analog Converters

Figure 21.8: Ideal input-output characteristics for a 3-bit D/A converter

The major disadvantage of this approach is the large ratio of component values. For example,
the ratio of the resistor for the MSB to the resistor for the LSB is given by
RM SB 1
= N −1 (21.6)
RLSB 2

For a 8-bit converter, this gives a ratio of 1/128.


An alternative to this approach is the use of a R-2R ladder as shown in Fig. 21.10. Using the
fact that the resistance to the right of any of the vertical 2R resistors is 2R, we see that the
currents I1 , I2 , I3 , . . . , IN are binary-weighted and given as

I1 = 2I2 = 4I3 = . . . = 2N −1 IN (21.7)

Thus, the output voltage of the R-2R D/A converter is given by Eq. 21.5.

VLSI Design
Course 21-6
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Digital-To-Analog Converters

Figure 21.9: (a) Conceptual illustration of a current-scaling D/A converter,


(b) Implementation of (a)

Figure 21.10: A current-scaling D/A converter using an R-2R ladder

VLSI Design
Course 21-7
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Digital-To-Analog Converters

21.2.2 Voltage Scaling D/A Converters

A voltage-scaling D/A converter is shown in Fig. 21.11. Its output voltage at any tap i can
be expressed as
Vref
Vi = (i − 0.5) (21.8)
8
The output voltage of the D/A converter is then determined by the values of the inputs b1 ,
b2 and b3 .

Figure 21.11: Illustration of a voltage-scaling D/A converter

The structure of this voltage-scaling D/A converter is very regular and thus well suited for
MOS technology. A problem with this type of D/A converters is the accuracy requirements
of the resistors used. This makes it difficult to build D/A converters of this type with more
than 8 bit resolution.

VLSI Design
Course 21-8
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters

21.3 Analog-To-Digital Converters

The objective of an A/D converter is the determination of the digital word corresponding to
the analog input signal. Usually a sample-and-hold circuit (see Fig. 21.6) is required at the
input of the A/D converter because it is not possible to convert a changing analog signal. A
block diagram of a general A/D converter is shown in Fig. 21.12. The ideal input-output
characteristics for a A/D converter are shown in Fig. 21.13.

Figure 21.12: Block diagram of a general analog-to-digital converter

Figure 21.13: Ideal input-output characteristics for a 3-bit A/D converter

VLSI Design
Course 21-9
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters

21.3.1 Serial A/D Converters

Two possible implementations of serial A/D converters are single-slope and dual-slope A/D
converters. Both will not be discussed in detail here. The main advantages of these converters
is their simplicity, their main disadvantage is the long conversion time required.

21.3.2 Successive Approximation A/D Converters

This type of A/D converters converts an analog input into an N-bit digital word in N clock
cycles. Consequently, the conversion time is less than for the serial converters without much
increase in the complexity of the circuit. Fig. 21.14 shows an example of a successive approx-
imation A/D converter architecture.

Figure 21.14: Example of a successive approximation A/D converter archi-


tecture

The successive approximation process is shown in Fig. 21.15.

VLSI Design
Course 21-10
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters

Figure 21.15: The successive approximation process

21.3.3 Parallel A/D Converters

In many applications, it is necessary to have a smaller conversion time than is possible with
the previously described A/D converter architectures. Parallel A/D converters, also known as
flash A/D converters, typically require down to one clock cycle for conversion. An architecture
of a 3-bit parallel A/D converter is shown in Fig. 21.16.
Parallel A/D converters can reach typically up to 20 MHz for CMOS technology. The sample-
and-hold time may though be larger than 50 ns and could prevent this conversion time from
being realised. Another problem is that the number of comparators required is 2N −1 . For N
greater than 8, too much area is required.
One method of achieving small system conversion times is to use slower A/D converters in
parallel, which is called time-interleaving and is shown in Fig. 21.17. Here M successive
approximation A/D converters are used in parallel to complete the N -bit conversion of one
analog signal per clock cycle. The sample-and-hold circuits consecutively sample and apply
the input analog signal to their respective A/D converters. N clock cycles later, the A/D
converter provides a digital word output. If M = N , then a digital word is given out every
clock cycle. If one examines the chip area for an N -bit A/D converter using the parallel A/D
converter architecture (M = 1) compared with the time-interleaved architecture for M = N ,
the minimum area will occur for a value of M between 1 and N .

VLSI Design
Course 21-11
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters

Figure 21.16: A 3-bit parallel A/D converter

VLSI Design
Course 21-12
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters

Figure 21.17: A time-interleaved A/D converter array

VLSI Design
Course 21-13
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters

21.3.4 Sigma-Delta A/D Converter

Introduction

The basic structure of a sigma-delta converter is shown in Fig. 21.18. The sigma-delta con-
verter can be referred to as an oversampling converter, although oversampling is just one of
the techniques contributing to the performance of a sigma-delta converter. The sigma-delta
converter shown in Fig. 21.18 quantizes an analog signal with very low resolution (1 bit) and
a very high sampling rate (2 MHz). With the use of oversampling techniques and digital
filtering, the sampling rate is reduced (8 kHz) and the resolution is increased (16 bits).

Figure 21.18: Basic structure of a sigma-delta converter

A more detailed block diagram of the sigma-delta modulator is shown in Fig. 21.19. It consists
of an integrator, a quantizer (comparator for 1 bit) and a feedback loop with a D/A converter
(switch for 1 bit). The output of the sigma-delta modulator is shown in Fig.21.20 for a sine
wave input. The single-bit conversion will result in an output which is either ’1’ or ’0’. When
the signal is near plus full scale, the output is positive during most of the clock cycles. The
opposite is true for near minus full scale signals. When the output is followed by a digital
filter as shown in Fig. 21.18 which can perform sophisticated averaging functions, the 1-bit
sequence is transformed into a much more meaningful signal.

Figure 21.19: First-order sigma-delta modulator block diagram

Noise Shaping

One feature that makes the sigma-delta converter so powerful is its noise shaping capability.
To understand how this works, the analysis of the sigma-delta modulator in the frequency
domain is appropriate. Fig.21.21 shows the frequency domain linearized model of a sigma-
delta modulator.

VLSI Design
Course 21-14
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters

Figure 21.20: Output of first-order sigma-delta modulator

Figure 21.21: Frequency domain linearized model of a sigma-delta modula-


tor

VLSI Design
Course 21-15
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters

The integrator is represented as a analog filter. For an integrator, the transfer function has
an amplitude which is inversly proportional to the input frequency ( f1 relationship). The
quantizer is modelled as a gain stage followed by the addition of quantization noise.
Thus, the output y of the sigma-delta converter can be expressed by
1
y = (x − y) +q (21.9)
f

where (x − y) is the difference signal from the summing node at the input and q is the
quantization noise. Applying some algebraic rearrangement yields
x y
y = − +q
f f
1 x
 
1+ y = +q
f f
x
f q
y = 1 + 1
1+ f 1+ f
x qf
y = + (21.10)
f +1 f +1

At a frequency f = 0, the output signal equals x with no noise element q. At higher frequencies,
the value of x is reduced and the influence of q increases. In essence, the sigma-delta modulator
has a low pass effect on the signal and a high pass effect on the noise. As a result of this,
the modulator can be thought of as a noise shaping filter where noise in the signal pass band
is reduced and noise energy is pushed into the higher frequency region. The effect of this
procedure on normally equally distributed (white) quantization noise is shown in Fig. 21.22.

Figure 21.22: Noise-shaping filter function

VLSI Design
Course 21-16
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Analog-To-Digital Converters

Digital Filtering

The sigma-delta modulator described so far produces a stream of single-bit digital values at
a very high rate. The modulator’s output bit stream is fed into the converter’s digital filter,
which performs several different functions. All of these functions, however, are integrated into
a single filter implementation. The functions of the filter are:

• sophisticated averaging (low pass filtering)

• removing high frequency noise (quantization noise)

• reducing sampling rate

The sampling rate reduction is done by averaging over a sample of cycles of the input bit
stream and produces an output data stream that is reduced in sampling rate, but increased
in resolution (i.e. number of bits per sample).

Advantages of Sigma-Delta Converters

The advantages of the sigma-delta converter technology are

• Sigma-delta converters are a complete conversion and filtering system, additional digital
filtering functions may easily be implemented in the digital output filter of the converter

• Very low-cost and high-performance conversion ist possible as the analog part of the
converter is very simple and need not be as accurate as in other A/D converters. The
main part of the converter is the digital filter which can be integrated more easily in
MOS technology.

• excellent signal-to-noise performance, therefore high resolution converters possible

• no sample-and-hold circuit preceeding the converter is neccessary as sampling rates are


very high

VLSI Design
Course 21-17
Darmstadt University of Technology
Institute of Microelectronic Systems 0
22. VLSI in Communications

Institute of
Microelectronic
Systems

State-of-the-art
RF Design,
Communications
and DSP Algorithms VLSI Design
Design

Isolated goals results in:

- higher implementation costs


- long transition time between
system level design and final
implementation

• Power-Area-Speed optimization
• Performance
• Flexibility
Improvement
• Risk minimization

Institute of
Microelectronic
22: VLSIinCOMMS Systems 2
Trends
RF Design,
Communications
and DSP Algorithms VLSI Design
Design

Trade-off during
the development
(Interdisciplinary Issue)

Performance vs. VLSI-relevant design aspects

• Algorithmic transformation techniques


• Architectural transformation techniques
• Low-power design of baseband processing
• Low-power RF design
• Analog-digital co-design methodologies
• High-speed low power AD/DA converters
Institute of
Microelectronic
22: VLSIinCOMMS Systems 3

Challenge
Towards complex on-chip wireless system design
• The increasing communication and multimedia processing can cope
with the high integration density of microelectronic circuits

1. Key ingredients
- maximize digital components
- minimize analog, passive elements (i.e. Simplification of design
requirements of analog components, moving to digital processing
as early as possible)
- Low power design techniques

2. Analog portions continue to dominate power consumption


Example: DS-CDMA RX (32 MHz chip) realized in 2 chips using
0.8 µ CMOS

Analog front-end (amp, sampling, demod, AD/DA) 107 mW


Digital baseband signal processing 27 mW

Institute of
Microelectronic
22: VLSIinCOMMS Systems 4
Design Flow: Overview
Digital Baseband Processing
SOFTWARE
System Design
Code Generation Compiler
System Simulation and
Description Analysis Optimization
Digital
Modeling Language Frequency Spectrum
e.g. Matlab, C, C++, HARDWARE
SDL, ...
Hardware Synthesis
Description (RTL, High Level)
(VHDL, Verilog)
Graphical
Eye Pattern
Environment Placement &
Routing

Dataflow-oriented
(e.g. Signal Processing)
Bit Error Rate
Tools:Cossap, Simulink,
0
10

−1
10

−2
10
AWGN
6Mbps
9Mbps
12Mbps
18Mbps
27Mbps
36Mbps
54Mbps
Analog & RF Design
SPW, ...
Bit Error rate

−3
10

Analog
−4
10

Description
−5

Simulation
10

−6
10

−7
10
−5 0 5 10 15 20 25 30 35

(VHDL AMS,
C / N [dB]

(SPECTRE, ...)
Spice, ...)
Controlflow
State Diagramm Layout
-oriented
(e.g. Protocols) Generation

Tools: Statemate

ADC Data
RAM
DAC Path Cores
Goal Analog Control
(DSP,
RISC)
ROM
RF Logic

Institute of
Microelectronic
22: VLSIinCOMMS Systems 5

Overview: Generic Transceiver Architecture


I
ADC

LNA IF M ix e r
D u p le x e r M ix e r Dem od.
VCO IF P L L
D ig ita l B a s e b a n d
Q P ro c e s s in g
ADC
- D iv e rs ity R e c e p tio n
- E q u a liz a tio n (R L S , V ite rb i)
I - C h a n n e l C o d in g /D e c o d in g
DAC - V o ic e C o d in g /D e c o d in g
- In te rle a v in g /D e in te rle a v in g
- E n c ry p tio n /D e c ry p tio n
M o d u la to r
Pow er
T ra n s m it P L L
A m p lifie r
VCO

Q
ANALOG DAC DIGITAL

Institute of
Microelectronic
22: VLSIinCOMMS Systems 6
IC Technologies

Institute of
Microelectronic
22: VLSIinCOMMS Systems 7

IC Technologies for Communication


Applications

Which technology is the most suitable for future communication systems ?

Criteria: • Support of high frequencies


• Analog/digital integration capabilties
• High integration density
• Low RF and IF noise
• Low power consumption
• High gain

Portfolio of technologies: • Silicon CMOS, SOI, BiCMOS and BJT


• Silicon-Germanium(SiGe) HEMT and HBT
• Gallium-Arsenide (GaAs) MESFET, HEMT and
HBT

HBT: Hetero Bipolar Transistor; HEMT: High Electronic MobilityTransistor;


SOI: Silicon-On-Insulator; BJT: Bipolar Junction Transistor
Institute of
Microelectronic
22: VLSIinCOMMS Systems 8
IC Technologies (cont’d)

SILICON CMOS BiCMOS BJT

fT up to 30 GHz
fT up to 30 GHz fT up to 80 GHz
Features fMAX up to 40 GHz
fMAX up to 40 GHz fMAX up to 75 GHz
(0.15 µm)
• Digital baseband • Intermediate fre-
• IF and RF mo-
Application • Trends:RF, IF and quency (IF) mo-
dules
analog baseband dules
(1999)

CMOS is currently the best IC technology for single chip solutions (analog +
digital) for communication applications

CMOS technologies
Advantages: Mature technology, high integration density, cost-effective
Drawbacks: Bad noise figure, bad linearity, substrate parasitics

Institute of
Microelectronic
22: VLSIinCOMMS Systems 9

IC Technologies (cont’d)

CMOS RF Design: Example

• RF-frontend components (LNA,


mixer and VCO) developed using
standard CMOS processes

• Realization with separated dies

LNA = Low Noise Amplifier


VCO = Voltage-Controlled Oscillator
Source: Fraunhofer-Gesellschaft

Institute of
Microelectronic
22: VLSIinCOMMS Systems 10
IC Technologies (cont’d)
SILICON
GERMANIUM
(SiGe) HBT HEMT

fT up to 130 GHz fT up to 30 GHz


Features
fMAX up to 160 GHz fMAX up to 120 GHz

- LNA, PA, mixers, VCO, PLL


Application
- High speed DA and AD converters

Advantages: - Easy integration into standard silicon processes (BJT,


BiCMOS, CMOS)
- Improved frequency response
- Better cost/performance trade-off

Disadvantage: - Technology process not mature


Institute of
Microelectronic
22: VLSIinCOMMS Systems 11

IC Technologies (cont’d)

HBT SiGe RF Design: Example

(Source: Temic Semiconductors)


(Source: Temic Semiconductors)
DECT LNA & PA
Noise figure: 1.6 dB GSM PA
Gain: 26 dB @ 5.8 GHz Amplification: 32 - 36.5 dBm
Amplification: 27 dBm Vop = 2 - 5.5 V

Institute of
Microelectronic
22: VLSIinCOMMS Systems 12
IC Technologies (cont’d)
GALLIUM
ARSENIDE
(GaAs) MESFET HEMT HBT

fT up to 100 GHz fT up to 180 GHz fT up to 90 GHz


Features
fMAX up to 115 GHz fMAX up to 220 GHz fMAX up to 110 GHz

- Amplifiers (PA, LNA), mixers


Application
- Ultrafast DA and AD converters (Gigahertz sampling rates)

Advantages: - Good analog capabilities, high linearity, high-speed operations


Disadvantages: - Expensive process, technology process not mature (in compa-
rison to other processes such as CMOS)

Institute of
Microelectronic
22: VLSIinCOMMS Systems 13
23. Digital Baseband Design

Institute of
Microelectronic
Systems

Algorithm-to-VLSI Circuit Refinement


Algorithmus Algorithmus
(Floating Point) Tradeoff (SNR Loss, BER)
(Fixed Point)

og VHDL,
eril Verilog
DL, V
VH Memory Control

Behavioral Synthesis +
Behavioral Level Architectural Level
is
hes
0 State
For I=0 to I=15 y nt
lS 0 0
Sum = Sum + array[I]
ctura is
r c hite thes
0
A y n
LS
RT

Gate Level Circuit & Layout Level


Circuit Synthesis
Vdd
Layout Synthesis

Clk

C
Institute of
Microelectronic Gnd

23: Digital Baseband Design Systems 2


Digital Baseband: Design & Verification
Challenges

• Does my algoritm work?


• Does my algorithm work at certain bit data width? (Assess fixed point aspects)
• Refine the algorithm to improve the hardware implementation?
• How to capture the algorithm in a design environment (Software test bench and
IP environments, Hardware verification engine, Description language, ...)?
• Does the captured algorithm still working? How can I model the external world
to stimulate my design? How do I know that I have exercised my entire design?
How can I debug the software realization before developing the hardware?

Institute of
Microelectronic
23: Digital Baseband Design Systems 3

Floating Point to Fixed Point Number


Representation
• Translation “floating point to fixed point“ number representation rests upon a
tradeoff performance loss vs. bitwidth
• Simulation of quantization effects

E.g. Constraint
Gain Gain loss < 0.5 dB
Floating Point

8 bits
4 bits

Input Parameter

Institute of
Microelectronic
23: Digital Baseband Design Systems 4
Behavioral or High-Level Synthesis

•The fixed-point model can be can be captured as a behavioral model by using


a hardware description language like VHDL or Verilog

Behavioral Synthesis = Translation behavioral description structural or RTL


(Register Transfer Level) description

1. Ressoure allocation
2. Scheduling (! Introduction of timing infornation !)
3. Ressource assignment

TOOLS:
SYNOPSYS Behavioral
Compiler, Virtual Artist

Institute of
Microelectronic
23: Digital Baseband Design Systems 5

Algorithm-to-VLSI Circuit Refinement


Algorithmus Untimed
Algorithmus
(Floating Point) Tradeoff (SNR Loss, BER)
(Fixed Point)

og VHDL,
eril Verilog
DL, V
VH Memory Control

Behavioral Synthesis +
Behavioral Level Architectural Level
is
hes
0 State
For I=0 to I=15 y nt
lS 0 0
Sum = Sum + array[I]
ctura is
r c hite thes
0
A y n
LS
RT

Gate Level Circuit & Layout Level


Circuit Synthesis
Vdd
Layout Synthesis

Clk
Clocked C
Institute of
Microelectronic Gnd

23: Digital Baseband Design Systems 6


Behavioral or High-Level Synthesis (cont’d)
Algorithm For i = 0 ; i = 15
0% technology dependent sum = sum + data[I]

Mapping onto a technology

Architecture (e.g. 10% technology dependent)


i
Data[0] Data[0] Data[15]

or Untimed
Data[15]

Sum
Sum

Behavioral Synthesis

Clear
Register level MEM
(e.g. 20% technology dependent)
address Clocked
Clock
Clear
sum

Institute of
Microelectronic
23: Digital Baseband Design Systems 7

Low Power Design

• Power consumption in a digital CMOS circuit:

P ≈ α . C eff . Vdd2 . f clk

• Power reduction at different levels


- Logic level: Gated clocked, avoiding hazard generation, FSM encoding for low power
- Architectural level: Parallelization, pipelining

Institute of
Microelectronic
23: Digital Baseband Design Systems 8
Low Power Design

• Simple example:
– 4 bit counter

• „Conventional“ counters are binary coded:


– 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 ...
– A maximum of 4 bits change at the same time
High power consumption

• Alternative: Gray-code the numbers:


– 0000 0001 0011 0010 0110 0111 0101 0100 1100 1101 1111 1110 1010
1011 1001 1000
– Only one one bit changes per clock cycle
Reduced power consumption

Institute of
Microelectronic
23: Digital Baseband Design Systems 9

Binary Counter – Schematic

Avarage Power Consumption:


102µW@20MHz

Institute of
Microelectronic
23: Digital Baseband Design Systems 10
Gray-Counter – Schematic
Avarage Power
Consumption:
76µW@20MHz

Institute of
Microelectronic
23: Digital Baseband Design Systems 11

Low Power – 4 Bit Counter: Results

• By using Gray-coding instead of binary coding the power


consumption of the 4 bit counter has been reduced to:

76 µW
= 75%
102 µW

Institute of
Microelectronic
23: Digital Baseband Design Systems 12
Bibliography

Bibliography

[1] Joseph J. F. Cavanagh. Digital Computer Arithmetic - Design and Implementation.


McGraw-Hill, Inc., 1985.

[2] European Silicon Structures (ES2), Zone Industrielle, 13106 Rousset, France. Solo 2030
User Guide, e02a02 edition, June 1992.

[3] John P. Hayes. Computer Architecture and Organization. McGraw-Hill, Inc., 1988.

[4] Kai Hwang. Computer Arithmetic – Principles, Architectures, and Design. John Wiley
and Sons, 1979.

[5] W. Maly. Atlas of IC Technologies: An Introduction to VLSI Processes. The Ben-


jamin/Cummings Publishing Company, 1987.

[6] Jan M. Rabaey. Digital Integrated Circuits – A Design Perspective. Prentice Hall,
http://bwrc.eecs.berkeley.edu/Classes/IcBook/index.html.

[7] John P. Uyemura. Fundamentals of MOS Digital Integrated Circuits. Addison Wesley,
1988.

[8] John P. Uyemura. Circuit Design for CMOS VLSI. Kluwer Academic Publishers, 1992.

[9] Neil Weste and Kamran Eshraghian. Principles of CMOS VLSI design. Addison-Wesley
Publishing Company, 1985.

VLSI Design
Course 24-1
Darmstadt University of Technology
Institute of Microelectronic Systems 0
Appendix
Derivation of current
equations of short
channel devices
vS vG vD
iD iD
Gate (G)
Source (S) Drain (D)

n+ Channel Region n+

x
L
P-Type Substrate

Body (B)
iB
vB

There is a negative charge induced by the voltage VGS − VT under the oxide. The charge
per area can therefore be expressed as:

Q00 (x) = −(VGS − VT − V (x)) · Cox


00
(1)
00
where Cox means the capacity per area.
The drain current due to electron velocity v(x) can be expressed as:

ID = −Q00 (x) · v(x) · W = W · Cox


00
· (VGS − VT − V (x)) · v(x) (2)

For simplification the electron velocity v(x) with velocity saturation effect is assumed as:

−µn E(x)
v(x) = ¯ ¯ (3)
1 + ¯ E(x)
¯ ¯
Ec ¯
dV
with : E(x) = − (4)
dx
Combining equations (2) and (3) yields:

00 µn dV
dx
ID = W · Cox · 1 dV
· (VGS − VT − V (x)) (5)
1 + Ec dx

Equivalence conversion:
ID dV dV 00
ID + = µn · W · Cox · (VGS − VT − V (x)) (6)
Ec dx dx
ID 00
ID dx + dV = µn · W · Cox · (VGS − VT − V (x))dV (7)
Ec
ID 00 00
ID dx = − dV + µn · W · Cox · (VGS − VT )dV − µn · W · Cox · V (x)dV (8)
Ec
Integrating both sides yields:

ZL VDSµ
Z ¶
ID 00 00
ID dx = − + µn · W · Cox · (VGS − VT ) − µn · W · Cox · V (x) dV
Ec
0 0
ID 00 1 00 2
ID L = − VDS + µn · W · Cox · (VGS − VT )VDS − µn · W · Cox · VDS
Ec 2
µ ¶ µ ¶
VDS 00 1 2
ID L + = µn · W · Cox · (VGS − VT )VDS − VDS
Ec 2

For a NMOS device we get for the drain current including velocity satuartion:
µ ¶
1 00 W 1 2
ID = · µn · Cox · · (VGS − VT )VDS − VDS (9)
1 + VEDS
cL
L 2

For a PMOS device we get the corresponding formula:


µ ¶
1 00 W 1 2
ID = · µp · Cox · · (VGS − VT )VDS − VDS (10)
1 + VESD
cL
L 2

Equations (9) and (10) can be concluded as:


µ ¶
00 W 1 2
ID = κ(VDS ) · µ · Cox · · (VGS − VT )VDS − VDS
L 2
1
with : κ(VDS ) = |VDS |
1+ Ec L

This equation is true for both the NMOS and the PMOS device.
Exercises
1. Exercise: Short channel MOSFETs

VLSI- Design of Integrated Circuits

Institute of
Microelectronic
Systems

1. Problem: Short channel MOSFETs

• Complete the table on the next slide (calculate K‘)


• What is the value of κ for a long channel MOSFET?
• Estimate the drain current IDS for both MOSFETs in ohmic region
using the classical expression and using the velocity saturation
effect. Compare both results by calculating the percentage of
error between the results.
• Calculate the value of VDSAT and compare it with the classical
assumption that the device enters in saturation when VDS≥VGS-VT0
• Find an expression for the on-resistance of short channel devices
and estimate the on-resistance for both devices.

Institute of
1. Exercise: Short Microelectronic
Channel MOSFETs Systems 2
1. Problem: Short channel MOSFETs

Given the following parameters:


VT[V] K‘ [A/V2] µ [cm²/Vs]
NMOS 0.4 µn= 1.15* 104
PMOS -0.4 µp=3.00*103

COX= 10-8 F/cm2


|VGS|=0.6V
|VDS|=0.1V
L=0.25µm
W=0.75µm
EC= 1.5*106 V/m

Institute of
1. Exercise: Short Microelectronic
Channel MOSFETs Systems 3

1. Problem: Short channel MOSFETs

Formulas:

W ⎡ 2 ⎤
VDS
I DS = κ (VDS )µ ⋅ COX ⎢(VGS − VT )VDS − ⎥
L ⎣ 2 ⎦
1
κ (VDS ) =
1 + (VDS Ε C L )

1 ∂I DS
Ron = g DS =
g DS VDS →0 ∂VDS

Institute of
1. Exercise: Short Microelectronic
Channel MOSFETs Systems 4
Exercises 2

NMOS and CMOS Inverters

VLSI Design of Integrated Circuits

Institute of
Microelectronic
Systems

1. NMOS Inverter

Assume three types of NMOS inverters:


a) with resistive load
b) with enhancement MOSFET load
c) with depletion MOSFET load

V D D V D D V D D

Q L
R L R L

Iout Iout Iout


Q S V Q S V Q S V
V IN
O U T V IN
O U T V IN
O U T

a) resistive load b) enhancement load c) depletion load

Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 2
1. NMOS Inverter

Draw the simplified pull-up characteristic of the three types of NMOS


inverters shown before.
Use the appended diagram “Pull-Up-Characteristics” for this purpose
Assume
VDD = 5V,
RL = 20kΩ , VT,enh = 1V,
VT,dep = -1V
λ=0
The short-circuit current of both inverters with active load is
IQ = 0.2mA

Neglect short channel and body effects of the transistors.

Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 3

1. NMOS Inverter

The next appended diagram shows the output characteristics of the


driver transistor QS.

The low-state output voltage should not exceed 0.8V. Determine


graphically, for an input voltage of 2.5V and 3V, how much
current the NMOS inverter can sink if:
• a load resistor RL = 20kΩ is used,
• a depletion transistor with I Q = 0.2mA is used, neglecting body
and short channel effects

Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 4
1. NMOS Inverter

For the NMOS inverter with saturated enhancement load, the


voltage transfer characteristics should be estimated.

Use the appended diagram “Determination of VTC” to determine


the Voltage Transfer Characteristic (VTC) of the NMOS inverter
with saturated enhancement load graphically. Draw the VTC in
the empty diagram “VTC of NMOS-Inverter”.

Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 5

1. NMOS Inverter

This inverter is characterized by the following parameters:


VDD = 5V 2 φ F = 0.6V VT 0 = 1.0V
β1
KR = βR = =8 γ = 0.37 V
β2

• Calculate VOH
• Calculate VOL
• Calculate VIH

Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 6
1. NMOS Inverter

Hints:
• The body effect (influence of the bulk- source voltage) of the load
transistor must be taken into account when determining its
threshold voltage. Therefore the following equation for the
threshold voltage can be used:
VTH = VT 0 + γ ( 2 | φ F | +VSB − 2 | φ F | )
• An equation of type x = f(x) can be solved numerically by starting
at any value for x and iteratively calculating f(x) until the result
reaches the desired precision.

Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 7

2. VIL and VIH for a CMOS Inverter

A CMOS process is characterized by the following parameters:


µA
VT 0 p = −0.8V , β p = 40

µA
VT 0 n = +0.8V , β n = 40

• Calculate the values of VIL and VIH for a supply voltage VDD= 5V,
10V and 15V
• At which operation point does the current consumption of the
inverter reach its maximum ?
• Calculate the current consumption of the inverter at these supply
voltages.

Institute of
2. NMOS and CMOS Inverters Microelectronic
Systems 8
Pull−Up−Characteristics
5

4.5

3.5

2.5

Vout (V)
2

1.5

0.5

0
0 0.05 0.1 0.15 0.2 0.25 0.3
Iout (mA)
VGS=5.0V 3.0V 2.75V
3.5V

4.5V
0.25

4.0V 2.5V

0.2
2.4V

2.3V

0.15

ID (mA)
2.2V
2.1V

2.0V
0.1
1.9V

1.8V
1.7V
0.05
1.6V
1.5V

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
V (V)
DS
Determination of Voltage Transfer Characteristic (VTC)

3.5V
2.75V
3.0V
Pull−Up−Characteristic of Enhancement−Load
0.25
4.0V
4.5V 2.5V

VGS=5.0V
0.2
2.4V

2.3V

0.15

ID (mA)
2.2V
2.1V

2.0V
0.1
1.9V

1.8V
1.7V
0.05
1.6V
1.5V

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
V (V)
DS
VTC of NMOS−Inverter
5

4.5

3.5

(V)
2.5

out
V
2

1.5

0.5

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Vin (V)
3. Exercise: NMOS and CMOS Inverter

VLSI- Design of Integrated Circuits

Institute of
Microelectronic
Systems 1

Problem 1
The figure below shows the layout of a CMOS inverter, whose dimensions
are given in micrometers. The inverter is realized in a n-well CMOS process.
The oxid capacitance is Cox = 69.1 nF/cm2 for both n and p-channel
transistors. The drain-bulk and source-bulk depletion capacitances of the
transistors are given by the following parameters:

NMOS PMOS
C j0 [ fF / µm ]
2
0.0975 0.0298
C jsw0 [ fF / µm] 0.107 0.362
φ0 [V ] 0.879 0.939
φ0 sw [V ] 0.921 0.985

Institute of
Microelectronic
Systems 2
Although not explicitly shown in the figure, an overlap L0 = 0.3µm is assumed
and must be included in calculations. The supply voltage is VDD = 5V .
a) Compute the maximum value of CGDn and CGDp
b) Determine the zero bias value of Cdbn and Cdbp .
Take the sidewall and the bottom regions into account separately.
C j 0 ⋅ area C jsw0 ⋅ perimeter
Cbottom = ; Csidewall =
1 + Vr /φ0 1 + Vr /φ0 sw
c) Compute K (V0 H ,V0 L ) for the inverter and herewith determine Cout ,
i.e. ignore the interconecting wires and CG .
C db,average = K (VOL , VOH ) ⋅ C db,max ; Average for V between VOL and VOH
d) Compute t HL and t LH for the inverter, by using the value of Cout
determined above. Use the following parameters for the transistors :
µA µA
NMOS : VT 0 n = +0.8V , k n = 40 ; PMOS : VT 0 p = −0 .8V , k p = 16
V2 V2
Institute of
Microelectronic
Systems 3

Problem 2
The figure below shows the layout of two cascaded CMOS inverters, each
stage being identical to the one analysed in the problem 1. Capacitances and
the connecting wires are now taken into account. Let Cp-f = 0.0576 fF/um2
and Cm-f = 0.0345 fF/um2.
a) Compute the metal - field capacitance
from the output of the first stage to
the metal - poly contact of the second
stage. Consider only the metal - field
regions, ignoring the regions in which
metal overlaps n +, p + or poly.
b) Determine the input capacitance of the
second stage, as seen from the
beginning of the poly line. Determine
the sum Cline + C g , using the value
of Cout calculated in problem 1. Is one
of the two capacitances dominating?
Institute of
Microelectronic
Systems 4
Problem 3
Let’s consider a CMOS inverter with βn = βp = 35 µA/V2 and VT0n = 0.9V,
VT0p = -0.8V. The output capacitance is Cout = 125 fF and the supply voltage
is VDD = 5V.
a) Compute t HI and t IH for the inverter .
b) Determine the propagation delay time t p .

Institute of
Microelectronic
Systems 5
4. Exercise: CMOS and Pass Transistor
Logic

VLSI- Design of Integrated Circuits, WS 2003/04

Institute of
Microelectronic
Systems

1. Problem: Logic Function Analysis


Determine the logic function of the following NMOS circuits:
a)

b)

Institute of
4. Exercise: CMOS and Microelectronic
Pass Transistor Logic Systems 2
2. Problem: CMOS Logic
Synthesize the CMOS circuit for a parity generator with four inputs:
Z = A⊕ B ⊕C ⊕ D

3. Problem: Full Adder


Synthesize the CMOS circuit for a full-adder, which has the following
truth table:

4. Problem: CMOS Logic


Implement the following function using static CMOS logic:
f = ( AB ) + C ( D + E )
Institute of
4. Exercise: CMOS and Microelectronic
Pass Transistor Logic Systems 3

5. Problem: Transistor Count


The figure below shows an implementation with CMOS transmission
gates of the function: F = AS + BS

a) Build the equivalent multistage circuit with elementary gates (AND,


OR, INV)
b) Implement the circuit as a Complex-Gate
c) Compare the transistor count. Point out the advantages and
disadvantages of all three solutions

Institute of
4. Exercise: CMOS and Microelectronic
Pass Transistor Logic Systems 4
6. Problem: Pass Transistor Logic

Implement the following function:


F = ac ′d ′ + acd + a ′cb′ + a ′c ′b
You may use 8 PMOS and 8 NMOS transistors respectively. The
literals are available in both inverted and non-inverted form.

Institute of
4. Exercise: CMOS and Microelectronic
Pass Transistor Logic Systems 5
Exercise 5: Dynamic Logic

VLSI- Design of Integrated Circuits, WS 2003/04

Institute of
Microelectronic
Systems

Problem 1: Dynamic Logic Full Adder

Draw the transistor level circuit of a dynamic ripple carry full adder,
whose logic functions are the following.
C n +1 = An ⋅ Bn + C n ( An + Bn )
S n = C n +1 ( An + Bn + C n ) + An ⋅ Bn ⋅ C n

Institute of
Exercise 5: Dynamic
Microelectronic
Logic
Systems 2
Problem 2: Charge Sharing

The function:

Z = A (B + C + D + E + F )

must be implemented using domino logic. Could charge sharing


effects occur? If yes, how can they be avoided?

Institute of
Exercise 5: Dynamic
Microelectronic
Logic
Systems 3

Problem 3: Charge Sharing

All input variables in the above circuit come from domino logic blocks, so
that immediately after the precharge we have: A = B = C = D = F = 0V.
For which possible 0 →1 transitions has the charge sharing effect the
greatest influence? The capacitances are:
C X 1 = C X 2 = 10 fF , Cout ,1 = 185 fF
Calculate the voltage Vout,1. Make the calculations for C X 1 = C X 2 = 40 fF.
Institute of
Exercise 5: Dynamic
Microelectronic
Logic
Systems 4
Exercises 6

Line Propagation Delay, Buffer Stages

VLSI Design of Integrated Circuits

Institute of
Microelectronic
Systems

Problem 1: Line Propagation Delay

Assume a poly line with a length of l = 3mm, a line resistance of


r = 12 Ω/µm and a line capacitance of c = 4*10-4 pF/µm.

a) Calculate the delay of the line


b) Insert a buffer with a delay J = 3 ns. At which position must the
buffer be inserted to achieve a minimum delay (line delay and
buffer delay)? Calculate this delay.

Institute of
Exercise 6: Line Propagation Delay, Microelectronic
Buffer Stages Systems 2
Problem 2: Inverter Chain

Consider an inverter with M stages like the one depicted below:

In L o a d

C L

C g S C g S M -1
C g S M
C g = C L

Institute of
Exercise 6: Line Propagation Delay, Microelectronic
Buffer Stages Systems 3

Problem 2: Inverter Chain

• Assume the inverters in the chain as symmetrical, this means that


the rise and fall times at the output of the inverter are equal.
Furthermore, the gate capacitance is for the NMOS of the first
stage C1 = 6fF. The line capacitances are negligible. The load
capacitance is CL = 150pF.

• Determine M and S, so that the delay of the inverter chain is


minimal. The output must not be inverted.

Institute of
Exercise 6: Line Propagation Delay, Microelectronic
Buffer Stages Systems 4
Exercises 7
Gate-Matrix, Stick-Diagrams, Euler Graphs

VLSI- Design of Integrated Circuits

Institute of
Microelectronic
Systems

Problem 1: Full adder - Stick Diagram

Let’s consider a full adder, whose input signals are A, B and Cin.
The outputs are S and Cout.
A) Draw the logic table for the full adder and determine the
equations for S and Cout.
B) Show the stick-diagram of the full adder

Institute of
Exercises 7: Gate-Matrix, Microelectronic
Stick-Diagrams, Euler Graphs Systems 2
Problem 2: Barrel Shifter

Draw the stick-diagram of a barrel shifter for a 4-bit word, n∈{0…3}.


Each input has its own shift-enable. Assume that these inputs are
properly driven by a decoder, i.e. only one input can be enabled at a
time.

Institute of
Exercises 7: Gate-Matrix, Microelectronic
Stick-Diagrams, Euler Graphs Systems 3

Problem 3: Gate-Matrix Method


The figure below shows an implementation with CMOS transmission
gates of the function: F = AS + BS

a) Build the equivalent multi-stage circuit with elementary gates


(AND,OR,INV)
b) Compare the transistor count. Show the advantages and
disadvantages of both solutions
c) Implement the circuit from a) using the gate-matrix technique.
Draw the corresponding stick-diagram
Institute of
Exercises 7: Gate-Matrix, Microelectronic
Stick-Diagrams, Euler Graphs Systems 4
Problem 4: Euler Graphs
Given the following function:
F = (i1 + i2 )(i3 + i4 ) + i5 ⋅ i6 + i7 ⋅ i8
a) Show the transistor level circuit implemented using static CMOS logic.
b) Build the optimal layout using the Euler graph method.
1) Show the complex-gate implementation
2) Modify the circuit so that, after applying the Euler graph
method, to obtain the optimal result
3) Determine the Euler path for the graph reduction and the
subsequent graph expansion
c) Draw the layout as stick-diagram.

Institute of
Exercises 7: Gate-Matrix, Microelectronic
Stick-Diagrams, Euler Graphs Systems 5
Exercises 8
PLA Structures

VLSI Design of Integrated Circuits

Institute of
Microelectronic
Systems

Problem 1: PLA - Stick diagram

Draw the stick diagram of a NMOS PLA that implements a full adder
stage. The input and the output registers are clocked using φ1 and
φ2 respectively.

Institute of
Microelectronic
Exercises 8: PLA Structures Systems 2
Problem 2: FSM implementation with PLA

Design and implement with PLA a traffic light controller for the
crossroad below. The farm road has sensors for detecting waiting
cars.
There is also a timer available, which is
triggered by the rising edge of a ‘Start’
signal and provides two output signals:
TShort - during the yellow phase
TLong - for timing the green phase

TLong
Start
TShort
TShort
TLong

Institute of
Microelectronic
Exercises 8: PLA Structures Systems 3

S = 0 o r T L o n g = 0

S - Signal when a car is on


Y Y H ig h - F a rm -
1
0 0
0
w a y ro a d the farmroad
(S a n d T L o n g )= 1 T S h o rt= 1
TL - Timer signal for green
(active low)
T S h o rt= 0 T S h o rt= 0
Y Y Y Y
1
0 1
0 1
1 0
0 TS - Timer signal for yellow
T S h o rt= 1 (active low)
S = 0 o r T L o n g = 1 HG - Highway green state
H ig h - F a rm -
H ig h - F a rm - Y 1 Y 0
w a y ro a d HY - Highway yellow state
w a y ro a d 1 1
FG - Farm road green state
S = 1 a n d T L o n g = 0
FY - Farm road yellow state
H ig h - F a rm -
w a y ro a d

First, draw the schematics of the controller, showing the PLA, the
timer and the traffic lights.
Institute of
Microelectronic
Exercises 8: PLA Structures Systems 4
Problem 1: Unconfigured PLA
V d d
V d d C C A A B B

A c tiv e
P o ly s ilic o n
M e ta l
T r a n s is to r

P h i2

P h i1

S i
C i+ 1
C i A i B i

Institute of
Microelectronic
8. Exercise Systems 1
Problem 2: Unconfigured PLA
V d d
V d d
S S T T Y 1 Y 1 Y 0 Y 0
P 2 A c tiv e
P 3 P o ly s ilic o n
P 4
M e ta l
P 5

P 6
T r a n s is to r
P 7

P 8

P 9

P h i2

P h i1
Y 1' Y 0' F L 1
S T Y 1 Y 0
S t H L 1

Institute of
Microelectronic
8. Exercise Systems 2
Appendix: IMI Unprogrammed Array

Institute of
Microelectronic
Exercises 9 Systems 6
Appendix: CDI Unprogrammed Array

Institute of
Microelectronic
Exercises 9 Systems 7

You might also like