Lecture 4 (1)
Lecture 4 (1)
1
Outline
Introduction
Combinational logic
Sequential logic
Custom single-purpose processor design
RT-level custom single-purpose processor
design
Introduction
Processor
Digital circuit that performs a
computation tasks
Controller and datapath Digital camera chip
CCD
General-purpose: variety of computation CCD preprocessor Pixel coprocessor D2A
tasks A2D
gate
IC package IC oxide
source channel drain
Silicon substrate
CMOS transistor implementations
Complementary Metal source source
nMOS pMOS
Typically 0 is 0V, 1 is 5V
Two basic CMOS types
nMOS conducts if gate=1 1 1 1
x y x
pMOS conducts if gate=0 x F = x'
F = (xy)' y
Hence “complementary” x F = (x+y)'
0 y x y
Basic gates
0 0
Inverter, NAND, NOR inverter NAND gate NOR gate
Basic logic gates
x F x F x x y F x x y F x x y F
F y F F
0 0 y 0 0 0 0 0 0 y 0 0 0
1 1 0 1 0 0 1 1 0 1 1
1 0 0 1 0 1 1 0 1
F=x F=xy F=x+y F=xy
Driver AND 1 1 1 OR 1 1 1 1 1 0
XOR
x F x F x x y F x x y F x x y F
F F F
0 1 y 0 0 1 y 0 0 1 y 0 0 1
1 0 0 1 1 0 1 0 0 1 0
F = x’ F = (x y)’ 1 0 1 F = (x+y)’ 1 0 0 F=x y 1 0 0
Inverter NAND 1 1 0 NOR 1 1 0 XNOR 1 1 1
Combinational logic design
A) Problem description B) Truth table C) Output equations
a 00 01 11 10
0 0 1 0 1
z
1 0 1 1 1
z = ab + b’c + bc’
Combinational components
I(log n -1) I0 A A B
B A B
I(m-1) I1 I0 n n
… n n n
n …
log n x n n-bit n bit,
S0 n-bit, m x 1 n-bit
Decoder Adder m function S0
… Multiplexor Comparator
ALU …
… n
S(log m) S(log m)
n n
O(n-1) O1 O0 carry sum less equal greater
O O
With enable input e With carry-in input Ci May have status outputs
all O’s are 0 if e=0 carry, zero, etc.
sum = A + B + Ci
Sequential components
I
n
load shift
n-bit n-bit n-bit
Register Shift register Counter
clear I Q
n n
Q Q
Q= Q = lsb Q=
0 if clear=1, - Content shifted 0 if clear=1,
I if load=1 and clock=1, - I stored in msb Q(prev)+1 if count=1 and clock=1.
Q(previous) otherwise.
Sequential logic design
A) Problem Description D) Implementation Model
C) State Table (Moore-type)
You want to construct a clock
divider. Slow down your pre- x
Inputs Outputs a Combinational logic
existing clock so that you output a
Q1 Q0 a I1 I0 x I1
1 for every four clock cycles
0 0 0 0 0 I0
0
0 0 1 0 1
0 1 0 0 1 0 Q1 Q0
0 1 1 1 0
B) State Diagram 1 0 0 1 0 0
1 0 1 1 1 State register
a=0 x=0 x=1 a=0 1 1 0 1 1
1
1 1 1 0 0 I1 I0
0 a=1 3
a=1 a=1
1
a=1
2
a=0
Given this implementation model
a=0 x=0 x=0
Sequential logic design quickly reduces
to combinational logic design
Sequential logic design (cont.)
E) Minimized Output Equations F) Combinational Logic
I1 Q1Q0
a 00 01 11 10
a
0 0 0 1 1
I1 = Q1’Q0a + Q1a’ + x
1 Q1Q0’
0 1 0 1
I0 Q1Q0 I1
00 01 11 10
a
0 0 1 1 0 I0 = Q0a’ + Q0’a
1 1 0 0 1
x Q1Q0 I0
a
00 01 11 10
0 0 0 1 0 x = Q1Q0
Q1 Q0
1 0 0 1 0
Custom single-purpose processor
basic model
… …
external external
control data controller datapath
inputs inputs
… …
datapath next-state registers
control and
controller inputs datapath control
logic
datapath
control state functional
outputs register units
… …
external external
control data
outputs outputs
… …
8: x = x - y; 9: d_o = x
}
9: d_o = x; 1-J:
}
State diagram templates
Assignment statement Loop statement Branch statement
a=b while (cond) { if (c1)
next statement loop-body- c1 stmts
statements else if c2
} c2 stmts
next statement else
other stmts
next statement
!cond
a=b C: C:
cond c1 !c1*c2 !c1*!c2
next loop-body-
c1 stmts c2 stmts others
statements
statement
J: J:
declared variable 1
!(!go_i)
2:
Create a functional unit for !go_i
x_i y_i
Datapath
each arithmetic operation 2-J:
x_sel
n-bit 2x1 n-bit 2x1
3: x = x_i
Connect the ports, registers y_sel
x_ld
0: x 0: y
and functional units 4: y = y_i
y_ld
9: d_o = x
1-J:
Creating the controller’s FSM
!1 go_i
1:
Controller !1
Same structure as FSMD
1
!(!go_i) 0000 1:
2:
1
!(!go_i) Replace complex
0001 2:
!go_i
2-J:
!go_i actions/conditions with
3: x = x_i
0010 2-J:
x_sel = 0
datapath configurations
0011 3: x_ld = 1
4: y = y_i
y_sel = 0 x_i y_i
0100 4: y_ld = 1
!(x!=y)
Datapath
5: !x_neq_y
0101 5: x_sel
x!=y n-bit 2x1 n-bit 2x1
x_neq_y y_sel
6: 0110 6:
x_ld
x<y !(x<y) x_lt_y !x_lt_y 0: x 0: y
y_ld
7: y = y -x 8: x = x - y 7: y_sel = 1 8: x_sel =1
y_ld = 1 x_ld = 1
1010 5-J:
1011 9: d_ld = 1
1100 1-J:
Controller state table for the GCD
example
Inputs Outputs
Q3 Q2 Q1 Q0 x_neq x_lt_y go_i I3 I2 I1 I0 x_sel y_sel x_ld y_ld d_ld
_y
0 0 0 0 * * * 0 0 0 1 X X 0 0 0
0 0 0 1 * * 0 0 0 1 0 X X 0 0 0
0 0 0 1 * * 1 0 0 1 1 X X 0 0 0
0 0 1 0 * * * 0 0 0 1 X X 0 0 0
0 0 1 1 * * * 0 1 0 0 0 X 1 0 0
0 1 0 0 * * * 0 1 0 1 X 0 0 1 0
0 1 0 1 0 * * 1 0 1 1 X X 0 0 0
0 1 0 1 1 * * 0 1 1 0 X X 0 0 0
0 1 1 0 * 0 * 1 0 0 0 X X 0 0 0
0 1 1 0 * 1 * 0 1 1 1 X X 0 0 0
0 1 1 1 * * * 1 0 0 1 X 1 0 1 0
1 0 0 0 * * * 1 0 0 1 1 X 1 0 0
1 0 0 1 * * * 1 0 1 0 X X 0 0 0
1 0 1 0 * * * 0 1 0 1 X X 0 0 0
1 0 1 1 * * * 1 1 0 0 X X 0 0 1
1 1 0 0 * * * 0 0 0 0 X X 0 0 0
1 1 0 1 * * * 0 0 0 0 X X 0 0 0
1 1 1 0 * * * 0 0 0 0 X X 0 0 0
1 1 1 1 * * * 0 0 0 0 X X 0 0 0
Completing the GCD custom
single-purpose processor design
We finished the datapath … …
combinational logic
state functional
design register units
Problem Specification
machine Sende Bridge Rece
r rdy_in A single-purpose processor that rdy_out iver
Rather than algorithm converts two 4-bit inputs, arriving one
clock at a time over data_in along with a
Cycle timing often too central rdy_in pulse, into one 8-bit output on
data_out along with a rdy_out pulse.
data_in(4) data_out(8)
to functionality
Example
rdy_in=0 Bridge rdy_in=1
Bus bridge that converts 4-bit
rdy_in=1
rdy_in=0
Exercise: complete the design Inputs
Send8Start rdy_in: bit; data_in: bit[4];
data_out=data_hi Send8End
Outputs
& data_lo rdy_out=0
rdy_out: bit; data_out:bit[8]
rdy_out=1 Variables
data_lo, data_hi: bit[4];
RT-level custom single-purpose processor
design (cont’)
Bridge
(a) Controller
rdy_in=0 rdy_in=1
rdy_in=1
WaitFirst4 RecFirst4Start RecFirst4End
data_lo_ld=1
rdy_in=0 rdy_in=0 rdy_in=1
rdy_in=1
WaitSecond4 RecSecond4Start RecSecond4End
data_hi_ld=1
Send8Start Send8End
data_out_ld=1 rdy_out=0
rdy_out=1
rdy_in rdy_out
clk
data_in(4) data_out
data_lo_ld
data_out_ld
data_hi_ld
registers
data_hi data_lo
to all
data_out
(b) Datapath
Optimizing single-purpose
processors
Optimization is the task of making design
metric values the best possible
Optimization opportunities
original program
FSMD
datapath
FSM
Optimizing the original program
Analyze program attributes and look for areas
of possible improvement
number of computations
size of variable
time and space complexity
operations used
multiplication and division very expensive
Optimizing the original program
(cont’)
original program optimized program
0: int x, y; 0: int x, y, r;
1: while (1) { 1: while (1) {
2: while (!go_i); 2: while (!go_i);
3: x = x_i; // x must be the larger number
4: y = y_i; 3: if (x_i >= y_i) {
5: while (x != y) { replace the subtraction operation(s) 4: x=x_i;
6: if (x < y) with modulo operation in order to 5: y=y_i;
7: y = y - x; speed up program }
else 6: else {
8: x = x - y; 7: x=y_i;
} 8: y=x_i;
9: d_o = x; }
} 9: while (y != 0) {
10: r = x % y;
11: x = y;
12: y = r;
}
13: d_o = x;
}
GCD(42, 8) - 9 iterations to complete the loop GCD(42,8) - 3 iterations to complete the loop
x and y values evaluated as follows : (42, 8), (43, 8), x and y values evaluated as follows: (42, 8), (8,2),
(26,8), (18,8), (10, 8), (2,8), (2,6), (2,4), (2,2). (2,0)
Optimizing the FSMD
Areas of possible improvements
merge states
states with constants on transitions can be
eliminated, transition taken is already known
states with independent operations can be
merged
separate states
states which require complex operations
(a*b*c*d) can be broken into smaller states to
reduce hardware size
scheduling
Optimizing the FSMD (cont.)
int x, y; !1 optimized FSMD
original FSMD
1:
int x, y;
1
!(!go_i) eliminate state 1 – transitions have constant values 2:
2:
!go_i go_i !go_i
2-J: x = x_i
3: y = y_i
merge state 2 and state 2J – no loop operation in
3: x = x_i between them
5:
x!=y
9: d_o = x
6: merge state 5 and state 6 – transitions from state 6 can
x<y !(x<y) be done in state 5
y = y -x 8: x = x - y
7:
eliminate state 5J and 6J – transitions from each state
6-J: can be done from state 7 and state 8, respectively
5-J:
eliminate state 1-J – transition from state 1-J can be
d_o = x done directly from state 9
9:
1-J:
Optimizing the datapath
Sharing of functional units
one-to-one mapping, as done previously, is not
necessary
if same operation occurs in different states, they
can share a single functional unit
Multi-functional units
ALUs support a variety of operations, it can be
shared among operations occurring in different
states
Optimizing the FSM
State encoding
task of assigning a unique bit pattern to each state
in an FSM
size of state register and combinational logic vary
can be treated as an ordering problem
State minimization
task of merging equivalent states into a single state
state equivalent if for all possible input combinations
the two states generate the same outputs and
transitions to the next same state
Summary
Custom single-purpose processors
Straightforward design techniques
Can be built to execute algorithms
Typically start with FSMD
CAD tools can be of great assistance
Layout, Fabrication, and
Elementary Logic Design
Introduction
Integrated circuits: many transistors on one
chip.
Very Large Scale Integration (VLSI)
Complementary Metal Oxide Semiconductor
(CMOS)
Fast, cheap, “low-power” transistors circuits
VLSI:Very Large Scale Integration
Integration: Integrated Circuits
Multiple devices (transistors) on one substrate (chip)
How large is Very Large?
SSI (small scale integration): 7400 series, 10-100 transistors
MSI (medium scale): 74000 series 100-1000
LSI 1,000-10,000 transistors
VLSI > 10,000 transistors
ULSI/SLSI (some disagreement)
Complementary Metal Oxide Semiconductor (CMOS)
Fast, cheap, “low-power” transistors circuits
The Process of VLSI Design:
Consists of many different representations/Abstractions
of the system (chip) that is being designed.
System Level Design
Place/Route Artwork
Si Si Si
Si Si Si
Si Si Si
http://onlineheavytheory.net/silicon.html
Dopants
Silicon is a semiconductor at room temperature
Pure silicon has few free carriers and conducts poorly
Adding dopants increases the conductivity drastically
Dopant from Group V (e.g. As, P): extra electron (n-
type)
Dopant from Group III (e.g. B, Al): missing electron,
called hole (p-type)
Si Si Si Si Si Si
- +
+ -
Si As Si Si B Si
Si Si Si Si Si Si
p-n Junctions
First semiconductor (two terminal) devices
A junction between p-type and n-type
semiconductor forms a diode.
Current flows only in one direction
p-type n-type
anode cathode
MOS Integrated Circuits
1970’s processes usually had only nMOS transistors
Inexpensive, but consume power while idle
1980s-present: CMOS processes for low idle power
Intel 1101 256-bit SRAM Intel 4004 4-bit Proc Pentium 4 Processor
Transistor Types
Bipolar transistors
npn or pnp silicon structure
Small current into very thin base layer controls large
currents between emitter and collector
Base currents limit integration density
Metal Oxide Semiconductor Field Effect Transistors
nMOS and pMOS MOSFETS
Voltage applied to insulated gate controls current
between source and drain
Low power allows very high integration
First patent in the ’20s in USA and Germany
Not widely used until the ’60s or ’70s
MOS Transistors
Four terminal device: gate, source, drain, body
Gate – oxide – body stack looks like a capacitor
Gate and body are conductors (body is also called the substrate)
SiO2 (oxide) is a “good” insulator (separates the gate from the body
Called metal–oxide–semiconductor (MOS) capacitor, even though
gate is mostly made of poly-crystalline silicon (polysilicon)
n+ n+ p+ p+
p bulk Si n bulk Si
NMOS PMOS
NMOS Operation
Body is commonly tied to ground (0 V)
Drain is at a higher voltage than Source
When the gate is at a low voltage:
P-type body is at low voltage
Source-body and drain-body “diodes” are OFF
No current flows, transistor is OFF
Source Gate Drain
Polysilicon
SiO2
0
n+ n+
S D
p bulk Si
NMOS Operation Cont.
When the gate is at a high voltage: Positive charge
on gate of MOS capacitor
Negative charge is attracted to body under the gate
Inverts a channel under gate to “n-type” (N-channel, hence
called the NMOS) if the gate voltage is above a threshold
voltage (VT)
Now current can flow through “n-type” silicon from source
through channel to drain, transistor is ON
Source Gate Drain
Polysilicon
SiO2
1
n+ n+
S D
p bulk Si
PMOS Transistor
Similar, but doping and voltages reversed
Body tied to high voltage (VDD)
Drain is at a lower voltage than the Source
Gate low: transistor ON
Gate high: transistor OFF
Bubble indicates inverted behavior
Source Gate Drain
Polysilicon
SiO 2
p+ p+
n bulk Si
Power Supply Voltage
GND = 0 V
In 1980’s, VDD = 5V
VDD has decreased in modern processes
High VDD would damage modern tiny transistors
Lower VDD saves power
VDD = 3.3, 2.5, 1.8, 1.5, 1.2, 1.0,
Effective power supply voltage can be lower due
to IR drop across the power grid.
Transistors as Switches
In Digital circuits, MOS transistors are
electrically controlled switches
Voltage at gate controls path from source to
g=0 g=1
drain d d
d
nMOS g OFF
ON
s s s
d d d
pMOS g OFF
ON
s s s
CMOS Inverter
A Y VDD
0
1
A Y
A Y
GND
CMOS Inverter
A Y VDD
0
1 0 OFF
A=1 Y=0
GND
device.
CMOS Inverter
A=0 Y=1
OFF
A Y
GND
CMOS NAND Gate
A B Y
0 0
0 1 Y
1 0 A
1 1
B
CMOS NAND Gate
A B Y
0 0 1 ON ON
0 1 Y=1
1 0
A=0 OFF
1 1
B=0 OFF
CMOS NAND Gate
A B Y
0 0 1 OFF ON
0 1 1 Y=1
1 0
A=0 OFF
1 1
B=1 ON
CMOS NAND Gate
A B Y
0 0 1 ON OFF
0 1 1 Y=1
1 0 1
A=1 ON
1 1
B=0 OFF
CMOS NAND Gate
A B Y
0 0 1 OFF OFF
0 1 1 Y=0
1 0 1
A=1 ON
1 1 0
B=1 ON
CMOS NOR Gate
A B Y
0 0 1 A
0 1 0
1 0 0 B
1 1 0 Y
3-input NAND Gate
Y is pulled low if ALL inputs are 1
Y is pulled high if ANY input is 0
Y
A
B
C
CMOS Fabrication
CMOS transistors are fabricated on silicon
wafer
Wafers diameters (200-300 mm)
Lithography process similar to printing press
On each step, different materials are
deposited, or patterned or etched
Easiest to understand by viewing both top
and cross-section of wafer in a simplified
manufacturing process
Inverter Cross-section
Typically use p-type substrate for nMOS transistors
Requires to make an n-well for body of pMOS
transistors
A
GND VDD
Y SiO 2
n+ diffusion
p+ diffusion
n+ n+ p+ p+
polysilicon
n well
p substrate
metal1
p+ n+ n+ p+ p+ n+
n well
p substrate
p+ n+ n+ p+ p+ n+
n well
p substrate
GND VDD
n-well
Polysilicon
Polysilicon
n+ diffusion
p+ diffusion n+ Diffusion
Contact p+ Diffusion
Metal Contact
Metal
Fabrication Steps
Start with blank wafer (typically p-type where NMOS is created)
Build inverter from the bottom up
First step will be to form the n-well (where PMOS would reside)
Cover wafer with protective layer of SiO (oxide)
2
Remove oxide layer where n-well should be built
Implant or diffuse n dopants into exposed wafer to form n-well
Strip off SiO2
p substrate
Oxidation
Grow SiO2 on top of Si wafer
900 – 1200 C with H2O or O2 in oxidation
furnace
SiO 2
p substrate
Photoresist
Spin on photoresist
Photoresist is a light-sensitive organic polymer
Property changes where exposed to light
Photoresist
SiO 2
_
p substrate
Lithography
Expose photoresist to Ultra-violate (UV) light
through the n-well mask
Strip off exposed photoresist with chemicals
Photoresist
SiO 2
p substrate
Etch
Etch oxide with hydrofluoric acid (HF)
Seeps through skin and eats bone; nasty
stuff!!!
Only attacks oxide where resist has been
exposed
N-well pattern is transferred from the mask to
silicon-di-oxide surface; creates an opening
to the silicon surface
Photoresist
SiO 2
p substrate
Strip Photoresist
Strip off remaining photoresist
Use mixture of acids called piranah etch
Necessary so resist doesn’t melt in next step
SiO 2
p substrate
n-well
n-well is formed with diffusion or ion implantation
Diffusion
Place wafer in furnace with arsenic-rich gas
Heat until As atoms diffuse into exposed Si
Ion Implanatation
Blast wafer with beam of As ions
n well
Strip Oxide
Strip off the remaining oxide using HF
Back to bare wafer with n-well
Subsequent steps involve similar series of
steps
n well
p substrate
Polysilicon
(self-aligned gate technology)
Polysilicon
Thin gate oxide
n well
p substrate
Polysilicon Patterning
Use same lithography process discussed
earlier to pattern polysilicon
Polysilicon
Polysilicon
Thin gate oxide
n well
p substrate
Self-Aligned Process
Use gate-oxide/polysilicon and masking to
expose where n+ dopants should be diffused
or implanted
N-diffusion forms nMOS source, drain, and n-
well contact
n well
p substrate
N-diffusion/implantation
Pattern oxide and form n+ regions
Self-aligned process where gate blocks n-dopants
Polysilicon is better than metal for self-aligned gates
because it doesn’t melt during later processing
n+ Diffusion
n well
p substrate
N-diffusion/implantation cont.
Historically dopants were diffused
Usually high energy ion-implantation used
today
But n+ regions are still called diffusion
n+ n+ n+
n well
p substrate
N-diffusion cont.
Strip off oxide to complete patterning step
n+ n+ n+
n well
p substrate
P-Diffusion/implantation
Similar set of steps form p+ “diffusion”
regions for PMOS source and drain and
substrate contact
p+ Diffusion
p+ n+ n+ p+ p+ n+
n well
p substrate
Contacts
Now we need to wire together the devices
Cover chip with thick field oxide (FO)
Etch oxide where contact cuts are needed
Contact
Metal
Metal
Thick field oxide
p+ n+ n+ p+ p+ n+
n well
p substrate
Physical Layout
Chips are specified with set of masks
Minimum dimensions of masks determine transistor
size (and hence speed, cost, and power)
Feature size f = distance between source and drain
Set by minimum width of polysilicon
Feature size improves 30% every 3 years or so
Normalize for feature size when describing design
rules
Express rules in terms of = f/2
E.g. = 0.3 m in 0.6 m process
Simplified Design Rules
Conservative rules to get you started
Inverter Layout
Transistor dimensions specified as Width / Length
Minimum size is 4-6/ 2sometimes called 1 unit
In f = 0.25 m process, this is 0.5-0.75 m wide (W),
0.25 m long (L)
Since fm.
Technology Roadmap
Technology Roadmap
Technology Roadmap
Summary
MOS Transistors are stack of gate, oxide,
silicon
and p-n junctions
Can be viewed as electrically controlled
switches
Build logic gates out of switches
Draw masks to specify layout of transistors
Now you know everything necessary to start
designing schematics and layout for a simple
chip!
Circuits & Layout
CMOS Gate Design
A 4-input CMOS NOR gate
A
B
C
D
Y
Complementary CMOS
Complementary CMOS logic gates
nMOS pull-down network
pMOS pull-up network pMOS
pull-up
network
a.k.a. static CMOS
inputs
output
nMOS
pull-down
network
Pull-down ON 0 X (crowbar)
Series and Parallel
a a a a a
0 0 1 1
g1
g2
0 1 0 1
b b b b b
(a) OFF OFF OFF ON
nMOS: 1 = ON a
0
a
0
a
1
a
1
a
g1
pMOS: 0 = ON g2
b
0 1 0 1
b b b b
a
Parallel: either can be ON
a a a a
g1 g2 0 0 0 1 1 0 1 1
b b b b b
(c) OFF ON ON ON
a a a a a
g1 g2 0 0 0 1 1 0 1 1
b b b b b
(d) ON ON ON OFF
Conduction Complement
Complementary CMOS gates always produce 0 or 1
C D
A B C D
A B
(c)
(d)
C D
A
A B
B
Y Y
C
A C
D
B D
(f)
(e)
Example: O3AI
Y ( A B C ) D
Example: O3AI
Y ( A B C ) D
A
B
C D
Y
D
A B C
Pass Transistors
Transistors can be used as switches
g
s d
s d
Pass Transistors
Transistors can be used as switches
g g=0 Input g = 1 Output
s d 0 strong 0
s d
g=1 g=1
s d 1 degraded 1
Input Output
g = 0, gb = 1 g = 1, gb = 0
g
a b 0 strong 0
a b g = 1, gb = 0 g = 1, gb = 0
a b 1 strong 1
gb
g g g
a b a b a b
gb gb gb
Tristates
Tristate buffer produces Z when not enabled
EN
EN A Y A Y
0 0 Z
0 1 Z EN
1 0 0
1 1 1 A Y
EN
Nonrestoring Tristate
Transmission gate acts as tristate buffer
Only two transistors
But nonrestoring
Noise on A is passed on to Y (after several stages, the
EN
A Y
EN
Tristate Inverter
Tristate inverter produces restored output
Note however that the Tristate buffer
ignores the conduction complement rule because we want a
Z output
A
EN
Y
EN
Tristate Inverter
Tristate inverter produces restored output
Note however that the Tristate buffer
ignores the conduction complement rule because we want a
Z output
A A
A
EN
Y Y Y
EN
EN = 0 EN = 1
Y = 'Z' Y=A
Multiplexers
2:1 multiplexer chooses between two inputs
S
S D1 D0 Y
0 X 0 D0 0
0 X 1 Y
D1 1
1 0 X
1 1 X
Multiplexers
2:1 multiplexer chooses between two inputs
S
S D1 D0 Y
0 X 0 0 D0 0
0 X 1 1 Y
D1 1
1 0 X 0
1 1 X 1
Gate-Level Mux Design
Y SD1 SD0 (too many transistors)
How many transistors are needed?
Gate-Level Mux Design
Y SD1 SD0 (too many transistors)
How many transistors are needed? 20
D1
S Y
D0
D1 4 2
S 4 2 Y
D0 4 2
2
Transmission Gate Mux
Nonrestoring mux uses two transmission
gates
Transmission Gate Mux
Nonrestoring mux uses two transmission
gates
Only 4 transistors
S
D0
S Y
D1
S
Inverting Mux
Inverting multiplexer
Use compound AOI22
Or pair of tristate inverters
Essentially the same thing
D0 S D0 D1 S
S D1 S S
Y Y D0 0
S S S S Y
D1 1
4:1 Multiplexer
4:1 mux chooses one of 4 inputs using two
selects
4:1 Multiplexer
4:1 mux chooses one of 4 inputs using two
selects
Two levels of 2:1 muxes
Or four tristates S1S0 S1S0 S1S0 S1S0
D0
S0 S1
D0 0
D1
D1 1
0
Y Y
1
D2 0 D2
D3 1
D3
D Latch
When CLK = 1, latch is transparent
Q follows D (a buffer with a Delay)
When CLK = 0, the latch is opaque
Q holds its last value independent of D
CLK CLK
D
Latch
D Q
Q
D Latch Design
Multiplexer chooses D or old Q
CLK
CLK
D Q Q
1
Q D Q
0
CLK CLK
Old Q
CLK
D Latch Operation
Q Q
D Q D Q
CLK = 1 CLK = 0
CLK
Q
D Flip-flop
When CLK rises, D is copied to Q
At all other times, Q holds its value
a.k.a. positive edge-triggered flip-flop, master-
slave flip-flop
CLK
CLK
D
Flop
D Q
Q
D Flip-flop Design
Built from master and slave D latches
CLK CLK
CLK QM
D Q
CLK CLK CLK CLK
CLK
Latch
Latch
QM
D Q
CLK CLK
QM Q
D
CLK = 0
Q -> NOT(NOT(QM))
CLK = 1
CLK
Q
Race Condition
Back-to-back flops can
malfunction from clock skew
Second flip-flop fires Early
Sees first flip-flop change
and captures its result
Called hold-time failure or
race condition
Nonoverlapping Clocks
Nonoverlapping clocks can prevent races
As long as nonoverlap exceeds clock skew
Good for safe design
Industry manages skew more carefully instead
2 1
QM
D Q
2 2 1 1
2 1
1
2
Gate Layout
Layout can be very time consuming
Design gates to fit together nicely
Build a library of standard cells
Must follow a technology rule
Inverter, contd..
Example: NAND3
Horizontal N-diffusion and p-diffusion strips
Vertical polysilicon gates
Metal1 VDD rail at top
Metal1 GND rail at bottom
32 by 40
NAND3 (using Electric), contd.
Stick Diagrams
Stick diagrams help plan layout quickly
Need not be to scale
Draw with color pencils or dry-erase markers
Stick Diagrams
Stick diagrams help plan layout quickly
Need not be to scale
Draw with color pencils or dry-erase markers
VDD
Vin
Vout
GND
Wiring Tracks
A wiring track is the space required for a wire
4 width, 4 spacing from neighbor = 8 pitch
Transistors also consume one wiring track
Well spacing
Wells must surround transistors by 6
Implies 12 between opposite transistor flavors
Leaves room for one wire track
Area Estimation
Estimate area by counting wiring tracks
Multiply by 8 to express in
Example: O3AI
Sketch a stick diagram for O3AI and estimate area
Y ( A B C ) D
Example: O3AI
Sketch a stick diagram for O3AI and estimate area
Y ( A B C ) D
Example: O3AI
Sketch a stick diagram for O3AI and estimate area
Y ( A B C ) D
Introduction to Semiconductor
Manufacturing Technology
Objective
Photo courtesy:
AT&T Archive
First Transistor and Its Inventors
10M
80486
Pentium
1M
80386
100K 8086 80286
Chip
or die
Chip made with 0.35 m
technology 300 mm
with 0.25 m
technology
200 mm
with 0.18 m
technology
150 mm
Smallest Known Transistor Made
by NEC in 1997
Upper gate
Lower gate
Dielectric
Source Drain
n+ n+
MODULE
+
GATE
CIRCUIT
DEVICE
G
S D
n+ n+
IC Design:
CMOS Inverter Vin
Vdd
(a)
NMOS PMOS
Vss
Vout
(b)
P-well
Metal 1 Polycide gate and local N-well
interconnection Contact
Metal 1, AlCu
W
PMD
n+ n+ STI p+ p+
P-Well
P-Epi
N-Well
(c)
P-Wafer
IC Design: Layout and Masks of CMOS Inverter
Mask 3, shallow trench isolation Mask 4, 7, 9, N-Vt, LDD, S/D Mask 5, 8, 10, P-Vt, LDD, S/D
Quartz substrate
A Mask and a Reticle
Dielectric Test
Metallization CMP
deposition
Wafers
Design
Fab Cost
Wafers good
YW
Wafers total
Die Yield
Dies good
YD
Diestotal
Packaging Yield
Chips good
YC
Chips total
Overall Yield
YT = YWYDYC
Sale:
~200 chips/wafer
~$50/chip (low-end microprocessor in 2000)
*Cost of wafer, chips per wafer, and price of chip varies, numbers here are choosing
randomly based on general information.
How Does a Fab Make (Loss) Money
• 100% yield: 150+1200+1000 = $2350/wafer
Cost: • 50% yield: 150+1200+500 = $1850/wafer
• 0% yield: 150+1200 = $1350/wafer
Almost no yield
Throughput
Number of wafers able to process
Fab: wafers/month (typically 10,000)
Tool: wafers/hour (typically 60)
1
Y n
(1 DA)
Yield and Die Size
Killer Defects
Die
Test die
Illustration of a Production Wafer
Scribe Lines
Test
Structures
Dies
Clean Room
10000 Cl
ass
10
Cl 0,
1000 00
# of particles / ft3
as 0
s1
0,
Cl 00
100 Cl a s 0
Cl as s 1,
as s1 00
s1 00 0
10 Cl 0
as
Cl s1
1 as
sM
-1
0.1
0.1 1.0 10
Particle size in micron
Definition of Airborne Particulate
Cleanliness Class per Fed. Std. 209E
Particles/ft3
Class
0.1 m 0.2 m 0.3 m 0.5 m 5 m
1 35 7.5 3 1
10 350 75 30 10
1000 1000 7
10000 10000 70
Effect of Particles on Masks
Particles
on Mask
Stump Hole on
on +PR PR
Film Film
Substrate Substrate
Effect of Particle Contamination
Ion Beam
Dopant in PR
Particle
Photoresist
Screen Oxide
Partially Implanted Junctions
Cleanroom Structure
Makeup Air Makeup Air
Fans
< Class 1
Class 1000
Process Process
Tool Class 1000 Tool
Entrance Shelf of
Gloves
To
Cleanroom
Wash/Clean
Stations
Shelf of
Storage Gloves
Benches
IC Fabrication Process Module
Photolithography
PR Stripping PR Stripping
RTA or Diffusion
Illustration of Fab Floor
Equipment Areas Process Bays
Corridor
Service Area
Process and
metrology
tools
Service Area
Quartz
Tube
Gas flow
Distance
Vertical Furnace
Process Chamber
Heaters
Wafers
Tower
Schematic of a Track Stepper
Integrated System
Prep
Chamber Spin Coater Chill Plates
Wafer
Stepper
Wafer
Chill Plates Developer Hot Plates
Movement
Cluster Tool with Etch and Strip
Chambers
PR Strip PR Strip
Chamber Chamber
Etch Etch
Chamber Chamber
Transfer
Robot
Chamber
PECVD
Chamber
Robot Transfer
Chamber
Ti/TiN Ti/TiN
Chamber Chamber
Transfer
Robot
Chamber
Equipment Area
Process Area
Service Area
Wafer Loading Doors
Test Results
Failed die
Chip-Bond Structure
200
Wire Bonding
Bonding Pads
IC Chip Packaging
Chip
Bonding
Pad
Pins
Chip with Bumps
Bumps
Flip Chip Packaging
Bumps
Chip
Socket Pins
Bump Contact
Bumps
Chip
Socket Pins
Heating and Bumps Melt
Bumps
Chip
Socket Pins
Flip Chip Packaging
Chip
Socket Pins
Molding Cavity for Plastic Packaging
Top Chase Molding Cavity
Lead Frame
Chip Bond Metallization
Pins
Bottom Chase
Ceramic Seal
Ceramic Cap
Cap Seal
Metallization Layer 2 Layer 2
Pins