Advance Memory
Advance Memory
+ VG
Shrinking of MOSFETs
10
+ VD
Min. feature [µm]
1
wC
0.1 forecast wS Source
Drain wD
depletion
shrink of ~ 13% / year depletion
0.01 = 2 / 3 years
= 2 in area shrink / 3 years
1E-3
1950 1960 1970 1980 1990 2000 2010 2020 2030
Year
4.1 Shrinking
Advantages
Disadvantages
History:
Shrinking of MOSFETs 1960: 25 µm
Min. feature [µm] 10
1
2000: 180 nm
0.1 forecast 2005: 90 nm
2008: 65 nm
shrink of ~ 13% / year 2012: 22 nm
2010: 32 nm
2014: 14 nm
0.01 = 2 / 3 years
= 2 in area shrink / 3 years
1E-3
1950 1960 1970 1980 1990 2000 2010 2020 2030
Year
Since the first MOSFET (1960) the feature size is shrinked about 13 % per year for 50 years now
Why ?
(see chap.2
Economics: A shrink of 15 % approximately reduces the fabrication costs by 50% / device and exercise 1)
Discretes Integration
Gate
Source L Drain
L L A
´ A w L A´ w L A´
v ´ S S S S S2
v
Shrinking dimensions induces: better dynamic behavior (higher speed, lower power)
+ better economics (more devices per wafer)
Output chracteristics:
Sub-threshold behavior:
Shrinking dimensions induces: worse static behavior (higher leakage, lower reliability)
4.1 Shrinking
Advantages
Disadvantages
discovery study
1980
Empirical relationship
+ VG
+ VD
Gate
Source wC
depletion
Drain
wS
depletion depletion
wD
Short-channel effects appear, if the length of the Source/Drain depletion zone is no longer small compared to the channel length
These, not gate-controlled, depletion charges change the gate-defined threshold voltage
1977 6 1 1000 5 0.26 0.6 740 3.7
1979 3.5 0.7 700 5 0.26 0.6 362 2.9
1 Ångstroem = 0.1 nm
2000 0.15 0.04 40 1.5 0.26 0.3 0.6 0.34
1960
1974
calculated values
from table
1977
1979 8.6µm
6.4µm
2000
Since 1985 the channel length is scaled
more aggressively , short channel behavior
partly suppressed by technological tricks
1
3
There exists no (electrical) lower limit for a short-channel device,
if junction depth rj, oxide thickness dox and doping is properly scaled down
S Long channel:
* the surface potential S, determined by the gate voltage, is constant all over the channel
gate
Source VDS
* the S/D depletion zones are negligible compared to the gate length
Ec
Gradual Channel Approximation
Drain
xS xD
Short channel: Charge sharing model:
S
gate Assumption: the surface potential S is still constant, but the depletion charge is shared by the Source/Drain
depletion zones and the gate depletion
Source VDS
Ec the threshold voltage is reduced due to already depleted Source/Drain depletion
Drain This model is a good description, as long as a remarkable part of the channel is not occupied by the S/D
xS xD
S Drain-induced Barrier Lowering model (DIBL):
gate DIBL Assumption: the surface potential S varies along the channel due to the S/D potentials
Source VDS
the threshold voltage is reduced due to a Source/Drain induced barrier lowering
Ec
Drain This model is a good description, if the S/D depletion zones occupy a remarkable part of the channel
xS xD
These analytical models result in equations, which may be adjusted for practical use
In contrast, with increasing influence of 2-dim or 3-dim effects, analytical models must be replaced by numerical simulations
2) the shared (S/D symmetrical) triangles can be calculated by trigometric: wdep L rj wdep rj
L L*
L
2 2 2
2
L L* 2wdep
L rj 1 1 2 0 Si 2bulk
2 rj wdep
qN A
3) the threshold voltage shift is then:
L L*
qN A w dep w gate
qN A w dep w gate L 2 qN A w dep L L* qN w rj 2 w dep
Vth VthLC VthSC 1 A dep 1 1
C gate C gate C´´gate 2L ´´
2LCgate rj
the long channel threshold voltage decreases with shrinking channel length L, independent of V DS at low VDS
1
2
3
Lmin 0.41 rj [ µm] d ox[ A] wS [ µm] wD [ µm]
2 0 Si 2bulk
keeping S = 2 bulk approximately constant -> wdep~const wdep
qN A
qN A wdep rj 2 xS
1 2 xD 1
2) the threshold voltage shift is then: Vth VthLC VthSC 1 1
´´
2 LCgate rj rj
+ VG the long channel threshold voltage now decreases with shrinking channel length L and VDS (via S or xD)
+ VD
Short channel, the contribution of S/D depletion dominates:
(very short channel and/or VDS high)
w
wS Source dep 1) the gate controlled charge is assumed to reduce to a triangle:
depletion Drain wD
depletion
L L
wdep rj independent of VDS
2r j 4
Advanced MOSFETs and Novel Devices Prof.Dr.W.Hansch, Dr. J. Biba
Modul 1254 AdMOS, 4- 14
4 The Short-Channel MOSFET
4.2 Short Channel Effects
The Charge Sharing Model (Yau-Model)
Narrow channel
1) in reality the gate voltage also depletes the regions on both sides of the channel,
which means an unintentionally increase in channel width
The gate charge, which in the long channel case, is enough to induce inversion, is no
longer sufficient, because a part of this gate charge depletes the channel sides.
2) assuming, that the additional depletion is approximately cylindrical, the total depletion charge is:
wdep
*
Qdep qN A wdep wgate L
2 wgate
2 0 SiqN dop 2bulk wdep
3) In consequence, the threshold voltage is increased to: Vth VFB 2bulk 1
´´
Cox
2 wgate
Threshold Voltage
the opposite behavior of threshold voltage shift can not be used for compensation, since both effects are nonlinear connected
in addition, the channel length is fixed and defined by technology, the gate width is needed to be flexibel in circuit design
I-V characteristics:
Since short channel and narrow channel effect can not be separated, the simple approach: V th*=VthLC – VthSC + VthNC
can not be used in the IV-characteristics:
In practice, complicated analytical models must be assumed to calculate Vth* = f( N, L, w, rj, VDS) or 3-dim simulations
Long channel:
Channel formation (strong inversion) starts when S = 2 bulk
Approach: Vth,SC Vth, LC Apene VDS Bneigh where Apene is a so-called penetration constant, how effective the VSD penetrates
into the channel
and Bneigh a so-called neighbouring-effect B = f(L, xj, dox, Ndop, ...)
For Apene several approaches exist:
6d ox nL Apene = 1/L
Apene exp
wdep 4wdep Apene = L-3 as it is used in SPICE
the long channel threshold voltage decreases with shrinking channel length L and linear decrease in VDS
Characterization of the DIBL-model:
disadvantage: the physical description is nearly lost, mainly empirical trends are used, which are fitted to (a special ?) reality
In principle the 2/3-dim Poisson-equation must be solved. Because solving Poisson´s equation is done in
advantage:
any numerical simulation the DIBL-model is very powerful when strong inhomogenities (e.g. channel doping) exists
Simulation
700nm 1.7 µm
This effect is called: VT roll-off
In addition to the small geometry effects all high-field effects have a serious impact to short channel MOSFETs:
gate injection
2 gate break-through
3 Punch-through
Parasitics
Source-Drain resistors
4.1 Shrinking
Advantages
Disadvantages
To avoid short-channel effects (2/3-dim geometry, hot electrons (high-field) und transport degradations) scaling laws are investigated
Due to several restrictions in the past (and future), e.g. keeping voltage levels constant or technological capabilities
the „ideal“ Constant-Field Scaling was modified by keeping Poisson´s equation constant
With the Poisson equation the electrical potentials in a semiconductor device are calculated:
d 2 ( x) qN dop
dx 2 0 Si 0 Si
Poisson equation is invariant when transforming: Geometries: x*= x/a with a factor a
potentials: *= /b with a factor b
2
d 2 * d b a 2 d 2 a 2 qN dop qN *dop
dx*2 d x
a
2
b dx 2 b 0 Si 0 Si
Subthreshold scaling:
It was found emirically (1980), that the short-channel behavior can be avoided and the subthreshold characteristics
is not degraded, if the following relationship is fulfilled:
1 + VG
2
3
Lmin 0.41 rj [ µm] d ox[ A] wS [ µm] wD [ µm] + VD
dox
rj rj
wC
wS Source
Drain wD
depletion
depletion
Most flexible scaling, because only the product of parameters must be scaled
DRAM-complexity 1k 4k 16k 64k 256k 1M 4M 16M 64M 256M 1G 1G * 4G 16G 64G 256G Scaling S
Year 1970/71 1973 1976/77 1979/80 1982 1985 1988 1993 1995 1997 1999 2001 2003 2006 2009 2012 All 3 years
Process p-chan n-chan n-chan n-chanl forecast: ITRS-roadmap 1997
Si-gate Si-gate 2-poly 2-poly
DRAM cell structure 3T 1T 1T
Min.Feature [µm] 8 - 10 7-8 6-7 3.5 2.0 1.0 0.7 0.5 0.35 0.25 0.18 150 130 100 70 50 S=1/2=0.7
(channel length) 3-4 2-2.5 1.5 0.8 0.5 0.3
junction depth [nm] 2500 2000 1000 700 450 250 150 150 100-50 72-36 60-30 52-26 40-20 30-15 20-10
gate oxide thick [nm] 120 120 100 70, 70 45 30-20 15-10 15-10 5-4 4-3 3-2 3-2 2-1.5 < 1.5 < 1.0
gate width µm 5-7 3-4 2 1.2 0.5 0.5
interconnect width µm 5 4 3 2 1 1
interconnect thick µm 1 1 0.9 0.75 0.6 0.6
cell area [µm²] 3700 900 450 170 30 10 3.9 1.5 0.56 0.22 0.14 0.09 0.036 0.014 0.006 S²/1.3
Chip size [mm²] 13 19 18 21 38 55 80 145 190 280 400 445 560 790 1120 1580 1 / 1.5
Udd [V] 18/12 12 12,5 5, 5 5 5 5-3.3 3.3 2.5- 1.8 1.8-1.5 1.5 1.2 1.2-0.9 0.9-0.6 0.6-0.5
Power dissipation [W] 0.2 70 90 110 130 160 170 175
comments HMOS HMOS HMOS
II
Note:
Usually between experimental prototypes -> industrial prototype -> high volume fabrication (~ 1mill. chips/year) several years are passing.
Dependent on source the data correlation year – properties may be shifted for 1-3 years
Note:
A factor 4 increase in chip complexity is achieved by: - decreasing size by 13% per year -> 2 in 3 years -> factor 2 in area
Shrinking of MOSFETs
junction depth 1970-1976:
(values x10) * shrinking of channel length
U=12 V 100 * no shrink of voltage, oxide thickness
10 -> discovery of short channel effects
oxide thickness U= 5 V -> voltage reduction, increased oxide scaling
(values x 100) voltage
Min. feature [µm]
10
quasiconst
1978-1987:
1 * const. Voltage 5V TTL-level
* more vertical shrink to come back to GCA
channel length 1
! Not const.voltage scaling !
Fit:
channel: S=1.13
0.1 1995 - 2000:
0.1 * variable voltage
* ~ quasi-const. Scaling
Due to the technological capabilities of various companies short-time deviations from long term S=1.13 are the result
Beside the general Const.-Field scaling the scaling models may change from generation to generation
4.1 Shrinking
Advantages
Disadvantages
We have seen:
"Scaling" is the theoretical instruction how to avoid SCE (Short Channel Effects) in the I-V characteristics
1000C
By using thermal oxidation of silicon Si O2 ~ SiO2 with SiO2 the first working MOSFETs was possible (1960)
Some properties: high temperature stable very good isolator: bulk resitivity 1015 cm
good adhesion to silicon and metals very high breakthrough field: >10 MV/cm
selectively etching to silicon and metals dielectric constant: 3.9
best interface to silicon: Dit < 108 (eVcm)-1 possible
From scaling theory a continous shrinking of oxide thickness is necessary to avoid short channel effects (SCE)
10
The gate metal is poly-Silicon, highly-doped together with Source/Drain doping SiO2
p- Source p- Drain
Boron (due to its high solubility in silicon) is used for the p-MOSFET,
implanted as B+ or BF2+ n- substrate p-MOSFET
But Boron has a higher solubility in SiO2 -> boron from gate poly-Si diffuse into SiO2 and also in the silicon channel
These positive charges within the SiO2 and also within the channel will shift the threshold voltage of the MOSFET and increase scattering
, thus reducing mobility and on-current
This effect, called boron penetration, is observed, when gate oxide thickness was around 4 nm and below (beginning 1990)
Based on the dualism of light, that light in some experiments behave like an electromagnetic wave in other experiments like a particle
SiO2 deBroglie postulated that in contradiction particles may be described as wave with a wavelength depending on velocity (momentum):
energy W wave: c
h h h
Dualism of wave and particle
Photons: E h mc 2 mc mv p
wave function
non-zero
of electrons
transmission 1.23nm
for electrons
probability
Quantum well Ekin eV
of channel
Energy level of for thermal electrons with E=0.026 eV the wave length is: ~ 7,6 nm
electrons in the channel
metal
coordinate x
Because of the non-infinite barrier height of SiO2 (3.1eV) the quantum-mechanical wave function
of electrons has a non-zero transmission probability through the gate oxide.
This results in leakage current and long-term degradation of the gate oxide
From material properties of SiO2 we calculate: very high breakthrough field: >10 MV/cm = 1V/nm
With reduced 0.5 V /nm the tunneling current due to Fowler-Nordheim is ~ 10-9 A /cm²
The increase of tunneling current is ~ 1 order/ 0.25 nm (1 monolayer)
Solution: We are searching for new gate dielectrics with k> 3.9 (SiO2) -> high-k dielectrics
0 ox A we achieve the same high capacitance C with higher ox and thicker tox
Because: C
tox
Example: SiO2 = 3.9 , if we use ox = 20, we can use tox = tSiO2 * ox/ SiO2 = 5 * tox
-> if tSiO2 < 1nm is needed from roadmap, we can use tox < 5nm, which is too thick for tunneling
Possible candidates:
most promising
candidates
A new gate dielectric material must fulfill many requirements (as it is done by SiO 2):
3) band position compared to Si > high barrier for electrons in conduction band + high barrier for holes in valence band
5) amorphous or single-crystalline lattice with no phase changes between room temperature and 1050°C, 30 sec
(if Gate is fabricated first, then S/D. But fabrication of dummy-gate, then S/D (high-temperature), then Gate may be possible)
9) dry etch possible (-> all elements should build volatible compounds)
Conduction band Si
Valence band Si
14-25 22-40
~25 12-18
7 ~12
k-value: 3.9 8-11
Possible solutions:
* further search and development of materials
* replacement gate (also good for metal gate)
ZrO2, k> 15
after temperature treatment 900°C, 10 sec
interfacial layers of SiOx are created to Si and poly-Si
Leading candidate materials in 2005: HfO2 (Keff~15 - 30); HfSiOx (Keff~12 - 16) an then in 2007
Materials, process, integration issues to solve INTEL presented HfOSiN
Due to shrinking the resistance of interconnects itself and by narrowing the parasitic capacitances are increasing.
Both increases switching times, although the single transistor is getting faster and faster.
Delay [psec]
d t W W
10
(d,t ~W, see next page) 8
6
Example: 100nm Generation 4
2
C 6 0 SiO2 L 6 8.85 *1014 F 3.9 100µm 2 *1014 F
cm 0
1000 100
All together: RC = 150 * 2*10-14 F = 3 psec Generation [nm]
Year
Copper 2003 2005 2008 2011 2014
Parameter
Conductors
Technology(nm) 120 100 70 50 35
(8 Levels) # of Transistors 95.2M 190M 539M 1523M 4308M
Clock Frequency 1724 MHz 2000 MHz 2500 MHz 3000 MHz 3600 MHz
Chip Area (mm2) 372 408 468 536 615
Copper
Plugs
At sub-0.25um feature sizes, 90% of the total capacitance is dominated by line-to-line capacitance.
THE DEVICE
Increasing clock speeds and reduced sizes require new dielectrics with k values of less than 3.
14nm node
Fin pitch: 48 nm
Contacted gate pitch: 64 nm
up to 12 levels metallization
•Material •di
•Free space •1 Low-k Materials:
•Acrogels •1.5
1) Vacuum/air exhibit the smallest possible dielectric constant, this is k=1
•Teflon AF •2.1
•Aromatic thermosets (SiLK) •2.6 – 2.8 Free, non-supported interconnects are not mechanically stable,
•Polyimides (organic) •3.1 – 3.4 CMP is not possible (may be with etching-out isolation at last step)
•Silicon dioxide •3.9 – 4.5
•Glass epoxy (PCBs) •5
•Silicon nitride •7.5
•Alumina (package) •9.5
•Silicon •11.7
Since (2005) the most promising candidates are polymeres, which create nanopores when annealed
basic concept is to replace strong Si-O bond basic concept is to create polymers, because during
by weaker bond types as provided by C, F formation a lot of free space ( =1) is created
Table of polarization strength with covalent C- bonds from: Miller et al., Macromolecules 23(1990)3865 (polarizability)
Pine, Organic chemistry 5th ed., 1987 (bond strength)
Bond C≡N C≡C C=C C=O O-H C-H C-O C-F C-C
Polarizability [A] 2.24 2.04 1.64 1.02 0.71 0.65 0.58 0.56 0.53
additional porosity
air gaps by space creators (porogenes
About 10 manufacturer are developing low-k materials, the market value is about 300 Mill. € (2005)
status 2007
Equipment manufacturer try to force the development Manufacturers of chemicals offer Spin-on materials
of CVD-materials
Disadvantage: due to temperature limitation T< 400°C organics: (based on C,O,H,F) (polymeric)
in the metal layers there is not much freedom
for material development SiLK (porous polymer, k>2.2) from Dow Chemical
Black Diamond von Applied Materials GX-3 from Honeywell Electronic Materials (Sunnyvale, Calif.)
(based on trimethylsilane, fabricated by Dow Corning, inorganics: (mainly based on Si,O,H,F)
distributed by Air Products silicon-based systems with carbon (or methyl) groups and/or
hydrogen attached (hydrogen silsesquioxane)
CORAL, TOMCATS (porous SiO:C) from Novellus * advantage: better temperature stability, lower density
(based on tetramethyl silane, tetramethylcyclotetrasiloxane, compared to organics
distributed by Air Products) XLK from Dow Corning (porous version of FOx)
ORION (porous SiO:C, k>2.2) from Trikon Technologies Nanoglass E (porous SiO2) from Honeywell Electronic Materials
Most suppliers of CVD-tools offer processes for deposition Air Products (world´s largest producer of SiF4)
of SiO2-based layers with k between 2.7 to 2.4 , porous silica material called MesoELK
layers with k~2.0 were demonstrated
LKDxx (porous MSQ, k>1.9 from JSR (Tokyo),
Silsesquioxane forms precise 3-dim structures (like bucky balls) and with radicals X
properties like polymerisation, solubilization, polaribility, stability,.. can be tailored.
some of the polar Si-O bonds are Formation of cross-linked Oligomeres (here octamere)
replaced by less polar Si-X bonds (creating free space by cage-like arrangement)
O HC H O
Si Si
H CH O O H
O H
H Si Si H C H
O O O
O O Si Si If the X is:
Si Si Si H O
O O C O
O a hydrogen atom then
X Si Si O H Si Si
O HC H O
O H O the crosslink is done by
H
a bridging oxygen (HSQ)
O If X is: O O
H
Si Si O
Si Si
- a methyl-group, then O O O O O
O
Si-O-Si bond is named siloxane the crosslink is done by Si O Si O Si O
Si
O
a bridging CH2 (MSQ) H
C H H O O
Si Si
O O H
Si O Si O
Properties: H H
Typically in the low-k polymer (the matrix) an additional thermal instable polymer (called porogen, ~ 25 vol %) is added.
At formation temperature of the low-k network the instable polymer decomposes and evaporates, leaving nanopores
H O
Si Si
O O
O
Matrix Porogen
Si O Si H O
O Si Si
H O O
O O O O O
Si Si Si Si Si O Si
O O
O H O O
Si Si O O O
O Si O Si O
HO H O Si Si
O O H
O O Si O
Si Si Si O
O O HH H
Si Si O
O
HO HO O typical: co-polymeres
Si Si
H O O
Si O Si
O - poly(methyl methacrylate-co-dimethylaminoethyl methacrylate)
O
= P(MMA-co-DMAEMA)
O O
Si Si
O O H
typical: silsesquioxane Si O Si O
H H
Typical Process:
SSQ
wafer
To achieve low contact resistance and shallow junctions selective epitaxial growth could be used
4.1 Shrinking
Economical pressure requires a lateral shrink (channel length) of ~ 13% / year -> area shrink of factor 2 every 3 years
Shrinked devices have the advantage of better dynamic performance, but the disadvantage of worse static behavior
The classical short channel effect appears, if the Source/Drain depletion zones occupy most of the channel length
This results in a lowering of the threshold voltage and in an IV-characteristics dependent on S/D voltage
The classical short channel effect can be analyzed by two similar models:
The Charge-sharing model explains the lowering of threshold voltage by shared depletion zones of Gate and S/D
The DIBL model explains extreme lowering of threshold voltage by channel barrier lowering due to connected S/D depletion zones
This results in many effects like modifications in carrier transport and break-through phenomena
4.3 Scaling
Scaling describes the attempt to keep the electric fields the same in a shrinked device
The starting point of scaling is the simple, classical long-channel MOSFET without any modifications
Due to external requirements (as given by technological capability and circuit requirements) several scaling models exist:
Electrostatic scaling, with special conditions like Constant-Field Scaling, (Quasi)-Constant-Voltage Scaling, General Scaling
Subthreshold scaling
Scaling rules are overruled by addition of doping structures in the MOSFET structure
Todays and future MOSFET design is high sophisticated task to balance all parameters to a good compromise
4.1 Shrinking
Advantages
Disadvantages