Stacked Nanosheet Fork Architecture For SRAM Design and Device Co-Optimization Toward 3nm
Stacked Nanosheet Fork Architecture For SRAM Design and Device Co-Optimization Toward 3nm
Abstract—This paper discusses SRAM scaling beyond the needed which prove to be difficult to etch and refill combined
5nm technology node and highlights the fundamental scaling with the high aspect ratio of the ever-increasing fin height (FH).
limits due to FinFET and Gate all-around (GAA) technology. On the other hand, reducing GE is limited by the margin needed
To compensate for expected gate pitch scaling slowdown for oxide gate stack and metal gate Work Function (WF) tuning
below 42nm, several scaling boosters are needed to reduce the in the conventional RMG process [14], combined with mask
cell height. However, limited scaling benefits can be achieved overlay margins (Fig. 3A). Moreover, as cell width scaling is
in FinFET and GAA technology. Therefore, a novel vertically proportional to CGP, any relaxation from CGP scaling across
stacked lateral nanosheet architecture using a forked gate nodes will further constrain GE and GC for a target area.
structure is proposed showing superior performance and area
B. SRAM scaling boosters to enable beyond 5nm
scaling compared to FinFET and GAA devices. Moreover,
limited additional processing complexity can be achieved. The The need for an area scaling reduction in a generic standard
Fork architecture allows 20% SRAM area scaling at iso- cell logic below CGP=42nm, can be partly compensated by
performance and 30% performance increase at iso-area track height reduction or fin depopulation [10]. In SRAM, this
compared to FinFET beyond 5nm technology node. compensation requires to break the traditional NxFP rule. One
approach to reduce cell height is to downscale the GE by
I. INTRODUCTION allowing a self-alignment solution which circumvents overlay
Since the first generation of FinFET technology, SRAM bit margin (Fig. 3B) [15]. However, the GE remains twice as wide
cells have progressively scaled proportional to minimum as the gate stack. Another approach is to allow a Metal Gate Cut
contacted gate pitch (CGP) and Fin Pitch (FP) defining the bit (MGC), circumventing trench filling issues and reducing GE to
cell width and height respectively (Fig. 1) [1-9]. Projecting a a single gate stack plus overlay margin (Fig. 3C). A clear path
0.625 scaling per node, this trend can be maintained down to a in which both a self-aligned GC to fin and MGC can be
5nm node targeted at 42nm CGP and 24nm FP. However, combined has not been identified as both seem to be mutually
beyond these pitches the trend is not likely to be maintained due exclusive. Another approach is to break the cell height design
to an anticipated CGP scaling slowdown below 42nm caused arc by allowing the dummy P-type Fin end to Tuck under the
by, among others, device short channel control loss, high Gate Cut (FTGC) instead of the gate (Fig. 4). This reduces the
contact resistance and SADP cliff for patterning [10]. Likewise, P to N separation allowing to increase the GC margin (Fig. 5)
FP scaling has patterning limitations of SAQP and process C. Novel stacked sheet architecture for hyper-scaling
constraints due to silicon fin etch and Replacement Metal Gate The GE is an intrinsic SRAM scaling limitation for FinFET
(RMG) stack fill. Vertically stacked lateral GAA devices, i.e. technology as it is needed to control the channel on each vertical
nanowires and nanosheets, are expected to be introduced at side of the fin. GAA devices do not solve this issue and hence
these tight pitches [11-13]. They offer better electrostatic are not very attractive from a SRAM scaling perspective.
control favoring CGP scaling but fail to provide benefits in Therefore, an alternative and novel vertically stacked lateral
scaling the cell height. In this paper, we therefore propose an sheet device with forked gate structure, i.e. the Fork sheet, is
alternative and novel vertically stacked lateral sheet device with proposed to continue scaling beyond 5nm. This device
forked gate structure overcoming these issues. architecture, shown in Fig. 6, allows removing the GE margin
II. FUNDAMENTAL SRAM SCALING ARC all together on one side of the device. The essential benefit of
the sheet device is that most of the channel control is provided
A. Traditional SRAM scaling following FinFET scaling
by WF metals present in the y-direction (along the sheet) rather
Fig. 2 shows the layout of the 111 and 122 bit cells with the than the z-direction. This allows for GE optimization leading to
height decomposed into critical Front End Of Line (FEOL) the Fork architecture. Furthermore, without additional process
design rules such as Gate Cut (GC), Gate Extension over active complexity, the gate edge can self-align with the device
fin (GE) and Dummy Fin Gate Tuck (DFGT) among others. In circumventing overlay margin (Fig. 7). Consequently, a single
an advanced 5nm technology up to 60% of SRAM cell height mask could be used to fabricate the same. Moreover, for
is determined by GE and GC rules. Therefore, the key extended channel control, an isotropic dielectric etch after the
challenges for continued SRAM scaling are enabling nanosheet release can be introduced. This allows the gate to
aggressive GCs and reducing the GE rule, while still providing partially wrap around the sheet side face (Fig. 7 inset c*).
channel control. Beyond the 5nm node, GCs below 20nm are
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on July 29,2023 at 11:16:41 UTC from IEEE Xplore. Restrictions apply.
978-1-5386-3559-9/17/$31.00 ©2017 IEEE 20.5.1 IEDM17-505
III. DEVICE OPTIONS AND FEOL SCALING ION and VTH mismatch for a 50nm tall fin is compared to a 2 and
A. Lateral device options for 5nm technology and beyond 3 stacked GAA and Fork sheet at equal device footprint
considering GE and equal IOFF (Fig. 13). The GAA structure
From an energy perspective, SRAM favors low leakage requires the Number of NanoSheets(NNS)=3 for improved
devices making the GAA more attractive compared to FF. mismatch compared to FinFET in both the 1 and 2 fin scenarios.
However, mismatch is inversely proportional to the device However, in the 1 fin scenario the ION is reduced. On the
dimensions and ultimately determines both the performance contrary, the Fork structure shows comparable mismatch and
and yield of an SRAM array. Moreover, the SRAM area scaling ION at NNS=2, and improved mismatch and ION at NNS=3.
needs to be considered as well. Therefore, a device with the
highest drive for the lowest footprint, while maintaining enough IV. SRAM PERFORMANCE AREA AND YIELD IMPACT
channel control is preferred, leading to the Fork device (Fig. 6). A. Effective pull-down width and cell-height scaling
B. Effective device width and footprint scaling The layout of SRAM half-cells with GAA and Fork sheet
Fig. 8A shows the effective device width (Weff) as function devices are shown in Fig. 14A. Due to the specific nature of the
of the device y-footprint considering the GE. For the Fin and SRAM layout not one but two GE can be eliminated for the pull
GAA the GEs on both sides of the device are included while for down and pass gate device and a DFGT of the pull up device
the Fork only one side is included. The GAA device with a when using the Fork sheet device. As a result, the HP cell has
Vertical Sheet Pitch (VSP) of 17nm then outperforms the Fin a 20% cell height reduction for the same pull Down Weff or a
for multi-fin devices with Fin Pitch (FP) 24nm. However, when 40% Weff increase for the same cell height as shown in Fig. 14B.
scaling device footprint, this margin is reducing. Moreover, the B. SRAM electrical performance and area scaling
GAA VSP needs to be smaller than the FP to make it
Monte Carlo circuit simulations were performed using a
competitive against a Fin based architecture (Fig. 8B). The
device compact model [12] combined with the mismatch and
Fork architecture, however, always outperforms the Fin at
electrostatics obtained from TCAD. Fig. 15A shows the cell
equal FP and VSP. Fig. 9 shows Weff/y-footprint scaling vs.
height scaling for HD and HP cells when comparing equal pull
device y-footprint for 22nm down to 5nm FinFET technology
down drive strength evaluated at the μ-6ı yield criterion. Here,
node. For each node fin number (NF) variations 1-4 are shown.
20% projected area scaling is achieved because the relative ION
The continued FP reduction (needed for scaling) and FH
reduction for the Fork sheet is compensated by the improved
increase (needed for drive) across technology nodes are not
mismatch at small sheet widths. Fig. 15B shows the pull-down
likely to be maintained beyond 5nm. GAA technology does not
drive at equal cell height. Here, 30% improvement is shown
offer much improvement compared to 5nm FinFET at small
(slightly lower than projected 40% Weff) because, for wider
device y-footprints. On the contrary, the Fork sheet is a natural
sheets, the mismatch improvements are reduced (see Fig. 12).
node extension to 3nm offering superior Weff/y-footprint.
V. CONCLUSION
C. Device electrostatics and mismatch performance
TCAD simulations using Synopsys Sentaurus device [16] In this work, a novel vertically stacked lateral nanosheet
were performed to assess the electrical performance of all architecture using a forked gate structure is proposed for
devices. Fig. 10 shows the drain current for variable sheet width continued scaling beyond 5nm. In this structure, the margin for
for both the GAA and Fork device. It is observed that the GE is eliminated and the gate cut can be self-aligned to the
subthreshold slope is slightly degraded due to the device. This not only reduces device footprint but allows for a
‘uncontrolled’ side faces in the Fork structure. When offsetting simplified process. Although minimal electrostatic control loss
VTH to match IOFF, this results in an ION reduction between a few is observed, compared to GAA, the increased Weff boosts ION
% to 15% when moving from 30nm to 10nm equal sheet width, and mismatch robustness at equal footprint. SRAM bit cell
depending on the gate length (LG) (Fig. 11). This, however, is scaling hugely benefits from this architecture showing a 20%
vastly compensated for by the increased sheet width of the Fork area shrink or a 30% performance gain at and beyond 5nm.
device at equal footprint. Accompanying mismatch simulations REFERENCES
using the GTS Minimos NT [17] solver were performed to
capture the effects of Random Discrete Dopants (RDD) and [1] E. Karl et al., ISSCC 2012, p. 230; [2] E. Karl et al., ISSCC
Metal Grain Granularity (MGG), most important to device 2015, p. 1; [3] https://newsroom.intel.com/; [4] Y. H. Chen et
random mismatch in SRAM (Fig. 12). Realistic doping profiles al., ISSCC 2014, p. 238; [5] S-Y Wu, et al., VLSI 2016, p. 92;
for the source and drain doping of 3e18cm3 extending into the [6] M. Clinton et al., ISSCC 2017, p. 210; [7] T. Song et al.,
channel of 1e16cm3 are evaluated. Due to low channel doping, ISSCC 2014, p.232 ; [8] T. Song et al., ISSCC 2016, p. 306;
RDD mainly impacts source/drain dopants at the edge of the [9] T. Song et al., ISSCC 2017, p. 208; [10] M. G. Bardon et
channel, resulting in an effective LG modulation. Therefore, al., IEDM 2016, p. 28.2.1; [11] H. Mertens et al., VLSI 2016,
GAA and Fork devices with lower sheet width are more robust p.158; [12] D. Jang et al., TED 2017, p. 2707; [13] N. Loubet
against RDD due to their better channel control. WF induced et al., VLSI 2017, p T17-5; [14] L. -A. Ragnarsson et al., VLSI,
VTH variations due to MGG scale per ratio of grain size to gate pT98, 2016; [15] M.C. Webb et al., 2015 patent number
area. Consequently, a similar AVT for GAA and Fork devices is WO2015094305; [16] Synopsys inc., Mountain View, CA,
observed as MGG mismatch dominates over RDD. Finally, the USA; [17] Global TCAD Solutions, Wien, Austria; [18]
Coventor, inc., Cary, NC, USA
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on July 29,2023 at 11:16:41 UTC from IEEE Xplore. Restrictions apply.
IEDM17-506 20.5.2
0.11
111[4-6] 111[4-6] x10
122[4-6] 122[4-6]
122[7-9] 122[7-9]
0.08
111[1-3] 111[1-3]
112[1-3] 112[1-3]
x8
x0.625
0.05
CGP=42
FP=24
CGP=36
FP=21 ?
0.01 0.02
22nm 16/14nm 10nm 7nm 5nm 3nm 0 0.005 0.01
A) Technology node B) 2xGate Pitch x Fin Pitch
Fig. 1. A) SRAM HD (111) and HP (112/122) bit cell scaling
trend following 0.625 shrink per technology node. B) Same trend
as function of 2xCGPxFP. As cell width scales at 2xCGP, the Fig. 2. 111 and 122 FinFET bit cell showing the cell height decomposed into
HD and HP cell height scales approximately x8 and x10 with FP. FEOL design rules of Gate Cut (GC), Gate Extension (GE), Dummy Fin Gate
Poly gate Spacer Poly gate Tuck (DFGT), Fin Width (FW), Fin Pitch (FP) and PP separation (PP).
deposition
111 SRAM target height=192nm FW
+ etch back
60 Metal gate cut Weff=
(7nm+3nm) NF(2FH+FW)
50 SA extension
A) B) Defines GE C) FH
Gate Cut (GC) [nm]
(14nm)
Poly gate cut
Fin
40
Poly gate cut Poly gate metal gate (14nm+3nm) A)
FP
GE deposition deposition 30 Overlay= Weff=
NNS(2SW+2SH) SH
GC 3nm
RMG stack=
20 0.5nm IL GAA VSP
+ 1.5nm HK
10 + 3nm HK cap B)
+ 2nm WFM SW
metal gate
Fig. 4. SRAM scaling = 7nm Weff=
metal gate metal gate 0 NNS(2SW+SH)
deposition deposition cut booster FTGC allows 0 5 10 15 20
GC GE to reduce N-P spacing. Gate Extension (GE) [nm] Fork
z
cut
self- cut Fig. 5. GC vs. GE w/ and w/o C)
mask mask
overlay
aligned
overlay
FTGC for 5nm. The GE for 3
y y-footprint
process options is indicated.
double stack double stack single stack
Fig. 6. A) Fin, B) GAA and C)
STI Fin Poly-Si SiO2 IL Sacrificial spacer HK HK cap WFM W
Fork device.
Fig. 3. A) Conventional RMG process with Poly cut resulting in a GE
of 2x gate stack plus overlay. B) Self-aligned GE using spacer 700
FP=24nm
defined trench for Poly-gate fill. C) MGC allowing single gate stack VSP=17nm
outperforms 28 1 fin
600
Device effective width [nm]
NNS=3
500 GE=17nm 24
Fin
400 GAA 20
Fork
300 3 fin
16
200 2 fin
12 GAA
100 1 fin 1 fin
comparable
0 8
A) 0 25 50 75 100 B) 16 18 20 22 24
Device Y footprint [nm] Fin Pitch (FP) [nm]
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on July 29,2023 at 11:16:41 UTC from IEEE Xplore. Restrictions apply.
20.5.3 IEDM17-507
5 VTH offset for equal IOFF
50nm 1.E-04 1.E-04 25%
NNS=3 Solid(GAA)
4.5 40 nm Sheet width 70
1.E-05 Dash(Fork)
20%
(ION,GAA-ION,Fork)/ION,GAA
4 FP=24nm 8.E-05 Lg=14nm
25 nm FH=53nm 65
SS [mV/dec]
3.5 15%
Weff/y-footprint
ION/ION,FIN
10
80%
8
6
60%
4 40%
2 20%
0 0%
1 2 3 2 3 2 2 3 2 3
A) Fin GAA FORK B) Fin GAA FORK
FP=24nm
50nm GE=17nm
z
39nm 5nm 22nm 63nm 29 nm 46nm
y
Fig. 13 VTH mismatch for Fin, GAA and Fork sheet devices and relative
Fig. 12 VTH mismatch due to RDD and MGG as function of sheet width ION for the GAA and Fork devices normalized to Fin for the A) single
for GAA and Fork obtained from TCAD. Pelgrom mismatch parameter and B) double fin scenario at equal footprint and IOFF of 4pA targeting
AVT is calculated by effective device channel area. B) Metal WF high VTH SRAM transistors. Iso-drive Iso-height
variation due to MGG. C) RDD from S/D perturb the electron 300
HD cell
30
HD cell +26%
concentration in the channel. HD 122 FinFET cell with FH=50nm 250 HP cell -22% HP cell 25
μ-6ı Pull Down drive [μA]
DFGT 100
SWP=17 10
GC DFGT
100% +40%
GC 50 5
FW -20%
GE SWN=17 GC SWN=34 Fork
75% 0 0
GE
GE
GE Dash(P:N=1:1) A) B)
FP
Solid(P:N=1:2)
50%
0% 50% 100% 150% 200%
A) GC GC GC
B) Pull Down Weff scale
Fig. 14 A) Layout of SRAM half-cells. GAA devices still require GE rule, while the Fork device Fig. 15 Iso-drive and iso-height comparison
has a zero extension rules enabling area scaling of equal Weff. B) SRAM Weff and area tradeoff for between Fin and Fork sheet HD and HP cell
the GAA and Fork devices compared to the HP 122 FinFET cell. Using variable sheet widths with configurations.
ratio of 1:2 for pMOS and nMOS allows maximum gains in area scaling and performance boost.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on July 29,2023 at 11:16:41 UTC from IEEE Xplore. Restrictions apply.
IEDM17-508 20.5.4