3D Integration in VLSI Circuits Implementation Technologies and Applications
3D Integration in VLSI Circuits Implementation Technologies and Applications
com
3D Integration in VLSI
Circuits
www.allitebooks.com
Devices, Circuits, and Systems
Series Editor
Krzysztof Iniewski
www.allitebooks.com
3D Integration in VLSI
Circuits
Implementation Technologies
and Applications
Edited by
Katsuyuki Sakuma
Managing Editor
Krzysztof Iniewski
www.allitebooks.com
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize
to copyright holders if permission to publish in this form has not been obtained. If any copyright material
has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying, microfilming, and recording, or in any information storage or retrieval
system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.
com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and
registration for a variety of users. For organizations that have been granted a photocopy license by the
CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
www.allitebooks.com
Contents
v
vi Contents
vii
viii Preface
I would like to sincerely thank all of the authors for their hard work and
commitment. Without their contributions, it would not have been possible to
provide an up-to-date review of these innovative technologies and the chal-
lenges in 3D integration. It is my hope that this book will provide readers
with a timely and comprehensive view of current 3D integration technology.
Katsuyuki Sakuma
Yorktown Heights, New York
Series Editor
xi
Editor
xiii
Contributors
xv
xvi Contributors
Li Li Spyridon Skordas
Cisco Systems, Inc. IBM Research Division
San Jose, California Albany, New York
Kevin Winstel
Liam Madden IBM Research Division
Xilinx, Inc. Albany, New York
San Jose, California
Ephrem Wu
Takayuki Ohba Xilinx, Inc.
Laboratory for Future San Jose, California
Interdisciplinary Research of
Science and Technology Susan Wu
Tokyo Institute of Technology Xilinx, Inc.
Tokyo, Japan San Jose, California
Xin Wu
Suresh Ramalingam Xilinx, Inc.
Xilinx, Inc. San Jose, California
San Jose, California
Ting-Yang Yu
Katsuyuki Sakuma Department of Electronics
IBM T.J. Watson Research Center Engineering
Yorktown Heights National Chiao Tung University
Albany, New York Hsinchu City, Taiwan
1
Three-Dimensional Integration:
Technology and Design
P. Franzon
CONTENTS
1.1 Introduction .................................................................................................... 1
1.2 Three-Dimensional Integrated Circuit Technology Set ...........................2
1.3 Three-Dimensional Drivers..........................................................................5
1.4 Miniaturization .............................................................................................. 5
1.5 Cost Reduction ............................................................................................... 6
1.6 Heterogeneous Integration ...........................................................................7
1.7 Performance Enhancement ..........................................................................8
1.8 Power Efficiency ........................................................................................... 10
1.9 Conclusion .................................................................................................... 12
References............................................................................................................... 12
1.1 Introduction
3D and 2.5D integration technologies permit substantial improvement in
form factor, power, performance, functionality, and sometimes even cost.
Though not providing a direct replacement for Moore’s law, 3D technologies
can permit a generation or more of exponential scaling in power per unit of
performance and other factors.
This chapter is structured as follows. First there is a review of 3D technolo-
gies, followed by a general discussion of commercial 3D success stories and
technology drivers. Included in that we review 3D logic projects conducted
at North Carolina State University, United States, before closing out on het-
erogeneous integration.
1
2 3D Integration in VLSI Circuits
Microbump Oxide
Wiring layers
Through-silicon vias (TSVs)
Bumps Silicon
Interposer
FIGURE 1.1
Interposer (2.5D) technology.
3D Integration: Technology and Design 3
FIGURE 1.2
3D stacking technologies.
they are limited to 40 μm pitch (with potential for 25 μm). Copper–copper
thermo-compression can be used down to sub-5 μm pitches. Alternatively,
hybrid bonding can be used. In hybrid bonding the top surface, with metal
plugs in it, is planarized and then bonded to another such surface. An exam-
ple is the Ziptronix data-based individualization (DBI) process [1]. This pitch
can be built down to 1 μm pitch but 6–8 μm are more typical. Hybrid bond-
ing is used to make many cell phone cameras today (Please see Section 1.6
on heterogeneous integration). With such high interconnect densities, many
interesting architectures can be explored, as will be explained later.
The bottom two chips in Figure 1.2 are joined using a face-to-back (F2B)
arrangement in which the bottom (back) of one die is joined to the top ( face) of
another. TSVs are needed to provide the connection to the joining backside.
As the sidewalls of the TSV are not entirely vertical, the TSV pitch is limited
to approximately the thickness of the wafer, typically 10–25 μm. Thus a F2B
connection provides a lot of less density than a F2F connection.
TSVs can be inserted before, during, or after complete wafer processing.
These are referred to, respectively, as via early, middle, and last. Figure 1.3
shows a process via middle. Wafers are partially completed, say to metal 1.
A vertical side-wall via is etched part way through the wafer. This via is
created using the Bosch process [2]. In the Bosch process, deep reactive-ion
etching (DRIE) is alternated with a deposition step multiple times to cre-
ate a near vertical wall via. Today the wall steepness is typically 10:1. Thus
the via depth has to be less than 10× the diameter of the opening at the
top. The via sidewalls are passivated, typically with an oxide, and then filled
with a metal, typically tungsten or copper. The chip metal stack (BEOL) is
then completed. The wafers are then flipped and thinned, exposing the TSV
metal. The exposed metal can then be used directly for a joining process, or a
bump structure is added before joining with another wafer or chip to create
a 3D stack.
4 3D Integration in VLSI Circuits
Transistor/early
wiring layers
Bulk silicon
Microbumping
Stacking
FIGURE 1.3
Through-silicon vias (TSVs) fabrication steps.
Note in Figure 1.2 that the top chip has not been thinned. Usually a 3D
chip stack is assembled with the chip in wafer form. Two wafers are attached
with the second left unthinned. This wafer stack can be attached to another
(thick) wafer for further handling. In most cases, one wafer is left in a thick
format so that the wafer stack can be easily handled. Thin wafer handling is
possible but increases the cost.
Most 3DICs are assembled in wafer format. Again, the driver is cost.
Wafer-to-wafer attachment processing costs less per die and has higher
yields. Although chip to wafer attachment is possible it is not widely used.
An intriguing complement to 3D chip stacking technology is monolithic
3D in which there is only one silicon substrate (and thus no TSVs). The most
commercially successful one of these technologies is 3D NAND Flash in
which the string of NAND transistors in a nonvolatile flash device is fabri-
cated vertically. This approach brings substantial density improvements and
cost savings over conventional NAND flash technology. However, to date it
has not been commercially applied to any other logic or memory structure.
Another monolithic (-like) 3D technology set involves techniques in which
silicon-on-insulator (SOI) wafers are the starting wafer. This is shown in
Figure 1.4. In this approach, fabricated wafers are joined face to face using an
oxide–oxide bond. As the transistors are built on the top of an oxide layer, a
silicon-selective back etch can be used to remove the silicon part of the SOI
substrate while not affecting the transistors and interconnect layers. Simple
3D Integration: Technology and Design 5
BEOL oxide
Epi
Metal
transistors Buried oxide
Bulk silicon
4. Repeat
FIGURE 1.4
3DIC chip stack using SOI wafers.
1.4 Miniaturization
An early application of TSVs was providing the I/O connections cell phone
camera front-side imaging sensor [4,5]. The goal was not to leverage 3D chip
stacks—these were single die—but to reduce the overall sensor height, at
least when compared with conventional packaging approaches.
6 3D Integration in VLSI Circuits
FIGURE 1.5
Miniaturized sensor as a two-chip stack.
3D chip stacking can be used to make such sensors with low integrated
volume. Though fabricated using wire bonding, Chen et al. demonstrated
an integrated power-harvesting data-collecting sensor with the photovoltaic
power-harvesting chip mounted on top of the logic and RF chips [6]. This
maximizes the photovoltaic power-harvesting area while minimizing the
volume. TSVs and bonding technologies would permit further volume reduc-
tion. Lentiro [7] described a two-chip stack aimed at simulating a particle of
meat for the purposes of calibrating a new food processing system. One chip
is a radio-frequency identification (RFID) power harvester and communica-
tions chip, the second is the temperature data logger. It is a two-chip stack
with F2F connections and TSV-enabled I/O. It is integrated with a small bat-
tery for data collection purposes only as the RFID cannot be employed in
the actual processing pipes. The two-chip stack permits smaller imitation
food particles than otherwise would be the case and is shown in Figure 1.5.
The RFID coil can be seen. The chip includes capacitors for temporary power
storage.
few micron pitch. Examples are reported in References 8 and 9. This is also
an example of heterogeneous integration, as the two chips in the stack go
through different manufacturing processes.
Another example of cost reduction is found in building high-end field-
programmable gate arrays (FPGAs). To a first approximation, the cost of
a large CMOS chip goes up with the square of the area. This is because
the probability of a defect occurring on the chip and thus killing the chip
goes up with the chip area, whereas the cost of making the chip in the first
place also goes up with the area. Thus it is worth considering partitioning a
large chip into a set of smaller ones, if the cost of integration and the addi-
tional test are less than the savings accrued to increase CMOS yield. Xilinx,
California, United States, investigated this concept for large FPGAs and is
now selling FPGA modules containing 2–4 CMOS FPGA chips, tightly inte-
grated on an interposer. Details are not available but they claim an overall
cost savings [10].
A third example is that of mixing technology nodes. In general, Moore’s
law tells us that a digital logic gate costs less to make in a more advanced
technology due to the reduced area for that gate in that node. However, in
contrast, many analog and analog-like functions such as ADC and high-
speed serial deserializer I/Os (SerDes) do not benefit in such a fashion. The
reason is that the analog behavior of a transistor has higher variation for
smaller transistors than for larger ones. Thus for many analog functions that
rely on well-matched behaviors of different transistors in the circuit, no ben-
efit is accrued from building smaller transistors. More simply put, analog
circuit blocks do not shrink in dimensions with the use of more advanced
technologies. Thus the cost of these functions in a more advanced process
node can actually be higher, than in the old node, as the old node costs less to
make per unit of area. Again this is an example of heterogeneous integration,
the heterogeneity being that of mixing technology notes.
Although Wu [10] also explored this concept generically, Erdmann et al.
[11] have explored this concretely for a mixed ADC/FPGA design. Their
design consisted of two 28 nm FPGA logic dies, integrated with two 65 nm
ADC array dies on an interposer. Thus two sets of cost benefits are accrued,
first the yield-related savings from splitting the logic die into two and the
fabrication cost savings of keeping the ADCs in an older technology.
TABLE 1.1
Comparison of 2D and 3D Memories
Power Efficiency I/O Efficiency DQ
Technology Capacity BW (GB/s) (W) (mW/GB/s) (mW/Gb/s) Count
TABLE 1.2
Improvements in 3D Design over 2D Using Logic Cell Partitioning
Total Wire Length Fmax Total Power Power/
(% Change) (% Change) (% Change) MHz
die through a TSV array, the TSV arrays running through the chip centers.
Each chip is F2B mounted to the chip beneath it. The eight channels are oper-
ated independently. Details for a first-generation HBM (operating at 3.8 pJ/bit
power level at 128 GB/s) can be found in Reference 15. The use of HBM in
graphics module products has been announced by Nvidia and AMD.
Wide I/O is also a JEDEC-supported standard, aimed largely at low-power
mobile processors. Although intended to be mounted on top of the logic
die in a true 3D stack, side by integration on an interposer is also possible.
Wide I/O is a DRAM-only stack—there is no logic layer. Instead the DRAM
stack is exposed through a TSV-based interface and the memory controller
is designed separately on the CPU/logic die that is customer designed. To
date the thermal challenges of mounting a DRAM on an already hot mobile
processor logic die have been insurmountable, especially as it is desired to
operate the DRAM at a lower temperature than logic (85°C for DRAM vs.
105°C for logic) to control leakage and refresh time. This has been a barrier
to adoption.
The Tezzaron DiRAM4 is a proprietary memory still in development. It
has 4096 data I/O organized across 64 ports. It is intended only for 3D and
interposer integration. It has a unique organization in that the logic layer
is not only used for controller and I/O functions but also houses the global
sense amplifiers and addresses decoders that in other 3D memories are
on the DRAM layers. This permits faster operation for these circuits. The
DiRAM4 has potential for a very high bandwidth (up to 8 Tbps) and fast
random cycles (15 ns) [16].
Logic only
Logic, clocks,
flip-flops Die photo
FIGURE 1.6
Two-chip stack.
a CAD flow to do this that could reuse 2D CAD tools, especially place and
route tools. To make that feasible, all flip-flops are kept in one tier so that 3D
clock distribution was not required. The radar PE was implemented in the
Tezzaron bulk CMOS 3D process [17] (Figure 1.6). The results are summa-
rized in Table 1.2. On average, performance per unit of power was increased
by 22% due to the decreases in wire length achieved through this parti-
tioning approach. The radar processor had an improvement in performance
per unit of power of 21%. The other designs achieved 18% and 35%. The
achieved improvement was roughly equivalent to one generation of Moore’s
law scaling.
In a different project at North Carolina State University, we took a very differ-
ent approach to improving performance/power using 3D technologies. A stack
of two different CPUs are integrated vertically using a vertical thread transfer
bus that permits fast compute load migration from the high-performance CPU
to and from the low-power CPU when an energy advantage is found [18]. In this
design, the high-performance CPU can issue two instructions per cycle, whereas
the low-power CPU is a single-issue CPU. The transfer is managed using a low-
latency, self-testing multisynchronous bus [19]. The bus can transfer the state
of the CPU in one clock cycle by using a wide interface and by exploiting a
high-density copper–copper direct bond process. The caches are switched at
the same time, removing the need for a cold cache restart.
Simulation with Specmark workloads shows a 25% improvement in the
power/performance ratio compared with executing the sample workload
solely in the high-performance processor. In contrast, if the workload was
executed solely in the single-issue (low-power) CPU, there would be 28%
total energy savings, compared with keeping the workload in the high-
performance CPU but at the expense of a 39% reduction in performance. If
the workload was allowed to switch every 10,000 cycles, there would be 27%
total energy savings but at the expense of only a 7% reduction in performance
that is, a 25% improvement in power per unit of performance is achieved. The
die photograph is shown in Figure 1.7.
12 3D Integration in VLSI Circuits
FIGURE 1.7
3D heterogeneous processor die.
1.9 Conclusion
3D and 2.5D technologies considerably open the design space for semicon-
ductor technologies. Dimensions for exploration include miniaturization,
cost reduction, achieving new modalities via heterogeneous integration, per-
formance improvement, and improvement in performance/power.
References
1. P. Enquist, G. Fountain, C. Petteway, A. Hollingsworth, and H. Grady, Low cost
of ownership scalable copper direct bond interconnect 3D IC technology for
three dimensional integrated circuit applications, IEEE International Conference
on 3D System Integration, 3DIC 2009, San Francisco, CA, 2009, pp. 1–6.
2. U.S. Patents 5,501,893 and 6,531,068.
3. J.A. Burns, B.F. Aull, C.K. Chen, C. Chang-Lee, C.L. Keast, J.M. Knecht,
V. Suntharalingam, K. Warner, P.W. Wyatt, and D. Yost, A wafer-scale 3-D cir-
cuit integration technology, IEEE Transactions on Electron Devices, 52(10): 2507–
2516, 2006.
4. http://image-sensors-world.blogspot.com/2008/09/toshiba-tsv-reverse-
engineered.html
5. http://www.semicontaiwan.org/en/sites/semicontaiwan.org/files/docs/
4._mkt__jerome__yole.pdf
6. G. Chen, M. Fojtik, D. Kim, D. Fick, J. Park, M. Seok, M.-T. Chen, Z. Foo,
D. Sylvester, and D. Blaauw, Millimeter-scale nearly perpetual sensor system
with stacked battery and solar cells, 2010 IEEE International Solid-State Circuits
Conference-(ISSCC), San Francisco, CA, 2010, pp. 288–289.
7. A. Lentiro, Low-density, ultralow-power and smart radio frequency telemetry
sensor, PhD Dissertation, NCSU, 2013.
3D Integration: Technology and Design 13
Li Li
CONTENTS
2.1 Three-Dimensional SiP Introduction ........................................................ 15
2.2 Enabling Technologies for 3D SiP ............................................................. 17
2.2.1 Three-Dimensional Stackable Memory ........................................ 17
2.2.2 High-Density Interposer ................................................................. 19
2.2.2.1 Silicon Interposer .............................................................. 19
2.2.2.2 Organic Interposer ............................................................ 21
2.2.3 Microbump Interconnect ................................................................ 23
2.3 3D SiP for Application-Specific Integrated Circuits and High
Bandwidth Memory Integration ................................................................ 25
2.3.1 Organic Interposer Design ............................................................. 26
2.3.2 Simulation and Results ................................................................... 27
2.4 Three-Dimensional SiP Assembly............................................................. 29
2.5 Test and Characterization ........................................................................... 32
2.6 Reliability Challenge ................................................................................... 33
2.7 Summary....................................................................................................... 37
References............................................................................................................... 38
15
16 3D Integration in VLSI Circuits
H BM
HBM
ASIC H BM
Inter H BM
pos er
Pack
age subs
trate
FIGURE 2.1
A schematic of a 3D SiP with one ASIC die and four HBM die stacks.
• Memory and ASIC (logic) device can each be built in their own spe-
cific processes.
• Further exploit the process/cost differences between the logic and
memory devices.
• Very high data rate (bandwidth) with low latency and low-power
per bandwidth.
• Wide interface enabled by very wide interdie interconnect.
• Low parasitic enabled by short, direct interconnect.
Recently, this SiP approach has been extended to include the HBM DRAM
devices based on 3D IC integration with TSV and micropillar intercon-
nects [5].
HBM is a new class of DRAM developed by major DRAM suppliers
leveraging wide I/O and TSV technologies [10]. It is targeted for graphics
3D System in Package 19
Foundry process
Substrate process
Subs TSV and Front side Wafer Back side
Debond Ship
supplier FEOL μBump thinning bump
FIGURE 2.2
A comparison of silicon interposer manufacturing and supply chain flows.
20 3D Integration in VLSI Circuits
For both the foundry and MEOL processes, TSVs are generated using the
deep reactive-ion etching (DRIE) process [12]. The front-side interconnects
or wiring layers are made with Cu damascene techniques. For the backside
interconnection, MEOL usually uses redistribution layer (RDL) process [13].
For true 3D wafers with active circuits and TSVs, MEOL may be preferred
but for passive interposers, an alternative and cost-effective way may exist.
This alternative, substrate process flow is the focus of this study.
The 3D SiP in Figure 2.1 can be supported with a silicon interposer with
TSV. The TSVs that are typically 10–25 μm in diameter are formed by the
DRIE process. The walls of the TSV are lined with the SiOx dielectric. Then,
a diffusion barrier and a copper seed layer are introduced. The via holes
are filled with copper through the electrochemical deposition process. The
chemical–mechanical planarization (CMP) process is used to remove the
copper overburden.
Recently, manufacturing of cost- and performance-effective, large-size
silicon interposer has been investigated [14]. The existing supply chain
and infrastructure of high-performance flip-chip packaging substrates are
leveraged. There are several advantages in this approach. One is mini-
mal disruption to the existing supply chain. The silicon interposer is con-
sidered as a packaging material rather than another piece of silicon chip.
Secondly, large-size silicon interposers can be manufactured with a line
width and line spacing in the range of a few micrometers. This type of sili-
con interposer (Si-IP) is often referred to as coarse-pitch silicon interposer
to distinguish itself from the ones made by wafer foundries. Table 2.1
shows a comparison between the coarse-pitch silicon interposer to the
fine-pitch silicon interposer.
For coarse-pitch silicon interposers, the semi-additive process (SAP) is
used to fabricate Cu wiring on either side of the interposer. The SAP method
TABLE 2.1
Comparison of Coarse-Pitch and Fine-Pitch Silicon Interposers
Features Coarse-Pitch Si-IP Fine-Pitch Si-IP
FIGURE 2.3
A top view of a 35 × 35 mm silicon interposer attached to a package substrate.
has fewer process steps and uses conventional equipment that is also used
for fine-pitch printed wiring board fabrication [13]. On the other hand, fine-
pitch silicon interposer relies on the damascene technique for Cu wiring fab-
rication that requires both chemical–mechanical polishing (CMP) and dry
etching processes. As it involves fewer process steps and uses conventional
equipment for fabrication, coarse-pitch silicon interposers will be less costly
when compared to fine-pitch silicon interposers [13].
Figure 2.3 shows the top view of a 35 × 35 mm silicon interposer fabricated
with the SAP method.
The front side of the interposer has two metal wiring layers, whereas
the backside has one wiring layer. The interposer shown is attached on a
50 × 50 mm HiTCE ceramic substrate. Major fabrication steps used are shown
schematically in Figure 2.4.
The size of the silicon interposers from the leading foundries is currently
limited to 26 × 32 mm, which is the reticle size used in the lithographic
wafer processing. This size limitation can be a disadvantage for ASIC and
memory integration as the die sizes for high-performance ASICs can be as
large as 25 × 25 mm and the ASIC chips require multiple external memory
devices as illustrated in the FCAMP case. Use of reticle stitching to increase
the silicon interposer size is under development; however, it has its own
limitations as well.
(1)
(2)
(3)
(4)
(5)
FIGURE 2.4
Major fabrication steps used to fabricate the coarse-pitch silicon interposer. (1) Silicon wafer;
(2) TSV formation, thermal oxidation silicon thickness: 200 μm, TSV: diameter 60 μm/pitch
150 μm; (3) TSV filling, planarization; (4) multilayer wiring (double-sided), RDL—
semi-additive process, insulator—photosensitive resin, top side—two-layer, and bottom
side—one-layer; and (5) double-sided bumping, Cu/Ni/SnAg bump (electroplating), and
diameter—30 μm.
TABLE 2.2
A Comparison of Organic Interposer and Silicon Interposer
Features Organic Interposer Silicon Interposer
• Plated-through hole (PTH) generation and filling for the core layer
• Circuitization of the core layer
• Building Cu-wiring layers on two sides of the core layer with the
micro-via and build-up processes
Pattern plating as part of the SAP is used to fabricate Cu wiring and micro-
vias for all the build-up layers. The pattern plating method has been used
extensively in manufacturing high-density, build-up organic packaging
substrates. On the other hand, fine-pitch silicon interposer relies on the
damascene technique for Cu-wiring fabrication that requires both chemical–
mechanical polishing (CMP) and dry etching processes. As it involves fewer
process steps and uses a panel format for fabrication, organic interposers
will be less costly when compared to silicon interposers and can offer much
larger size interposer for high-performance ASIC and HBM integration.
Figure 2.5 shows the top view of a 38 × 30 mm organic interposer fabricated
with the build-up and pattern-plating processes.
As shown in the above-mentioned figure, four micropillar footprint pat-
terns are included for attaching the HBM DRAM die-stacks. A close-up view
on the pads for micropillars is shown in Figure 2.6.
TL TR
BL BR
FIGURE 2.5
A top view of a 38 × 30 mm organic interposer manufactured.
TL (× 100) TR (× 100)
(× 500)
(A)
(B)
BL (× 100) BR (× 100)
FIGURE 2.6
A close-up view of a 38 × 30 mm organic interposer on the pads for micro-pillars.
3D System in Package 25
the mechanical integrity of the microbumps after the plating process and
after aging at 150°C for 72 hours. The results for the bump shear test before
aging are also shown in Reference 3. The average shear force is 3.47 g/bump.
After aging, the average shear force decreased to 2.1 g/bump and the growth
of intermetallic compound (IMC) Cu6Sn5 is observed [3]. Another approach
for the solder bump pitch reduction and density increase is through a cop-
per (Cu) pillar approach. The Cu pillar technology was first introduced to
advanced logic devices in 2006 and has since been developed especially for
flip chip applications with a bump pitch below 100 μm. Copper pillars are
less prone to electromigration that makes the technology a good choice for
applications with reduced bump pitches, sizes, and increased current densi-
ties. The Cu pillar interconnection used for 2.5D and 3D IC integration is also
referred to as micropillar interconnection. For example, the micropillar array
used to connect the current generation of HBM DRAM to interposers has a
pitch of 55 μm [15].
50 mm
38 mm
HBM-M HBM
30 mm
50 mm
ASIC
HBM HBM-M
Organic interposer
Build-up substrate
FIGURE 2.7
A schematic top view of the 3D SiP designed.
FIGURE 2.8
A schematic cross-sectional view of the 3D SiP designed.
A memory controller and PHY for the HBM were designed and imple-
mented for the host ASIC on the organic interposer. To shorten the develop-
ment cycle, the ASIC design including the IP for the HBM memory controller
and PHY was first implemented through a field-programmable gate arrays
(FPGA) device.
with a low-loss dielectric material (0.005@10 GHz loss tangent) and has a coef-
ficient of thermal expansion (CTE) closely matched to the CTE of the dielec-
tric material used in the packaging substrate. The substrate has an 800 μm
thick core to help minimizing the warpage effects during reflow, whereas the
organic interposer has a 200 μm thick core. This new, low-loss dielectric mate-
rial used for the organic interposer allows ultrafine line spacing (line width/
spacing = 6 μm/6 μm), low transmission loss, and high insulation reliability.
−0.5
−1
−1.5
−2
−2.5
−3
−3.5
−4
−4.5
−5
0 1 2 4 6 8 10 12 14 16 18 20
Frequency (GHz)
FIGURE 2.9
Insertion loss of the HBM channel simulated.
28 3D Integration in VLSI Circuits
−10
−15
−20
−30
−40
−50
−60
−70
−80
0 0.5 1 2 4 6 8 10 12 14 16 18 20
Frequency (GHz)
FIGURE 2.10
Return loss of the HBM channel simulated.
−20
−30
−40
−60
−80
−100
−120
0 0.5 1 2 4 6 8 10 12 14 16 18 20
Frequency (GHz)
FIGURE 2.11
Loss due to cross-talk for the HBM channel simulated.
3D System in Package 29
Read mode
PRBS31
stimulus HBM Probe
ASIC
IBIS Package
IBIS
model model
model
Write mode
PRBS31
Probe stimulus
HBM ASIC
Package
IBIS IBIS
model
model model
FIGURE 2.12
A schematic for setting up the time-domain analysis.
For the time-domain analysis, multiple corners have been simulated and
only the worst-case results are reported here. From the package design file,
an HSPICE model was extracted and used for the simulations, along with
I/O buffers for the ASIC and HBM memories. The setup used in the HSPICE
simulations is shown schematically in Figure 2.12.
A random PRBS 31 (pseudo-random bit sequence) is generated at the input
and comparing the resulted sequence at the output with and without any
aggressors (one signal was considered victim, and three aggressors on each
side) is used to analyze the effect of cross-talk. The results are a comparison
for the victim, in the situation with and without the aggressors, so that we
can see the output data waveforms when the memory has low activity com-
pared to the situation when the memory has high activity. As such, in READ
MODE (see block diagram in Figure 2.12), the worst-case skew due to cross-
talk is ~46 ps, whereas in WRITE MODE, the worst-case skew is ~42 ps.
Eye diagrams have been computed as well for the two modes. In READ
MODE, with cross-talk from aggressors, the eye has a height of ~900 mV
and a width of ~800 ps, and the introduced jitter is about 200 ps. In WRITE
MODE, with cross-talk from aggressors, the eye has a height of ~1 V and a
width of ~825 ps, whereas the introduced jitter is about 175 ps. All the results
mentioned here are with cross-talk and in the worst-case possible. As it can
be expected, the results without the cross-talk, in the worst-case, are slightly
better. Good confirmation to the frequency-domain results is achieved.
16.0
14.0
HBM_1 HBM_2
12.0
Warpage (micron)
10.0
8.0
6.0
4.0
2.0
0.0
25
100
120
150
183
200
220
240
260
280
260
240
220
200
183
150
120
100
25
Temperature (°C)
FIGURE 2.13
Warpages measured for the HBM die stack from room temperature to 280°C.
140.0
120.0
100.0
Warpage (micron)
80.0
60.0
Interposer_1 Interposer_2
40.0
20.0
0.0
25
75
90
100
110
120
130
150
183
200
220
240
260
240
220
200
183
150
130
120
110
100
90
75
25
Temperature (°C)
FIGURE 2.14
Warpages measured for the organic interposer from room temperature to 260°C.
3D System in Package 31
It can be seen that the warpage of the organic interposer was relatively
small (about 100 μm) and stayed in that range when the temperature was
increased up to 260°C.
Based on the thermal deformation analysis for each components of the
3D SiP, a series of design-of-experiments (DOE) were planned and con-
ducted to overcome the challenges in developing the suitable assembly
process for the ASIC and HBM 3D SiP. As the warpages of the silicon dice
and organic interposer were small and did not change with tempera-
ture, the ASIC die and HBM DRAM die stacks were assembled onto the
interposer to form the ASIC and HBM subassembly first. Underfill encap-
sulation was used to protect the joints made of micropillars and regu-
lar bumps. A top view and a bottom view of the 3D SiP subassembly are
shown in Reference 5.
The structure of the 3D SiP subassembly was then analyzed using X-ray,
acoustic, and optical microscopy to ensure good micropillar solder joints are
formed. Pictures of the cross-sections for the micropillar joints for attaching
HBM die stacks can be found in Reference 5.
In the final assembly process step, the ASIC and HBM subassembly was
attached to the package substrate using the conventional C4 solder bump
interconnection. The C4 bumps were then encapsulated using the underfill
material. A top view of the finished 3D SiP (without the lid) is shown in
Figure 2.15 and a cross-sectional view of the finished 3D SiP (with the lid) is
shown in Figure 2.16.
FIGURE 2.15
A top view of the finished 3D SiP.
32 3D Integration in VLSI Circuits
ASIC (FPGA)
HBM ASIC
Micro bumps
Interposer
C4 bumps
Substrate
BGA
HV spot mag WD 2 mm
10.00 kV 4.0 100 × 21.2 mm C3SiP module
FIGURE 2.16
A cross-sectional view of the finished 3D SiP.
FIGURE 2.17
A top view of the application board with a 3D SiP module in the test socket.
different built-in memory tests within the ASIC memory controller design.
The Perl script calls a Windows program and the necessary drivers that con-
trol a USB to I2C protocol conversion board that sits between the Application
Board’s I2C interface and the PC’s USB port. Through the ASIC’s built-in tests
and via the I2C/Perl script platform, the electrical interface between PHY
and memory is tested for general connectivity, cross-talk between lines, and
simultaneously switching data lines. Defective cells in the memory are also
detected through the tests.
• Thin die.
• TSV.
• Chip-to-chip (C2C) or chip-to-wafer (C2W) joining and interconnect.
• Micropillar or microbump for C2C or C2W connection.
• Die/interposer backside (BS) RDLs.
A summary of the new elements introduced with the 2.5D and 3D IC integra-
tion technologies and their implications on reliability of the final products is
included in Table 2.3.
34 3D Integration in VLSI Circuits
TABLE 2.3
New Elements Introduced by 2.5D and 3D IC Integration
Element Failure Mechanism
The JEDEC publication, JEP158 titled “3D Chip Stack with Through-Silicon
Vias (TSVS): Identifying, Evaluating and Understanding Reliability Interactions”
[16] was developed about a decade ago when industry started to develop
2.5D and 3D IC integration for commercial applications. It was intended as
a guideline to describe the extension of the standard tests to 3D IC com-
ponents. The main element of this extension is the addition of appropriate
test structures to evaluate the reliability of the TSVs and other new features
introduced in the fabrication of 3D IC products. JEP158 together with JEDEC
standard JESD47, “Stress-Test-Driven Qualification of Integrated Circuits” [17]
are used as a starting point for technology qualification to assess the intrin-
sic reliability of TSV and micropillar interconnection that are new elements
introduced by 2.5D and 3D IC integration.
Besides component-level reliability evaluation leveraging JEP158 and
JESD 47, system-level and board-level validations are also needed. The fol-
lowing is an example of board-level reliability evaluation of a 3D IC compo-
nent for networking applications. A test board was designed to mimic the
actual system board. Four large body-size, high-performance Cisco ASIC
packages were also included for the reliability test board assembly along
with the two 3D IC components. Figure 2.18 shows a top view of the fully
assembled test board assembly.
3D System in Package 35
3D IC
3D IC
FIGURE 2.18
A top view of the reliability test board assembly for the ASIC and 3D IC components.
TABLE 2.4
Board-Level Reliability Evaluation of a 3D IC Package
Test Condition Sample Size Results
FIGURE 2.19
A top view of the reliability test board assembly for the 3D SiP with ASIC and HBM DRAM.
1 2 3
5
4 6
FIGURE 2.20
A front view showing the wiring and setup of test boards inside the environmental chamber.
3D System in Package 37
A 0°C–100°C accelerated temperature cycling test was used for the board-
level reliability test. The temperature profile inside the chamber was adjusted
to minimize temperature variation from board to board. A measured tem-
perature profile is given in Reference 18. The duration for the temperature
cycling was 6000 cycles. Except for a few very early fails on some chains in
the sub 100 cycle range, all the other daisy chains or test features monitored
passed 6000 cycles with no failures.
2.7 Summary
To meet the requirements of next-generation Information and Communication
Technology (ICT) systems, several 2.5D and 3D IC integration or packag-
ing technology platforms have been developed. 3D-stackable memory,
large-size silicon/organic interposers, and microbump interconnection
are key-enabling technologies for realizing 2.5D and 3D IC integration.
Leveraging the existing supply chain and infrastructure of high-performance
flip-chip packaging substrates, a 3D SiP is designed and manufactured that
includes a large-size organic interposer with a total of 12 Cu-wiring layers,
an ASIC die, and 4 HBM DRAM stacks.
A lot of progress has been made in the last decade through technology devel-
opment and process improvement. Several 2.5D and 3D SiP-packaging plat-
forms have emerged and appear to be ready for production.
References
1. Priest, J., M. Ahmad, L. Li, J. Xue, and M. Brillhart, Design optimization of a high
performance FCAMP package for manufacturing and reliability, Proceedings
55th Electronic Components and Technology Conference, Orlando, FL, May 2005,
pp. 1497–1501.
2. Priest, J. and L. Li, Challenges in substrate design, assembly, and reliability of
SiP package for a high end networking application, 39th International Symposium
on Microelectronics, IMAPS, San Diego, CA, October 2006, pp. 8–12.
3. Li, L., S. Peng, J. Xue et al., Addressing bandwidth challenges in next genera-
tion high performance network systems with 3D IC integration, Proceedings of
the 62nd Electronic Components and Technology Conference (ECTC), San Diego, CA,
May 2012, pp. 1040–1046.
4. S. Lakka, Xilinx SSI technology, Hot Chips: A Symposium on High Performance
Chips, Palo Alto, CA, August 2012.
5. Li, L., P. Chia, P. Ton, et al., 3D SiP with organic interposer for ASIC and mem-
ory integration, Proceedings of the 66th Electronic Components and Technology
Conference (ECTC), Las Vegas, NV, June 2016, pp. 1445–1450.
6. JEDEC Standard, Addendum No. 3 to JESD79-3: 3D Stacked SDRAM,
JESD79-3-3, December 2013.
7. JEDEC Standard, Wide I/O single data rate (wide I/O SDR), JESD229, December
2011.
8. JEDEC Standard, Wide I/O 2, JESD229-2, August 2014.
9. Hybrid Memory Cube. Retrieved from http://hybridmemorycube.org/
specification-v2-download-form/
3D System in Package 39
CONTENTS
3.1 Introduction ..................................................................................................42
3.2 Architecture, Design, and Product Enablement......................................43
3.2.1 Key Challenges.................................................................................44
3.2.1.1 Limited Connectivity and Bandwidth ...........................44
3.2.1.2 Excessive Latency ..............................................................44
3.2.1.3 Power Penalty ....................................................................44
3.2.2 Xilinx-Stacked Silicon Interconnect Technology......................... 45
3.2.2.1 Creating Field-Programmable Gate Array Die
Slices with Microbumps for Stacked Silicon
Integration .......................................................................... 46
3.2.2.2 Silicon Interposer with Through-Silicon Vias .............. 47
3.2.2.3 Three-Dimensional Integration Chip Analysis
Methodology Enablement with Simulation
Program with Integrated Circuit Emphasis
Simulators .......................................................................... 48
3.2.2.4 Silicon Interposer Signal Integrity.................................. 50
3.2.3 Three-Dimensional Integration Chip Resource-Rich
Field-Programmable Gate Arrays Product Offerings .................54
3.3 Stacked Silicon Technology Development and Package Reliability ..... 55
3.3.1 Key-Enabling Technologies ............................................................ 55
3.3.1.1 Silicon-Grinding Quality Optimization ........................ 57
3.3.1.2 Wafer Edge and Bevel-Cleaning Optimization ............ 57
3.3.1.3 Silicon-Footing Improvement.......................................... 58
41
42 3D Integration in VLSI Circuits
3.3.2
Three-Dimensional Integration Chip Development Test
Vehicles .............................................................................................. 59
3.3.2.1 28 nm Test Vehicle-Driven Process and Reliability
Improvements .................................................................... 60
3.3.2.2 Improvements to 20 nm Test Vehicle ............................. 61
3.3.2.3 Board-Level Reliability Test Vehicle and Results .........63
3.3.3 Three-Dimensional Integration Chip Reliability
Look-Ahead Assessments ...............................................................65
3.3.3.1 Continuous Process Yield Improvements ..................... 66
3.4 Potential Three-Dimensional Integration Chip Future Challenges ..... 67
Acknowledgment .................................................................................................. 68
References............................................................................................................... 68
3.1 Introduction
As the role of the field-programmable gate arrays (FPGA) becomes more sig-
nificant in larger and more complex system designs, it demands higher logic
capacity and more on-chip resources and functionalities. Until 40 nm node,
FPGAs have depended predominantly on Moore’s law scaling to respond to
this need, delivering nearly twice the logic capacity with each new process
generation. However, keeping pace with today’s high-end market demands
requires more than Moore’s law can provide.
To respond to these requirements, Xilinx first introduced a three-
dimensional integration chip (3D-IC) stacked silicon interconnect technol-
ogy (SSIT) for building FPGAs that offer bandwidth and capacity that are
not realized by traditional Moore’s law scaling (Figure 3.1). SSIT uses silicon
interposers with microbumps (μBumps) and through-silicon vias (TSV) to
integrate multiple FPGA die slices in a single package.
Starting from 2006 (Figure 3.2), Xilinx worked with research consortiums,
equipment vendors, foundry, and outsourced assembly and test (OSAT) part-
ners to develop TSVs, μBump, bonding, and 3D-IC integration technology.
Xilinx’s first 3D-IC SSIT FPGA, Virtex®-7 2000T, introduced in late 2011
mainly for emulation applications has been well adopted by the industry.
3D-IC devices since have expanded into networking, data center cloud
computing, and high-performance computing (HPC) areas in volume pro-
duction and account for more than 30% of the 20 nm/16 nm products offered.
The largest SSIT FPGA as of today is UltraScale XCVU440 with 4.4 million
logic cells, 600 k μBumps, 19 billion transistors in a 55 mm package, and has
been shipping in volume production since early 2016.
This chapter describes Xilinx’s 3D-IC development experiences gained
by launching process learning vehicles, developing design simulation
methodology, pursuing early-stage reliability assessment, and optimizing
A New Class of High-Capacity, Resource-Rich FPGA 43
3.0E+06
2.0E+06
2.0E+06
1.0E+06 7.6E+05
3.3E+05
2.7E+04 4.3E+04 1.0E+05 2.0E+05
0.0E+00
1998 2000 2002 2004 2006 2009 2011 2014
FIGURE 3.1
FPGA capacity doubling as 28 nm generation enabled by 3D-IC.
Heterogeneous
stacked silicon
Stacked 90 nm
90 nm 65 nm 28 nm interconnect
silicon Process
interconnect integration
Test Test Test Design tools technology
vehicle vehicle vehicle available
development and modular
completed completed completed
started development
Design
Initial enablement World’s first 3D-
Reliability Design
reliability and supply stacked silicon
assessment validation
look-ahead chain interconnect
validation device
FIGURE 3.2
FPGA 3D-IC history—from concept to world’s first programmable products on 28 nm.
the process baseline. All the development efforts paved the foundation
for a robust production line. At the end, the potential future 3D-IC tech-
nology challenges are outlined.
® 7
ex -
Virt 0T FPGA Dies
200
SI interposer
Package
FIGURE 3.3
Stacked silicon interconnect package top view.
46 3D Integration in VLSI Circuits
FIGURE 3.4
FPGA tiles built with ASMBL™ architecture.
FIGURE 3.5
FPGA die or SLR (super logic region) optimized for stacked silicon integration.
A New Class of High-Capacity, Resource-Rich FPGA 47
FIGURE 3.6
Passive silicon interposer with die-to-die interconnect wiring.
ASMBL Silicon
optimized interposer
FPGA slice glue
High-density
Additional stitching
slices side- • >10K connections
by-side • ∼1ns latency
FIGURE 3.7
Transparent view of the assembled die stack and die-to-die interconnects using silicon
interposer.
48 3D Integration in VLSI Circuits
Package substrate
BGA
FIGURE 3.8
Cross-section view of FPGA-silicon interposer integration on a package substrate.
FIGURE 3.9
Module-based netlist for 3D-IC integration chip with two dice from different technologies.
the major advantage of the new approach over the traditional one is its capa-
bility to maintain libraries and netlist files to help reduce memory usage and
simulation errors.
An example netlist in Figure 3.9 below illustrates the simple 3D IC integra-
tion chip with two dice of different technologies and temperatures as “xic1”
and “xic2.”
Each design team is expected to exercise the full design verification flow
for each IC module before the 3D-IC integration. As each integrated IC mod-
ule is regarded as a completed product, retaining the IC module netlist integ-
rity becomes a primary requirement. New HSPICE netlist features are then
introduced to target the multiple-die and multitechnology integration for
both heterogeneous and homogeneous designs [4].
New 3D-IC-specific configurations and analysis requirements are intro-
duced as well, such as multidie multitechnology integration, verification with
multiple operational IC temperature domain, and exponentially increasing
corner simulations for multidie integration with different simulation corners
on each IC module.
For both prelayout and postlayout design analysis and verification, the
new SPICE simulators offer 3D-IC-specific methodology for functional veri-
fication, timing verification, and power analysis.
The flow chart in Figure 3.10 provides the steps for 3D-IC integration simu-
lations using Synopsys’ 3D-IC simulation features. The new 3D-IC simula-
tion features and the methodology offer an efficient simulation solution for
timing closure of multidie and multitechnology integration under the con-
ventional SPICE simulation environment.
50 3D Integration in VLSI Circuits
Run HSPICE/CustomSim
simulations
FIGURE 3.10
3D-IC SPICE simulation flow.
FIGURE 3.11
Signal Type I (vertical to package pin, as “a”) and Type II (interdie, as “b” and “c”).
200
Conductance, S
Capacitance, fF
2e−03
100
1e−03
0 10 G 20 G 30 G 0 10 G 20 G 30 G
(a) Frequency, GHz (b) Frequency, GHz
FIGURE 3.12
Simulation versus measurement comparison for (a) effective capacitance and (b) effective
conductance.
0 db
Insertion loss, db
−1.0 db
FIGURE 3.13
Simulation versus measurement comparison for insertion loss.
(14 GHz) is −0.414 dB from the 20 Ω-cm silicon substrate and −0.822 dB from
10 Ω-cm silicon substrate, respectively. The high-resistivity silicon substrate
obviously provides lower loss over the entire frequency range.
Based on these measurements and simulation results, we conclude that a
20 Ω-cm high-resistivity silicon substrate is preferred for very high-speed sig-
naling applications. Higher than 20 Ω-cm resistivity silicon substrates are also
available in the industry if there is a need to further reduce TSV insertion loss.
Voltage, V
FIGURE 3.14
Even-mode coupling (a) Even mode without side shields and (b) Even mode with side shields.
FIGURE 3.15
Type II signal wire length distribution.
whereas 85% of the traces are less than 3.75 mm in length. Clocking between
the GTZ and its neighboring SLR is system synchronous. Careful balancing
of clock networks on these two pieces of silicon across process, voltage, and
temperature is necessary. Static timing analysis (STA) verifies interdie tim-
ing efficiently. However, during XC7VH580T development, STA tools did not
accept RLC interconnect models. As a result, interdie propagation delay and
transition time values were calibrated between SPICE simulations with 3D
RLC interconnect models and STA with RC interconnect models. The layout
ensures that Type II signaling in the XC7VH580T is not in transmission-line
mode and STA with RC-based interconnects suffices.
65 nm 4 L
TSV
(105 μm depth)
Dicing tape
Frame
FIGURE 3.16
MEOL soft reveal process flow scheme: 0—i-wafer with TSV, 1—Microbump, 2—Carrier
mount, 3—Silicon thinning, 4—Si recess etch, 5—P-capping, 6—CMP contact open, 7—C4
bumping, and 8—Carrier demount.
FIGURE 3.17
Defect improvement before and after postgrind surface optimization.
58 3D Integration in VLSI Circuits
FIGURE 3.18
Defect types (a) foreign material, (b) Sulfur hexafluoride (SF6) by-product, and (c) pre-existing
particle masking defect.
FIGURE 3.19
Die yield improvement comparing (a) without bevel clean, (b) bevel wet clean, and (c) bevel
wet + dry clean.
A New Class of High-Capacity, Resource-Rich FPGA 59
1 1 1 Silicon
22 1 1 1 1 21 TSV
1 1 1 1 1 1 1 1 21
1 1 1 1 1 1 1 1 1 1 21 Cu (b)
1 1 1 1 1 1 1 1 1 1 BS PASV 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 25 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Oxide
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Si 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 11
(a) (c) 1 1
FIGURE 3.20
(a) Interposer wafer edge failures caused by nonoptimized etching revealed, (b) center and
edge revealed TSVs after Si gradient etch implementation, and (c) no failures after etch
optimization.
FIGURE 3.21
Milestone of PDK delivery and test vehicle development for new technology node and 3D-IC.
60 3D Integration in VLSI Circuits
Substrate
BGA
∼27.1 mm
Bali top
dies
Bali top die
Interposer
∼7.5 mm
∼32.6 mm
∼24.7 mm
FIGURE 3.22
A schematic of Xilinx 28 nm 3D-IC TV (code named “Bali”).
A New Class of High-Capacity, Resource-Rich FPGA 61
Electrical test
X-ray inspection
C-SAM
(Confocal scanning acoustic microscopy)
TDR
(Time domain reflectometry)
OBIRCH
(Optical beam-induced resistance charge)
Delayer/X-section
FIGURE 3.23
A typical 3D-IC physical FA working flow.
BGA
FIGURE 3.24
One typical FA case using X-ray and C-SAM inspection approaches.
CoWoS CoW-Last
∼ 100 μm FPGA FPGA FPGA FPGA
Hold
Maximum die-level warpage (μm)
UF UF
CoS
FPGA FPGA
UF1
UF
UF2 ∼ −150 μm
Organic substrate
Concave warpage
FIGURE 3.25
Warpage behavior during microbump joining for different integration schemes.
FIGURE 3.26
Demonstration of one typical microbump open and short FA case, respectively.
Top die
die seal ring
Interposer 14.56 mm
DMV TD
Interposer
die seal ring
DMV TD
45.3 mm
23.25 mm
DMV TD
24.7 mm
FIGURE 3.27
20 nm 3D-IC TV with 3 DMV top dies on X-interposer containing ~375 K μBumps.
very thick, high-layer count boards (high stress on BGA), and long life time
expectation. In Section 3.3.2.3, we briefly discuss the vehicle and results seen
in board level 0°C–100°C thermal cycling and Shock and Bend.
FIGURE 3.28
20 nm 3D-IC board-level reliability (BLR) test vehicle.
First failure without heat sink occurred at 8619 cycles and characteristic
life (63.2% failure) was reached at 10,792 cycles. Dye and pry FA on the first
failed unit indicated the failure was caused by solder ball crack at pack-
age corner ball, as shown in Figure 3.29. All failed units have first failure
at corner solder balls and not on C4 bump or μBump. This agreed with
simulation that the X-interposer package has similar BLR characteristic as
a large die flip chip package where corner balls fail early and should be
(a)
Failing ball
(b)
FIGURE 3.29
(a) Passing sample, no corner ball crack (with six corner depopulated pads). (b) Failed sample
with corner ball crack (with six corner depopulated pads).
A New Class of High-Capacity, Resource-Rich FPGA 65
considered critical. Including heat sink, onset of first failure was sooner
at 5092 cycles and characteristic life was reached at 7565 cycles. However
failure mode remained the same, that is, corner BGA balls failed and there
was no impact to elements within the package such as μBump, C4, FPGA
die ELK, interposer, TSV, and substrate.
The package was also assembled on a Shock and Bend board and tested
without heat sink. The board is 185 × 185 mm 2 and 3.2 mm thick. It has 16
metal layers with high-speed, low-loss dielectric material. Board design,
assembly, and test are based on JESD22-B110A for Shock and JESD22-B113
for Bend. The failure criteria for Shock are 10% increase in daisy-chain
net resistance and for Bend test is 20% net resistance increase. Results
showed Shock test passed both 100G of Condition C and an extended
test of 125G; for Bend, the global strain to failure ranged from 1775 to
2359 microstrains.
(a) (b)
FIGURE 3.30
HTS 150°C 1000 hours failing μBump shown in (a) and an improved one shown in (b) (post HTS
150°C 4000 hours of stress).
66 3D Integration in VLSI Circuits
Die sawing
FIGURE 3.31
Plan view of interposer die highlighted with power-ground lines.
A New Class of High-Capacity, Resource-Rich FPGA 67
Dummy Dummy
Dummy Dummy
(a) (b)
FIGURE 3.32
~10%–15% C4 bridging at four corners of the XC7VH870T was detected (a). The C4 bridging was
eliminated after dummy dies included in CoW integration (b).
1. The cost-down tendency and margin of the 3D-IC: It will become a key
factor to promote 3D-IC adoption by new markets other than emula-
tion, hyperscale data center, and networking.
2. The 3D-IC capacity and/or supply chain expansion to keep pace with the
adoption demand: Today’s limited capacity is expected to be con-
sumed by one or two major fabless design houses. More capacity
investment is pursued by foundry and OSATs but the pace needs to
accelerate.
3. The need for an innovative/effective 3D-IC development learning vehicle:
How would it be possible to define a test vehicle as close as to the
future 3D-IC product in the development period of a new technol-
ogy node?
68 3D Integration in VLSI Circuits
For vertical stacking integration the challenges are more pronounced to iden-
tify the technology, design/architecture, and power/thermal space. In the
future with advanced silicon technology scaling only becoming more chal-
lenging not only from technical but an economic standpoint, it is expected
that 3D vertical stacking would gain more interest and momentum in the
industry.
Acknowledgment
The authors would like to express our appreciation for the great support
from Xilinx’s architecture, design, reliability, and FA teams and our partners
TSMC’s 3D-IC development team, TSMC BTSD team, and SPIL CRD Team.
Particular thanks go to our management team for their full commitment and
support right throughout the program.
References
1. K. Saban, Xilinx stacked silicon interconnect technology delivers breakthrough
FPGA capacity, bandwidth, and power efficiency, Xilinx White Paper: Virtex-7
FPGAs, WP380 (Initial v1.0) October 2011 and (updated v1.2), December 11,
2012.
2. L. Madden, E. Wu, N. Kim, B. Banijamali, K. Abugharbieh, S. Ramalingam, and
X. Wu, Advancing high performance heterogeneous integration through die
stacking, in Proceedings of the IEEE 38th European Solid-State Circuits Conference
(ESSCIRC), Bordeaux, France, 2012, pp. 18–24.
3. C. Erdmann, D. Lowney, A. Lynam, A. Keady, J. McGrath, E. Cullen,
D. Breathnach et al., A heterogeneous 3D-IC consisting of two 28 nm FPGA die
and 32 reconfigurable high-performance data converters, IEEE Journal of Solid-
State Circuits, 50(1): 258–269, 2015.
4. S. Wu, J. Wei, and H. Lam, A new SPICE simulation approach for 3D IC integra-
tion, SNUG 2013.
5. N. Kim, D. Wu, D.W. Kim, A. Rahman, and P. Wu, Interposer design optimiza-
tion for high frequency signal transmission in passive and active interposer
using through silicon via (TSV), in Proceedings of the 2011 IEEE 61st Electronic
Components and Technology Conference (ECTC), Lake Buena Vista, FL, 2011,
pp. 1160–1167.
6. N. Kim, D. Wu, J. Carrel, J.H. Kim, and P. Wu, Channel design methodology for
28Gb/s SerDes FPGA applications with stacked silicon interconnect technol-
ogy, in Proceedings of the 2012 IEEE 62nd Electronic Components and Technology
Conference (ECTC), San Diego, CA, 2012, pp. 1786–1793.
A New Class of High-Capacity, Resource-Rich FPGA 69
CONTENTS
4.1 Introduction .................................................................................................. 71
4.2 Past Challenges in Three-Dimensional Integration ............................... 72
4.3 Challenges in Three-Dimensional System Integration .......................... 73
4.4 Challenges in Three-Dimensional Heterogeneous Integration ............ 76
4.5 Challenges toward Future Three-Dimensional Integration .................. 79
4.6 Summary....................................................................................................... 81
References............................................................................................................... 81
4.1 Introduction
The 3D integration technology using through-silicon via (TSV) has
significantly progressed for these years as represented by 3D-stacked
dynamic random-access memory (DRAM) such as hybrid memory
cube (HMC) and high bandwidth memory (HBM) [1,2]. The 3D-stacked
structure is also employed in a complementary metal–oxide–semiconductor
(CMOS) image sensor (CIS) [3]. In addition to these 3D-stacked DRAM
and 3D-stacked image sensor, heterogeneous 3D integration technology
has increasingly attracted much attention as it is indispensable for future
Internet of Things (IoT). In a heterogeneous 3D integration technology,
different kinds of chips such as microelectromechanical systems (MEMS),
sensor, photonic device chip, and spintronic device are stacked on CMOS
chips. Low-power consumption, small form factor, and multifunctionality
are required for embedded devices in IoT. Heterogeneous 3D integration
can provide these embedded devices with low-power consumption, small
form factor, and multifunctionality. We have developed new heterogeneous
3D integration and system integration technologies using self-assembly
71
72 3D Integration in VLSI Circuits
0.7 μm 12 μm
Supporting
Cu material
2.5 um
Glue layer
30 μm
W Metal wiring
18 μm
18 μm
Si MOSFET Si Si 3 μm Cu
SiO2
Poly-Si TSV W-TSV Cu-TSV Cu-TSV
MOSFET (Via first) (Via middle) (Via middle) (Via last)
Bonding layer Si substrate
(1995) (2001) (2006) (2007)
(a) (b)
FIGURE 4.1
Cross-sectional structure of 3D LSI with TSVs (a) and cross-sectional view of TSVs (b).
Challenges in 3D Integration 73
Parallel processing
P
C
AMP
M
D
A
and ADC
10 11
00 01 02
SHIFT
register
cu ut
it
cir up
circuit
Configuration of Array of processing units
one processing unit to form one frame
FIGURE 4.2
Photomicrographs of fabricated 3D-stacked image sensor test chip with poly-Si TSVs.
r
so
sen Image sensor chip
age Analog chip
d im ADC chip
ke Interface chip
st ac
3 D-
Si interposer
3D-stacked processor
FIGURE 4.3
Configuration of 3D-stacked image sensor system module for advanced driver assistance sys-
tems (ADAS).
No.00
Image sensor Image sensor (CIS) Pixel No.01
FIGURE 4.4
Configuration of one image signal processing element in 3D-stacked image sensor. (a) Structure
of stacked image sensor and (b) configuration of one circuit block.
ADC with digital noise cancellation circuit. The ADC was designed with a
standard 90-nm 1-Poly 9-Metal CMOS technology [20]. The three-dimensional
structure of designed 3D-stacked image sensor is depicted in Figure 4.5a.
Figure 4.5b shows X-ray CT scan image and SEM cross-sectional view of fabri-
cated 3D-stacked image sensor. It is clearly seen in the X-ray CT scan image that
four layers with many TSVs are vertically stacked. The die size of each layer is
5 × 5 mm2 and each layer has approximately two thousands of TSVs. The thick-
ness of Si substrate and the diameter of TSV are approximately 50 and 5 μm,
respectively. We confirmed the successful operation of fabricated prototype
3D-stacked image sensor.
A prototype 3D-stacked dependable multicore processor was fabricated by
the 3D integration technology with the backside-via [22–23]. The conceptual
Challenges in 3D Integration 75
Cu TSV M1
nd
Cu TSV
50 μm
2 layer TSV
st Cu TSV
1 layer TSV
5 μm
Si Sub.
Metal μ-bump
(a) (b)
FIGURE 4.5
Three-dimensional display of designed 3D-stacked image sensor by computer graphics (a)
and X-ray CT scan image and SEM cross-sectional view of fabricated 3D-stacked image
sensor (b).
memory
External
OSTC PBB Core VBB SM Mem
Self-test/ Tier 0 Ctrl
Self-repai Test access port
layer (3D TAP) System bus Memory bus
(a) (b)
FIGURE 4.6
Conceptual structure (a) and functional circuit block diagram (b) of 3D-stacked multicore pro-
cessor with self-test and self-repair function.
Tier 1 TSV
Multilevel
TSVs for Cu wiring
3D-stacked
memory Tier 0 8 μm 50 μm μ-bump
FIGURE 4.7
A circuit block diagram and a die photo of core processor (a) and X-ray CT scan image and SEM
cross-sectional view of fabricated 3D-stacked multicore processor (b).
MEMS chip
3D super chip Sensor chip
CMOS RF-IC
MMIC
Power IC
Control IC
Logic LSI
Flash memory
DRAM
SRAM
Microprocessor
FIGURE 4.8
An example of heterogeneous 3D LSI.
Challenges in 3D Integration 77
FIGURE 4.9
Production procedure for heterogeneous 3D LSIs in reconfigured wafer-to-wafer (R-W2W) 3D
integration technology.
Hydrophilic
Hydrophobic
Dielectric Hydrophobic
layer KGD
Water
SAE carrier
Bipolar electrode
Water droplet
Wafer
FIGURE 4.10
Concept of self-assembly and electrostatic temporary bonding and a photo of 8-inch wafer
with hydrophilic areas and hydrophobic areas after simultaneously supplying liquid droplets
with four sizes on hydrophilic areas.
known good dies (KGDs) onto a carrier wafer with high alignment accuracy
and high throughput. The surface tension of liquid droplet is utilized in the
self-assembly to simultaneously align many dies as shown in Figure 4.10.
Hydrophilic areas and hydrophobic areas are formed on the surface of
wafer or chip. We have succeeded in simultaneously aligning five hundreds
of chips with the average alignment accuracy of 0.5 μm within 0.1 second.
78 3D Integration in VLSI Circuits
Cu sidewall
interconnection
MEMS chip
LSI chip
Substrate
FIGURE 4.11
A photo of heterogeneous 3D LSI test chip where a pressure sensor MEMS chip is integrated
on CMOS chip, which is stacked on an active interposer wafer.
1 6
NCF
μ-bump
TSV
2 7 Top chip φ 8 μm
KGD
TSV
Hydrophobic
3 CF
NCF 45 μm thick
Water 8 chip
SiO2
4
Die stacking Wafer
9
5
FIGURE 4.12
Fabrication process flow to fabricate 3D LSIs by a reconfigured wafer-to-wafer (R-W2W) inte-
gration technology. 1—Lamination, 2—dicing, 3—first-layer die release, 4—self-assembly
(alignment), 5—die-bonding, 6—Wafer thinning/TSV and μ-bump formation, 7—second-
layer die release, 8—self-assembly and die-bonding, and 9—completed 3D stacking.
Challenges in 3D Integration 79
Si substrate
Metal
Insulator
Supporting substrate
(a)
Copolymer
10 μm 2.5 μm
Si substrate
Si
Oxide liner (DSA guide) TEOS liner
(b)
(c) 200 nm Si
Copolymer A Copolymer B
(d)
FIGURE 4.13
New concept of TSV formation based on advanced directed self-assembly (DSA) with nano-
composites consisting of diblock copolymers and nanosized metal particles. (a) Thinning of
Si substrate, (b) formation of Si deep hole (RIE), (c) deposition of diblock copolymer including
metal nanodots with low melting point, and (d) nanophase separation.
proposed a new hybrid bonding method for chip-to-chip bonding and chip-
to-wafer bonding in which a novel inorganic anisotropic conductive film
(i-ACF) comprising ultrahigh density of Cu nanopillar (CNP) and alumina
matrix is used. SEM images of i-ACF film with the top view and the cross-
sectional view are shown in Figure 4.14. The i-ACF film is formed by anodic
oxidation of aluminum film and Cu electroplating [36]. The diameter and
pitch of Cu nano-pillars are 60 and 100 nm, respectively. In order to eval-
uate the electrical characteristics of Cu–Cu joining bonded through the
i-ACF film, we fabricated a test chip with a huge number of Cu electrodes of
4.3 million. The size and pitch of Cu electrodes are 3 and 6 μm, respectively.
The cross-sectional image of fabricated test chips bonded using i-ACF film
is also shown in Figure 4.14. Test chip and interposer chip are electrically
connected by ultrahigh-density CNPs. We confirmed that all of 4.3 million
Cu joints with the electrode size of 3 μm were completely connected through
ultrahigh-density CNPs [37–38]. The joining resistance was approximately
30 mΩ for each pair of joining.
Challenges in 3D Integration 81
CNP Alumina
100
nm
20 μm i-ACF
FIGURE 4.14
SEM images of i-ACF film with the top view and the cross-sectional view (a) and cross-sectional
image of fabricated test chip bonded using i-ACF film (b).
4.6 Summary
We have developed various 3D integration technologies using TSV and
metal microbump for a long time. In addition, we have also developed vari-
ous 3D system-on-chips and heterogeneous 3D LSIs using self-assembly and
electrostatic bonding. Furthermore, we are challenging to develop new tech-
nologies such as DSA TSV and CNP hybrid bonding for future 3D LSIs with
high-density TSVs and microjoints.
References
1. J. Jeddeloh and B. Keeth, Hybrid memory cube new DRAM architecture
increases density and performance, Digest of Technical Papers, Symposium on VLSI
Technology, pp. 87–88 (2012).
2. D. U. Lee et al., A 1.2V 8Gb 8-channel 128GB/s high-bandwidth memory
(HBM) stacked DRAM with effective microbump I/O test methods using
29nm process and TSV, IEEE International Solid-State Circuits Conference (ISSCC),
pp. 432–433 (2014).
3. S. Sukegawa et al., A 1/4-inch 8Mpixel back-illuminated stacked CMOS image
sensor, IEEE International Solid-State Circuits Conference (ISSCC), pp. 484–485
(2013).
4. Y. Akasaka and T. Nishimura, Concept and basic technologies for 3-D IC struc-
ture, IEEE International Electron Devices Meeting (IEDM), pp. 488–491 (1986).
82 3D Integration in VLSI Circuits
23. T. Fukushima et al., New chip-to-wafer 3D integration technology using hybrid
self-assembly and electrostatic temporary bonding, IEEE International Electron
Devices Meeting (IEDM), pp. 789–792 (2012).
24. H. Hashimoto et al., Highly efficient TSV repair technology for resilient 3-D
stacked multicore processor system, IEEE International 3D System Integration
Conference (3DIC) (2013).
25. T. Fukushima et al., New three-dimensional integration technology based on
reconfigured wafer-on-wafer bonding technique, IEEE International Electron
Devices Meeting (IEDM), pp. 985–988 (2007).
26. T. Fukushima et al., Three-dimensional integration technology based on recon-
figured wafer-to-wafer and multichip-to-wafer stacking using self-assembly
method, IEEE International Electron Devices Meeting (IEDM), pp. 349–352 (2009).
27. T. Fukushima et al., Self-assembly technology for reconfigured wafer-to-wafer
3D integration, IEEE Electronics Components and Technology Conference (ECTC),
pp. 1050–1053 (2010).
28. K.-W. Lee et al., A cavity chip interconnection technology for thick MEMS chip
integration in MEMS-LSI multichip module, Journal of Microelectromechanical
Systems, 19(6): 1284–1291 (2010).
29. K.-W. Lee et al., Three-dimensional hybrid integration technology of CMOS,
MEMS, and photonics circuits for optoelectronic heterogeneous integrated sys-
tems, IEEE Transactions on Electron Devices, 58(3): 748–757 (2011).
30. T. Tanaka et al., Ultrafast parallel reconfiguration of 3D-stacked reconfig-
urable spin logic chip with On-chip SPRAM (spin-transfer torque RAM),
Symposia on VLSI Technology and Circuits (VLSI2012), pp. 169–170 (2012).
31. Y. Ito et al., Development of highly-reliable microbump bonding technology
using self-assembly of NCF-covered KGDs and multi-layer 3D stacking chal-
lenges, IEEE Electronics Components and Technology Conference (ECTC), pp. 336–
341 (2015).
32. H. Kikuchi et al., Tungsten through-silicon via technology for three-dimensional
LSIs, Japanese Journal of Applied Physics, 47(4): 2801–2806 (2008).
33. K.-W. Lee et al., Effects of electro-less Ni layer as barrier/seed layers for high
reliable and low cost Cu TSV, International 3D System Integration Conference
(3DIC) (2014).
34. M. Murugesan et al., High density 3D LSI technology using W/Cu hybrid TSVs,
IEEE International Electron Devices Meeting (IEDM), pp. 139–142 (2011).
35. T. Fukushima et al., New concept of TSV formation methodology using directed
self-assembly (DSA), IEEE International 3D System Integration Conference (3DIC)
(2016).
36. K. Yamashita et al., Copper-filled anodized aluminum oxide -a potential material
for low temperature bonding for 3D packaging, ICEP- IAAC, pp. 571–574 (2015).
37. K.-W. Lee et al., Novel reconfigured wafer-to-wafer (W2W) hybrid bonding tech-
nology using ultra-high density nano-Cu filaments for exascale 2.5D/3D inte-
gration, IEEE International Electron Devices Meeting (IEDM), pp. 185–188 (2015).
38. K.-W. Lee et al., Novel W2W/C2W hybrid bonding technology with high
stacking yield using ultra- fine size, ultra-high density Cu nano-pillar (CNP)
for exascale 2.5D/3D integration, IEEE Electronics Components and Technical
Conference (ECTC), pp. 350–355 (2016).
5
Wafer-Level Three-Dimensional
Integration Using Bumpless
Interconnects and Ultrathinning
Takayuki Ohba
CONTENTS
5.1 Introduction .................................................................................................. 86
5.2 Co-Engineering by 3D and 2D ................................................................... 87
5.2.1 Delay of Three-Dimensional Integration Technology ................ 87
5.2.2 Economic and Technical Issues for Lithography ........................ 87
5.2.3 Co-Engineering Using Three Dimensional for Next
Generation of Manufacturing ........................................................ 88
5.3 Bumpless Interconnecting and Wafer-Level Three-Dimensional
Integration .....................................................................................................90
5.3.1 Overview of Bumpless Interconnects ........................................... 90
5.3.2 Thickness of Wafer and Ultrathinning......................................... 91
5.4 Details of Wafer-Level Three-Dimensional Process ............................... 92
5.4.1 Thinning Module............................................................................. 92
5.4.2 Stacking Module .............................................................................. 93
5.4.3 Through-Silicon Via Module.......................................................... 94
5.4.4 Packaging Module (Singulation/Packaging Module) ................ 97
5.5 Device Characteristics after Ultrathinning .............................................. 98
5.5.1 Retention Time Change of Dynamic Random- Access
Memory after Thinning .................................................................. 98
5.5.2 Ultrathinning and Estimation of Critical Thickness .................. 99
5.5.3 Cu-Diffusion Phenomenon at Ground Surface ......................... 101
5.6 Characteristics of Low-Aspect Ratio Through-Silicon Via
Interconnects .............................................................................................. 103
5.6.1 Step Coverage and Cu Diffusion ................................................. 103
5.6.2 Stresses in Cu Through-Silicon Vias ........................................... 104
5.6.3 Electrical Characteristics .............................................................. 106
85
86 3D Integration in VLSI Circuits
5.1 Introduction
Integrated circuits (ICs) based on planar technology started in the 1960s and
have led to today’s huge semiconductor industry and will be a key technology
in realizing a global infrastructure for the Internet-of-Things (IoT) and the
Internet-of-Everything (IoE) [1]. Twenty years after the advent of ICs, the con-
cept of three-dimensional integration (3DI) was proposed. The beginning of
3DI was the so-called transistor-based 3DI in the front-end-of-line (FEOL)
for stacking complementary metal–oxide–semiconductor (CMOS) devices
(a monolithic-stacked structure of n-MOSFETs [metal–oxide–semiconductor
field-effect transistor] and p-MOSFETs) to fabricate high-density ICs. In 2000s,
packaging-based 3DI, such as chip-on-board (COB) and chip-on-chip (COC)
with wire bonding, was developed to fabricate high-performance electron-
ics. A typical product is a system-in-package (SiP) consisting of a stack of 4–5
chips for mobile applications. Due to these two different approaches, that is,
transistor- or packaging-based 3DI, some technical misunderstandings often
occurred when people heard the term 3DI technology. For instance, current
interconnecting ICs in back-end-of-line (BEOL) for microprocessing units
(MPUs) or graphics processing units (GPUs) were realized by 12–14 level Cu/
low-k multilevel interconnects. This is obviously a 3D structure and comes
from the requirements for high device performance and density. Packaging-
based 3DI consists of a chip-based stack after singulation of the wafer, and
many fabrication methods have been introduced leading to complication
and confusion in attempts to classify 3DI technologies. In any case, attempts
to apply the early generation of 3DI were delayed. This is because conven-
tional miniaturization, involving scaling in planar and/or 2D processes, had
succeeded in reaching the submicron level, enabled by inexpensive ICs with
high density and high capacity. Wafer enlargement, such as from 8 to 12 inch
in diameter, also helped to reduce chip costs.
Interest in packaging-based 3DI technology using wafer-level process-
ing has been increasing again. This is driven by the physical and eco-
nomic limits of conventional scaling, which is no longer a main stream
for the increasing demands for device performance, system form factor,
and total manufacturing cost. This chapter discusses a BEOL-compatible
Wafer-Level 3D Integration 87
required for EUV technology. Assuming that the past lifelong sales for each
generation are approximately 10 times the corresponding business investment,
the corresponding market size necessary for this investment is more than
20 billion USD. Based on the 300 billion USD, total worldwide semiconductor
market, this expected market size for one product and one manufacturer is not
realistic. In short, this is the limit of 2D scaling in light of the economics of the
industry, and it is difficult to find a scenario of victory at present.
FIGURE 5.1
A comparison of wiring length and layout for 2D and 3D chip sets. Miniaturizing the layout
using 3DI provides low-power consumption, higher bandwidth, and higher integration.
Wafer-Level 3D Integration 89
∼500 μm
Controller logic PHY PHY
Interposer
PKG substrate
(a)
Bumpless and ultra thin WOW/COW technology User requirements
Form factor, energy
efficiency, and cost
TSV CPU/GPU/SoC
8 × DRAM
Controller logic <200 μm
PHY PHY
interposer
PKG substrate
(b)
FIGURE 5.2
A comparison of (a) bump and (b) bumpless interconnects using TSVs for 3D memory stack
structures, assuming memory stacks containing four memories + controller (five stacks) and
eight memories + controller (nine stacks). Bumpless interconnects used for the ultrathin WOW
process can be formed with higher density (narrower pitch) compared with TSVs and bumps
due to the limitations of bump size and pitch.
Thinning module
• Wafer-edge trimming
Support glass or Silicon • Temporary adhesive
• Exposure
coating
Al or Cu Pad Device wafer 1 • Development
• Bonding onto support
glass 5∼10 μm diameter
FIGURE 5.3
Bumpless interconnects using ultrathinning, TSVs, and wafer-on-wafer (WOW) process flow.
Vertical interconnects for TSVs are formed after wafer bonding from the front side. Additional
wafers can be stacked on top without any limitation on the number of wafers. These modules
can also be applied to chip-on-wafer (COW) integration. On-chip and off-chip TSV, respec-
tively, represent bumpless interconnects formed in the device area and the area around
devices, including gap fill materials in COW.
Wafer-Level 3D Integration 91
Aspect ratio = 5
Aspect ratio = 1 100 μm thick Si
20 μm thick Si Si3
BEOL
BEOL
MN Si4
BEOL Cu Cu
525 MPa
Si3
BEOL 225 MPa
Si2
MX
BEOL
Si1 Si2
MX
BEOL
Compressive Tensile
−600 −400 −200 0 200 400 600 900 (MPa)
(a) (b)
FIGURE 5.4
Stress simulation using the finite element method (FEM) for Si thicknesses of 100 (b) and 20 μm
(a) after stacking three wafers and forming a TSV with a diameter of 30 μm. A 10 μm Cu/low-k
BEOL layer is formed on every wafer surface, and thus the depths of the TSVs are 110 and
30 μm, respectively.
metal filling, decreases to about 1/5 at most, and the step coverage signifi-
cantly improves. A reduced TSV length is also considerably advantageous
for data transfer and power distribution with high-energy efficiency.
In addition, with the use of small TSVs, stress induced by a mismatch in
the CTE between Cu and Si decreases with decreasing aspect ratio of the
TSV, as shown in Figure 5.4 [9]. Stress at the center of a Cu plug in a 100 μm
thick wafer is 525 MPa and decreases to 225 MPa in a 20 μm wafer. The small
aspect ratio provided by an ultrathin wafer also has the advantage of reduc-
ing stresses generated in the silicon itself, in the bottom and top Cu-TSV, and
in interface regions having different CTEs.
1E+11
Misalignment <1 μm
1E+10
1E+09
Viscosity (MPa.sec)
1E+08
1E+07
>10 μm
1E+06
1E+05
1E+04
80 μm
1E+03
1E+02
0 50 100 150 200 250 300
Temperature (°C)
FIGURE 5.5
Relationship of misalignment of wafer bonding and viscosity of permanent adhesive material
as a function of temperature.
94 3D Integration in VLSI Circuits
escaping from the adhesive after the bonding process would form a cavity
(void) in the adhesive layer, measures should be taken to prevent this, such
as preheating after applying the adhesive or performing the bonding process
under a reduced pressure.
M2 RDL trench
SiN
PR TSV hole
Si #2
Si #2 Si #2
Adhesive
M1 RDL
Si #1 Si #1 Si #1
(a) (b) (c)
ECD-copper
PECVD-SIN Ti liner
M2 RDL
Si #2 Si #2 Si #2 TSV
M1 RDL
Si #1 Si #1 Si #1
(d) (e) (f )
FIGURE 5.6
TSV formation and damascene Cu plug processes. After bonding of the thinned wafer to another
wafer surface, RDL (redistribution line) patterning, TSV etching, Cu plug formation, and pla-
narization by CMP are carried out. (a) Wafer level bonding, (b) TSV PR & Si DRIE, (c) PECVD,
(d) SiN self-align etching, (e) PVD metal barrier and Copper ECD, and (f) planarization.
Wafer-Level 3D Integration 95
(AR)2 d3
200 Process time =
180−200 NO! D2R
180 160−180 Assuming etching rate 10 μm/min (R) at 50 μm diameter (D) and
etching rate followed by mass-transport limited, where AR =
160 140−160
aspect ratio and d = depth, D = diameter, respectively.
Etching time (min)
120−140
140
100−120
120 80−100
NO!
100 60−80
40−60 100
80 90
20−40 NO! 80
60 70
0−20 60
40 50
40
20 30 Depth (μm)
Our target
0 20 Wf thickness
20 19 18 17 10
16 15 14 13 7.5
12 11 10 9
8 7 6 5
4 3 2 1 5
Aspect ratio (AR)
FIGURE 5.7
Silicon etching time versus aspect ratio of TSV and depth, assuming that the total etching vol-
ume follows the mass transport limit reaction. Etching time increases with increasing depth
and aspect ratio.
RDL TSV
50 μm
Chain structure
RDL
Si Si = 10 μm
FIGURE 5.8
SEM images of Cu TSVs after ECD-Cu deposition and CMP. For ECD-Cu, to reduce polishing
time in CMP, the overburden of Cu was controlled to be thin. Chain structure shows the cross-
section of Cu pad and damascene TSV for two stack wafers.
96 3D Integration in VLSI Circuits
50
Glass Void
BCB (2 μm) IR image of 2WOW Microscope image of 2WOW
40
BCB void area (%)
10 Si2 Si2
BCB BCB
Si1 Si1
0
0 100 200 300 400 500
(a) Dishing depth (nm) (b)
FIGURE 5.9
Occupied void area (BCB) formed on as a function of Cu pad as a function of dishing depth
of Cu pad (a). Top view images of second wafer surface after Cu CMP using IR and optical
microscope (b).
Si SiON Cu Si SiON Cu
Ti/TiN
SiOC Ti/TiN
Si
Cu Si Cu
5.85 μm
SiOC
100 nm
Crack 100 nm
5.78 μm
−3 −3
10 10
400°C
10−5 10−5 5.45 μm
Leakage current (A)
400°C
10−7 300°C 10−7 300°C
Si substrate
10−9 10−9
200°C
10 −11
10−11 200°C
w/o anneal
w/o anneal
10 −13
10 −13 5.39 μm
0 10 20 0 10 20
V (V) V (V) 10.0 um
(a) (b)
FIGURE 5.10
A comparison of leakage currents in two types of TSV samples made by Bosch etching and
direct etching (a). Cracks are observed in the side walls of the TSV in the Bosch-etched sample,
which had a rough interface called scalloping. The leakage current as a function of applied
voltage after annealing at temperatures up to 400°C was measured. With increasing tempera-
ture, the leakage current increased but was two orders of magnitude higher in Bosch etching.
By optimizing the scalloping shape, SEM images of TSV etched off through Cu/Low-k BEOL
layer and Si are shown (b).
Wafer-Level 3D Integration 97
4
Thinned Si wafer 20 μm
3
FIGURE 5.11
SEM cross-sectional image of seven-wafer stack after dicing. No crack and delamination of
bonding material were found.
98 3D Integration in VLSI Circuits
FIGURE 5.12
Thermal stress testing at temperatures of –65°C to 150°C for 100 cycles, observed by scanning
acoustic tomography (SAT).
BCB
5 μm
thinning
Cumulative bit failure (a. u.)
1.E−02
95
Ion/Ioff
90 1.E−03
Electrical 80
70 1.E−04
50
property 30
20
1.E−05
1.E−06
10
Junction leakage
5 1.E−07
1 Before 1.E−08 Before thinning
−.1 thinning 1.E−09 After thinning
−.01 1.E−10
0.6 0.8 1 1.2 1.4 1 10 100 1000 10000
Switching charge (a.u.) Refresh time (a.u.)
FIGURE 5.13
Device characteristics before and after wafer thinning for ferroelectric random access mem-
ory (FeRAM), MPU (SRAM), and DRAM. In the FeRAM, the quantity of residual polarization
charge (Qsw) was evaluated; in the SRAM, the on- and off-current, and leakage current; and in
the DRAM, the retention time. There were no significant changes in these characteristics after
wafer thinning down to 9 and 4 μm.
Wafer-Level 3D Integration 99
Wafer grinding
Inclination angle
2.5
Grinder wheel Initial angle
Si
Adhesive
2.0
Support glass
First step
Chuck table
TTV measurement using NCG
1.5
TTV (μm)
Non contact
gauge (NCG)
1.0
0.0
(a) (b) Inclination angle (a.u.)
FIGURE 5.14
Improvement in total thickness variation (TTV) using so-called auto-TTV process employing NCG
(noncontact gauge) methods (a) and TTVs for 300 mm wafer with various inclination angles (b).
100 3D Integration in VLSI Circuits
0.553 0.555
0.545
0.543
0.540 200 nm No.5
0.538
0.535
5 nm No.10
0.533 0.530
0 5 10 15 20 25 30 0 50 100 150 200
(a) Positron energy (keV) (b) Depth (nm)
FIGURE 5.15
Doppler broadening spectrum of positron annihilation spectroscopy analyses for wafer
backside after thinning. (a) S parameter as a function of incident positron energy E for fine
grinding (Samples No. 5 and 9) and CMP (Samples No. 10 and 12). The number in terms
such as “Coarse75” represents 75 μm coarse grinding. (b) Depth profiles of S parameter for
fine grinding and CMP. The S parameter is normalized by that of Sample No. 12, which
is assumed as a reference. Thicknesses of the defect layer were ~200 and 3–4 nm for fine
grinding and CMP, respectively, observed by TEM analysis, which agrees with the depth
profiles.
Wafer-Level 3D Integration 101
1.2
0.6 w/ Cu
contamination
0.4
0.2
0
1 10 100
Si thickness (μm)
FIGURE 5.16
Retention time of DRAM as a function of Si thickness. Cu was formed at a density on
the order of 1013 atoms/cm 2 at the back side after thinning. Then wafers were annealed
at 250°C for 60 minutes. Normalized retention time was the time at 80% yield. Wafer
map represents Si thickness, where the average thickness and TTV were 2.66 and 1.6 μm,
respectively.
HAADF
Device layer
Si
1.0 μm BF
EDX
Cu
200 nm
2 μm
1.0 μm Cu K
FIGURE 5.17
Cross-sectional TEM image of DRAM wafer after grinding and Cu contaminated. Cu aggregation
at the back side of the Si surface was observed by HAADF-STEM and EDX analyses.
102 3D Integration in VLSI Circuits
1.00E+21
w/o ANL
200°C
1.00E+20
300°C
800°C
Concentration (atoms/cm3)
1.00E+19 1000°C
200°C-GAUSS
1.00E+18 300°C-GAUSS
800°C-GAUSS
1.00E+17 1000°C-GAUSS
Simulation
1.00E+16
1.00E+15
1.00E+14
0.0 0.2 0.4 0.6 0.8 1.0
Depth (μm)
FIGURE 5.18
Backside SIMS depth profiles of Cu in Si after back-side grinding for various annealing tem-
peratures, and Gaussian distribution of Cu into Si (dotted line) calculated using the diffusion
kinetic model.
Ti/TiN SiOC
Ti/TiN
Crack
100 nm 100 nm
(a) (b)
FIGURE 5.19
TEM cross-sectional image of TSV after SiON/Ti/TiN/Cu formation. Scallop shape at the side
wall and crack propagation at the SiON layer were found for Bosch etching (a). No crack was
observed for direct etching, even after annealing (b).
104 3D Integration in VLSI Circuits
10000
Si(O)N/Si TSV 1015 atms/cm3
Cu 1010 atms/cm3
Thickness of Si(O)N (t: nm)
t
100 Cu 1010
Diffused Cu
Si
LT SiON (high dense)
10
HT SiN
FIGURE 5.20
Critical thickness for Cu barrier of dielectric layer as a function of film density. Density repre-
sents relative value with respect to that of bulk Si3N4 (3.44 g/cm3).
into the Si substrate. This suggests that the effective Cu area decreases with
an increase in the thickness of the dielectric, which limits the scalability of the
Cu TSV diameter, for example, a Cu TSV diameter of only 1 μm remains at a
3 μm thick dielectric.
Bump
PI
UBM
Cu/Low-k BEOL
Delamination
Wafer 1 20 μm
(a)
Bumpless
Cu TSV
Wafer 2
Wafer 1
Cu/Low-k BEOL 20 μm
(b)
FIGURE 5.21
Cross-sectional SEM images of (a) bump and (b) bumpless TSV formed on Cu/Low-k BEOL
interconnects.
0.5
Intel Leti 1.2 GPa (× 3) Si
(IITC2007) Tensile
BEOL
0.4
1300 Cu
1 GPa
TAdhesive/ TSi
0.3 Si
Samsung
(3D art.2007) σ
Elpida
0.2 (ECTC2007)
Bumpless TSV 800 MPa (× 2) BEOL
Fraunhofer
Leti (3D art.2004) Cu SiN Si
0.1 (Low temp.2007) −100
600 MPa (× 1.5) ASET (MPa) Induced stress
RTI (Low temp.2007) Adhesive
400 MPa (× 1) IMEC
(IEDM2006) RTI (Low temp.2007)
0.0
2 4 6 8 BEOL Cu/Low-k
Aspect ratio of TSV Si sub.
FIGURE 5.22
Stress at Cu/Low-k interconnects layer versus the ratio of thickness of adhesive layer to thick-
ness of Si as a function of aspect ratio of TSV.
106 3D Integration in VLSI Circuits
99.99
99.9 (a) w/o TSV
99 251k dense chain
BEOL: 0.4 μm dense chain
31k sparse chain
90
Probability (%)
50
w/o TSV (b) w/bumpless TSV
+ TSV: 10 μm Multi-TSV
10
w/ TSV
1
w/o TSV w/ TSV
.1
.01 0.21 mΩ/blocks
0 20 40 60 80 100 120
Resistance (kΩ)
FIGURE 5.23
Via chain resistance cumulative failure distribution of Cu BEOL interconnects with and with-
out Cu-TSVs. Schematic of test structure for electrical measurement: (a) 65-nm Cu intercon-
nects and (b) with multiple TSVs in two-wafer stack.
BEOL
Layer 8
Vertical interconnects
1st–
7th layer
Layer 0
(a)
TSV
Si
TSV
BEOL,
device layer
Adhesive, (c)
underfill
Microbump
(b)
FIGURE 5.24
Schematic diagrams of thermal resistance estimation for Si substrate having a BEOL layer and
TSV vertical interconnections with and without bumps: (a) nine-device stacked structure with
bumps, (b) details of TSV and bump with underfill, and (c) TSV (no bump) and ultrathin wafer
with organic bonding adhesive.
evaluated using bump and bumpless interconnects [24]. The thermal resis-
tance of a Si substrate having a BEOL layer and TSV vertical intercon-
nects with and without bumps was estimated, as illustrated in Figure 5.24.
There are polymer and composite materials, such as underfill in the case
of the bump interconnects and bonding adhesives in the case of bump-
less interconnects. The thermal resistance Rth of the vertical interconnects
and the total thermal resistance were calculated using the FEM and a ther-
mal network method, respectively. Assuming a stack of eight DRAMs and
one controller wafer, the thermal resistance was estimated by the follow-
ing sequence: estimating the effective thermal conductivity of each layer
and calculating the temperature rise using the thermal network method.
The thermal conductivities used were 148, 160.5, and 1.44 W/mK for Si, Si
with TSVs, and BEOL, respectively [25]. Figure 5.25 shows thermal resis-
tance as a function of normalized via occupancy. The thermal resistance
decreased as both the thicknesses of the underfill material and bonding
108 3D Integration in VLSI Circuits
100
Gap
= 50
μm
10 25 μ
m TSV
w/o Under-fill
Interconnection thermal
Bump Gap
Gap = 50 μm
resistance (Kcm2/W)
1
25 μm w/ Under-fill
Adh
0.1 es ive =
5 μm
2 μm
Bumpless TSV
0.01
TSV
0.0001
0.001 0.01 0.1 1
Normalized occupancy of via area
FIGURE 5.25
Thermal resistance of interconnects as a function of normalized occupancy of via area, where
microbump and TSV (no bump) are compared.
Microbump Bumpless
51.4 μm pitch 25 μm bump 512*16 TSV 5 μm gap
Thermal Thermal
Equivalent thermal Thickness Thickness
Components resistance resistance
conductivity (W/mK) (μm) (μm)
(Kcm2/W) (Kcm2/W)
Si 148 150 0.049 5 0.00034
1.44 15 0.104
BEOL
Top 3.99 15 0.038
Chip
Interconnection
2.54 20 0.079
microbump
Rth1
Interconnection
2.56 5 0.020
bumpless
DRAM
Si with TSV 160.5 50 0.003 5 0.00031
2–8 1.44 15 0.104
BEOL
Chip 3.99 15 0.038
7
layers Interconnection
2.54 20 0.079
microbump
Interconnection
Rth2–8 2.56 5 0.020
bumpless
FIGURE 5.26
Summary of total thermal resistance of TSV + microbump and TSV (no bump) structure.
25
Microbump Bumpless
20
Temperature rise (°C)
15
10
0
Base 8 7 6 5 4 3 2 1 0
Device layer
FIGURE 5.27
Temperature rise for each layer in stacked devices in bump and no-bump (bumpless) structures.
110 3D Integration in VLSI Circuits
10000
gration
inte Terabit generation
1000 3D
it 1 nm
DRAM density (Gb/cm2)
y lim
og raph
, lith 3 nm
st
h co 5 nm
Hig
7 nm 4-stack/64 Gb
ing
Multistack
FIGURE 5.28
Trend of DRAM density using 2D conventional scaling and 3D multistacking using existing
DRAM. DRAM capacity in the 3D case corresponds to the number of stacked dies, assuming
that redundancy is eliminated by cell blocks at each layer.
Wafer-Level 3D Integration 111
0.8−1.0
1.0 0.6−0.8
0.4−0.6
n=4 0.8
0.2−0.4
m/4 0.6 0.0−0.2
n = 3 Y3D 1/4
0.4
2/4
n=2 0.2
3/4
0.0
n=1 0.9 0.8 0.7 0.6 4/4
Multi-TSVs 0.5 0.4
(a) (b) YS
FIGURE 5.29
Schematic of stack wafers of a combination in four-level stacked wafer (a), and yield for
four-level 3D wafer stack and a comparison of good-die combination at die size = 1.148 cm2
and defect density D0 = 0.2/cm2 (b).
112 3D Integration in VLSI Circuits
DRAM3 12
DRAM2 13
14
CH15 Controller
DRAM1
12
13
14
TSV0
DRAM0 12 13
14
CH15
12
Controller 13 CH15
CH/DRAM0
14
CH15 TSV1
CH/DRAM1
TSV2
CH/DRAM2
TSV3
y
(x, y, 3 CH/DRAM3
)
z (x, y, 2
)
(x, y, 1
)
x (x, y, 0
)
FIGURE 5.30
Schematic diagram of DRAM stack structure. Here, one memory die has 16 channels (CH0
to CH15) in total. Wafers following DRAM2, DRAM1, DRAM0, and controller are stacked to
DRAM3 (base wafer) using bumpless interconnects and a WOW process. Bumpless intercon-
nects are connected independently to the controller die from each channel of the DRAM layers.
of capital investment, and the chip cost per unit area is currently saturated,
even after taking account of shrinking chip sizes. In combination with
three-dimensional stacking to overcome such problems, a roadmap toward
high-density integration backed up by production costs can be made. This is
because the capital investment for 3D wafer processes is not high compared
to that of lithography. Moreover, keeping the wafer shape as-is for stacking
ensures compatibility with manufacturing facilities in front-end process-
ing and helps utilize the mature process technologies developed for wafer
processing. If the processes up to three-dimensional stacking can be handled
as units in the manufacturing line, the throughput will be 1/100th of that in
stacking starting with chips. Therefore, future semiconductor manufactur-
ing is expected to advance with a roadmap in which the number of stacked
wafers, the wafer thickness, and the number of TSV interconnects serve as
indices, as shown in Figure 5.31.
105
Multistack 3D multistack × 32
vertical × 32 × 16
Feature size interconnects
10000 × 16 ×8
Feature size (nm), Total Si surface (cm2)
104
7000 ×8 450 mm × 4
3000 300 mm × 4
FIGURE 5.31
Trends of two-dimensional (2D) scaling and wafer size including total Si surface of wafer stack.
Conventional scaling will face difficulties such as physical limits and inability to minimize
costs, whereas 3D integration will become superior to scaling. By combining conventional two-
dimensional integration (2DI) with three-dimensional stacking to overcome such problems
associated with device scaling and increasing wafer size, it is possible to make a roadmap
toward high-density integration backed up by production costs. In volume production, 3D
wafer stacking (WOW) enables a lower cost than chip-on-chip (COC) and high-density inte-
gration, reaching Terabit level. Bumpless interconnects using TSVs and ultrathinning provide
high-density I/Os connecting top and bottom device layers and achieve a small form factor
1/10th that of bump structures.
114 3D Integration in VLSI Circuits
References
1. Cisco, 2013. http://www.cisco.com/c/dam/en_us/about/business-insights/
docs/ioe-value-at-stake-public-sector-analysis-faq.pdf, By comparison, the
“Internet of Things” (IoT) refers simply to the networked connection of physical
objects (doesn’t include the “people” and “process” components of IoE). IoT is a
single technology transition, while IoE comprises many technology transitions
(including IoT).
2. T. Ohba, N. Maeda, H. Kitada, K. Fujimoto, K. Suzuki, T. Nakamura, A. Kawai,
and K. Arai, Microelectron. Eng., 87: 485–490, 2010.
3. T. Ohba, Electrochem. Soc. Trans., 34 (1): 1011–1016, 2011.
4. T. Ohba, Y. S. Kim, Y. Mizushima, N. Maeda, K. Fujimoto, and S. Kodama, IEICE
Electron. Expr., 12 (7): 1–14, 2015.
5. G. Moore, Electron. Mag., 38 (8), April 19, 1965.
6. J. U. Knickerebocker, P. S. Andry, B. Dang, P. R. Horton, M. J. Interrante,
C. S. Patel et al., IBM J. Res. Dev., 50 (4/5): 553–567, 2006.
7. M. Koyanagi, T. Nakamura, Y. Yamada, H. Kikuchi, T. Fukushima, T. Tanaka,
and H. Kurino, IEEE Trans. Electron Devices, 53 (11): 2799–2808, 2006.
8. E. Beyne, P. D. Moor, W. Ruythooren, R. Labie, A. Jourdain, H. Tilmans,
D. S. Tezcan, P. Soussan, B. Swinnen, and R. Cartuyvels, IEEE IEDM Technical
Digest, pp. 495–498, 2008.
9. H. Kitada, N. Maeda, K. Fujimoto, K. Suzuki, A. Kawai, K. Arai, T. Suzuki,
T. Nakamura, and T. Ohba, IEEE IITC, pp. 107–109, 2009.
Wafer-Level 3D Integration 115
CONTENTS
6.1 Introduction ................................................................................................ 118
6.1.1 Rationale for Three-Dimensional Integration ........................... 118
6.1.2 Rationale for Wafer-Bonding Technology .................................. 120
6.1.3 Description of Work ...................................................................... 120
6.2 Chip-Level Three-Dimensional Integration with 22 nm
Complementary Metal–Oxide–Semiconductor Technology ............... 121
6.3 Wafer-Bonding Technology for Three-Dimensional
Integration Stacking .................................................................................. 125
6.3.1 Metal–Metal Bonding.................................................................... 126
6.3.2 Hybrid Bonding ............................................................................. 127
6.3.3 Oxide Bonding ............................................................................... 128
6.4 Oxide-Bonding Technology for Embedded Dynamic
Random Access Memory Stacking .......................................................... 129
6.4.1 Oxide-Bonding Layer Preparation .............................................. 129
6.4.2 Wafer-Level Three-Dimensional Integration with
Oxide Bonding ............................................................................... 131
6.5 Key Electrical Metrics Performance Results for Embedded
Dynamic Random Access Memory Stacking ........................................ 135
6.6 Summary..................................................................................................... 140
Acknowledgments .............................................................................................. 141
References............................................................................................................. 141
117
118 3D Integration in VLSI Circuits
6.1 Introduction
It is a very well understood fact that device and interconnect (IC) scaling in
semiconductor technology applications is becoming increasingly difficult
and more expensive with each new technology node. The traditional model
of continuous electric field scaling that had been for decades the key to the
success and growth of the microelectronics industry [1], and which enabled
Moore’s law, has ceased to be applicable, at least not in a straightforward way.
However, the semiconductor industry is still expected to follow Moore’s law
in the economic sense and continues to deliver significant improvements in
device and chip performance to enable yet more applications, with each new
generation of technology, while still supporting concomitant decreases in cost
per function. The driving force for the last two to three decades is the inces-
sant implementation of new innovations to enable the continuation or even
acceleration of the business model based on Moore’s law. Many new materi-
als have been introduced along with new advanced processes and increas-
ingly complex integration schemes and device architectures. This has been
undoubtedly necessary as the traditional silicon-based materials and tradi-
tional complementary metal–oxide–semiconductor (CMOS) architectures
reached their physical limitations as the scaling trends demanded struc-
tures increasingly in the nanoscale. Furthermore, interconnect scaling has
also resulted in resistance–capacitance (RC) delays now becoming a major
contributing factor to limiting overall performance of systems, which were
usually significantly gated primarily by transistor performance in the past.
All these innovative approaches have driven collectively the tooling and fab
costs much higher and have increased the risks, cost, complexity, and the time
involved to deliver technologies through the pipeline, from early research
and development (R&D) to manufacturing ramp up, and to fully mature
product. To make matters worse, the performance gains with each technol-
ogy node are not as sizeable or straightforward as in the era of traditional
scaling, which makes the cost-benefit considerations for future investments
even more challenging. It is becoming increasingly apparent that revolution-
ary rather than evolutionary solutions are needed to sustain or even acceler-
ate again the desired rate of combined improvements in the performance and
the costs of the technology. This continued demonstration of technological
prowess is still the hallmark of the semiconductor industry and, combined
with the cost efficiencies, it is expected by both enterprise and individual cus-
tomers and has been an essential factor in the proliferation of semiconductor
technology applications across all areas of human life and enterprise.
the memory business and the backside image sensor business, the industry
is increasingly considering wafer-level stacked technology. It is noteworthy
that the scaling of TSVs depends on the aspect ratio of TSV diameter to the
thickness of thinned silicon wafer. It follows that the thinned silicon wafer
thickness needs to scale with the TSV diameter so as to achieve a fixed aspect
ratio as much as possible, and therefore all the related enabling processes
(bonding, thinning, etching, TSV liner deposition, and metal fill) must be
able to perform well to these specifications.
a laminate with seven layers of build-up circuitry on each core side. The
design features integrated TSVs, microbumps, back-end-of-line (BEOL) wir-
ing structures, and assembled controlled collapse chip connections (C4)
joints, with Cu TSVs integrated in the BEOL. Void-free TSVs were formed,
and the additional BEOL levels were fabricated after TSV processing and
subsequently planarized by chemical–mechanical planarization (CMP). The
types of bumps used to achieve interconnection were low solder volume
with Cu pillars for the top die to the bottom die, and high solder volume for
the bottom die to the laminate with the package exhibiting reliable assembly
process, high-quality connections, and very good thermal performance.
For the latter, wafer-bonding process and integration technology was
developed to achieve the stacking of high-performance POWER7™ cache
cores [9] that were built based on 45 nm silicon-on-insulator (SOI) technology
with embedded dynamic random access memory (EDRAM) and a total of 13
BEOL metal levels. For this work, copper TSVs of 5 μm in diameter were used
at a 13 μm pitch for the signal communication and the power delivery to the
stacked cache cores. The individual wafers, each featuring nine BEOL metal
levels, were bonded permanently to each other by using a low-temperature
oxide bonding. The topside wafers were thinned to about 13 μm using a com-
bination of mechanical grinding, chemical–mechanical polish, and dry etch-
ing. Subsequently the TSVs were formed by using conventional lithographic
alignment and definition techniques. Four additional metal levels were also
built post-bonding and TSV definition to complete the interconnection of the
chips in the two wafers and to enable testing. Electrical testing shows very
good performance and device stability.
TSV
C4 UF
Laminate
Laminate
(a)
22 nm CMOS
bottom die
Laminate
(b)
FIGURE 6.1
3D module with IBM’s 22 nm CMOS die (>600 mm2) joined face to face with microbumps, TSVs,
and C4s: (a) cross-section image and (b) optical image.
99.9 99.9
99 99
90 90
Cumulative percent
70 70
50 50
30 61 μm pitch -1 61 μm pitch -2 30
61 μm pitch -3 61 μm pitch -4
10 61 μm pitch -5 61 μm pitch -6 10
61 μm pitch -7 61 μm pitch -8
131 μm pitch -1 131 μm pitch -2
1 131 μm pitch -3 131 μm pitch -4
1
0.1 0.1
(a) Resistance (Ω)
20%
Change in thermal resistance from time 0
10%
(% of center perf.)
5%
0%
−5%
−10%
−15%
−20%
0
166
412
1000
1500
0
166
412
1000
1500
0
166
412
1000
1500
0
166
412
1000
1500
0
166
412
1000
1500
0
166
412
1000
1500
FIGURE 6.2
Reliability test results: (a) a cumulative distribution graph of electrical resistance including
microbumps, TSV, BEOL wiring, and the laminate wiring, and (b) thermal reliability of TIM1
(R int) and die-to-die (Rd-d) interfaces through 1500 cycles of DTC (−55/+125°C) accelerated
stressing.
stacked die to the package lid through a thermal interface material level-1
(TIM1), also known as Rint, is monitored by implementing 25 top-chip sensors
that are referenced to the external thermocouple attached to the lid. The sen-
sors are classified based on the location on the die (center, middle, and edge)
to track potential thermal performance degradation during deep thermal
cycling (DTC) (−55°C to +125°C) by a TIM tearing, shearing, or loss of adhe-
sion mechanism.
3D Integration Stacking Technologies 125
TABLE 6.1
Reliability Stress Conditions for Chip-Level 3D Modules and Test Results
Cell Stress Condition JEDEC Spec Requirements Quantity (Lots) Result
A TC-K 0/125°C A104 1000 cycles/0 Fail 31(4) Pass
B TC-G −40/125°C A104 850 cycles/0 Fail 13(1) Pass
C THB 85°C/85% RH/3.6V A101 1000 hours/0 Fail 30(2) Pass
D HTS 150°C A103 1000 hours/0 Fail 13(2) Pass
Rint (C/W ) =
( Ti (top die) − Tthermocouple ) (6.1)
power
Rd-d (C/W ) =
( Ti (bottom die) − Ti (top die)) (6.2)
power
Thermal reliability is defined by the change in Rd-d or Rd-d over the duration
of an accelerated stress test, such as DTC. In this case, a cumulative 1000
DTC was used with thermal readouts at the 166, 412, 1000, and 1500 cycle
points. As shown in Figure 6.2b, both Rint and Rd-d exhibited extremely stable
behavior throughout stressing. A maximum thermal degradation of <4°C
per 1000 W of power dissipation was observed at the end-of-stress 1500 cycle
readout.
process ensures the melting and diffusion of the lower melting point metal
atoms into the higher melting point metal atoms, thus forming stable inter-
metallic states. However, this technique also is of rather limited scalability
with respect to multiwafer stacking due to the inadequate thermal stabil-
ity of the bonds compared to what is actually needed to handle the rigors
of downstream wafer-level processing. The direct metal-to-metal thermal
compression bonding is, on the other hand, more promising for wafer-scale
bonding compared to other metal-to-metal wafer-bonding techniques. In
this technique, the metallic structures, such as Cu microstuds and pillars,
are built such that they protrude out of the wafer surface, typically after a
recess process for the surrounding dielectric is done [19]. To prevent corro-
sion of these structures, it is essential to achieve clean metal surface before
the thermo-compression bonding. For this purpose, typically clean pro-
cesses are used to remove the surface oxides and any other impurities that
may be detrimental to the bonding quality over time. By using this method,
the advantage of direct electrical connection is retained, whereas high bond
energy is achieved for the metal–metal bond interfaces with stability that
can be enough to withstand downstream wafer-level processing. Having
said that, underfill is typically needed to guard against mechanical stability
and reliability risks, and this impacts the complexity and cost negatively. To
ensure tight pitches and CD, the bonding alignment overlay performance on
a wafer scale must be very accurate and more so in the case of smaller CD
interconnect features. Another challenge is derived from the thermal and
compression stress involved in the bonding process itself, which can affect
alignment performance, process yield, and is also typically of relatively low
throughput, which makes implementation for high-volume manufacturing
less probable.
(a)
(b)
FIGURE 6.3
(a) 3D profilometry imaging results for a (2 × 5) mm field, in 10 nm steps, before and after the
bonding film preparation over a challenging area of incoming wafer topography showing that
topography is improved to ~5 nm over the scanned range. (b) Atomic force microscopy image
showing the atomically smooth surface of the prepared bonding film stack.
3D Integration Stacking Technologies 131
atomically smooth surface with root mean square (RMS) roughness values
between 0.2 and 0.4 nm, as demonstrated in Figure 6.3b. Thus, a smooth
bonding surface is achieved as required for the bonding process. Prior work
has shown the value of using the two-layer bonding stack system.
Although the first TEOS-based layer is thick enough to accommodate
incoming topography, the bond strength that it can deliver upon bonding
is not optimal. However, the use of silane-based low-temperature oxide as
the topmost bonding layer provides better performance. The presence of
defective regions in the silane-based oxide, where multiple Si-OH groups
are concentrated, can accommodate H2O that evolves later during the inter-
facial condensation reactions that take place during the bonding process and
this results in higher bonding energy [23]. This interpretation is confirmed
from the fact that prebaking of these films can increase their bonding energy,
as any already absorbed H2O is removed, leaving room for H2O molecules
generated later during the post-bonding anneal. The fact that the bonding
energy is typically dependent on the N2 plasma power for these films sug-
gests that the surface density of Si-OH is key, and therefore silane-based
films are more suitable than TEOS-based films.
2.00
0.10
(a) (b)
FIGURE 6.4
(a) Scanning acoustic microscopy image showing the absence of voids in the bonding between
two wafers with an edge area of the bonded wafer system magnified to show virtually no
defectivity. (b) Wafer-bonding overlay map based on measurements from cross-in-box struc-
tures from 13 chips showing bonding overlay performance <2 μm. The insert is an image of
the overlay structures used.
3D Integration Stacking Technologies 133
Carrier
Carrier
Bulk Si Bulk Si
(f ) (e) (d)
FIGURE 6.5
Schematic representation of wafer-level 3D integration process, showing the use of a silicon
carrier wafer to stack a thinned device wafer on a bottom full-thickness device wafer. (a) FEOL,
(b) BEOL, (c) carrier attach, (d) thinning, (e) bond, carrier removal, and (f) TSV and BEOL.
TSVs
45 nm CMOS
stratum-1
showing
BEOL wire 45 nm CMOS
levels stratum-2
showing
BEOL wire
levels
FIGURE 6.6
SEM cross section of donor and acceptor device wafers after wafer-level bonding and removal
of handle wafer and subsequent TSV formation.
to fill the TSVs without voids. As a result, void-free TSVs were formed
successfully with a 5 μm CD at a 13 μm pitch, as shown in Figure 6.6 [4],
where a scanning electron microscopy (SEM) cross section of the joined and
TSV-interconnected donor and acceptor device wafers is shown. This was
followed by the fabrication of three additional copper-wiring levels on the
top thinned device wafer. It is essential for high-performance, low-voltage
applications to use low-resistivity Cu TSVs to limit the IR voltage drop.
Throughout the whole process flow that enabled the wafer-level integra-
tion, all processing and tools utilized were compatible with high-volume
manufacturing.
We should also note that we have demonstrated that this face-to-back
wafer-level bonding integration process can be repeated to create intercon-
nected wafer multistacks by adding more thinned wafers in the same fash-
ion as the second wafer. With respect to the extendibility of this technology
to include smaller diameter TSVs and thinner Si strata, we have successfully
demonstrated four-wafer strata structures that feature 1 micron diameter
interwafer TSVs chain structures at a Si strata thickness of ~6 μm, where
each wafer also features intrawafer TSVs of 0.25 μm diameter, to connect the
front and back side of the same wafer [8]. As can be seen in this prototype
structure, SEM cross-section image shown in Figure 6.7a, a handle silicon
wafer, four strata of wafers with both interwafer and intrawafer TSVs, and
associated TSV chain structures were demonstrated. For each stratum, the
intrawafer TSVs that eventually connect its front to its back are fabricated
when the wafer is at full thickness from the front side. These intrawafer
TSVs are of 0.25 μm in diameter. Once these are connected to the respec-
tive thin wire levels on the front side of the wafer, the bonding layer is pre-
pared and then bonding ensues to the handler wafer (for the first stratum)
3D Integration Stacking Technologies 135
Strata 4
Strata 3 A2
E2
B2
Strata 2
F1
Strata 1
Handle wafer
(a) (b)
FIGURE 6.7
(a) SEM cross section of donor and acceptor device wafers after wafer-level bonding and
removal of handle wafer and subsequent TSV formation. (b) SEM cross sections showing the
formed TSV chain structures from wafer to wafer for evaluation for TSV performance.
or on stratum n−1 for the n stratum. In each case the newly bonded wafer is
thinned to <7 μm to reveal the thin intrawafer TSVs from its back side. For
n > 1 strata, interwafer TSVs of 1 μm in diameter are fabricated from the
backside, using a process that etches the remaining silicon of the stratum
and its complex front-end-of-line (FEOL)/BEOL dielectric stack on the front
side, the bonding layer, and the ILD stack on the back side of the n−1 stra-
tum. Subsequently, backside wiring is defined to complete test chains that
are formed for the intra- and interstrata connections, respectively, as shown
in the SEM cross-section image in Figure 6.7b. The low resistivity of Cu
achieved through the special TSV-filling processes, especially the intrawafer
TSVs, is extremely powerful and essential for the performance requirements
with regard to the key potential application of multistacked DRAM memory.
This TSV design system can allow for an unprecedented interconnection
density, with very tight allowable pitches. This opens a very promising inte-
gration route toward high-volume manufacturing low-power, high-density
3D DRAM memory stacks.
FIGURE 6.8
To accurately capture the effect of 3D stacking and TSV processing (a) TSV chains, (b) TSV
banks, and (c) FETs in proximity to TSV were used.
12.0
Measured resistance (Ω)
10.0
8.0
0.4
Probability
6.0
0.3
4.0 0.2
2.0 0.1
0.0
0 20 40 60 80 100 0 5 10 15 20 25 30 35 40
(a) Number of links in the chain (b) Resistance (Ω)
FIGURE 6.9
(a) Plot of measured median chain resistance versus number of links in TSV chain. The typi-
cal resistance per link was 120 mΩ, including the link wiring resistance. (b) Frequency plot of
number of dies measured versus measured resistance for a TSV chain consisting of 82 links.
3D Integration Stacking Technologies 137
75 120
160
50
25
0
2.5E−12 5E−11 7.5E−12 1E−11 1.25E−9 1.5E−11
Capacitance (F)
FIGURE 6.10
Plot of measured TSV bank capacitance versus number of TSVs in the bank. The capacitance
measured per TSV was extracted as 80 fF per link.
138 3D Integration in VLSI Circuits
100
Cumulative distribution (%)
Probability
160 0.10
50 0.05
25
0
−1
−1
−1
−1
−1
−1
−1
e
1e
3e
5e
7e
9e
0
−4
−1
(a) Leakage current (b) Leakage current (A)
FIGURE 6.11
(a) Plot of measured leakage current versus number of TSVs in bank. The worst leakage per TSV
was well below the pA level, suitable for most demanding CMOS applications. (b) Frequency
plot of the distribution of the leakage current from a bank with 120 TSV.
−2.60e−4
Device
Device
−2.50e−4 with
with
Device Control TSV
3.00e−7 Device Control TSV
with −2.40e−4
with
Control TSV Control
TSV −2.30e−4
2.00e−7
Idlin (A)
Ioff (A)
−2.20e−4
1.00e−7 −2.10e−4
−2.00e−4
0.00e+0
−1.90e−4
−1.00e−7 −1.80e−4
Prebond Postbond Prebond Postbond
(a) (b)
FIGURE 6.12
(a) Plot of Ioff for FET devices, control and devices with TSV in proximity. The data before
bonding and after bonding showed no effect from either introduction of TSV or bonding.
(b) Same as (a) but Idlin, a sensitive metric of TSV, process effects is shown.
3D Integration Stacking Technologies 139
1.54
Passives
1.4
1.25
0.975
Reliability 0.9
0.85
0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3
(a) (b) Vdd (V)
FIGURE 6.13
(a) The chip layout of a one strata. This stratum is joined to another identical stratum via 3D
stacking (b), which shows the lowest functional frequency versus Vdd shmoo for eDRAM read
and write operations demonstrating the functionality of a stacked eDRAM.
for NMOS transistors was not measured for the purpose of this work but it
is expected to be smaller than P-type metal–oxide–semiconductor (PMOS)
devices as demonstrated previously [27,28].
Through this wafer-level integration process, a high-performance 45 nm
SOI stacked chip (layout is shown in Figure 6.13a) with stacked EDRAM cache
prototype was built with more than 11,000 integrated TSVs. The chips in the
two different strata are designed to emulate a stacked processor and cache
chip assembly, as described in a previous work [14]. A built-in-self-test (BIST)
engine was designed for the purpose of testing the strata-to-strata commu-
nication performance and to test the memory functionality of these struc-
tures for each stratum. The BIST engine successfully accessed the EDRAM
on both strata of this prototype. The resulting shmoo plots, Figure 6.13b, of
clock frequency versus supply voltage demonstrate 16 Mb EDRAM function-
ality, which is fixable, and points to possible strata-to-strata communication
frequencies of up to 2.1 GHz at 1.3 V. The memory patterns for these tests
were written in this stacked EDRAM prototype using four different configu-
rations, as described in the following, and are also shown in Figure 6.12 [22]:
1. Single 2D-thick wafer mode where the memory on the thick wafer
was activated.
2. Bonded 2D-thin wafer mode where the memory on thinned wafer
S1 was activated and the test patterns were loaded using the TSVs.
3. 3D mode where the BIST engine on the thick wafer controls the
memory on both wafer strata.
4. 3D mode where the BIST on the thin wafer controls the memory on
both wafer strata.
140 3D Integration in VLSI Circuits
By evaluating modes 3 and 4, the ability to write and read data from alter-
nating strata memory in a single cycle was demonstrated, which confirms
the quality of the clocks, power, control, and data signals across the chip
boundary. This is necessary to be able to transfer the data at speed for the
entire memory assembly without any errors. Furthermore, it is obvious
from the shmoo plots in Figure 6.13 that the failure mechanism is shared by
all the modes evaluated, all plots are practically identical, which indicates
that the EDRAM performance was not affected by the wafer-level bonding
and integration of the TSVs to achieve interconnection. The pattern shmoo
that was used was the march 9 pattern, which by design forces cycle-to-cycle
simultaneous switching patterns across the strata boundary for this evalu-
ation. In addition, an equivalent column march pattern was also evaluated,
with the results being very similar as well. The maximum allowed fre-
quency at wafer test was limited by the voltage drop, which is inherent in
the cantilever-probing method used, compared to socket-based module test.
The retention signature obtained for the EDRAM indicates retention times
achieved that were over 200 μs.
6.6 Summary
Two different technologies of 3D technology for chip stacking were demon-
strated, both featuring very promising results. The first is a die-level stacking
integration scheme featuring 22 nm CMOS technology with ULK BEOL,
which enabled die assembly on a laminate with seven layers of build-up
circuitry on each core side. These featured BEOL-integrated TSVs, micro-
bumps, BEOL wiring, and C4s used for the bonding and interconnection
were low-volume SnAg solder with Cu pillars for the top die to the bottom
die and C4 for the bottom die to the laminate. Interconnect characterization
showed high yield and high integrity of microbumps, TSVs, and C4 joints.
Very good thermal performance was exhibited, with the test structures pass-
ing TC-K, TC-G, THB, and HTS, thermal stress test comfortably, which makes
this a very good candidate for die-level stacking and packaging.
As the main focus of this work and aiming at high-volume manufactur-
ing of high-performance memory, a wafer-level integration technology was
developed through oxide bonding stacking of high-performance POWER7™
45 nm SOI technology cache cores with EDRAM. The measured TSV capaci-
tance and resistance are compatible with high-bandwidth chip–chip com-
munication and the TSV leakage performance of 1 pA/TSV is very good. The
FET Ion/Ioff shows no significant change post stacking and TSV processing,
and functionality of 3D-stacked EDRAM cache cores has been confirmed by
successful writing of memory patterns at up to 2.1 GHz at 1.3 V. This technol-
ogy is highly compatible with existing high-performance logic technology
3D Integration Stacking Technologies 141
Acknowledgments
The authors would like to extend their gratitude to all current and former
colleagues at IBM who were involved in the work of 3DI technology devel-
opment. Special thanks for their work and support to Jonathan Faltermeier,
Wei Lin, Troy Graves-Abe, John Golz, Pooja Batra, Douglas LaTulipe, Alex
Hubbard, Richard Johnson, Allan Upham, Toshiaki Kirihata, Jeffrey Zitz,
Eric Perfecto, William Guthrie, Marcus Interrante, Richard Langlois, Koushik
Ramachandran, Matthew Angyal, Vamsi Paruchuri, Thomas Gow, Mukesh
Khare, Daniel Berger, John Knickerbocker, Subramanian Iyer, and T.C. Chen.
References
1. R.H. Dennard, F.H. Gaensslen, V.L. Rideout, E. Bassous, and A.R. LeBlanc,
Design of ion-implanted MOSFET’s with very small physical dimensions, IEEE
Journal of Solid-State Circuits, 9: 256–268, 1974.
2. S.S. Iyer, T. Kirihata, M.R. Wordeman, J. Barth, R.H. Hannon, and R. Malik,
Process-design considerations for three dimensional memory integration.
In Proceedings of the Symposium on VLSI Technology, Honolulu, HI, June 16–18,
2009, pp. 60–63.
3. W. Arden, M. Brillouët, P. Cogez, M. Graef, B. Huizing, and R. Mahnkopf, More
than Moore white paper. In International Roadmap Committee for the International
Technology Roadmap for Semiconductors, 2010. www.itrs2.net.
4. M. Koyanagi, H. Kurino, K.W. Lee, K. Sakuma, N. Miyakawa, and H. Itani,
Future system-on-silicon LSI chips, IEEE Micro, 18 (4): 17–22, 1998.
5. K. Sakuma, P.S. Andry, C.K. Tsang et al., 3D chip-stacking technology with
through-silicon vias and low volume lead-free interconnections, IBM Journal of
Research & Development, 52 (6): 611–622, 2008.
6. K. Sakuma, P.S. Andry, C.K. Tsang et al., Characterization of stacked die using
die-to-wafer integration for high yield and throughput. In Proceedings of the
IEEE Electronic Components and Technology Conference (ECTC), Lake Buena Vista,
FL, 2008, pp. 18–23.
7. S. Skordas, D.C.L. Tulipe, K. Winstel et al. Wafer-scale oxide fusion bonding
and wafer thinning development for 3D systems integration. In Proceedings of
the 3rd IEEE International Workshop on Low Temperature Bonding for 3D integration
(LTB-3D), Tokyo, Japan, May 22–23, 2012, pp. 203–208.
142 3D Integration in VLSI Circuits
23. W. Lin, L. Shi, Y. Yao, A. Madan, T. Pinto, N. Zavolas, R. Murphy, S, Skordas, and
S.S. Iyer, Low-temperature oxide wafer bonding for 3-D integration: Chemistry
of bulk oxide matters, IEEE Transactions on Semiconductor Manufacturing, 27 (3):
426–430, 2014.
24. W.P. Maszara, G. Goetz, A. Caviglia, and J.B. McKitterick, Bonding of silicon
wafers for silicon-on-insulator, Journal of Applied Physics, 64 (10): 4943, 1988.
25. M.G. Farooq, T.L. Graves-Abe, W.F. Landers et al., 3D copper TSV integration,
testing and reliability. In Proceedings of the IEEE International Electron Devices
Meeting (IEDM), Washington, DC, December 5–7, 2011.
26. J. Golz, J. Safran, B. He et al., 3D stackable 32 nm High-K/Metal Gate SOI
embedded DRAM prototype. In Proceedings of the Symposium on VLSI Circuits,
Honolulu, HI, June 15–17, 2011, pp. 228–229.
27. A. Mercha, G. Van der Plas, V. Moroz et al., Comprehensive analysis of
the impact of single and arrays of through silicon vias induced stress on
high-K/metal gate CMOS performance. In Proceedings of the IEEE International
Electron Devices Meeting (IEDM), San Francisco, CA, December 6–8, 2010,
pp. 2.2.1–2.2.4.
28. L. Yu, W.-Y. Chang, K. Zuo, J. Wang, D. Yu, and D. Boning, Methodology for
analysis of TSV stress induced transistor variation and circuit performance.
In Proceedings of the 13th International Symposium on Quality Electronic Design
(ISQED), Santa Clara, CA, March 19–21, 2012, pp. 216–222.
7
Toward Three-Dimensional High Density
CONTENTS
7.1 Introduction ................................................................................................ 146
7.2 Cu/SiO2 Hybrid Bonding.......................................................................... 147
7.2.1 Cu/SiO2 Hybrid Bonding Principle ............................................. 147
7.2.2 Technical Challenges Linked to Hybrid Bonding .................... 148
7.2.2.1 Surface Preparation......................................................... 148
7.2.2.2 Bonding Interface Characterization ............................. 149
7.2.2.3 Bonding Energy Evolution with Temperature............ 152
7.2.2.4 Copper Modification with Temperature ...................... 152
7.2.2.5 Wafer Alignment Consideration ................................... 154
7.2.2.6 Investigation on Copper Diffusion ............................... 154
7.2.2.7 Pad Dimension Reduction ............................................. 155
7.2.3 Electrical Performances Evaluation of Hybrid Bonding .......... 155
7.2.3.1 Electrical Structures Presentation and
Performances ................................................................ 155
7.2.3.2 Environmental Reliability Study .................................. 157
7.2.3.3 Electromigration.............................................................. 158
7.2.4 Hybrid Bonding Maturity Increase: Moving toward
Production....................................................................................... 159
7.2.5 Specificity of the Die-to-Wafer Process Variation ..................... 160
7.2.6 Die-to-Wafer Process Throughput Increase with
Self-Assembly ................................................................................. 161
7.3 3D Sequential: 3D Very-Large-Scale-Integration CoolCube™ ............ 162
7.3.1 3D Sequential: Principle ................................................................ 163
7.3.2 3D Sequential: State of the Art ..................................................... 164
7.3.3 3D Sequential: Integration Process Flow and
Low-Temperature Top FETs.......................................................... 165
7.3.4 3D Sequential: Intermediate Back-End-of-Line ......................... 167
7.3.5 3D Sequential: 300 mm Electrical Demonstration .................... 169
7.4 3D Technologies Comparative Thermal Analysis................................. 172
7.4.1 3D Technology Parameters for Thermal Comparison.............. 172
145
146 3D Integration in VLSI Circuits
7.4.2
Comparative Study Thermal Results .......................................... 175
7.4.2.1 For Hot-Spot Dissipation Scenarios .............................. 176
7.4.2.2 For Uniformly Distributed Power Dissipation
Scenarios........................................................................... 177
7.4.3 Thermal Comparison Conclusion ............................................... 177
7.5 General Conclusion ................................................................................... 179
References............................................................................................................. 179
7.1 Introduction
3D (three-dimensional technology) high-density integrations have gained an
increasing interest in the objective of maintaining high performances and/or
low-power consumption as the Moore’s law was slowing down.
If high-end advanced packaging solutions are developed, mainly for
heterogeneous packaging, those alternatives are not suitable to fulfill the
requirements for power-efficient applications, such as CIS (CMOS [comple-
mentary metal–oxide–semiconductor] image sensor), high-performance
computing (HPC), gaming, and data center.
As an example, back-side illuminated (BSI) imagers players have released
since 2012 many 3D prototypes and some are already in production, mainly
driven by mobile phone applications. 3D technology used for 3D BSI is
based on direct hybrid bonding, a pitch below 10 μm is reachable, pitch
that is not achievable with conventional advanced packaging techniques.
The idea is to dedicate top and bottom wafer to image sensor and logic
function, respectively, rather than integrating both functions on the same
floor plan.
We can predict that this specific application will pave the way for other
products; the 3D approach that will be used will depend on the granularity
scale of the product partitioning, as shown in Figure 7.1.
Below 10 μm pitch of chip-to-chip interconnection, two complementary 3D
solutions are taking over
Wafer stacking
3D Parallel Alignment 50−100 nm
1-Entire core
Granularity scale
Die 1
3D sequential
3-Logic gates
4-Transistors
CoolCubeTM
Alignment <3 nm
FIGURE 7.1
3D integration pitch roadmap, depending on design granularity.
Dielectric
erosion
Dishing Edge over
Oxide erosion
SiO2 Cu TiN
loss Recess
FIGURE 7.2
Principle of the damascene CMP. (a) Wafer post plating, (b) wafer post-CMP idealistic case, and
(c) wafer post-CMP realistic case.
Toward Three-Dimensional High Density 149
500.0 nm Cu roughness ≈ 5Å
0.0 nm
30.0 nm 500
400
300
0.0 nm
200
8 100
2
SiO2
SiO2 Copper
6 8 μm
2 4
FIGURE 7.3
Post-CMP 3D AFM images showing flat oxide/copper interface and below 5Å roughness
copper surface.
(a)
(b)
(A) (B)
FIGURE 7.4
(A) Acoustic microscopic images of 300 mm Cu/SiO2 bonding with defects: (a) large circular
defect due to a particle and (b) smaller defects due to excessive copper recess. (B) 300 mm Cu/
SiO2 bonding without interfacial defects before annealing.
Toward Three-Dimensional High Density 151
0.04
0.035
Height increase at the center
0.03
of the pads (µm)
0
0 10 20 30 40 50
Copper pads diameter (μm)
FIGURE 7.5
Ansys simulation of the vertical displacement of copper pads at 200°C, as a function of pad
geometry. H is the height of the copper line in micron.
Room
temperature
Anneal at
200°C 2 h
Anneal at
400°C 2 h
FIGURE 7.6
SAM images showing the evolution of half Kelvin cross patterns with temperature, the width
of the branch is respectively from top to bottom 25, 25, 20, and 15 μm.
7
SiO2 direct bonding
6
Bonding toughness (J/m2)
0
10 100 200 300 400
Annealing temperature (°C)
FIGURE 7.7
Comparison of bonding toughness for various wafer configurations. (From Di Cioccio, L. et al.,
An overview of patterned metal/dielectric surface bonding: Mechanism, alignment and char-
acterization, JECS, 81–86, 2011.)
Bonding
interface
3.6 μm
FIGURE 7.8
FIB/SEM cross-section of bonded Cu pads.
Toward Three-Dimensional High Density 153
5 nm
5 nm
FIGURE 7.9
High-resolution TEM cross section of direct copper bonding without post-bonding anneal.
A 4 nm thick crystalline layer is present at the bonding interface.
Cu Cu Cu
Cu Cu Cu (b)
After bonding After 200°C anneal After 400°C anneal
500 nm 500 nm 500 nm
(a)
FIGURE 7.10
(a) STEM cross sections of direct bonding with successive post-bond annealing. (b) TEM obser-
vation of a typical interfacial void.
154 3D Integration in VLSI Circuits
FIGURE 7.11
Vector map of overlay data quantified from a bonded 300 mm wafer pair using the EVG®40 NT
measurement system. Max overlay here is 250 nm.
Toward Three-Dimensional High Density 155
Bonding interface
Thermal storage: 300°C during 336 h
SiO2 Cu
SiO2 Cu 100
Cu O Si
80
Proportion %
SiO2 y 60
Cu
40
x 20
SiO2 Cu No copper in SiO2 layer
0
x 90 y
Cu 500 nm Position (nm)
FIGURE 7.12
STEM cross-section image and EDX–STEM mapping of the misalignment area of hybrid bond-
ing interface. The graph presents the Cu, Si, and O concentration profiles across the bonding
interface.
The cross sections after CMP, bonding, and annealing of these structures
are presented in Figure 7.13. The most recently published ones (case 3) consist
of copper pads and via achieved thanks to dual damascene process.
Before electrical tests, all the bonding pairs have been characterized with
the previously described protocol: No void has been detected, as shown
on the perfectly bonded 300 mm wafer presented in Figure 7.13. This con-
firms the robustness of CMP-based surface preparation process.
In order to enable the electrical tests, a backside electrical contact has to
be created on top wafer. Therefore, the bonded pairs have all been thinned
successfully at various thicknesses: 50 μm with through-silicon via (TSV)
and under bump metallization (UBM) in case 1; 40 μm in case 2 with fur-
ther Ti/TiN/AlSi redistribution layer (RDL) and electrical pads, and more
recently down to only 3 μm on a 300 mm wafer with following aluminum
pads creation process (case 3).
Table 7.1 presents a comparison of the specific contact pad resistances mea-
sured on the three integration schemes presented in Figure 7.13.
400°C (C2)
BSI (C1) (C3)
BSI
Logic
Logic
FIGURE 7.13
Contact pad cross-sections of the three different hybrid bonding stacks evaluated by CEA-Leti
and STMicroelectronics. Case 1 and 2 are 200 mm wafers, Case 3 is 300 mm wafers. (A) Case 1:
1+1 copper level (From Taïbi, R. et al., ECTC proceedings, 219–225, 2010; Taïbi, R. et al., IEEE IEDM,
2011), (B) Case 2: 2+2 copper level (From Beilliard, Y. et al., IEEE 3DIC, 2014) and (C) Case 3: 3+3
copper level. (Lhostis, S. et al., IEEE 66th ECTC, Las Vegas, NV, 2016; Moreau, S. et al., IEEE 66th
ECTC, Las Vegas, NV, 2016; Jourdon, J. et al., IEEE IRPS, 2017.)
TABLE 7.1
Specific Contact Resistance Comparison of Electrical Structures with Different
Numbers of Copper Layers
Case 1 Case 2 Case 3
ρc = R c × A c
where:
Rc is the contact resistance
Ac is the contact area
7.2.3.3 Electromigration
EM tests were performed on three structure types to identify possible failure
related to hybrid bonding. This paragraph will focus on results with more
complex structure including copper via and pads (case 3 Figure 7.13). In this
case, EM has been studied using NIST (National Institute of Standards and
Technology)-like or 100-connections daisy chain (DC100).
For NIST-like test structure, failure analysis reveals voids that are always
in the single damascene line of the top/bottom wafer depending on the elec-
tron flow direction (up/downstream).
For DC100, no EM-induced void is found along the daisy chain. Voids are
only localized at the cathode side in the feed line that is the metal line in the
top die (Figure 7.14).
These results support the fact that an electron flow flowing perpendicu-
larly to the hybrid bonding interface is favorable compared to a parallel one
as, in this case, voids and extrusions can be observed in bonding interface
[20]. In addition, one must notice that intrinsic bonding voids originating
from interfacial copper oxide do not move with the electrical current show-
ing that the hybrid bonding process is mature.
Complementary work also investigated the impact of seed layer type on
EM, confirming the higher Cu/TaN/Ta adhesion energy and longer resis-
tance to EM [16].
In conclusion, the hybrid bonding module has no impact on the EM resis-
tance and presents excellent environmental reliability. The weakest spot is
always the BEOL level (top or bottom depending on the electron flow).
(a) (b)
FIGURE 7.14
Characterization of daisy-chain (100 connections) after electromigration tests. (a) Lock-in ther-
mography results (amplitude, 15×, 0–0.15 V, 10 Hz, 60 seconds. White arrow locates a possible
failure) and (b) FIB-SEM cross-section in the area indicated by the white arrow in (a).
Toward Three-Dimensional High Density 159
FIGURE 7.15
FIB SEM characterization of a 3D image sensor stack containing 12 metal levels.
160 3D Integration in VLSI Circuits
Since then, SONY announced in March 2016 the production of the IMX260
3D Imager achieved, thanks to WtW hybrid bonding [23]. It is a 12MP camera
constituted of a five-metal (Cu) CIS die and a seven-metal (6 Cu + 1 Al) image
signal processor (ISP) die. The copper-to-copper pads are 3 μm wide and
present a 14 μm pitch in the peripheral regions and a 6 μm pitch in the pixel
array. This imager is already used as a rear-facing camera in the Samsung S7
mobile and confirms that WtW is mature for mass production.
The two examples mentioned are imager applications in which electrical
signal exit is achieved thanks to wire bonding after top imager wafer thin-
ning. However, to reduce the module size and to fasten data exchange with
other chips, it has been recently shown that hybrid bonding can also be com-
bined with a TSV–last process type [24].
(d) (e)
(A) (B)
FIGURE 7.16
(A): Self-assembly process description: (a) bottom structure before stacking, (b) water drop
deposition and top die prepositioning, (c) self-assembly thanks to capillary restoring forces,
(d) die alignment, residual interfacial water film, and (e) water evaporation and dies direct
bonding. (B): 200 mm electrical wafer stacked, thanks to self-assembly process with 20 top dies.
θ1+α
α θ1 θ2
Liquid
θ1 ≤ ϕ ≤ α + θ2 Solid
FIGURE 7.17
Principle and illustration of canthotaxis phenomena.
162 3D Integration in VLSI Circuits
TABLE 7.2
Comparison between the Contact Resistances Obtained with Three Different
Integration Processes
NIST Structure Contact Resistance
Hybridation Process DtW (Self-Assembly) DtW (Pick and Place) WtW
that the best theoretical confinement will be reached by creating both topol-
ogy (high α value) and hydrophobic treatments (high θ2 value).
If interesting demonstrations have already been made with μbumps
assembly [26], CEA-Leti has adapted the technology to the constraints of
hybrid bonding.
The process has been optimized, thanks to simple 1 × 1 cm nonpatterned
dies with both topology and hydrophobic contrast on their edges. These
structures enabled to demonstrate high alignment accuracy yield (>90%) and
high bonding quality (below 1 μm) [27,28].
Finally, S. Mermoz et al. [29] demonstrated that the presence of interfa-
cial water did not modify the electrical performances of the connected daisy
chains (as shown in Table 7.2) for a Kelvin NIST structure. Furthermore, this
work was continuously completed by theoretical simulation analysis using
a numerical software to explain how outer perturbations could impact the
capillary restoring force [30].
To conclude on this paragraph, hybrid bonding Cu/SiO2 has gained many
interest from industry players since few years, whereas first publications
were released in early 2000. In between, impressive understanding studies of
the bonding principle were carried out, many progresses about CMP and its
material were achieved. Finally, production in a WtW scheme is now a reality
for CIS and there is no doubt that other application fields will now continue
the trend. DtW is the natural (but challenging) continuation of the roadmap.
LT MOSFET
CMOS III-V
LT MOSFET
nMOS
Ge
CMOS PMOS
(a) (b)
FIGURE 7.18
CoolCube™ integration: Partitioning schemes for 3DVLSI, namely (a) 3DVLSI at the gate level
and (b) 3DVLSI at the transistor level.
164 3D Integration in VLSI Circuits
Top tier
Bottom tier
FIGURE 7.19
3DVLSI structure with two levels of intertiers interconnections.
1000
800
Temperature (°C)
600
400
200
1n 1µ 1m 1 1k 1M
Anneal duration (s)
FIGURE 7.20
Summary of thermal budgets tested on FDSOI technology with SiGe channel for pFET
and SiGe:B/SiC:P RSD with preserved N &PFET ION-IOFF performance. (From Fenouillet-
Beranger, C. et al., New insights on bottom layer thermal stability and laser annealing prom-
ises for high performance 3D VLSI, IEEE IEDM, 2014.)
(Seed window [36], Poly-Si and laser-crystallized Epi-like Si [39] or oxide direct
bonding [32]). Direct bonding clearly stands out with respect to the other tech-
niques: Thanks to the high crystalline quality of top silicon layer, the devices’
performance outperforms the other ones. Regarding CMOS scaling and espe-
cially SRAM integration, there have been great demands for higher density in
all areas of SRAM applications such as network and cache standalone mem-
ory, and embedded memory of the logic devices. 3DVLSI integration is very
promising for this type of applications as evidenced by the number of pub-
lications in literature [3,33,36–39]. Here again, as evoked previously, the best
transistor performances are always achieved in case of top silicon crystalline
layer obtained by oxide wafer bonding [3,31,33]. The ultimate example of high-
performance CMOS at low process cost is the stacking of III–V nFETs above
SiGe pFETs [40,41]. These high-mobility transistors are well suited for 3DVLSI
because their process temperatures are intrinsically low. 3D sequential integra-
tion, with its high contact density, can also be seen as a powerful solution for
heterogeneous cointegrations requiring high 3D via densities such as compu-
tation immersed in memory [42], heterogeneous IoT (Internet Of Things) chip
[43,44], nano-electromechanical systems (NEMS) with CMOS for gas-sensing
applications [45,46] or highly miniaturized imagers [47].
FIGURE 7.21
Process flow principle of monolithic 3D integration. By resorting to a unique alignment flow
throughout the whole process, layers are stacked on top of each other within a lithographic
alignment precision. (a) Bottom-layer processing with plugs down to the CMOS, (b) fabrication
of the inter level lines to ensure short distance connection with bottom layer, (c) high-quality
top film transfer by direct oxide–oxide bonding, and (d) top-layer fabrication and connection.
(From Vinet, M. et al., Monolithic 3D Integration: A powerful alternative to classical 2D scaling,
IEEE S3S, 2014. © 2014 IEEE. With permission.)
166 3D Integration in VLSI Circuits
4 modules for
which a
temperature
decrease is
required
Ni0.9Pt0.1
Low-temperature solid-phase epitaxy regrowth/laser
Dopant activation 500°C
1000°C
Desired 800°C
Gate oxide
∼525°C
stabilization
Si @ 750°C
Process window SiGe @ 650°C
Epitaxy ∼525°C
FIGURE 7.22
Bottom-layer thermal stability has been increased, thanks to silicide process optimization,
(From Vinet, M. et al., IEEE ESSDERC, 2016.) and top-layer thermal budget has been decreased
by optimization of hot-temperature modules.
TABLE 7.3
VPD-DC-ICPMS Monitoring of Ni through the Process
on the Bevel. Low Limit of Detection = 6.5E7 at/cm²
Ni (at/cm 2)
use of tungsten (W) as it has already been integrated in the FEOL of several
products. CEA-Leti studies [59] highlighted that Cu and W interconnections
combined with ULK are stable up to 500°C for 2 hours and 550°C for 5 hours,
respectively, in case of line 1 integration with 28 nm design rules. However,
W resistance is still larger as compared to the copper one (by a factor 6).
Moreover, as presented in the literature, the ULK stability is still questioned
beyond 500°C [60]. Indeed, the modification of ULK structure and permittiv-
ity during a thermal anneal at temperature higher than 500°C may increase
the leakage and delays of the iBEOL and thus may degrade the circuit perfor-
mances. Thermal stability of dielectrics has been studied to select the most
appropriate ones. Depending on the thermal budget set by the top MOS layer,
various couple of materials are possible and summarized in Table 7.4 [61].
Regarding the barrier layer, SiCO seems to be the most promising material
due to its robust composition, low permittivity (4.5) (lower than the state-
of-the-art barrier layer SiCNH (5.6)), and its high thermal stability. However,
for a top thermal budget limited to 500°C the state of the art SiCNH is still
suitable. Regarding the oxide-based material, several of them seem suitable
depending on the top feasible thermal budgets. For a thermal budget limited
to 500°C, 2 hours, the state-of-the art ULK (2.5) material is still possible. On
the other hand, at 600°C, 2 hours, only the SiO2 is suitable. Finally, the permit-
tivity of these materials is crucial to avoid circuit performance degradations.
The reliability of ULK dielectrics after a relatively high thermal budget
has never been demonstrated up to now. Using the standard extrapolation
parameters, the extrapolated lifetime is extracted for W/ULK interconnec-
tions [55] (Table 7.5). Although time to failure (TTF) decreases with increasing
thermal budget, a lifetime (line to line) of 109 years is calculated for the high-
est thermal budget, larger as to Cu/ULK intermetal dielectric (IMD) lifetime
TABLE 7.4
Possible Dielectrics for the iBEOL as a Function of the Top FET Thermal Budget and
Associated Permittivity
State-of-the-Art 500°C, 2 hours 550°C, 2 hours 600°C, 2 hours
Barrier layer SiCNH (5.6) SiCO (4.5)/SiCN(5.6) SiCO (4.5) SiCO (4.5)
Oxide-based ULK (2.5) ULK (2.5) Dense LK (3) SiO2 (4.1)
Source: Xu, H. et al., Design, Automation & Test in Europe (DATE), 1–6, 2011.
TABLE 7.5
Extrapolated Lifetime (TTF) of Intermetal Diel ULK with Ti/TiN/W before and
after Anneals. Years Needed to Reach the Failure Rate (<1%) with an Operating
Voltage at 1.115 V
No Anneal 500°C 550°C
without thermal budget. Therefore, W/ULK reliability is not an issue for the
3D sequential integration due to the high initial lifetime, as top transistor
thermal budget is limited to 500°C for a couple of hours.
In spite of contamination issue, it could be very interesting to integrate Cu/
ULK lines in the iBEOL of 3D sequential integration to use standard bottom tier
process, however the reliability versus thermal budget should be studied.
Integration of dense W lines introduces a new complexity degree not only
in terms of contamination but also in terms of CMP. Indeed, in spite of the
high metal lines density, the planarity of the structure should be preserved to
ensure a good bonding quality. A layer transfer by wafer bonding has been
realized above a 28 nm industrial metal 1 short loop with line densities up to
70% [3]. The schematic of the experiment is illustrated in Figure 7.23. W filling
is used instead of standard copper, coupled with a Ti/TiN barrier. A high-
quality layer transfer as observed with acoustic microscopy observation is
presented on Figure 7.24. Figure 7.24a shows a SiO2/SiO2 reference bonding
(without any defect that would appear as white dot), whereas Figure 7.24b
shows the bonding above W lines with very few bonding defects at the
wafer edges. These residual defects are explained by a nonfully optimized
W lines CMP process that will be easily adjusted on a real lot. The bonded
structure was then thinned down to the top buried oxide (BOX) with both
grinding and tetramethylammonium hydroxide (TMAH) etching. A SEM
cross-section of the structure after thinning with the Si layer highlighted in
red is presented in Figure 7.24c.
Direct bonding
200 nm 90 nm
40 nm
W W Si 9 nm
Ti/TiN Ti/TiN BOX
Bulk Si
350 nm
SiO2
SiN
Bulk Si
FIGURE 7.23
Description of the Si layer transfer above W metal 1 level. Ti/TiN diffusion barrier is used.
170 3D Integration in VLSI Circuits
9 nm Si layer
500 nm
FIGURE 7.24
(a) Acoustic microscopy on a reference SiO2/SiO2 bonding. Perfect bonding is observed (no
defect) and (b) acoustic microscopy on the studied structure. High-quality bonding reached.
Some defects on the edge due to W over-polishing. (c) SEM cross-section of the bonded struc-
ture, with bottom W lines, after thinning. The 9 nm Si layer is highlighted in dark gray.
and raised source and drain. On the bottom level, standard high-temperature
NMOS is N-type metal–oxide–semiconductor (NMOS) and PMOS tran-
sistors with HfSiON/TiN gate stack are fabricated on 300 mm SOI wafers
(tBOX = 145 nm/tSi = 7 nm) with Si raised source and drain junctions activated
at 1050°C. For the top level, a HfO2/TiN gate stack was formed followed by
a spacer zero deposition at 630°C and a selective SiGe27% raised source drain
epitaxy at 650°C for both NMOS and PMOS transistors. The junctions were
activated by SPER technique during 1 minute at 600°C. Figure 7.25 shows the
TEM cross section of the 3D sequential contacted structure with a focus on
two stacked transistors. The Id-VG characteristics for both NMOS and PMOS
200 nm 50 nm
FIGURE 7.25
TEM cross-section of the 3D sequential structure.
Toward Three-Dimensional High Density 171
104 104
103 Top NMOS 103
Top PMOS
102 102
Drain current ID (µA/µm)
FIGURE 7.26
ID-VG characteristics of both top-level NMOS and PMOS transistors. W = 10 μm, Lg = 60 nm.
(From Brunet, L. et al., First demonstration of CMOS over CMOS 3D VLSI CoolCubeTM
integration on 300 mm wafers, IEEE VLSI, 2016. © 2016 IEEE. With permission.)
transistors on the top level are presented in Figure 7.26. Finally, for the first time
the voltage transfer characteristics of 3D sequential inverters with either NMOS
or PMOS on the top level, on 300 mm wafer, are presented in Figure 7.27.
As a conclusion, 3DVLSI sequential 3D integration CoolCube™ success-
fully achieved in the past 10 years to prove the feasibility of the top layer
manufacturing and its robustness. This innovative integration, in a blaze of
many high-level worldwide publications, could be the answer for the end of
Moore’s law that some specialists predicted in a near future. In addition, in
the same manner as for advanced packaging or 3D high density, heteroge-
neous integration may be also a key advantage offered by CoolCube™, in a
range of pitch that is more aggressive than parallel stacking.
1.4 1.4
NMOSTOP VDD = 1.2 V VDD
1.2 VDD = −1.2 V 1.2
1V PMOSTOP
1.0 −1 V VIN 1.0 VOUT
VOUT 0.8 V VIN
VOUT (V)
VOUT (V)
FIGURE 7.27
(a) Voltage transfer characteristics of a 3D sequential inverter with NMOSFET on top and
PMOSFET on the bottom level. (b) Voltage transfer characteristics of a 3D sequential inverter
with PMOSFET on top and NMOSFET on the bottom level.
172 3D Integration in VLSI Circuits
25 W/K·m2 (low-power)
• Stack size: 2, 4 or 8 tiers (F2B)
2000 W/K·m2 (HPC)
• Top-die thickness: Variable to keep
total die stack height equal to 722 μm
• Package: 12 mm × 12 mm × 1.2 mm
300 W/K·m2 with a four-layer substrate (288 μm)
FIGURE 7.28
Example of stacking diagram indicating the main circuit and 3D integration technology
parameters considered for thermal simulations.
TABLE 7.6
3D Technology Parameters: Material and Thickness
Sequential 3D
Parallel 3D Technologies Technology
µbumps Cu/SiO2 Direct Bonding CoolCube™
Layer Material Thickness Material Thickness Material Thickness
given in Table 7.6. The 3D stack contains 1–8 tiers, which are assembled in
a face-to-back (F2B) manner [69]. In the case of TSV-based integration, the
thickness of the die substrate spans from 25 to 80 μm for assemblies using
μbumps and from 10 to 50 μm for Cu/SiO2 hybrid bonding. The layout of
each tier is based on a real scalable 65 nm 3D circuit where identical tiers
are stacked one on top of each other. An exception is made for CoolCube™,
which uses tungsten for the metallization layers (BEOL) of the intermediate
tiers due to thermal limitations and contamination risks during the fab-
rications process, as explained in the previous paragraph. The height and
diameter of the μbumps in the interdie layer are, respectively, 25 and 20 μm,
thus in the typical range for this technology. The thickness of the interdie
174 3D Integration in VLSI Circuits
layer is 1.7 μm for the Cu/SiO2 hybrid bonding and only 60 nm in the case of
CoolCube™. As already mentioned, the impact of the TSVs on the thermal
conductivity of the silicon substrate has been demonstrated to be very lim-
ited [65] and therefore will not be considered in this study.
The 3D die is stacked on a flip-chip organic package with the bottom tier
connected to a four-layer ball grid array (BGA) package (288 μm height)
through C4-bumps (40 μm height). To be compatible with typical low-power
applications, such as mobile, the total package height was fixed at 1.2 mm,
thus resulting in a max height of 722 μm for the 3D die stack. Provided this
constraint, the topmost tier is made as thick as possible as it plays as a heat
spreader for hot-spot mitigation. Specific HTCs were applied to the boundary
surfaces according to the considered power application scenario to emulate
the behavior of typical Printed Circuit Board (PCB) and package components
such as Thermal Interface Material (TIM) heat spreader and heat sink.
Figure 7.29 illustrates the power profiles used to emulate multiple appli-
cation scenarios and to provoke distinct thermal behaviors depending on
the application use case. Heat dissipation at hot spots is primarily diffused
through the silicon substrate and usually spreads in a semispherical direc-
tion, rapidly decreasing the heat density and lowering the peak temperature.
But for 3D ICs, the thinned silicon substrate reduces the lateral heat spread-
ing capability and the interdie layers (mainly polymer) act as vertical thermal
barriers, which results in exacerbated hot spots. When power is evenly dis-
tributed, the temperature gradient over the tier is much smaller and the heat
flows mostly perpendicular to the die substrates in the 3D stack.
Most of the heat generated in applications with intensive power dissipa-
tion, such as HPC, flows toward efficient heat sinks usually mounted on top
of the package. On the contrary, for applications with limited power dissipa-
tion, circuit is mainly cooled from the bottom and heat flows toward the PCB
through the package substrate.
This study also considers applications with similar power budget allo-
cated either in a single or in multiple tiers. Taking into account voltage and
FIGURE 7.29
Examples of power application scenarios. (a) Strong hot spots in a single tier, (b) limited
uniform power in a single tier, (c) limited uniform power distributed between tiers, and
(d) intensive uniform power dissipation.
Toward Three-Dimensional High Density 175
TABLE 7.7
Power Dissipation Profiles, according to Application Scenarios Described in
Figure 7.29
Hot Spot Single-Tier Multitiers Multitiers HPC
Total Power Total Power Total Power Total Power
Quantity Power Densitya Power Densitya Power Densitya Power Densitya
of Tiers (W) (W/cm 2) (W) (W/cm 2) (W) (W/cm 2) (W) (W/cm 2)
2 2.64 365 2.657 3.65 2.657 1.825 5.3 3.65
4 2.64 365 2.657 3.65 2.657 0.913 10.6 3.65
8 2.64 365 2.657 3.65 2.657 0.456 21.2 3.65
a In active regions.
130
μbumps 2 dies
Cu/SiO2 8 dies
120
Cu/SiO2 4 dies
130
μbumps 2 dies
Cu/SiO2 8 dies
120
Cu/SiO2 4 dies
Cu/SiO2 2 dies
110
CoolCubeTM 8 dies
Cu-pillar
100 Cu-Cu DB CoolCubeTM 4 dies
CoolCubeTM 2 dies
90 CoolCube
0 10 20 30 40 50 60 70 80
(b) Die thickness (μm)
FIGURE 7.30
(a) Peak temperature in the case of hot-spot dissipation considering very few die-to-die con-
nections (ratio <1%) or (b) max connection density between tiers (ratio of 25% for TSV-based
and 10% for CoolCube™). Hot spots are active only in midmost tier to account for similar heat
flow paths in both vertical directions.
Toward Three-Dimensional High Density 177
100°C
70°C
(a) (b) (c)
FIGURE 7.31
Thermal maps of the middle and topmost tiers in the case of hot-spot dissipation (Figure 7.29a)
in a eight-dies stack: (a) parallel stacking with μbumps, (b) parallel stacking with Cu/SiO2
direct bonding, and (c) sequential stacking CoolCube™.
benefits from both thermal coupling and lateral heat dissipation in the inter-
mediate tiers, therefore presenting the best thermal performance in most 3D
stacks with more tiers. This difference in the thermal profiles is depicted in
the temperature maps in Figure 7.31, where the hot spot outlines reveal the
lateral heat dissipation through the die substrates.
140
μbumps 8 dies
120 8 dies μbumps 4 dies
Peak temperature (°C)
μbumps 2 dies
100
Cu/SiO2 8 dies
80
Cu/SiO2 4 dies
4 dies
60 Cu/SiO2 2 dies
CoolCubeTM 8 dies
40 2 dies
CoolCubeTM 4 dies
20
CoolCubeTM 2 dies
0
0 10 20 30 40 50 60 70 80
(a) Die thickness (μm)
FIGURE 7.32
Peak temperature resulting from a power application scenario with intensive power dissipa-
tion uniformly distributed over the tiers (Figure 7.29d) and in the case of (a) very few connec-
tions between tiers and (b) max density connections between tiers.
References
1. L. Di Cioccio et al., An overview of patterned metal/dielectric surface bonding:
Mechanism, alignment and characterization, JECS, 2011, pp. 81–86.
2. M. Goto et al., Three-dimensional integrated CMOS image sensors with pixel-
Parallel A/D converters fabricated by direct bonding of SOI layers, IEEE,
Electron Device Meeting IEDM, 2014.
3. L. Brunet et al., First demonstration of CMOS over CMOS 3D VLSI CoolCubeTM
integration on 300 mm wafers, IEEE VLSI, 2016.
4. V. Balan et al., CMP process optimization for bonding applications, ICPT 2012,
October 15–17, 2012, Grenoble, France, pp. 177–183.
5. L. Benaissa et al., Next generation image sensor via direct hybrid bonding,
2015 IEEE 17th Electronics Packaging Technology Conference, December 2–4, 2015,
Singapore.
6. B. Rebhan et al., <200 nm wafer-to-wafer overlay accuracy in wafer level Cu/
SiO2 hybrid bonding for BSI CIS, Electronics Packaging Technology Conference,
December 2–4, 2015, Singapore.
7. M. Okada et al., High-precision wafer-level Cu-Cu bonding for 3DICs, IEEE,
Electron Device Meeting IEDM, 2014.
180 3D Integration in VLSI Circuits
8. S. Lhostis et al., Reliable 300 mm wafer level hybrid bonding for 3D stacked
CMOS image sensors, 2016 IEEE 66th Electronic Components and Technology
Conference, May 31–June 5, 2016, Las Vegas.
9. L. Wang et al., Direct bond interconnect (DBI) for fine-pitch bonding in 3D and
2.5D integrated circuits, Pan Pacific Microelectronics Symposium, 2017, IEEE.
10. Z.Y. Liu et al., Detection and formation mechanism of micro-defects in ultrafine
pitch Cu-Cu direct bonding, Chinese Physics B, 25(1), 018103, 2016.
11. Q.Y. Tong and U. Gösele, Semiconductor Wafer Bonding Science and Technology,
1999, Wiley, New York, 320 p.
12. W.P. Maszara et al., Bonding of silicon wafers for silicon–on-insulator, Journal of
Applied Physics, 64(10), 4943–4950, 1988.
13. P. Gueguen et al., Copper direct-bonding characterization and its interests for
3D integration, Journal of the Electrochemical Society, 156(10), H772–H776, 2009.
14. P. Gueguen et al., Direct bonding: An innovative 3D interconnect, ECTC
Proceeding, 2010, pp. 878–883.
15. L. Di Cioccio, F. Baudin, P. Gergaud et al., Modelling and integration phenomena
of metal-metal direct bonding technology, ECS Transactions, 64(5), 339–355, 2014.
16. Y. Beilliard et al., Advances toward reliable high density Cu–Cu interconnects
by Cu-SiO2 direct hybrid bonding, IEEE International 3D System Integration
Conference (3DIC), 2014.
17. Imec and EVG demonstrate for the first time 1.8 μm pitch overlay accuracy for
wafer bonding. https://www.evgroup.com/en/about/news/2017_01_imec.
18. R. Taïbi et al., Full characterization of Cu/Cu direct bonding for 3D integration,
ECTC Proceedings, 2010, pp. 219–225.
19. R. Taïbi et al., Investigation of stress induced voiding and electromigration
phenomena on direct copper bonding interconnects for 3D integration, IEEE
Electron Device Meeting IEDM, 2011.
20. S. Moreau et al., Mass transport-induced failure of hybrid bonding-based
integration for advanced image sensor applications, 2016 IEEE 66th Electronic
Components and Technology Conference, May 31–June 5, 2016, Las Vegas, NV.
21. J. Jourdon et al., Effect of pasivation annealing on the electromigration properties of
hybrid bonding stack, IEEE International Reliability Physics Symposium (IRPS), 2017.
22. A. Garnier et al., Electrical performance of high density 10 μm diameter 20 μm
pitch Cu-Pillar with chip to wafer assembly, IEEE 67th Electronic Components and
Technology Conference, May 30–June 2, 2017, Orlando.
23. Sony IMX260 in Samsung Galaxy S7: Stacked or not?. http://image-sensors-
world.blogspot.fr/2016/03/sony-imx260-in-samsung-galaxy-s7.html.
24. C. Cavaco et al., Copper oxide direct bonding of 200 mm CMOS wafers with
five metal levels and TSVs: Morphological and electrical characterization, JECS,
2016, pp. 43–46.
25. Y. Beillard et al., Chip to wafer copper direct bonding electrical characterization
and thermal cycling, IEEE International Conference on 3D System Integration (3D IC),
2013.
26. T. Fukushima et al., Transfer and non-transfer stacking technologies based
on chip-to wafer self-assembly for high throughput and high precision align-
ment and microbumps bonding, IEEE International 3D Systems Integration
Conference, 2015.
27. L. Sanchez et al., Chip to wafer direct bonding technologies for high density 3D
integration, IEEE Electronic Components and Technology Conference, 2012.
Toward Three-Dimensional High Density 181
CONTENTS
8.1 Introduction ................................................................................................ 186
8.2 Stacked Terahertz Optical Component ................................................... 186
8.2.1 THz Wave Applications ................................................................ 186
8.2.2 Issues of Common THz Polarizers .............................................. 187
8.2.3 Fundamentals and Fabrication Methods.................................... 188
8.2.3.1 Structure Design of THz Polarizer ............................... 188
8.2.3.2 Fabrication Methods and Low-Temperature
Eutectic Liquid Bonding ................................................ 189
8.2.4 Performance of Staked THz Polarizer ........................................ 190
8.2.4.1 Bonding and Etching Qualities..................................... 190
8.2.4.2 High Transmittance THz Polarizer .............................. 191
8.2.4.3 Broadband THz Polarizer .............................................. 192
8.2.5 Comparison between Common and Stacked THz
Polarizers .............................................................................. 193
8.3 Pressure-Sensing System .......................................................................... 194
8.3.1 Pressure-Sensing Platform ........................................................... 195
8.3.2 Pressure Sensor .............................................................................. 196
8.3.3 Micropin-Fin Heat Sink Interposer ............................................. 196
8.3.4 Integration of Micropin-Fin Heat Sink Interposer and Chips
with Double-Self-Assembly Approach ...........................................198
8.3.5 Achievements and Outlook .......................................................... 199
8.4 Neurosensing Systems .............................................................................. 200
8.4.1 Fabrication, Scheme, and Reliability of Neural Sensing
Biosensor ......................................................................................... 201
8.4.1.1 Three-Dimensional System-in-Packaging Neural
Sensing Biosensor ........................................................... 201
8.4.1.2 Chip-Level Heterogeneous Integration Scheme......... 201
8.4.1.3 Electroplating Solution Improvement .......................... 201
8.4.1.4 Electrical Reliability Tests .............................................. 203
185
186 3D Integration in VLSI Circuits
8.4.2
2.5D-Silicon Interposer Neural Sensing Biosensor ................... 203
8.4.2.1 Fabrication of Silicon Interposer and
Through-Silicon Via-Embedded μ-Probe .................... 204
8.4.2.2 Electrical Reliability Test ................................................ 204
8.4.3 2.5D-Flexible Interposer Neural Sensing Biosensor ................. 205
8.4.3.1 Fabrication of Flexible Interposer ................................. 205
8.4.3.2 Novel Flexible Bonding Approach ............................... 206
8.4.3.3 Electrical Reliability Test of Novel Thin
Film-Bonding Approaches............................................. 206
8.4.4 Demonstration ................................................................................ 207
References............................................................................................................. 208
8.1 Introduction
With the growing demands of high computing and Internet of Things (IoT)
applications, high I/O counts with fast signal transmission speed and vari-
ety integration ability become equally significant in current semiconductor
development. Platforms based on 3D-advanced packaging and integration can
provide such solutions to high-speed, low-power, small form factors, and het-
erogeneous integration requirements. Advanced applications that have been
successfully manufactured such as graphics processor units (GPU) and mobile
processers adopt key technologies of 3D integration and advanced packaging
such as fan-out scheme, whereas advanced complementary metal–oxide–
semiconductor (CMOS) image sensors use through-silicon via (TSV) and fine-
pitch Cu direct bonding. Other than these applications, 3D integration and
heterogeneous technologies create many opportunities and a whole new world
for advanced systems, which were difficult to be fulfilled in the past. This
chapter describes examples of novel platform and application demonstration,
including optical components, pressure sensors, and bioneural applications.
Silicon wafers as
Seal wire-grid by
Mass production, substrate: Hide wire-grid into
wafer bonding and
large area, and robust (1) suitable material, wafers and fabricate
fabricate antireflection
(2) compatible for antireflection layers
structure by using layers on
semiconductor on both sides of the
thick substrate both sides of the
manufacture, and structure
structure
(3) high reflectance
Advantages:
• Low cost
Si Si Si • Robust structure
Wire-grid polarizer Hide wire-grid Antireflection • High transmittance
with Si substrate layers • No precision optical
alignment required
FIGURE 8.1
The designed diagram and advantages of new type THz polarizer.
188 3D Integration in VLSI Circuits
λ0
t AR = (8.1)
4n AR
=
n AR =
n si 3.4 = 1.84 (8.2)
Si
λ0 AR layer
LAR
nAR
lay
Si Si
de LAR
se
ha
π p1
T
15 μm
R
FIGURE 8.2
AR coating with quarter wavelength thickness leads to destructively interfere, and the index
could be controlled by whole array pitch under zero-order effective medium approach.
Novel Platform and Applications Using 3D Integration Technologies 189
coating and substrate, and also get peeled. Therefore, Si substrate is etched
by deep reactive-ion etching (DRIE) with cylinder holes and the index of
AR layer is tuned to specific index and also specific thickness according to
the effective medium theory. The effective index of two mixing materials is
equal to Equation 8.3:
where f is the filling factor of Si and air and is 0.35 to achieve the effective
index. As the pitch of etching holes is less than λ0/10, the AR layer seems like
a homogeneous layer with effective index, as shown in Figure 8.2.
Si Si
PR patterning and AR
etching
Si
PR removing Si
Si Si
PR patterning on
Ti 200Å
PR patterning on top
wafer bottom wafer
Cu 3000Å
Thin buffer layer 100Å
Si Si
Thin buffer layer/Cu/Ti In/Sn/Ni/Ti sealing
grating deposition deposition
In 3000Å
Si Si Sn 3000Å
Thin buffer layer/Cu/Ti In/Sn/Ni/Ti sealing lift-off
grating lift-off
Ni 200Å
Ti 200Å
Si
Si
Si
(a) Wafer bonding (b)
FIGURE 8.3
(a) The process flow of stacked THz polarizer fabrication (From Chi, N.-C. et al., IEEE ECTC,
1793–1798, 2017.) and (b) the bonding structure for Cu wire-grid sealing. (From Chi, N.-C. et al.,
SPIE Optics + Optoelectronics, 102420Z–102420Z-6, 2017.)
190 3D Integration in VLSI Circuits
the designed filling factor and AR thickness. The Cu wire-grid gratings and
low-temperature In/Sn solder-sealed rings are deposited with relative thick-
ness, which are patterned on the other side of the respective Si wafers after
fabricating the AR layer, as shown in Figure 8.3b. Two wafers are bonded
face to face at 150°C for 30 minutes. The bonding mechanism uses low eutec-
tic point (118°C) of In/Sn alloy. Therefore the polarizer could be bonded at
low temperature [4]. Using ultrathin buffer layer could tentatively avoid
Cu-In/Sn interdiffusion and ensure that the residue of In/Sn would melt
even with submicron thickness [5]. The fabrication method of robust THz
polarizer with AR layers using 3D-IC technologies is demonstrated. The Cu
wire-grid is sealed into bonded wafers to prevent corrosion.
Size: 2 cm × 2 cm
12 polarizers for a bonded wafer
(a) (b)
FIGURE 8.4
(a) SAT image showing the excellent bonding quality and (b) the finished component of THz
polarizers. (From Yu, T.-Y. et al., SPIE Optical Engineering + Applications, 95850L–95850L-7, 2015.)
Novel Platform and Applications Using 3D Integration Technologies 191
FIGURE 8.5
SEM images and parameters set up of AR layers. (From Yu, T.-Y. et al., SPIE Optical Engineering +
Applications, 95850L–95850L-7, 2015.)
As discussed earlier, the depth and filling factor of the etching holes
control the central frequency and the refractive index of the AR layer.
Figure 8.5 demonstrates the designed parameters and SEM measurement
results of the three different AR layers. Comparing the etching results
with the designed parameters, all samples are close to the original design
except the hole diameter of AR3, which is a little larger. The large holes
make the n AR smaller than the designed value because of smaller filling
factor according to Equation 8.3. Modified etching process would improve
the etching profile control.
AR layers
Commercial wire
grid polarizer TE
E
90°
TM
k
THz TDS system under low-humidity
environment
FIGURE 8.6
Stacked THz polarizer measurement by THz-TDs through commercial polarizer. (From Yu,
T.-Y. et al., SPIE Optical Engineering + Applications, 95850L–95850L-7, 2015.)
192 3D Integration in VLSI Circuits
1.0 1
0.8
0.1
Transmittance
Transmittance
0.6
0.01
0.4
TM with AR1 TE with AR1
0.2 TM with AR2 TE with AR2
1E−3
TM with AR1 TE with AR1
TM with AR3 TE with AR3
TM with AR2 TE with AR2
0.0 1E−4 TM with AR3 TE with AR3
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
(a) Frequency (THz) (b) Frequency (THz)
FIGURE 8.7
(a and b) The transmittance spectrum of TE and TM mode of stacked THz polarizers in linear
and log scale. (From Yu, T.-Y. et al., SPIE Optical Engineering + Applications, 95850L–95850L-7, 2015.)
1.0 1.0
0.9
0.8 AR1×AR2
0.9
0.7
Transmittance
Transmittance
0.6 Peak width: 1.1 THz 0.8
0.5
0.4 0.7 >250 GHz
0.3 AR1×AR1
0.2 AR2×AR2
0.6
AR1×AR2
0.1
0.0 0.5
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
(a) Frequency (THz) (b) Frequency (THz)
1.0 1.0
0.9 0.9
0.8 0.8
0.7 0.7
Transmittance
Transmittance
0.6 0.6
0.5 0.5
0.4 0.4
0.3 AR3×AR3 0.3 AR3×AR1
0.2 AR1×AR1 0.2
AR3×AR1
0.1 0.1
0.0 0.0
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
(c) Frequency (THz) (d) Frequency (THz)
FIGURE 8.8
Transmittance spectra of (a) TE and (b) TM mode of stacked THz with AR1×AR2 layers, and
(c) TE and (d) TM mode of THz with the AR1×AR3 layers. (From Chi, N.-C. et al., SPIE Optics +
Optoelectronics, 102420Z–102420Z-6, 2017; Chi, N.-C. et al., IEEE ECTC, 1793–1798, 2017.)
TABLE 8.1
Overview of Some Designed and Commercial THz Polarizer
Measured
Extinction Bandwidth
Group Transmittance Ratio (dB) (THz) Structure
Hsieh et al. 0.60 ~ 0.85 20 ~ 40 0.2 ~ 1.0 Feussner polarizer based on
a liquid crystal
Ren et al. 0.75 ~ 0.95 18 ~ 50 0.2 ~ 2.2 Carbon nanotube (CNT)
layers
Kyoung et al. 0.45 ~ 0.60 27 ~ 38 0.1 ~ 2.0 Reel-wound CNT
multilayers
Wojdyla and 0.90 ~ 0.99 37.8 (max) N/A Brewster’s polarizer using a
Gallot stack of silicon wafers
Yamada et al. 0.95 35 ~ 50 0.5 ~ 3.0 Al on thick Si substrate
angled at Brewster angle
MacPherson et al. 0.91 ~ 0.98 20 ~ 37 0.2 ~ 2.0 Thin-film aluminum on SiO2
Microtech 0.9 ~ 1.0 20 ~ 40 0 ~ 3 Free standing wire grid
Microtech 0.6 ~ 1.0 20 ~ 40 0 ~ 3 Wire-grid pattern on thin
substrate
Tydex 0.9 ~ 0.95 ~20 0 ~ 3 Wire-grid pattern on thin
substrate
Stacked THz 0.5 ~ 1.0 20 ~ 40 0.2 ~ 2 Wire-grid sealed into Si
polarizers wafer and AR processing
on surface
Source: Yan, F. et al., J. Infrared Millim. Terahertz Waves, 34, 489–499, 2013.
Pressure Readout
sensor circuit
Wireless
FIGURE 8.9
Concept of pressure-sensing system for monitoring external pressure variation. (From Hu,
Y.-C. et al., IEEE IEDM, 9.2.1–9.2.4, 2016.)
Gain Phone
Sensor Wireless
stage app
FIGURE 8.10
Pressure-sensing system consists of sensor chip, readout circuit chip, wireless module, and
receiver. (From Hu, Y.-C. et al., IEEE IEDM, 9.2.1–9.2.4, 2016.)
196 3D Integration in VLSI Circuits
VCC Ref
−
G ADC
FIGURE 8.11
Block diagram of the proposed pressure-sensing system. The signal begins from the pressure
sensor to the receiver. (From Hu, Y.-C. et al., IEEE IEDM, 9.2.1–9.2.4, 2016.)
Membrance
Readout TSV
circuit Cavity
Bonded joint
RDL
TSV
μ-pin fin
ENIG
FIGURE 8.12
Schematic of the integrated readout circuit chip, pressure sensor chip, and micropin-fin inter-
poser. Cross-sectional and top view of the pressure sensor. (From Hu, Y.-C. et al., IEEE IEDM,
9.2.1–9.2.4, 2016.)
Bonded joint
RDL
TSV
Micropin-fin
ENIG
TSV-embedded micropin-fin
FIGURE 8.13
Schematic of the micropin-fin heat sink interposer and SEM image of the micropin-fin. (From
Hu, Y.-C. et al., IEEE IEDM, 9.2.1–9.2.4, 2016.)
interposer, two RDLs are fabricated for connecting sensor chip and circuit
chip. Cu/In-bonded pads are formed on the RDL. On the backside of inter-
poser, electroless nickel immersion gold (ENIG) is chosen as bonded pads.
DRIE process is adopted to etch deeply micropin-fins shape. The relation-
ship between micropin depth and temperature reduction is simulated. TSV
daisy chain electrical characteristics, thermal cycling test (TCT) with −55°C
to 125°C for 750 loops, and highly accelerated stress test (HAST) at 130°C for
96 hours have been investigated [14].
198 3D Integration in VLSI Circuits
Readout
MEMS
circuit
Interposer Readout
circuit MEMS
TSV
Hydrophobic SiO2
Liquid
Carrier wafer Carrier wafer Carrier wafer
FIGURE 8.14
Process flow of double-self-assembly approach: (a) Interposer is temporarily attached on the
carrier wafer for the first self-assembly, (b) readout circuit and sensor chips are permanent
attached on the interposer for the second self-assembly, and (c) the integrated assembly chips
are detached from the carrier wafer. (From Hu, Y.-C. et al., IEEE IEDM, 9.2.1–9.2.4, 2016.)
Novel Platform and Applications Using 3D Integration Technologies 199
Cu/In IMC
Chip
μ-pin fin interposer 500 nm
FIGURE 8.15
Cross-sectional SEM image of Cu/In-bonded joint with no void or seam. (From Hu, Y.-C. et al.,
IEEE IEDM, 9.2.1–9.2.4, 2016.)
MEMS chip
Interposer
Circuit chip
Carrier wafer
FDTS film
FIGURE 8.16
Photograph of the proposed integrated pressure-sensing system with double-self-assembly
approach. (From Hu, Y.-C. et al., IEEE IEDM, 9.2.1–9.2.4, 2016.)
sensor chip, and micropin-fin interposer. Chips with Cu-In bonded pads are
successfully self-assembled on the interposer. Moreover, interposer is also
temporary attached on the carrier wafer and well debonded with first self-
assembly approach. Various sizes of interposer and chip are investigated
with the relationship of optimized liquid volume. The resulting electrical
characteristic shows high-accuracy alignment and stable reliability, which
implies the feasibility of double-self-assembly. Heterogeneous integration
can be further extended to various types by which difficult-handling chip
can be processed based on this novel technology.
ENIG film
11.5 μm height
Pillar bump
FIGURE 8.17
SEM cross-section image of chip-level heterogeneous integration scheme. (From Hu, Y.-C.
et al., IEEE Trans. Electron Devices, 62, 4148–4153, 2015.)
Upper chip
ILD Top metal ILD
PD ENIG PD
IMC Sn
PD Cu PD
ILD Top metal ILD
(c) Lower chip
FIGURE 8.18
The schematic of material-phase transformation during bonding process. (a) ENIG film and
concave-shaped bump formation before bonding, (b) During bonding, Sn squeezes outside
and inside the concave-shaped bump, (c) Pure Sn is surrounded by the middle of IMC. (From
Hu, Y.-C. et al. IEEE Trans. Electron Devices, 62, 4148–4153, 2015.)
planarization (CMP) process, which is very expensive and dirty. In this study,
by adjusting the plating current and changing plating solution, the bonding
quality can be improved efficiently. The adopted methodology is compara-
bly lesser in cost, highly efficient, and also compatible to semiconductor pro-
cesses. The plating results under different plating current and solution shows
Novel Platform and Applications Using 3D Integration Technologies 203
that the height difference between the bump center and edge is less than 1 μm
[19]. During the plating process, the chloride in the improved solution links
with the Cu+ ion to form the suppression layer on the cathode. Thus, the Cu+2
on the surface of the cathode will not be inhibited.
Parylene-C covered
Connector
90 nm tech node neural chip 90 nm tech node neural chip
TSV
Interposer
TSV
μ-Probe
FIGURE 8.19
Schematic of 2.5D-silicon interposer neural sensing biosensor. (From Hu, Y.-C. et al., IEEE
Trans. Electron Devices, 64, 1666–1673, 2017.)
204 3D Integration in VLSI Circuits
(e)
(f ) (g)
FIGURE 8.20
Process flow of silicon interposer. (a) TSV etching, (b) oxide linear and seedlayer deposit,
(c) TSV plating, (d) Cu CMP, (e) RDL and ENIG pad fabrication, (f) temporary bonding, (g) wafer
thinning, (h) 2 RDLs fabrication, (i) microbump plating, and (j) handle wafer debond. (From
Hu, Y.-C. et al., IEEE Trans. Electron Devices, 64, 1666–1673, 2017.)
Novel Platform and Applications Using 3D Integration Technologies 205
PR PR PR
(a)
SF6 PR SF6 PR SF6 PR SF6
(b)
PR PR PR
(c)
FIGURE 8.21
Process flow of TSV-embedded μ-probe. (a) PR is patterned as followed etching hard, (b) iso-
tropic etching for concave shape around probe opening, and (c) Bosch process is adopted for
probe formation. (From Hu, Y.-C. et al., IEEE Trans. Electron Devices, 64, 1666–1673, 2017.)
resistance and bonded joints resistance before and after the 1000 thermal
cycling loops, and the Cu TSVs in series and bonded joints resistance before
and after the unbias HAST under 130°C, 85% RH for 96 hours. All the reli-
ability test results show good reliability of the 2.5D neural sensing biosensor
scheme [18].
FIGURE 8.22
Schematic of 2.5D-flexible interposer neural sensing biosensor. (From Huang, Y.-C. et al.,
Symposium on VLSI Technology, pp. 218–219, 2016.)
TCT tests and the un-bias HAST under 130°C, 85% RH for 96 hours, respec-
tively, illustrate the good bonded results, and the specific contact resistance
is better than the conventional bonding approaches [20]. Thus, this bonding
approach reveals the potential to improve the electrical properties of future
flexible packaging.
8.4.4 Demonstration
The overall schemes, fabrication procedures, electrical, and reliability
measurements of three types of the neural sensing biosensor schemes are
discussed. Demonstrations of a 2.5D-silicon interposer neural sensing bio-
sensor and a 2.5D-flexible interposer neural sensing biosensor are shown
in Figures 8.23 and 8.24, respectively. Key technologies of 3D IC are used in
these biosensors, such as heterogeneous integration, through silicon/flexible
via, wafer level thinning, and chip-level bonding. Circuit chips, interposer,
and TSV-embedded μ-probe are integrated within a small form factor. The
reliabilities of the bonded joints and TSVs are also investigated. These results
prove the potential of 3D IC technologies used in the biosensor scheme and
other products in the near future.
RDL
Interposer 40 μm
Chips
(c) Interposer
TSV
μ-probe Sn-solder
Interposer μ-probe
20
10
0
μ-probe embedded 0.0 0.5 1.0 1.5 2.0
Displacement (mm)
(d) (e) (f )
FIGURE 8.23
Photos of 2.5D-silicon interposer neural sensing biosensor. (a) Side view, (b) Cross-sectional
view of circuit chip-interposer bonded joint, (c) Cross-sectional view of shunt-connected
interposer-probe bonded joint, (d) μ-needle SEM image, (e) Photo of neural sensing microsys-
tem, (f) Shear test diagram of the 2.5-D heterogeneous integrated neural sensing microsystem.
(From Hu, Y.-C. et al., IEEE Trans. Electron Devices, 64, 1666–1673, 2017.)
208 3D Integration in VLSI Circuits
Interposer
5 mm
die0
15 mm die1
15 mm
3 mm
5 mm
3 mm
die2
die3
TSV-embedded
Connector
μ-needles array
15 mm 15 mm
(a) (b)
(c) (d)
FIGURE 8.24
Photos of 2.5D-flexible interposer neural sensing biosensor. (a) Topview, (b) bottomview,
(c) 256-ch microsystem, and (d) bending. (From Huang, Y.-C. et al., Symposium on VLSI
Technology, pp. 218–219, 2016.)
References
1. R. A. Lewis, A review of terahertz sources, Journal of Physics D: Applied Physics,
47, 374001, 2014.
2. X. Yin, B. W.-H. Ng, and D. Abbott, Terahertz sources and detectors, in
Terahertz Imaging for Biomedical Applications: Pattern Recognition and Tomographic
Reconstruction, New York: Springer, 2012, pp. 9–26.
3. T.-Y. Yu, H.-C. Tsai, S.-Y. Wang, C.-W. Luo, and K.-N. Chen, High transmittance
silicon terahertz polarizer using wafer bonding technology, in SPIE Optical
Engineering + Applications, 2015, pp. 95850L–95850L-7.
4. H.-W. Liang, T.-Y. Yu, Y.-J. Chang, and K.-N. Chen, Asymmetric low temper-
ature bonding structure using ultra-thin buffer layer technique for 3D inte-
gration, in 2016 IEEE 23rd International Symposium on the Physical and Failure
Analysis of Integrated Circuits (IPFA), 2016, pp. 312–315.
5. Y.-J. Chang, Y.-S. Hsieh, and K.-N. Chen, Submicron Cu/Sn bonding technol-
ogy with transient Ni diffusion buffer layer for 3DIC application, IEEE Electron
Device Letters, 35, 1118–1120, 2014.
6. N.-C. Chi, T.-Y. Yu, H.-C. Tsai, S.-Y. Wang, C.-W. Luo, and K.-N. Chen, High trans-
mittance and broaden bandwidth through the morphology of anti-relfective
layers on THz polarizer with Si substrate, in SPIE Optics + Optoelectronics, 2017,
pp. 102420Z–102420Z-6.
7. N.-C. Chi, T.-Y. Yu, H.-C. Tsai, S.-Y. Wang, C.-W. Luo, Y.-T. Yang, K.-N. Chen,
High transmittance broadband THz polarizer using 3D-IC technologies, in
IEEE Electronic Components and Technology Conference (ECTC), 2017, 1793–1798.
Novel Platform and Applications Using 3D Integration Technologies 209
21. Y.-C. Hu, C.-P. Lin, Y.-J. Chang, N.-S. Chang, M.-H. Sheu, C.-S. Chen, and
K.-N. Chen, A novel flexible 3-D heterogeneous integration scheme using elec-
troless plating on chips with advanced technology node, IEEE Transactions on
Electron Devices, 62, 4148–4153, 2015.
22. Y.-C. Huang, Y.-C. Hu, P.-T. Huang, S.-L. Wu, Y.-H. You, J.-M. Chen et al.,
Integration of neural sensing microsystem with TSV-embedded dissolv-
able μ-needles array, biocompatible flexible interposer, and neural recording
circuits, in Symposium on VLSI Technology, 2016, pp. 218–219.
Index
Note: Page numbers followed by f and t refer to figures and tables respectively.
211
212 Index
T rationale, 118–120
system challenges, 73–75, 76f
Terahertz (THz) thermal simulations, 173f
gap, 186 Three-dimensional SiP, 15–17, 33f
polarizers issues, 187–188, 187f 3D-stackable memory, 17–19
wave applications, 186–187 ASICs and HBM, 25–29
Test vehicles (TVs), 59 assembly, 29–31
20 nm 3D-IC, 61–63, 63f cross-sectional view of, 25, 26f, 32f
28 nm 3D-IC, 60–61, 60f high-density interposer, 19–23
BLR, 63–65 microbump interconnect, 23, 25
Tezzaron DiRAM4, 10 module, 33f
Thermal comparison, 177–178 neural sensing biosensor, 201
3D technology parameters, 172–175 organic interposer, 21–23, 24f
Thin film-bonding approaches, reliability challenge, 33–37
electrical reliability test, Si-IP, 19–21, 19f, 21f
206–207 test and characterization, 32–33
Thinning module, wafer-level three- top view of, 25, 26f, 31f
dimensional process, 92–93 Through-silicon vias (TSVs), 2, 20,
Three-dimensional integrated circuit 72, 172
(3DIC) 3D LSI, 72f
cost reduction, 6–7 bank capacitance, 137, 137f, 138f
elements by 2.5D and, 33, 34t Bosch and direct etching methods,
heterogeneous integration, 7–8 96, 96f
reliability evaluation, 35, 35f bump vs. bumpless interconnects
reliability test board assembly, 35f using, 89, 89f
SOI wafers, 4, 5f coarse-pitch, 48
stacking technologies, 2 cross-sectional TEM image, 103, 103f
technology set, 2–5 DSA, 79, 80f
Three-dimensional integration chip electrical characteristics, 106
(3D-IC), 42 embedded μ-probe, 204, 205f
challenges, 67–68 fabrication steps, 3, 4f
FPGA, 43f formation and damascene Cu plug
module-based netlist, 49f processes, 94f
physical FA working flow, 60, 61f interposer wafer, 57, 59f
reliability assessment, 65–66, 65f low-aspect ratio, characteristics,
with SPICE, 48–50, 50f 103–106
TVs, 59–65, 59f module, 94–97
Xilinx, 55 module, wafer-level three-
Three-dimensional integration (3DI) dimensional process, 94–97
technology, 71–72, 86, 201 silicon etching time vs. aspect
22 nm CMOS technology, 121–125, ratio, 95f
124f, 125t stresses in Cu, 104, 105f, 106f
bumpless interconnecting and thermal resistance, 109f
wafer-level, 90–92 THz. See Terahertz (THz)
challenges, 72–81 THz polarizers, 187–188, 194t
delay of, 87 broadband, 192–193, 193f
design granularity, 147f common vs. stacked, 193
Index 217