A Scheme of High-Dimensional Key-Variable Search Algorithms For Yield Improvement
A Scheme of High-Dimensional Key-Variable Search Algorithms For Yield Improvement
all the inputs: production information (XR , XP , and y), defects high-dimensional linear regression [5]. The steps of OGA are
(D), and final inspections (Y). The characteristics of these inputs briefly descried below.
are described below. Step 1: Define R0 = Y = (y1 , y2 , . . . , yn )T . Choose the
XR needs to be discretized into 1 or 0, which indicates that variable defined as xŜ 1 that is most correlated with
the workpiece getting through this stage or not. XP contains R0 in {X = x1 , x2 , . . . , xp }. Then, the correspond-
tool process data (such as voltage, pressure, temperature, etc.) ing residual will be
which need to be centralized. y stands for inline inspection data
R̂1 = R0 − β̂ŝ11 xŝ 1 (1)
(such as critical dimension, thickness, etc.) which need to be
centralized. As for D, different companies have different defini- where
tions of defects, thus discussion with domain experts is required p number of parameter;
before executing data-preprocessing and quality check. Finally, n sample size;
Y stands for the yield test results that should be centralized. xŜ 1 highest correlation variable with R0 in X;
The data quality evaluation algorithm of XR , DQIX R , eval- 1
β̂ŝ 1 regression coefficient of R0 for xŜ 1.
uates the following four facts: 1) a stage may contain several Step 2: Choose another variable, xŜ 2 , which is most corre-
devices of the same type, while a stage utilizes only one device;
if a process should get through three devices, it then has three lated with R̂1 in X. Then, the corresponding residual
stages; 2) if a device is used in different processes, the same will be
device in a different process would be considered as a different R̂2 = R0 − β̂ŝ11 xŝ 1 − β̂ŝ22 xŝ 2 (2)
stage; 3) there are only two possibilities for a workpiece passing
through the device: get through (1) or not (0); 4) a workpiece where β̂ŝ22 is the regression coefficient of R0 for xŜ 2 .
cannot get through any device that doesn’t belong to that stage. Step 3: Return to Step 2 and repeat m times so as to stepwise
Similarly, the data quality evaluation algorithms of XP and y choose xŜ 1 , xŜ 2 , . . . , xŜ m and calculate the corre-
are denoted as DQIX P and DQIX y , respectively. Both DQIX P , sponding regression coefficients (β̂ŝ11 , β̂ŝ22 , . . . , β̂ŝmm ).
and DQIX y adopt the algorithm similar to the process data Then, the corresponding residual will be
quality evaluation scheme utilized in AVM [10], [11].
Finally, the data quality evaluation algorithm of Y is denoted R̂m = R0 − β̂ŝ11 xŝ 1 − β̂ŝ22 xŝ 2 − . . . −β̂ŝ2m xŝ m . (3)
as DQIY . DQIY applies the algorithm similar to the metrology However, it is hard to select exact m features to recover the
data quality evaluation scheme used in AVM [10], [11]. entire original signal. Therefore, Ing and Lai proposed a termi-
nation condition, HDIC [5], to choose along the OGA path that
B. KSA Module has the smallest value of a suitably chosen criterion. Let
To double check the reliability of the search results, the KSA HDIC(J) = nlogϑ̂2J + #(J)wlogp (4)
module contains two algorithms: TPOGA and ALASSO. They
with
are described below.
1
n
1) Triple Phase Orthogonal Greedy Algorithm, TPOGA:
The greedy algorithm is a stepwise regression method that con- ϑ̂2J = (yi − ŷi;J )2
n i=1
siders the correlation between all the causing parameters (X)
and the results (Y). In this study, X includes all the related where
variables of production: XR , XP , and y; while Y represents the J set of variables selected in the model (xŝ 1, xŝ 2 , . . . ,
final inspection. xŝ m );
Pure greedy algorithm (PGA) and OGA are commonly ϑ̂2J mean square error of the corresponding model;
used in the literature for solving the high-dimensional regres- w general constant penalties > 0;
sion problem. In general, OGA performs better than PGA in yi ith sample of actual value of final inspection;
CHENG et al.: SCHEME OF HIGH-DIMENSIONAL KSAs ALGORITHMS FOR YIELD IMPROVEMENT 183
TABLE I
TOTAL NUMBER OF DEVICES IN THE TFT PROCESS
ACKNOWLEDGMENT
Fig. 13. Root cause analysis of control voltage on chamber A of equipment A. The authors would like to thank AUO in Taiwan for providing
the raw data used in the illustrative example. This work has also
filed a U.S. provisional patent application under application no.:
that another 3 out of 8 Type 2 Loss samples were processed 62/260,656 on November 30, 2015.
by the Top 2 device. To find out the root causes, Step 3 of
Fig. 8 should be performed by inputting Y and XP into the REFERENCES
KSA scheme.
The Top 2 device is selected for illustration. The process data [1] R. Vernon, “International investment and international trade in the product
cycle,” Quart. J. Econ., vol. 80, pp. 190–207, 1966.
(XP ) of the Top 2 device has 27 variables. After conducting [2] A. Chen and A. Hong, “Sample-efficient regression trees (SERT) for
the KSA analysis on all the devices of the same stage to which semiconductor yield loss analysis,” IEEE Trans. Semicond. Manuf.,
the Top 2 device belongs, its RIK value of XP search is 0.864 vol. 23, no. 3, pp. 358–369, Aug. 2010.
[3] C. F. Chien, W. C. Wang, and J. C. Cheng, “Data mining for yield en-
(>0.7) as shown in Fig. 12. Therefore, the search result is reli- hancement in semiconductor manufacturing and an empirical study,” Exp.
able with the Top 1 variable being Control Voltage. Syst. Appl., vol. 33, pp. 192–198, 2007.
To confirm that “Control Voltage” is the root cause, the “Con- [4] C. Y. Hsu, C. F. Chien, K. Y. Lin, and C. Y. Chien, “Data mining for
yield enhancement in TFT-LCD manufacturing: An empirical study,”
trol Voltage” values of all the five chambers (A-E) of Equipment J. Chinese Inst. Ind. Eng., vol. 27, no. 2, pp. 140–156, 2010.
A are drawn and verified by a box plot chart and the hypothe- [5] C.-K. Ing and T. L. Lai, “A stepwise regression method and consistent
sis test is executed as shown in Fig. 13. It shows that p-value model selection for high-dimensional sparse linear models,” Statistica
Sinica, vol. 21, pp. 1473–1513, 2011.
(=0.002) is less than 0.05, which indicates that Chamber A’s [6] R. Tibshirani, “Regression shrinkage and selection via the LASSO,”
“Control Voltage” value in red circle is indeed less than those J. Roy. Statist. Soc. B, vol. 58, no. 1, pp. 267–288, 1996.
of the other chambers. As such, the root cause of this Type 2 [7] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical
Learning: Data Mining, Inference, and Prediction. New York, NY, USA:
Loss is due to the abnormality found in Chamber A’s “Control Springer, 2009.
Voltage”. [8] F.-T. Cheng, C.-A. Kao, C.-F. Chen, and W.-H. Tsai, “Tutorial on applying
The KSA scheme was implemented in a computer with i5- the VM technology for TFT-LCD manufacturing,” IEEE Trans. Semicond.
Manuf., vol. 28, no. 1, pp. 55–69, Feb. 2015.
[email protected] GHz CPU, 6.0 GB RAM, and 1 TB hard disk. With [9] L. Fowler and T. Davis, “Engineering data analysis using discovery,” in
the illustrative example mentioned above, the execution times Proc. IEEE/SEMl Adv. Semicond. Manuf. Conf., Nov. 1996, pp 416–422.
of the Steps 2 (for searching the key stages) and 3 (for selecting [10] Y.-T. Huang and F.-T. Cheng, “Automatic data quality evaluation for the
AVM system,” IEEE Trans. Semicond. Manuf., vol. 24, no. 3, pp. 445–454,
the key variables in the most suspicious stage) shown in Fig. 8 Aug. 2011.
are 1.22 and 1.36 sec, respectively. [11] F.-T. Cheng, H.-C. Huang, and C.-A. Kao, “Developing an automatic
virtual metrology system,” IEEE Trans. Autom. Sci. Eng., vol. 9, no. 1,
pp. 181–188, Jan. 2012.
[12] F.-T. Cheng, Y.-T. Chen, Y.-C. Su, and D.-L. Zeng, “Evaluating reliance
VI. SUMMARY AND CONCLUSION level of a virtual metrology system,” IEEE Trans. Semicond. Manuf.,
A high-dimensional Key-variable Search Algorithm (KSA) vol. 21, no. 1, pp. 92–103, Feb. 2008.
[13] D. Port, R. Kazman, H. Nakao, and M. Katahira, “Practicing
for yield improvement is proposed in this paper. A two-phase what is preached: 80/20 rules for strategic IV & V assessment,”
process is suggested to apply the KSA scheme for searching the in Proc. IEEE Int. Conf. Exploring Quantifiable IT Yields, 2007,
root causes of yield losses: pp. 45–54.
[14] M. Singson and P. Hangsing, “Implication of 80/20 rule in electronic
1) Phase I: Feed yield test results (Y) as well as total- journal usage of UGC-infonet consortia,” J. Academic Librarianship,
inspection inline data (y) and/or production routes (XR ) vol. 41, no. 2, pp. 207–219, 2015.