0% found this document useful (0 votes)
432 views342 pages

Pub Quality Engineering Using Robust Design

Quality engineering using robust design by madhav s phadke

Uploaded by

Email Id
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
432 views342 pages

Pub Quality Engineering Using Robust Design

Quality engineering using robust design by madhav s phadke

Uploaded by

Email Id
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 342
Quatry ENGINEERING USING ROBUST DESIGN MADHAV S. PHADKE Atst (lw QUALITY ENGINEERING USING ROBUST DESIGN MADHAV S. PHADKE AT&T Bell Laboratories P T R Prentice Hall, Englewood Cliffs, New Jersey 07632 Library of Congress Cataloging-in-Publication Data Phadke, Madhav shridhar Quality engineering using robust design / Madhav S. Phadke. PB. ca. Includes index. ISBN 0-13-745167-9 1. Engineering design. 2. Computer-aided design. 3. UNIX (Computer operating system) 4. Integrated circuits--Very large scale integration. I. Title. TA174.P49 1989 620 ' 0042 ’ 0285--de20 99-3927 CIP © 1989 by AT&T Bell Laboratories Published by PT R Prentice-Hall, Inc. A Simon & Schuster Company Englewood Cliffs, New Jersey 07632 All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher. Printed in the United States of America 10 ISBN 0~-13-745167-9 Prentice-Hall International (UK) Limited, London Prentice-Hall of Australia Pty. Limited, Sydney Prentice-Hall Canada Inc., Toronto Prentice-Hall Hispanoamericana, S.A., Mexico Prentice-Hall of India Private Limited, New Dethi Prentice-Hall of Hapan, Inc., Tokyo Simon & Schuster Asia Pte. Ltd., Singapore Editora Prentice-Hall do Brasil, Ltda., Rio de Janeiro ‘To my parents, and Maneesha, Kedar, and Lata. CONTENTS Foreword xiii Preface xv Acknowledgments xvii CHAPTER 1 INTRODUCTION 1 1d A Historical Perspective 2 1.2 What Is Quality? 3 1.3. Elements of Cost 4 1.4 Fundamental Principle 5 1.5 Tools Used in Robust Design 6 1.6 Applications and Benefits of Robust Design 8 1.7 Organization of the Book 10 18 Summary 10 vil vii CHAPTER 2 21 2.2 23 24 25 2.6 27 28 29 CHAPTER 3 3.1 oe 3.3 3.4 a8 3.6 CHAPTER 4 4.1 Contents, PRINCIPLES OF QUALITY ENGINEERING 13 Quality Loss Function—The Fraction Defective Fallacy 14 Quadratic Loss Function 18 Noise Factors—Causes of Variation 23 Average Quality Loss 25 Exploiting Nonlinearity 27 Classification of Parameters: P Diagram 30 Optimization of Product and Process Design 32 Role of Various Quality Control Activities 35 Summary 38 MATRIX EXPERIMENTS USING ORTHOGONAL ARRAYS a Matrix Experiment fora CVD Process 42 Estimation of Factor Effects 44 Additive Model for Factor Effects 48 Analysis of Variance 51 Prediction and Diagnosis 59 Summary 63 STEPS IN ROBUST DESIGN 67 ‘The Polysilicon Deposition Process and Its Main Function 68 Contents 42 43 44 45 46 47 48 49 CHAPTER 5 5.1 ae S32 5.4 £3 5.6 CHAPTER 6 6.1 6.2 63 64 Noise Factors and Testing Conditions 71 Quality Characteristics and Objective Functions 72 Control Factors and Their Levels 74 Matrix Experiment and Data Analysis Plan 76 Conducting the Matrix Experiment 79 Data Analysis 80 Verification Experiment and Future Plan 90 Summary 93 SIGNAL-TO-NOISE RATIOS 97 Optimization for Polysilicon Layer Thickness Uniformity 98 Evaluation of Sensitivity to Noise 105 SIN Ratios for Static Problems 108 SIN Ratios for Dynamic Problems 114 Analysis of Ordered Categorical Data. 121 Summary 128 ACHIEVING ADDITIVITY 133 Guidelines for Selecting Quality Characteristics 135 Examples of Quality Characteristics 136 Examples of S/N Ratios 138 Selection of Control Factors 144 Contents 6.5 Role of Orthogonal Arrays 146 66 Summary 146 CHAPTER 7 CONSTRUCTING ORTHOGONAL ARRAYS. 149 7.1 Counting Degrees of Freedom — 150 7.2. Selecting a Standard Orthogonal Array 151 7.3 Dummy Level Technique 154 7.4 Compound Factor Method 156 7.5 Linear Graphs and Interaction Assignment 157 7.6 — Modification of Linear Graphs 163, 7.7 Column Merging Method 166 7.8 Branching Design 168 7.9 Strategy for Constructing an Orthogonal Array 171 7.10 Comparison with the Classical Statistical Experiment Design 174 7.11 Summary 181 CHAPTER 8 COMPUTER AIDED ROBUST DESIGN 183 8.1 Differential Op-Amp Circuit 184 8.2 Description of Noise Factors 186 8.3 Methods of Simulating the Variation in Noise Factors 189 8.4 Orthogonal Array Based Simulation of Variation in Noise Factors 190 Contents xi 8.5 Quality Characteristic and S/N Ratio 194 8.6 Optimization of the Design 194 8.7 Tolerance Design 202 8.8 Reducing the Simulation Effort 205 8.9 Analysis of Nonlinearity 207 8.10 Selecting an Appropriate S/N Ratio 208 8.11 Summary — 209 CHAPTER 9 DESIGN OF DYNAMIC SYSTEMS 213 9.1 Temperature Control Circuit and Its Function 214 9.2 Signal, Control, and Noise Factors 217 9.3 Quality Characteristics and S/N Ratios 218 9.4 Optimization of the Design 222 9.5 Iterative Optimization 227 9.6 Summary 228 CHAPTER 10 TUNING COMPUTER SYSTEMS 10.1 10.2 10.3 10.4 10.5 FOR HIGH PERFORMANCE 231 Problem Formulation 232 Noise Factors and Testing Conditions 234 Quality Characteristic and S/N Ratio 235 Control Factors and Their Alternate Levels 236 Design of the Matrix Experiment and the Experimental Procedure 238 xi 10.6 10.7 10.8 10.9 Contents Data Analysis and Verification Experiments 240 Standardized S/N Ratio 246 Related Applications 249 Summary 249 CHAPTER 11 RELIABILITY IMPROVEMENT 253 APPENDIX A APPENDIX B APPENDIX C 11 112 11.3 114 1g.3 11.6 11.7 11.8 1L9 Role of S/N Ratios in Reliability Improvement 254 The Routing Process 256 Noise Factors and Quality Characteristics 256 Control Factors and Their Levels 257 Design of the Matrix Experiment 258 Experimental Procedure 265 Data Analysis 265 Survival Probability Curves 271 Summary 275 ORTHOGONALITY OF A MATRIX EXPERIMENT 277 UNCONSTRAINED OPTIMIZATION 281 STANDARD ORTHOGONAL ARRAYS AND LINEAR GRAPHS 285 REFERENCES 321 INDEX FOREWORD The main task of a design engineer is to build in the function specified by the product planning people at a competitive cost. An engineer knows that all kinds of functions are energy transformations. ‘Therefore, the product designer must identify what is input, what is output, and what is ideal function while developing a new product. It is important to make the product’s function as close to the ideal function as possible. Therefore, it is very important to measure correctly the distance of the product's performance from the ideal function. This is the main role of quality engineering. In order to measure the distance, we have to consider the following problems: 4, Identify signal and noise space 2. Select several points from the space 3. Select an adequate design parameter to observe the performance 4. Consider possible calibration or adjustment method 5. Select an appropriate measurement related with the mean distance As most of those problems require engineering knowledge, a book on quality engineer- ing must be written by a person who has enough knowledge of engineering. Dr. Madhav Phadke, a mechanical engineer, has worked at AT&T Bell Labora- tories for many years and has extensive experience in applying the Robust Design method to problems from diverse engineering fields. He has made many eminent and pioneering contributions in quality engineering, and he is one of the best qualified per- sons to author a book on quality engineering. xiv, Foreword The greatest strength of this book is the case studies. Dr. Phadke presents four real instances where the Robust Design method was used to improve the quality and cost of products. Robust Design is universally applicable to all engineering fields. You will be able to use these case studies to improve the quality and cost of your products, This is the first book on quality engineering, written in English by an engineer. ‘The method described here has been applied successfully in many companies in Japan, USA, and other countries. I recommend this book for all engineers who want to apply experimental design for actual product design, G. Taguchi PREFACE Designing high-quality products and processes at low cost is an economic and technol- ogical challenge to the engineer. A systematic and efficient way to meet this challenge is a new method of design optimization for performance, quality, and cost. The method, called Robust Design, consists of 1, Making product performance insensitive to raw material variation, thus allowing the use of low grade material and components in most cases, 2. Making designs robust against manufacturing variation, thus reducing labor and material cost for rework and scrap, 3. Making the designs least sensitive to the variation in operating environment, thus improving reliability and reducing operating cost, and 4. Using a new structured development process so that engineering time is used most productively. All engineering designs involve setting values of a large number of decision vari- ables. Technical experience together with experiments, through prototype hardware models or computer simulations, are needed to come up with the most advantageous decisions about these variables. Studying these variables one at a time or by trial and error is the common approach to the decision process. This leads to either a very long and expensive time span for completing the design or premature termination of the design process so that the product design is nonoptimal. This can mean missing the market window and/or delivering an inferior quality product at an inflated cost. xvi Preface The Robust Design method uses a mathematical tool called orthogonal arrays to study a large number of decision variables with a small number of experiments. It also uses a new measure of quality, called signal-to-noise (S/N) ratio, to predict the quality from the customer's perspective. Thus, the most economical product and process design from both manufacturing and customers’ viewpoints can be accomplished at the smallest, affordable development cost. Many companies, big and small, high-tech and low-tech, have found the Robust Design method valuable in making high-quality prod- ucts available to customers at a low competitive price while still maintaining an accept- able profit margin. This book will be useful to practicing engineers and engineering managers from all disciplines. It can also be used as a text in a quality engineering course for seniors and first year graduate students, The method is explained through a series of real case studies, thus making it easy for the readers to follow the method without the burden of leaming detailed theory. At AT&T, several colleagues and I have developed a two and a half day course on this topic. My experience in teaching the course ten times has convinced me that the case studies approach is the best one to communicate how to use the method in practice. The particular case studies used in this book relate to the fabri- cation of integrated circuits, circuit design, computer tuning, and mechanical routing. Although the book is written primarily for engineers, it can also be used by stat- isticians to study the wide range of applications of experimental design in quality engineering. This book differs from the available books on statistical experimental design in that it focuses on the engineering problems rather than on the statistical theory. Only those statistical ideas that are relevant for solving the broad class of pro- duct and process design problems are discussed in the book. Chapters 1 through 7 describe the necessary theoretical and practical aspects of the Robust Design method. The remaining chapters show a variety of applications from different engineering disciplines. The best way for readers to use this book is, after reading each section, to determine how the concepts apply to their projects. My experience in teaching the method has revealed that many engineers like to see an application of the method in their own field. Chapters 8 through 11 describe case stud- ies from different engineering fields. It is hoped that these case studies will help readers see the breadth of the applicability of the Robust Design method and assist them in their own applications. Madhav S. Phadke AT&T Bell Laboratories Holmdel, N.J. ACKNOWLEDGMENTS Thad the greatest fortune to leam the Robust Design methodology directly from its founder, Professor Genichi Taguchi. It is with the deepest gratitude that I acknowledge his inspiring work. My involvement in the Robust Design method began when Dr. Roshan Chaddha asked me to host Professor Taguchi’s visit to AT&T Bell Labora- tories in 1980. I thank Dr. Chaddha (Bellcore, formerly with AT&T Bell Labs) for the invaluable encouragement he gave me during the early applications of the method in AT&T and also while writing this book. I also received valuable support and encouragement from Dr. E. W. Hinds, Dr. A. B. Godfrey, Dr. R. E. Kerwin, and Mr. E, Fuchs in applying the Robust Design method to many different engineering fields which led to deeper understanding and enhancement of the method. Writing a book of this type needs a large amount of time. I am indebted to Ms. Cathy Savolaine for funding the project. 1 also thank Mr. J. V. Bodycomb and Mr. Larry Bernstein for supporting the project. The case studies used in this book were conducted through collaboration with many colleagues, Mr. Gary Blaine, Mr. Dave Chrisman, Mr. Joe Leanza, Dr. T. W. Pao, Mr. C. S. Sherrerd, Dr. Peter Hey, and Mr. Paul Sherry. I am grateful to them for allowing me to use the case studies in the book. T also thank my colleagues, Mr. Don Speeney, Dr. Raghu Kackar, and Dr. Mike Grieco, who worked with me on the first Robust Design case study at AT&T. Through this case study, which resulted in huge improvements in the window photo- lithography process used in integrated circuits fabrication, I gained much insight into the Robust Design method. xvii xvii Acknowledgments I thank Mr, Rajiv Keny for numerous discussions on the organization of the book. A number of my colleagues read the draft of the book and provided me with valuable comments. Some of the people who provided the comments are: Dr. Don Clausing (M.LT.), Dr. A. M. Jogiekar (Honeywell), Dr. C. W. Hoover, Jr. (Polytechnic University), Dr. Jim Pennell (IDA), Dr. Steve Eick, Mr. Don Speeney, Dr. M. Daneshmand, Dr. V. N. Nair, Dr. Mike Luvalle, Dr. Ajit S. Manocha, Dr. V. V. S. Rana, Ms, Cathy Hudson, Dr. Miguel Perez, Mr. Chris Sherrerd, Dr. M. H. Sherif, Dr. Helen Hwang, Dr. Vasant Prabhu, Ms. Valerie Partridge, Dr. Sachio Nakamura, Dr. K. Dehnad, and Dr. Gary Ulrich. 1 thank them all for their generous help in improving the content and readability of the book. I also thank Mr, Akira Tomishima (Yamatake-Honeywell), Dr. Mohammed Hamami, and Mr. Bruce Linick for helpful discussions on specific topics in the book. Thanks are also due to Mr. Yuin Wu (ASI) for valuable general discussions. I very much appreciate the editorial help I received from Mr. Robert Wright and Ms. April Cormaci through the various stages of manuscript preparation, Also, I thank Ms. Eve Engel for coordinating text processing and the artwork during manuscript preparation. The text of this volume was prepared using the UNIX* operating system, 5.2.6a, and a LINOTRONIC® 300 was used to typeset the manuscript. Mr. Wright was responsible for designing the book format and coordinating production, Mr. Don Han- kinson, Ms. Mari-Lynn Hankinson, and Ms. Marilyn Tomaino produced the final illus- trations and were responsible for the layout. Ms. Kathleen Attwooll, Ms. Sharon Mor- gan, and several members of the Holmdel Text Processing Center provided electronic text processing. Chapter 1 INTRODUCTION The objective of engineering design, a major part of research and development (R&D), is to produce drawings, specifications, and other relevant information needed to manufacture products that meet customer requirements. Knowledge of scientific phenomena and past engineering experience with similar product designs and manufac- turing processes form the basis of the engineering design activity (see Figure 1.1). However, a number of new decisions related to the particular product must be made regarding product architecture, parameters of the product design, the process architec- ture, and parameters of the manufacturing process. A large amount of engineering effort is consumed in conducting experiments (either with hardware or by simulation) to generate the information needed to guide these decisions. Efficiency in generating such information is the key to meeting market windows, keeping development and manufacturing costs low, and having high-quality products. Robust Design is an engineering methodology for improving productivity during research and development so that high-quality products can be produced quickly and at low cost. This chapter gives an overview of the basic concepts underlying the Robust Design methodology: * Section 1.1 gives a brief historical background of the method. * Section 1.2 defines the term quality as it is used in this book. * Section 1.3 enumerates the basic elements of the cost of a product. * Section 1.4 describes the fundamental principle of the Robust Design methodol- ogy with the help of a manufacturing example. Introduction Chap. 1 Section 1.5 briefly describes the major tools used in Robust Design. Section 1.6 presents some representative problems and the benefits of using the Robust Design method in addressing them Section 1.7 gives a chapter-by-chapter outline of the rest of the book. Section 1.8 summarizes the important points of this chapter. In the subsequent chapters, we describe Robust Design concepts in detail and, through case studies, we show how to apply them. Product Design Customer's Requirements RAD for Design * Desired function and Low cost + Usage environment Manufacturing + High quality + Failure cost (low failure cost) Scientific Knowledge Engineering Knowledge Understanding of natural phenomena Experience with previous designs and manufacturing processes Figure L.1 Block diagram of R&D acti 1.1 A HISTORICAL PERSPECTIVE When Japan began its reconstruction efforts after World War I, it faced an acute short- age of good-quality raw material, high-quality manufacturing equipment and skilled engineers. The challenge was to produce high-quality products and continue to improve the quality under those circumstances. ‘The task of developing a methodology to meet the challenge was assigned to Dr. Genichi Taguchi, who at that time was a manager in charge of developing certain telecommunications products at the Electrical Communications Laboratories (ECL) of Nippon Telephone and Telegraph Company (NTT). Through his research in the 1950s and the early 1960s, Dr. Taguchi developed the foundations of Robust Design and validated its basic philosophies by applying Sec. 1.2 What Is Quality? 3 them in the development of many products. In recognition of this contribution, Dr. Taguchi received the individual Deming Award in 1962, which is one of the highest recognitions in the quality field. The Robust Design method can be applied to a wide variety of problems. The application of the method in electronics, automotive products, photography and many other industries has been an important factor in the rapid industrial growth and the sub- sequent domination of international markets in these industries by Japan. Robust Design draws on many ideas from statistical experimental design to plan experiments for obtaining dependable information about variables involved in making engineering decisions. The science of statistical experimental design originated with the work of Sir Ronald Fisher in England in the 1920s. Fisher founded the basic prin- ciples of experimental design and the associated data-analysis technique called analysis of variance (ANOVA) during his efforts to improve the yield of agricultural crops. The theory and applications of experimental design and the related technique of response surface methodology have been advanced by many statistical researchers. Today, many excellent textbooks on this subject exist, for example, Box, Hunter and Hunter [B3], Box and Draper [B2], Hicks [H2J, John [J2], Raghavarao [RI], and Kempthorne [K4]. Various types of matrices are used for planning experiments to study several decision variables simultaneously. Among them, Robust Design makes heavy use of the orthogonal arrays, whose use for planning experiments was first pro- posed by Rao [R2]. Robust Design adds a new dimension to statistical experimental design—it explicitly addresses the following concems faced by all product and process designers: * How to reduce economically the variation of a product’s function in the customer's environment. (Note that achieving a product’s function consistently on target maximizes customer satisfaction.) * How to ensure that decisions found to be optimum during laboratory experiments will prove to be so in manufacturing and in customer environments, In addressing these concems, Robust Design uses the mathematical formalism of sta- tistical experimental design, but the thought process behind the mathematics is different in many ways. The answers provided by Robust Design to the two concems listed above make it a valuable tool for improving the productivity of the R&D activity. The Robust Design method is still evolving. With the active research being car- ried out in the United States, Japan, and other countries, it is expected that the applica- tions of the method and the method itself will grow rapidly in the coming decade. 1.2 WHAT IS QUALITY? Because the word quality means different things to different people (see, for example, Juran [J3], Deming [D2], Crosby [C5], Garvin [G1], and Feigenbaum [F1}), we need to define its ‘use in this book. First, let us define what we mean by the ideal quality 4 Introduction Chap. 1 which can serve as a reference point for measuring the quality level of a product. The ideal quality a customer can expect is that every product delivers the target perfor- mance each time the product is used, under ali intended operating conditions, and throughout its intended life, with no harmful side effects. Note that the traditional con- cepts of reliability and dependability are part of this definition of quality. In specific situations, it may be impossible to produce a product with ideal quality. Nonetheless, ideal quality serves as a useful reference point for measuring the quality level. The following example helps clarify the definition of ideal quality. People buy automobiles for different purposes. Some people buy them to impress their friends while others buy them to show off their social status. To satisfy these diverse pur- poses, there are different types (species) of cars—sports cars, luxury cars, etc—on the market. For any type of car, the buyer always wants the automobile to provide reliable transportation. Thus, for each type of car, an ideal quality automobile is one that works perfectly each time it is used (on hot summer days and cold winter days), throughout its intended life (not just the warranty life) and does not pollute the atmo- sphere. When a product’s performance deviates from the target performance, its quality is considered inferior. The performance may differ from one unit to another or from one environmental condition to another, or it might deteriorate before the expiration of the intended life of the product. Such deviation in performance causes loss to the user of the product, the manufacturer of the product, and, in varying degrees, to the rest of the society as well. Following Taguchi, we measure the quality of a product in terms of the total loss to society due to functional variation and harmful side effects. Under the ideal quality, the loss would be zero; the greater the loss, the lower the quality. In the automobile example, if a car breaks down on the road, the driver would, at the least, be delayed in reaching his or her destination. The disabled car might be the cause of traffic jams or accidents, The driver might have to spend money to have the car towed. If the car were under warranty, the manufacturer would have to pay for repairs. The concept of quality loss includes all these costs, not just the warranty cost. Quantifying the quality loss is difficult and is discussed in Chapter 2. Note that the definition of quality of a product can be easily extended to processes as well as services. As a matter of fact, the entire discussion of the Robust Design method in this book is equally applicable for processes and services, though for simplicity, we do not state so each time. 1.3 ELEMENTS OF COST Quality at what cost? Delivering a high-quality product at low cost is an interdisci- plinary problem involving engineering, economics, statistics, and management. The three main categories of cost one must consider in delivering a product are: Sec. 1.4 Fundamental Principle 5 1, Operating Cost. Operating cost consists of the cost of energy needed to operate the product, environmental control, maintenance, inventory of spare parts and units, etc. Products made by different manufacturers can have different energy costs. If a product is sensitive to temperature and humidity, then elaborate and costly air conditioning and heating units are needed. A high product failure rate of a product causes large maintenance costs and costly inventory of spare units. A manufacturer can greatly reduce the operating cost by designing the product robust—that is, minimizing the product’s sensitivity to environmental and usage conditions, manufacturing variation, and deterioration of parts. 2. Manufacturing Cost. Important elements of manufacturing cost are equipment, machinery, raw materials, labor, scrap, rework, etc. In a competitive environ- ment, it is important to keep the unit manufacturing cost (umc) low by using low-grade material, employing less-skilled workers, and using less-expensive equipment, and at the same time maintain an appropriate level of quality. This is possible by designing the product robust, and designing the manufacturing pro- cess robust—that is, minimizing the process’ sensitivity to manufacturing distur- bances. 3. R&D Cost. The time taken to develop a new product plus the amount of engineering and laboratory resources needed are the major elements of R&D cost. The goal of R&D activity is to keep the ume and operating cost low. Robust Design plays an important role in achieving this goal because it improves the efficiency of generating information needed to design products and processes, thus reducing development time and resources needed for development. Note that the manufacturing cost and R&D cost are incurred by the producer and then passed on to the customer through the purchase price of the product. The operat- ing cost, which is also called usage cost, is borne directly by the customer and it is directly related to the product’s quality. From the customer's point of view, the pur- chase price plus the operating cost determine the economics of satisfying the need for which the product is bought. Higher quality means lower operating cost and vice versa, Robust Design is a systematic method for keeping the producer's cost low while delivering a high-quality product, that is, while keeping the operating cost low. 1.4 FUNDAMENTAL PRINCIPLE The key idea behind Robust Design is illustrated by the experience of Ina Tile Com- pany, described in detail in Taguchi and Wu [T7]. During the late 1950s, Ina Tile Company in Japan faced the problem of high variability in the dimensions of the tiles it produced [see Figure 1.2(a)]. Because screening (rejecting those tiles outside specified dimensions) was an expensive solution, the company assigned a team of expert engineers to investigate the cause of the problem. The team’s analysis showed that the tiles at the center of the pile inside the kiln (see Figure 1.2 (b)] experienced lower temperature than those on the periphery. This nonuniformity of temperature dis- tribution proved to be the cause of the nonuniform tile dimensions. The team reported 6 Introduction Chap. 1 that it would cost approximately half a million dollars to redesign and build a kiln in which all the tiles would receive uniform temperature distribution. Although this alter- native was less expensive than screening it was still too costly, The team then brainstormed and defined a number of process parameters that could be changed easily and inexpensively. After performing a small set of well- planned experiments according to Robust Design methodology, the team concluded that increasing the lime content of the clay from 1 percent to 5 percent would greatly reduce the variation of the tile dimensions. Because lime was the least expensive ingredient, the cost implication of this change was also favorable. Thus, the problem of nonuniform tile dimensions was solved by minimizing the effect of the cause of the variation (nonuniform temperature distribution) without con- tolling the cause itself (the kiln design), As illustrated by this example, the fundamen- tal principle of Robust Design is to improve the quality of a product by minimizing the effect of the causes of variation without eliminating the causes. This is achieved by optimizing the product and process designs o make the performance minimally sensi- tive to the various causes of variation. This is called parameter design. However, parameter design alone does not always lead to sufficiently high quality. Further improvement can be obtained by controlling the causes of variation where economi- cally justifiable, typically by using more expensive equipment, higher grade com- ponents, better environmental controls, etc., all of which lead to higher product cost, or operating cost, or both. The benefits of improved quality must justify the added prod- uct cost. 1.5 TOOLS USED IN ROBUST DESIGN A great deal of engineering time is spent generating information about how different design parameters affect performance under different usage conditions. Robust Design methodology serves as an "amplifier"—that is, it enables an engineer to generate infor- mation needed for decision-making with half (or even less) the experimental effort. There are two important tasks to be performed in Robust Design: 1, Measurement of Quality During Design/Development. We want a leading indi- cator of quality by which we can evaluate the effect of changing a particular design parameter on the product's performance. 2. Efficient Experimentation to Find Dependable Information about the Design Parameters. It is critical to obtain dependable information about the design parameters so that design changes during manufacturing and customer use can be avoided. Also, the information should be obtained with minimum time and Tesources. The estimated effects of design parameters must be valid even when other param- eters are changed during the subsequent design effort or when designs of related sub- systems change, This can be achieved by employing the signal-to-noise (SIN) ratio to measure quality and orthogonal arrays to study many design parameters simultane- ously. These tools are described later in this book. Sec. 1.5 Tools Used in Robust Design Probability Distribution (b) Schematic Diagram of the Kiin Figure 1.2 Tile manufacturing example. 8 Introduction Chap. 1 1.6 APPLICATIONS AND BENEFITS OF ROBUST DESIGN The Robust Design method is in use in many areas of engineering throughout the United States, For example, AT&T's use of Robust Design methodology has lead to improvement of several processes in very large scale integrated (VLSI) circuit fabrica- tion used in the manufacture of I-megabit and 256-kilobit memory chips, 32-bit pro- cessor chips, and other products. Some of the VLSI applications are: + The window photolithography application (documented in Phadke, Kackar, Speeney, and Grieco [P5]) was the first application in the United States that demonstrated the power of Taguchi’s approach to quality and cost improvement through robust process design. In particular, the benefits of the application were: — 4-fold reduction in process variance — 3-fold reduction in fatal defects — 2-fold reduction in processing time (because the process became stable. allowing time-consuming inspection to be dropped) — Easy transition of design from research to manufacturing — Easy adaptation of the process to finer-line technology (adaptation from 3.5-micron to 2.5-micron technology), which is typically a very difficult problem. * The aluminum etching application originated from a belief that poor photoresist print quality leads to line width loss and to undercutting during the etching pro- cess. By making the etching process insensitive to photoresist profile variation and other sources of variation, the visual defects were reduced from 80 percent to 15 percent. Moreover, the etching step could then tolerate the variation in the photoresist profile. The reactive ion etching of tantalum silicide (described in Katz and Phadke IK3)), used to give highly nonuniform etch quality, so only 12 out of 18 possible wafer positions could be used for production. After optimization, 17 wafer posi- tions became usable—a hefty 40 percent increase in machine utilization, Also, the efficiency of the orthogonal array experimentation allowed this project to be completed by the 20-day deadline. In this case, $1.2 million was saved in equip- ment replacement costs not including the expense of disruption on the factory floor. The polysilicon deposition process had between 10 and 5000 surface defects per unit area. As such, it represented a serious road block in advancing to line widths smaller than 1.75 micron, Six process parameters were investigated with 18 experiments leading to consistently less than 10 surface defects per unit area. As a result, the scrap rate was reduced significantly and it became possible to process smaller line widths, This case study is described in detail in Chapter 4. Sec. 1.6 Applications and Benefits of Robust Design 9 Other AT&T applications include: © The router bit life-improvement project (described in Chapter 11 and Phadke {P3]) led to a 2-fold to 4-fold increase in the life of router bits used in cutting printed wiring boards. The project illustrates how reliability or life improvement Projects can be organized to find the best settings of the routing process parame- ters with a very small number of samples. The number of samples needed in this approach is very small, yet it can give valuable information about how each parameter changes the survival probability curve (change in survival probability as a function of time). In the differential operational amplifier circuit optimization application (described in Chapter 8 and Phadke [P3]), a 40-percent reduction in the root mean square (rms) offset voltage was realized by simply finding new nominal values for the circuit parameters. This was done by reducing sensitivity to all tolerances and temperature, rather than reducing tolerances, which could have increased manufacturing cost. The Robust Design method was also used to find optimum proportions of ingredients for making water-soluble flux, By simultaneous study of the parame- ters for the wave soldering process and the flux composition, the defect rate was reduced by 30 to 40 percent (see Lin and Kackar [1.3]. Orthogonal array experiments can be used to tune hardware/sofiware systems. By simultaneous study of three hardware and six software parameters, the response time of the UNIX operating system was reduced 60 percent for a partic- ular set of load conditions experienced by the machine (see Chapter 10 and Pao, Phadke, and Sherrerd [P1]). Under the leadership of American Supplier Institute and Ford Motor Company, a number of automotive suppliers have achieved quality and cost improvement through Robust Design. These applications include improvements in metal casting, injection molding of plastic parts, wave soldering of electronic components, speedometer cable design, integrated circuit chip bonding, and picture tube lens coating. Many of these applications are documented in the Proceedings of Supplier Symposia on Taguchi Methods [P9]. All these examples show that the Robust Design methodology offers simultane- ous improvement of product quality, performance and cost, and engineering produc- tivity. Its widespread use in industry will have a far-reaching economic impact because this methodology can be applied profitably in all engineering activities, including prod- uct design and manufacturing process design. The philosophy behind Robust Design is not limited to engineering applications. Yokoyama and Taguchi [Y1] have also shown its applications in profit planning in business, cash-flow optimization in banking, government policymaking, and other areas, The method can also be used for tasks such as determining optimum work force mix for jobs where the demand is random, and improving the runway utilization at an airport, 10 Introduction Chap. 1 1.7 ORGANIZATION OF THE BOOK This book is divided into three parts. The first part (Chapters | through 4) describes the basics of the Robust Design methodology. Chapter 2 describes the quality loss function, which gives a quantitative way of evaluating the quality level of a product rather than just the "good-bad" characterization. After categorizing the sources of vari- ation, the chapter further describes the steps in engineering design and the classification of parameters affecting the product’s function. Quality control activities during dif- ferent stages of the product realization process are also described there. Chapter 3 is devoted to orthogonal array experiments and basic analysis of the data obtained through such experiments. Chapter 4 illustrates the entire strategy of Robust Design through an integrated circuit (IC) process design example. The strategy begins with problem formulation and ends with verification experiment and implementation. This case study could be used as a model in planning and carrying out manufacturing pro- cess optimization for quality, cost, and manufacturability. The example also has the basic framework for optimizing a product design. The second part of the book (Chapters 5 through 7) describes, in detail, the tech- niques used in Robust Design. Chapter 5 describes the concept of signal-to-noise ratio and gives appropriate signal-to-noise ratios for a number of common engineering prob- lems. Chapter 6 is devoted to a critical decision in Robust Design: choosing an appropriate response variable, called quality characteristic, for measuring the quality of a product or a process. The guidelines for choosing quality characteristics are illus- trated with examples from many different engineering fields. A step-by-step procedure for designing orthogonal array experiments for a large variety of industrial problems is given in Chapter 7. ‘The third part of the book (Chapters 8 through 11) describes four more case studies to illustrate the use of Robust Design in a wide variety of engineering discip- lines, Chapter 8 shows how the Robust Design method can be used to optimize prod- uct design when computer simulation models are available. The differential operational amplifier case study is used to illustrate the optimization procedure. This chapter also shows the use of orthogonal arrays to simulate the variation in component values and environmental conditions, and thus estimate the yield of a product. Chapter 9 shows the procedure for designing an ON-OFF control system for a temperature controller, The use of Robust Design for improving the performance of a hardware-software sys- tem is described in Chapter 10 with the help of the UNIX operating system tuning case study. Chapter 11 describes the router bit life study and explains how Robust Design can be used to improve reliability. 1.8 SUMMARY + Robust Design is an engineering methodology for improving productivity during research and development so that high-quality products can be produced quickly and at low cost. Its use can greatly improve an organization’s ability to meet market windows, keep development and manufacturing costs low, and deliver high-quality products. Sec. 1.8 Summary " + Through his research in the 1950s and early 1960s, Dr. Genichi Taguchi developed the foundations of Robust Design and validated the basic, underlying philosophies by applying them in the development of many products. + Robust Design uses many ideas from statistical experimental design and adds a new dimension to it by explicitly addressing two major concerns faced by all product and process designers: a. How to reduce economically the variation of a product’s function in the customer's environment. b. How to ensure that decisions found optimum during laboratory experiments will prove to be so in manufacturing and in customer environments. . The ideal quality a customer can receive is that every product delivers the target performance each time the product is used, under all intended operating condi- tions, and throughout the product's intended life, with no harmful side effects. The deviation of a product's performance from the target causes loss to the user of the product, the manufacturer, and, in varying degrees, to the rest of society as well. The quality level of a product is measured in terms of the total loss to the society due to functional variation and harmful side effects. The three main categories of cost one must consider in delivering a product are: (1) operating cost: the cost of energy, environmental control, maintenance, inven- tory of spare parts, etc. (2) manufacturing cost: the cost of equipment, machinery, raw materials, labor, scrap, network, ete. (3) R&D cost: the time taken to develop a new product plus the engineering and laboratory resources needed The fundamental principle of Robust Design is to improve the quality of a prod- izing the effect of the causes of variation without eliminating the izing the product and process designs 10 make the performance minimally sensitive to the various causes of variation, a process called parameter design. * The two major tools used in Robust Design are: (1) signal-to-noise ratio, which measures quality and (2) orthogonal arrays, which are used to study many design parameters simultaneously. The Robust Design method has been found valuable in virtually all engineering fields and business applications. Chapter 2 PRINCIPLES OF QUALITY ENGINEERING A produet’s life cycle can be divided into two main parts: before sale to the customer and after sale to the customer. All costs incurred prior to the sale of the product are added to the unit manufacturing cost (ume), while all costs incurred after the sale are lumped together as quality loss. Quality engineering is concerned with reducing both of these costs and, thus, is an interdisciplinary science involving engineering design, manufacturing operations, and economics. It is often said that higher quality (lower quality loss) implies higher unit manufacturing cost. Where does this misconception come from? It arises because engineers and managers, unaware of the Robust Design method, tend to achieve higher quality by using more costly parts, components, and manufacturing processes. In this chapter we delineate the basic principles of quality engineering and put in perspective the role of Robust Design in reducing the quality loss as well as the ume, This chapter contains nine sections: * Sections 2.1 and 2.2 are concerned with the quantification of quality loss. Sec- tion 2.1 describes the shortcomings of using fraction defective as a measure of quality loss. (This is the most commonly used measure of quality loss.) Sec- tion 2.2 describes the quadratic loss function, which is a superior way of quanti- fying quality loss in most situations. * Section 2.3 describes the various causes, called noise factors, that lead to the deviation of a product’s function from its target. 13 4 Principles of Quality Engineering Chap. 2 Section 2.4 focuses on the computation of the average quality loss, its com- ponents, and the relationship of these components to the noise factors. Section 2.5 describes how Robust Design exploits nonlinearity to reduce the average quality loss without increasing umc. Section 2.6 describes the classification of parameters, an important activity in quality engineering for recognizing the different roles played by the various parameters that affect a product’s performance. Section 2.7 discusses different ways of formulating product and process design optimization problems and gives a heuristic solution. Section 2.8 addresses the various stages of the product realization process and the role of various quality control activities in these stages. Section 2.9 summarizes the important points of this chapter. Various aspects of quality engineering are described in the following references: Taguchi (T2], Taguchi and Wu [T7], Phadke [P2], Taguchi and Phadke (T6], Kackar [K1,K2J, Taguchi (T4], Clausing [C1], and Byme and Taguchi [B4]. 2.1 QUALITY LOSS FUNCTION—THE FRACTION DEFECTIVE FALLACY ‘We have defined the quality level of a product to be the total loss incurred by society due to the failure of the product to deliver the target performance and due to harmful side effects of the product, including its operating cost. Quantifying this loss is difficult because the same product may be used by different customers, for different applications, under different environmental conditions, etc. However, it is important to quantify the loss so that the impact of alternative product designs and manufacturing processes on customers can be evaluated and appropriate engineering decisions made. Moreover, it is critical that the quantification of loss not become a major task that con- sumes substantial resources at various stages of product and process design. It is common to measure quality in terms of the fraction of the total number of units that are defective. This is referred to as fraction defective. Although commonly used, this measure of quality is often incomplete and misleading. It implies that ail products that meet the specifications (allowable deviations from the target response) are equally good, while those outside the specifications are bad. The fallacy here is that the product that barely meets the specifications is, from the customer's point of view, as good or as bad as the product that is barely outside the specifications. In reality, the product whose response is exactly on target gives the best performance. As the product's response deviates from the target, the quality becomes progressively worse. Seo. 2.1 Quality Loss Function—The Fraction Defective Fallacy 15 Example—Television Set Color Density: The deficiency of fraction defective as a quality measure is well-illustrated by the Sony television customer preference study published by the Japanese newspaper, The Asahi (T8]. In the late 1970s, American consumers showed a preference for the televi- sion sets made by Sony-Japan over those made by Sony-USA. The reason cited in the study was quality. Both factories, however, made televisions using identical designs and tolerances. What could then account for the perceived difference in quality? In its investigative report, the newspaper showed the distribution of color density for the sets made by the two factories (see Figure 2.1). In the figure, m is the target color density and m+5 are the tolerance limits (allowable manufacturing deviations). ‘The distribution for the Sony-Japan factory was approximately normal with mean on target and a standard deviation of 5/3. The distribution for Sony-USA was approxi- mately uniform in the range of m5. Among the sets shipped by Sony-Japan, about 0.3 percent were outside the tolerance limits, while Sony-USA shipped virtually no sets outside the tolerance limits. Thus, the difference in customer preference could not be explained in terms of the fraction defective sets. Sony—USA Sony—Japan Figure 2.1 Distribution of color density in television sets. (Source: The Asahi, April 17, 1979), The perceived difference in quality becomes clear when we look closely at the sets that met the tolerance limits. Sets with color density very near m perform best and can be classified grade A. As the color density deviates from m, the performance becomes progressively worse, as indicated in Figure 2,1 by grades B and C, It is clear 16 Principles of Quality Engineering Chap. 2 that Sony-Japan produced many more grade A sets and many fewer grade C sets when compared to Sony-USA. Thus, the average grade of sets produced by Sony-Japan was better, hence the customer's preference for the sets made by Sony-Japan, In short, the difference in the customer's perception of quality was a result of Sony-USA paying attention only to meeting the tolerances, whereas in Sony-Japan the attention was focused on meeting the target. Example—Telephone Cable Resistance: Using a wrong measurement system can, and often does, drive the behavior of people in wrong directions. The telephone cable example described here illustrates how using fraction defective as a measure of quality loss can permit suboptimization by the manufacturer leading to an increase in the total cost, which is the sum of quality loss and ume. A certain gauge of copper wires used in telephone cables had a nominal resis- tance value of m ohms/mile and the maximum allowed resistance was (m+ Ap) ohms/mile. This upper limit was determined by taking into consideration the manufac- turing capability, represented by the distribution (a) in Figure 2.2, at the time the specifications were written. Consequently, the upper limit (m-+Ap) was an adequate way to ensure that the drawing process used to form the copper wire was kept in con- trol with the mean on target. (b) Ny ON m-A, m m+A, Resistance (Ohms/Mile) —> Figure 2.2 Distribution of telephone cable resistance. (a) Initial distribution. (6) After pro- cess improvement and shifting the mean Sec. 2.1 Quality Loss Function—The Fraction Detective Fallacy 7 By improving the wire drawing process through the application of new technol- ogy, the manufacturer was able to reduce substantially the process variance. This per- mitted the manufacturer to move the mean close to the upper limit and still meet the fraction defective criterion for quality [see distribution (b) in Figure 2.2]. At the same time, the manufacturer saved on the cost of copper since larger resistance implies a smaller cross section of the wire. However, from the network point of view, the larger average resistance resulted in high electrical loss, causing complaints from the tele- phone users. Solving the problem in the field meant spending a lot more money for installing additional repeaters and for other corrective actions than the money saved in manufacturing—that is, the increase in the quality loss far exceeded the saving in the umc. Thus, there was a net loss to the society consisting of both the manufacturer and the telephone company who offered the service. Therefore, a quality loss metric that permits such local optimization leading to higher total cost should be avoided. Section 2.2 discusses a better way to measure the quality loss. Interpretation of Engineering Tolerances The examples above bring out an important point regarding quantification of quality loss. Products that do not meet tolerances inflict a quality loss on the manufacturer, a loss visible in the form of scrap or rework in the factory, which the manufacturer adds to the cost of the product. However, products that meet tolerance also inflict a quality loss, a loss that is visible to the customer and that can adversely affect the sales of the product and the reputation of the manufacturer. Therefore, the quality loss function must also be capable of measuring the loss due to products that meet the tolerances. Engineering specifications are invariably written as m + Ag. These specifications should not be interpreted to mean that any value in the range (m — Ap) to (m + Ag) is equally good for the customer and that as soon as the range is exceeded the product is bad. In other words, the step function shown below and in Figure 2.3(a) is an inade- quate way to quantify the quality loss: 0 if |y-m| sd LO)= lay otherwise a) Here, Ag is the cost of replacement or repair. Use of such a loss function is apt to lead to the problems that Sony-USA and the cable manufacturer faced and, hence, should be avoided. 18 Principles of Quality Engineering Chap. 2 (a) Step Function (b) Quadratic Loss Function Figure 2.3 Quality loss function, 2.2 QUADRATIC LOSS FUNCTION The quadratic loss function can meaningfully approximate the quality loss in most situations. Let y be the quality characteristic of a product and m be the target value for y. (Note that the quality characteristic is a product's response that is observed for quantifying quality level and for optimization in a Robust Design project.) According to the quadratic loss function, the quality loss is given by LQ) =kQy-my* (2.2) where k is a constant called quality loss coefficient. Equation (2.2) is plotted in Figure 2.3(b). Notice that at y = m the loss is zero and so is the slope of the loss function. This is quite appropriate because m is the best value for y. The loss L(y) increases slowly when we are near m; but as we go farther from m the loss increases more rapidly. Qualitatively, this is exactly the kind of behavior we would like the quality loss function to have. The quadratic loss function given by Equation (2.2) is the sim- plest mathematical function that has the desired qualitative behavior. Sec. 2.2 ‘Quadratic Loss Function 19 Note that Equation (2.2) does not imply that every customer who receives a prod- uct with y as the value of the quality characteristic will incur a precise quality loss equal to L(y). Rather, it implies that the average quality loss incurred by those customers is L(). The quality loss incurred by a particular customer will obviously depend on that customer’s operating environment. It is important to determine the constant & so that Equation (2.2) can best approxi- mate the actual loss within the region of interest. This is a rather difficult, though impor- tant, task. A convenient way to determine k is to determine first the functional limits for the value of y. Functional limit is the value of y at which the product would fail in half of the applications. Let m + Ag be the functional limits. Suppose, the loss at m + Ag is Ag. Then by substitution in Equation (2.2), we obtain > " (2.3) Ble Note that Ag is the cost of repair or replacement of the product. It includes the loss due to the unavailability of the product during the repair period, the cost of transporting the product by the customer to and from the repair center, etc. If a product fails in an unsafe mode, such as an automobile breaking down in the middle of a road, then the losses from the resulting consequences should also be included in Ag. Regardless of who pays for them—the customer, the manufacturer, or a third party—all these losses should be included in Ag. Substituting Equation (2,3) in Equation (2.2) we obtain A LO) = SO-myP . QA) Ao ‘We will now consider two numerical examples. Example—Television Set Color Density: Suppose the functional limits for the color density are m+7. This means about half the customers, taking into account the diversity of their environment and taste, would find the television set to be defective if the color density is m+7. Let the repair of a television set in the field cost on average Ay = $98. By substituting in Equation (2.4), the quadratic Joss function can be written as 98 ae (yom) = 2y- my. LQ) = 20 Principles of Quality Engineering Chap. 2 Thus, the average quality loss incurred by the customers receiving sets with color den- sity m + 4 is L(m +4) = $32, while customers receiving sets with color density m +2 incur an average quality loss of only L(m + 2) = $8. Example—Power Supply Circuit: Consider a power supply circuit used in a stereo system for which the target output voltage is 110 volts. If the output voltage falls outside 110 +20 volts, then the stereo fails in half the situations and must be repaired. Suppose it costs $100 to repair the stereo. Then the average loss associated with a particular value y of output voltage is given by 100 2 2 Lo) = 22S @- 110? = 0.25~- 110" . 0) = Se O- Ho) (y - 110); Variations of the Quadratic Loss Function The quadratic loss function given by Equation (2.2) is applicable whenever the quality characteristic y has a finite target value, usually nonzero, and the quality loss is sym- metric on either side of the target. Such quality characteristics are called nominal-the- best type quality characteristics and Equation (2.2) is called the nominal-the-best type quality loss function. The color density of a television set and the output voltage of a power supply circuit are examples of the nominal-the-best type quality characteristic. Some variations of the quadratic loss function in Equation (2.2) are needed to cover adequately certain commonly occurring situations. Three such variations are given below. © Smaller-the-better type characteristic. Some characteristics, such as radiation leakage from a microwave oven, can never take negative values. Also, their ideal value is equal to zero, and as their value increases, the performance becomes progressively worse. Such characteristics are called smaller-the-better type quality characteristics. The response time of a computer, leakage current in electronic circuits, and pollution from an automobile are additional examples of this type of quality characteristic. The quality loss in such situations can be approximated by the following function, which is obtained from Equation (2.2) by substituting m = 0: LQ) = ky? 5) Note this is a one-sided loss function because y cannot take negative values. As described earlier, the quality loss coefficient k can be determined from the func- tional limit, Ap, and the quality loss, Ag, can be determined at the functional limit by using Equation (2.3). Sec. 2.2 Quadratic Loss Function a * Larger-the-better type characteristic. Some characteristics, such as the bond strength of adhesives, also do not take negative values. But, zero is their worst value, and as their value becomes larger, the performance becomes progressively better—that is, the quality loss becomes progressively smaller. Their ideal value is infinity and at that point the loss is zero. Such characteristics are called larger-the-better type quality characteristics. It is clear that the reciprocal of such a characteristic has the same qualitative behavior as a smaller-the-better type characteristic. Thus, we approximate the loss function for a larger-the-better type characteristic by substituting 1/y for y in Equation (2.5): oye4[4] : (2.6) y ‘The rationale for using Equation (2.6) as the quality loss function for larger-the- better type characteristics is discussed further in Chapter 5. To determine the constant k for this case, we find the functional limit, Ap, below which more than half of the products fail, and the corresponding loss Ag. Substituting Ay and Ag in Equation (2.6), and solving for k, we obtain k=AgA3 . 7) * Asymmetric loss function, In certain situations, deviation of the quality charac- teristic in one direction is much more harmful than in the other direction, In such cases, one can use a different coefficient k for the two directions. ‘Thus, the quality loss would be approximated by the following asymmetric loss function: kio-m), y>m L0)= |pg-m)?, y or D3, since the average 7| for both D2 and D3 is —40 dB. Based on the matrix experiment, we can conclude that the settings A,B,C D> and A,B,C2D¥ would give the highest 1| or the lowest surface defect count. ‘The predicted best settings need not correspond to one of the rows in the matrix experiment. In fact, often they do not correspond as is the case in the present example. Also, typically, the value of 7 realized for the predicted best settings is better than the best among the rows of the matrix experiment. 48 Matrix Experiments Using Orthogonal Arrays Chap. 3 3.3 ADDITIVE MODEL FOR FACTOR EFFECTS In the preceding section, we used simple averaging to estimate factor effects. The same nine observations (1;, N2. + To) are grouped differently to estimate the factor effects. Also, the optimum combination of settings was determined by examining the effect of each factor separately. Justification for this simple procedure comes from * Use of the additive model as an approximation * Use of an orthogonal array to plan the matrix experiment We now examine the additive model. The relationship between 7) and the pro- cess parameters A, B, C, and D can be quite complicated. Empirical determination of this relationship can, therefore, tum out to be quite expensive. However, in most situa- tions, when 7) is chosen judiciously, the relationship can be approximated adequately by the following additive model: 11 (Ai, By Cp, Di) =P +a; + bj +o +d; +e. 5) In the above equation, 1 is the overall mean—that is, the mean value of 1 for the experimental region; the deviation from 1 caused by setting factor A at level A; is a;; the terms b;, cy and d; represent similar deviations from 1 caused by the settings B;, C, and D, of factors B, C, and D, respectively; and e stands for the error. Note that by error we imply the error of the additive approximation plus the error in the repeatabil- ity of measuring 1 for a given experiment. An additive model is also referred to as a superposition model or a variables separable model in engineering literature. Note that superposition model implies that the total effect of several factors (also called variables) is equal to the sum of the indi- vidual factor effects. It is possible for the individual factor effects to be linear, qua- dratic, or of higher order. However, in an additive model cross product terms involv- ing two or more factors are not allowed. By definition a1, a2, and a3 are the deviations from 41 caused by the three levels of factor A. Thus, a, +a, +43=0. (3.6) Similarly, b, +b2+b3=0 ce, te2 te, = d,; +d) +43 =0 G7 Sec. 3.3 Additive Model for Factor Effects aa It can be shown that the averaging procedure of Section 3.2 for estimating the factor effects is equivalent to fitting the additive model, defined by Equations (3.5), (3.6), and (3.7), by the least squares method. This is a consequence of using an orthogonal array to plan the matrix experiment. Now, consider Equation (3.2) for the estimation of the effect of setting tempera- ture at level 3: 1 may = (Mr + Ts + My) 5 wtasto tes 4d, +00) +Utas tb, te, tds +es) +(Uta; +b, +e, +d) + ey) = Fou +day+ tb, +b: +644 (€1 +e, +3) thea rayne t (e7 + es + ey) ueayet (e7 + es + ey) - (3.8) Note that the terms corresponding to the effects of factors B, C and D drop out because of Equation (3.7). Thus, ma, is an estimate of (+a). Furthermore, the error term in Equation (3.8) is an average of three error terms. Suppose 0 is the average variance for the error terms ¢1, 2, °** , é9. Then the error variance for the estimate m,, is approximately (1/3)62. (Note that in computing the error variances of the estimate m,, and other estimates in this chapter, we treat the individual error terms as independent random variables with zero mean and variance 6. In reality, this is only an approximation because the error terms include the error of the additive approximation so that the error terms are not strictly independent ran- dom variables with zero mean. ‘This approximation is adequate because the error vari- ance is used for only qualitative purposes.) This represents a 3-fold reduction in error variance compared to conducting a single experiment at the setting A; of factor A. 50 Matrix Experiments Using Orthogonal Arrays Chap. 3 Substituting Equation (3.5) in Equation (3.3) verifies that ma, estimates 1 + a2 with error variance (1/3)62. Similarly, substituting Equation (3.5) in Equation (3.4) shows that mg, estimates |. + b2 with error variance (1/3)o. It can be verified that similar relationships hold for the estimation of the remaining factor effects. The term replication number is used to refer to the number of times a particular factor level is repeated in an orthogonal array. The error variance of the average effect for a particular factor level is smaller than the error variance of a single experiment by a factor equal to its replication number. To obtain the same accuracy of the factor level averages, we would need a much larger number of experiments if we were to use the traditional approach of studying one factor at a time. For example, we would have to conduct 3 x 3 = 9 experiments to estimate the average 7, for three levels of tempera- ture alone (three repetitions each for the three levels), while keeping other factors fixed at certain levels, say, By, C1, D1. We may then fix temperature at its best setting and experiment with levels B, and B of pressure. This would need 3 x 2 = 6 additional experiments. Continuing in this manner, we can study the effects of factors C and D by performing 2 x 6 = 12 additional experiments. Thus, we would need a total of 9 + 3 x 6 = 27 experiments to study the four factors, one at a time. Compare this to only nine experiments needed for the orthogonal array based matrix experiment to obtain the same accuracy of the factor level averages. Another common approach to finding the optimum combination of factor levels is to conduct a full factorial experiment—that is, conduct experiments under all combi- nations of factor levels. In the present example, it would mean conducting experiments under 34 = 81 distinct combinations of factor levels, which is much larger than the nine experiments needed for the matrix experiment. When the additive model [Equa- tion (3.5)] holds, it is obviously unnecessary to experiment with all combinations of factor levels. Fortunately, in most practical situations the additive model provides an excellent approximation. The additivity issue is discussed in much detail in Chapters 5 and 6. Conducting matrix experiments using orthogonal arrays has another statistical advantage. If the errors, ¢;, are independent with zero mean and equal variance, then the estimated factor effects are mutually uncorrelated. Consequently, the best level of each factor can be determined separately. In order to preserve the benefits of using an orthogonal array, it is important that all experiments in the matrix be performed. If experiments corresponding to one or more rows are not conducted, or if their data are missing or erroneous, the balancing property and, hence, the orthogonality is lost. In some situations, incomplete matrix experiments can give useful results, but the analysis of such experiments is compli- cated, (Statistical techniques used for analyzing such data are regression analysis and linear models; see Draper and Smith [D4].) Thus, we recommend that any missing experiments be performed to complete the matrix. Sec. 3.4 Analysis of Variance 51 3.4 ANALYSIS OF VARIANCE Different factors affect the surface defect formation to a different degree. The relative magnitude of the factor effects could be judged from Table 3.3, which gives the aver- age 1) for each factor level. A better feel for the relative effect of the different factors can be obtained by the decomposition of variance, which is commonly called analysis of variance (ANOVA). ANOVA is also needed for estimating the error variance for the factor effects and variance of the prediction error. Analogy with Fourier Analysis An important reason for performing Fourier analysis of an electrical signal is to deter- mine the power in each harmonic to judge the relative importance of the various har- monics. The larger the amplitude of a harmonic, the larger the power is in it and the more important it is in describing the signal. Similarly, an important purpose of ANOVA is to determine the relative importance of the various factors. In fact, there is a strong analogy between ANOVA and the decomposition of the power of an electrical signal into different harmonics: + The nine observed values of 1) are analogous to the observed signal. * The sum of squared values of 7 is analogous to the power of the signal. + The overall mean 7) is analogous to the de part of the signal. * The four factors are like four harmonics. * The columns in the matrix experiment are orthogonal, which is analogous to the orthogonality of the different harmonics. The analogy between the Fourier analysis of the power of an electrical signal and ANOVA is displayed in Figure 3.2. The experiments are arranged along the horizontal axis like time. The overall mean is plotted as a straight line like a dc component. The effect of each factor is displayed as an harmonic. The level of factor A for experi- ments 1, 2, and 3 is Ay. So, the height of the wave for A is plotted as my, for these experiments. Similarly, the height of the wave for experiments 4, 5, and 6 is m,,, and the height for experiments 7, 8, and 9 is m4,. The waves for the other factors are also plotted similarly. By virtue of the additive model [Equation (3.5)], the observed 1 for any experiment is equal to the sum of the height of the overall mean and the deviation from mean caused by the levels of the four factors. By referring to the waves of the different factors shown in Figure 3.2 it is clear that factors A, B, C, and D are in the decreasing order of importance. Further aspects of the analogy are discussed in the rest of this section. 52 Matrix Experiments Using Orthogonal Arrays Observed S/N Ratio Overall Mean Effect of Factor A Effect of Factor B Effect of Factor C Effect of Factor D 123456789 Experiment Number Figure 3.2 Orthogonal decomposition of the observed S/N ratio. Chap. 3 Sec. 3.4 Analysis of Variance 53 Computation of Sum of Squares The sum of the squared values of 1 is called grand total sum of squares. Thus, we have 9 Grand total sum of squares = 1? Ai = (- 207 + 10Y +++ + 70" = 19,425 (dB)? . The grand total sum of squares is analogous to the total signal power in Fourier analysis. It can be decomposed into two parts—sum of squares due to mean and total sum of squares which are defined as follows: Sum of squares due to mean = (number of experiments) x m? =9 (41.677 = 15,625 (dB)? . 9 Total sum of squares = J (n;-m)* A = (-20-41.67? + (-10-41.67)? ++» + (-70-41.67"" = 3,800 (dB)? . The sum of squares due to mean is analogous to the de power of the signal and the total sum of squares is analogous to the ac power of the signal in Fourier analysis. Because m is the average of the nine 1); values, we have the following algebraic identity: (i-myP Me 54 Matrix Experiments Using Orthogonal Arrays Chap. 3 which can also be written as Total sum of squares = (grand total sum of squares) —(sum of squares due to mean) The above equation is analogous to the fact from Fourier analysis that the ac power is equal to the difference between the total power and the de power of the signal. The sum of squares due to factor A is equal to the total squared deviation of the wave for factor A from the line representing the overall mean, There are three experi- ments each at levels Ay, A2, and A3. Consequently, Sum of squares due to factor A = 3¢mg,—m)? + 3¢mg, my? + (m4, -m)? = 3-20+41.67) +3(-45 +41.67)? +3(-60+41.67)> = 2450 (4B). Proceeding along the same lines, we can show that the sum of squares due to factors B, C, and D are, respectively, 950, 350, and 50 (dB)*. These sums of squares values are tabulated in Table 3.4. The sums of squares values due to various factors are analogous to the power in various harmonics, and are a measure of the relative impor- tance of the factors in changing the values of 1. Thus, factor A explains a major portion of the total variation of 1. In fact, it is responsible for (2450/3800) x 100 = 64.5 percent of the variation of 7. Factor B is responsible for the next largest portion, namely 25 percent; and factors C and D together are responsible for only a small portion, a total of 10.5 percent, of the varia- tion in q. Knowing the factor effects (that is, knowing the yalues of a;, bj, cy, and d)), we can use the additive model given by Equation (3.5) to calculate the error term ¢; for each experiment i. The sum of squares due to error is the sum of the squares of the error terms. Thus we have, 9 Sum of squares due to error = e? isl In the present case study, the total number of model parameters (\l, a1, @2, 43, 61, b2, etc.) is 13; the number of constraints, defined by Equations (3.6) and (3.7) is 4. The Sec. 3.4 Analysis of Variance 55 number of model parameters minus the number of constraints is equal to the number of experiments. Hence, the error term is identically zero for each experiment. Hence, the sum of squares due to error is also zero. Note that this need not be the situation with all matrix experiments. TABLE 3.4 ANOVA TABLE FOR n Degrees of | Sum of | Mean Factor/Source Freedom | Squares | Square | F ‘A. Temperature 2 raso | 1225 | 12.25 B. Pressure 2 950 47s | 4.75 C. Settling time 2 aso | 175 D. Cleaning method 2 50* 25 Enor 0 ° - Total 8 3800 (Error) ) (400) | (100) * Indicates sum of squares added together to estimate the pooled error sum of squares indicated by parentheses. F ratio is calculated by using the pooled error mean square. Relationship Among the Various Sums of Squares The orthogonality of the matrix experiment implies the following relationship among the various sums of squares: (Total sum of squares) = (sum of the sums of squares due to various factors) + (sum of squares due to error) . (3.9) Equation (3.9) is analogous to Parseval's equation for the decomposition of the power of a signal into power in different harmonics, Equation (3.9) is often used for calculat- ing the sum of squares due to erfor after computing the total sum of squares and the sum of squares due to various factors. Derivation of Equation (3.9) as well as detailed mathematical description of ANOVA can be found in many books on statistics, such as Scheffé [S1], Rao [R3], and Searle [S2}. 56 Matrix Experiments Using Orthogonal Arrays Chap. 3 For the matrix experiment described in this chapter, Equation (3.9) implies: (Total sum of squares) = (sum of the sums of squares due to factors A, B, C, and D) + (sum of squares due to error) . Note that the various sums of squares tabulated in Table 3.4 do satisfy the above equation. Degrees of Freedom The number of independent parameters associated with an entity like a matrix experi- ment, or a factor, or a sum of squares is called its degrees of freedom. A matrix experi- ment with nine rows has nine degrees of freedom and so does the grand total sum of squares. The overall mean has one degree of freedom and so does the sum of squares due to mean. Thus, the degrees of freedom associated with the total sum of squares is 9-1=8. (Note that total sum of squares is equal to grand total sum of squares minus the sum of squares due to mean.) Factor A has three levels, so its effect can be characterized by three parameters: a1, a, and @3. But these parameters must satisfy the constrain given by Equation (3.6). Thus, effectively, factor A has only two independent parameters and, hence, two degrees of freedom. Similarly, factors B, C, and D have two degrees of freedom each. In gen- eral, the degrees of freedom associated with a factor is one less than the number of levels. The orthogonality of the matrix experiment implies the following relationship among the various degrees of freedom: (Degrees of freedom for the total sum of squares) = (sum of the degrees of freedom for the various factors) + (degrees of freedom for the error) . 3.10) Note the similarity between Equations (3.9) and (3.10). Equation (3.10) is useful for computing the degrees of freedom for error. In the present case study, the degrees of freedom for error comes out to be zero. This is consistent with the earlier observation that the error term is identically zero for each experiment in this case study. It is customary to write the analysis of variance in a tabular form shown in Table 3.4, The mean square for a factor is computed by dividing the sum of squares by the degrees of freedom. Sec. 3.4 Analysis of Variance 57 Estimation of Error Variance The error variance, which is equal to the error mean square, can then be estimated as follows: sum of squares due to error Error Variaiice = <7 ees of feedomfor error” GAD The error variance is denoted by 02. In the interest of gaining the most information from a matrix experiment, all or most of the columns should be used to study process or product parameters. As a Tesult, no degrees of freedom may be left to estimate error variance. Indeed, this is the situation with the present example. In such situations, we cannot directly estimate the error variance. However, an approximate estimate of the error variance can be obtained by pool- ing the sum of squares corresponding to the factors having the lowest mean square. As a rule of thumb, we suggest that the sum of squares corresponding to the bottom half of the factors (as defined by lower mean square) corresponding to about half of the degrees of freedom be used to estimate the error mean square or error variance. This tule is similar to considering the bottom half harmonics in a Fourier expansion as error and using the rest to explain the function being investigated. In the present example, we use factors C and D to estimate the error mean square. Together, they account for four degrees of freedom and the sum of their sum of squares is 400. Hence, the error variance is 100. Error variance computed in this manner is indicated by parentheses, and the computation method is called pooling. (By the traditional statistical assump- tions, pooling gives a biased estimate of error variance. To obtain a better estimate of error variance, a significantly larger number of experiments would be needed, the cost of which is usually not justifiable compared to the added benefit.) In Fourier analysis of a signal, it is common to compute the power in all har- monics and then use only those harmonics with large power to explain the signal and treat the rest as error. Pooling of sum of squares due to bottom half factors is exactly analogous to that practice. After evaluating the sum of squares due to all factors, we retain only the top half factors to explain the variation in the process response n and the rest to estimate approximately the error variance. The estimation of the error variance by pooling will be further illustrated through the applications discussed in the subsequent chapters. As it will be apparent from these applications, deciding which factors’ sum of squares should be included in the error variance is usually obvious by inspecting the mean square column. The decision process can sometimes be improved by using a graphical data analysis technique called half-normal plots (see Daniel [D1] and Box, Hunter, and Hunter [B3)). 88 Matrix Experiments Using Orthogonal Arrays Chap. 3 Confidence intervals for Factor Effects Confidence intervals for factor effects are useful in judging the size of the change caused by changing a factor level compared to the error standard deviation. As shown in Section 3.3, the variance of the effect of each factor level for this example is (1/3) 62 = (1/3)(100) = 33.3 (dB)”. Thus, the width of the two-standard-deviation confidence interval, which is approximately 95 percent confidence interval, for each estimated effect is +2 ¥33.3 =+11.5 dB. In Figure 3.1 these confidence intervals are plotted for only the starting level to avoid crowding. Variance Ratio The variance ratio, denoted by F in Table 3.4, is the ratio of the mean square due to a factor and the error mean square. A large value of F means the effect of that factor is large compared to the error variance. Also, the larger the value of F, the more impor- tant that factor is in influencing the process response 1). So, the values of F can be used to rank order the factors. In statistical literature, the F value is often compared with the quantiles of a probability distribution called the F-distribution to determine the degree of confidence that a particular factor effect is real and not just a random occurrence (see, for example, Hogg and Craig (H3]), However, in Robust Design we are not concemed with such probability statements; we use the F ratio for only qualitative understanding of the rela- tive factor effects, A value of F less than one means the factor effect is smaller than the error of the additive model. A value of F larger than two means the factor is not quite small, whereas larger than four means the factor effect is quite large. Interpretation of ANOVA Tables Thus far in this section, we have described the computation involved in the ANOVA table, as well as the inferences that can be made from the table. A variety of computer programs can be used to perform the calculations, but the experimenter must make appropriate inferences. Here we put together the major inferences from the ANOVA table. Referring to the sum of squares column in Table 3.4, notice that factor A makes the largest contribution to the total sum of squares, namely, (2450/3800) x 100 = 64.5 percent. Factor B makes the next largest contribution, (950/3800) x 100 = 25.0 per- Cent, to the total sum of squares. Factors C and D together make only a 10.5 percent contribution to the total sum of squares. The larger the contribution of a particular fac- tor to the total sum of squares, the larger the ability is of that factor to influence 7). In this matrix experiment, we have used all the degrees of freedom for estimating the factor effects (four factors with two degrees of freedom each make up all the eight degrees of freedom for the total sum of squares). Thus, there are no degrees of Sec. 3.5 Prediction and Diagnosis 59 freedom left for estimating the error variance. Following the rule of thumb spelled out earlier in this section, we use the bottom half factors that have the smallest mean square to estimate the error variance. Thus, we obtain the error sum of squares, indi- cated by parentheses in the ANOVA table, by pooling the sum of squares due to fac- tors C and D. This gives 100 as an estimate of the error variance. The largeness of a factor effect relative to the error variance can be judged from the F column. The larger the F value, the larger the factor effect is compared to the error variance. This section points out that our purpose in conducting ANOVA is to determine the relative magnitude of the effect of each factor on the objective function 1) and to estimate the error variance. We do not attempt to make any probability statements about the significance of a factor as is commonly done in statistics. In Robust Design, ANOVA is also used to choose from among many altematives the most appropriate quality characteristic and S/N ratio for a specific problem. Such an application of ANOVA is described in Chapter 8. Also, ANOVA is useful in computing the S/N ratio for dynamic problems as described in Chapter 9. 3.5 PREDICTION AND DIAGNOSIS Prediction of n under Optimum Conditions ‘As discussed earlier, a primary goal of conducting Robust Design experiments is to determine the optimum level for each factor. For the CVD project, one of the two identified optimum conditions is A; B, C2, D2. The additive model, Equation (3.5), can be used to predict the value of 1) under the optimum conditions, denoted by Nop, as follows: Mop =m + (m4,—m) + (mg,—m) = ~41.67 + (-20+41.67) + (-30+41.67) =-8.33 dB. @.12) Note that since the sum of squares due to factors C and D are small and that these terms are included as error, we do not include the corresponding improvements in the prediction of 7 under optimum conditions. Why are the contributions by factors having a small sum of squares ignored? Because if we include the contribution from all factors, it can be shown that the predicted improvement in 7) exceeds the actual realized improvement—that is, our prediction would be biased on the higher side. By ignoring the contribution from factors with small sums of squares, we can reduce this 60 Matrix Experiments Using Orthogonal Arrays Chap. 3 bias. Again, this is a rule of thumb. For more precise prediction, we need to use appropriate shrinkage coefficients described by Taguchi [T1}. Thus, by Equation (3.12) we predict that the defect count under the optimum conditions would be —8.33 dB. This is equivalent to a mean square count of = y=lo © = 10°83 = 6.8 (defects /unit area)” The corresponding root-mean-square defect count is V6.8 = 2.6 defects/unit area, The purpose of taking log in constructing the S/N ratio can be explained in terms of the additive model. If the actual defect count were used as the characteristic for constructing the additive model, it is quite possible that the defect count predicted under the optimum conditions would have been negative. This is highly undesirable since negative counts are meaningless. However, in the log scale, such negative counts cannot occur. Hence, it is preferable to take the log. The additive model is also useful in predicting the difference in defect counts between two process conditions. The anticipated improvement in changing the process conditions from the initial settings (ABC ,D1) to the optimum settings (A;B;C2D2) is AM = Nop ~ Nhinitiat = (Ma, — ™a,) + (mg, — mig,) (—20+45) + (-30+40) =35dB. (3.13) Once again we do not include the terms corresponding to factors C and D for the rea- sons explained earlier. Verification (Confirmation) Experiment After determining the optimum conditions and predicting the response under these con- ditions, we conduct an experiment with optimum parameter settings and compare the observed value of 1) with the prediction. If the predicted and observed 1 are close to each other, then we may conclude that the additive model is adequate for describing the dependence of nj on the various parameters. On the contrary, if the observation is drastically different from the prediction, then we say the additive model is inadequate. This is evidence of a strong interaction among the parameters, which is described later in this section. Sec. 3.5 Prediction and Diagnosis 61 Variance of Prediction Error We need to determine the variance of the prediction error so that we can judge the closeness of the observed Mop: to the predicted Nog. The prediction error, which is the difference between the observed Noi» and the predicted Tho, has two independent com- ponents. The first component is the error in the prediction of Tho caused by the errors in the estimates of m, m,,, and mg,. The second component is the repetition error of an experiment. Because these two components are independent, the variance of the prediction error is the sum of their respective variances, Consider the first component. Its variance can be shown equal to (1/n9)6? where o2 is the error variance whose estimation was discussed earlier, no is the equivalent sample size for the estimation of Nox. The equivalent sample size ny can be computed as follows: (3.14) where n is the number of rows in the matrix experiment and m4, is the number of times level A, was repeated in the matrix experiment—that is, m4, is the replication number for factor level A, and ng, is the replication number for factor level B . Observe the correspondence between Equations (3.14) and (3.12). The term (1/n) in Equation (3.14) corresponds to the term m in the prediction Equation (3.12); and the terms (1/n,,—1/n) and (1/ng, —1/n) correspond, respectively, to the terms (m,,—m) and (mp,—m). This correspondence can be used to generalize Equation (3.14) to other prediction formulae. Now, consider the second component. Suppose we repeat the verification experi- ment 7, times under the optimum conditions and call the average 7) for these exp ments as the observed Nop. The repetition error is given by (1/n,)02. Thus, the vari- ance of the prediction error, OZped, iS 2 fig fh cia (Jet (2 In the example, n = 9 and ng, = ng, = 3. Thus, (1/9) = (1/9) + (1/3-1/9) + (1/3-1/9) = (5/9). Suppose n, = 4. Then 3 G15) Rea = [$e + [+e = 80.64? 62 Matrix Experiments Using Orthogonal Arrays Chap. 3 The corresponding two-standard-deviation confidence limits for the prediction error are £17.96 dB. If the prediction error is outside these limits, we should suspect the possi- bility that the additive model is not adequate. Otherwise, we consider the additive model to be adequate. Uniformity of Prediction Error Variance It is obvious from Equation (3.15) that the variance of the prediction error, SFreds is the same for all combinations of the factor levels in the experimental region. It does not matter whether the particular combination does or does not correspond to one of the rows in the matrix experiment. Before conducting the matrix experiment we do not know what would be the optimum combination. Hence, it is important to have the property of uniform prediction error. Interactions among Control Factors The concept of interactions can be understood from Figure 3.3. Figure 3.3(a) shows the case of no interaction between two factors A and B. Here, the lines of the effect of factor A for the settings B,, B>, and B3 of factor B are parallel to each other. Parallel lines imply that if we change the level of factor A from A, to Az or A3, the corresponding change in 1) is the same regardless of the level of factor B. Similarly, a change in level of B produces the same change in 1) regardless of the level of factor A. The additive model is perfect for this situation. Figures 3.3(b) and 3.3(c) show two ‘examples of presence of interaction. In Figure 3.3(b), the lines are not parallel, but the direction of improvement does not change. In this case, the optimum levels identified by the additive model are still valid. Whereas in Figure 3.3(c), not only are the lines not parallel, but the direction of improvement is also not consistent, In such a case, the optimum levels identified by the additive model can be misleading. The type of interaction in Figure 3.3(b) is sometimes called synergistic interaction while the one in Figure 3.3(c) is called antisynergistic interaction. The concept of interaction between two factors described above can be generalized to apply to interaction among three or more factors. When interactions between two or more factors are present, we need cross prod- uct terms to describe the variation of 7 in terms of the control factors, A model for such a situation needs more parameters than an additive model and, hence, it needs more experiments to estimate all the parameters. Further, as discussed in Chapter 6, using a model with interactions can have problems in the field. ‘Thus, we consider the presence of interactions to be highly undesirable and try to eliminate them. When the quality characteristic is correctly chosen, the S/N ratio is properly con- structed, and the control factors are judiciously chosen (see Chapter 6 for guidelines), the additive model provides excellent approximation for the relationship between 1) and the control factors. The primary purpose of the verification experiment is to warn us Sec.38 Summary 63 By 8, A A, Ay A A, Ay A A, Ay (@) No Interaction (b) Synergistic (0) Antisynergistic Interaction Interaction Figure 3.3. Examples of interaction. when the additive model is not adequate and, thus, prevent faulty process and product designs from going downstream. Some applications call for a broader assurance of the additive model. In such cases, the verification experiment consists of two or more con- ditions rather than just the optimum conditions. For the additive model to be con- sidered adequate, the predictions must match the observation under all conditions that are tested. Also, in certain situations, we can judge from engineering knowledge that particular interactions are likely to be important. Then, orthogonal arrays can be suit- ably constructed to estimate those interactions along with the main effects, as described in Chapter 7. 3.6 SUMMARY * A matrix experiment consists of a set of experiments where the settings of several product or process parameters to be studied are changed from one experi- ment to another. Matrix experiments are also called designed experiments, parameters are also called factors, and parameter settings are also called levels. Conducting matrix experiments using orthogonal arrays is an important technique in Robust Design. It gives more reliable estimates of factor effects with fewer experiments when compared to the traditional methods, such as one factor at a time experiments. Consequently, more factors can be studied in given R&D Tesources, leading to more robust and less expensive products. Matrix Experiments Using Orthogonal Arrays Chap. 3 * The columns of an orthogonal array are pairwise orthogonal—that is, for every pair of columns, all combinations of factor levels occur an equal number of times. The columns of the orthogonal array represent factors to be studied and the rows represent individual experiments. Conducting a matrix experiment with an orthogonal array is analogous to finding the frequency response function of a dynamic system by using a multifrequency input. The analysis of data obtained from matrix experiments is analogous to Fourier analysis, * Some important terms used in matrix experiments are: The region formed by the factors being studied and their alternate levels is called the experimental region, The starting levels of the factors are the levels used before conducting the matrix experiment. The main effects of the factors are their separate effects. If the effect of a factor depends on the level of another factor, then the two factors are said to have an interaction. Otherwise, they are considered to have no interac- tion. The replication number of a factor level is the number of experiments in the matrix experiment that are conducted at that factor level. The effect of a fac- tor level is the deviation it causes from the overall mean response. The optimum level of a factor is the level that gives the highest S/N ratio. An additive model (also called superposition model or variables separable model) is used to approximate the relationship between the response variable and the factor levels, Interactions are considered errors in the additive model. * Orthogonal array based matrix experiments are used for a variety of purposes in Robust Design. ‘They are used to: — Study the effects of control factors — Study the effects of noise factors — Evaluate the S/N ratio — Determine the best quality characteristic or S/N ratio for particular applica- tions, + Key steps in analyzing data obtained from a matrix experiment are: 1. Compute the appropriate summary statistics, such as the S/N ratio for each experiment. 2. Compute the main effects of the factors. 3. Perform ANOVA to evaluate the relative importance of the factors and the error variance. 4, Determine the optimum level for each factor and predict the S/N ratio for the optimum combination. Sec.36 Summary 85 5. Compare the results of the verification experiment with the prediction, If the results match the prediction, then the optimum conditions are con- sidered confirmed; otherwise, additional analysis and experimentation are needed. * If one or more experiments in a matrix experiment are missing or erroneous, then those experiments should be repeated to complete the matrix. This avoids the need for complicated analysis. * Matrix experiment, followed by a verification experiment, is a powerful tool for detecting the presence of interactions among the control factors. If the predicted response under the optimum conditions does not match the observed response, then it implies that the interactions are important. If the predicted response matches the observed response, then it implies that the interactions are probably not important and that the additive model is a good approximation. Chapter 4 STEPS IN ROBUST DESIGN As explained in Chapter 2, optimizing a product or process design means determining the best architecture, levels of control factors, and tolerances. Robust Design is a methodclogy for finding the optimum settings of the control factors to make the prod- uct or process insensitive to noise factors. It involves eight steps that can be grouped into the three major categories of planning experiments, conducting them, and analyz- ing and verifying the results. * Planning the experiment 1) Identify the main function, side effects, and failure modes. 2) Identify noise factors and the testing conditions for evaluating the quality loss. 3) Identify the quality characteristic to be observed and the objective function to be optimized. 4) Identify the control factors and their alternate levels. 5) Design the matrix experiment and define the data analysis procedure, + Performing the experiment 6) Conduct the matrix experiment. 67 68 Steps in Robust Design Chap. 4 + Analyzing and verifying the experiment results 7) Analyze the data, determine optimum levels for the control factors, and predict performance under these levels. 8) Conduct the verification (also called confirmation) experiment and plan future actions. These eight steps make up a Robust Design cycle. We will illustrate them in this chapter by using a case study of improving a polysilicon deposition process. The case study was conducted by Peter Hey in 1984 as a class project for the first offering of the 3-day Robust Design course developed by the author, Madhav Phadke, and Chris Sherrerd, Paul Sherry, and Rajiv Keny of AT&T Bell Laboratories. Hey and Shenry jointly planned the experiment and analyzed the data. The experiment yielded a 4-fold reduction in the standard deviation of the thickness of the polysilicon layer and nearly two orders of magnitude reduction in surface defects, a major yield-limiting problem which was virtually eliminated. These results were achieved by studying the effects of six control factors by conducting experiments under 18 distinct combinations of the levels of these factors—a rather small investment for huge benefits in quality and yield. This chapter consists of nine sections: * Sections 4.1 through 4.8 describes in detail the polysilicon deposition process case study in terms of the eight steps that form a Robust Design cycle. * Section 4.9 summarizes the important points of this chapter. 4.1 THE POLYSILICON DEPOSITION PROCESS AND ITS MAIN FUNCTION Manufacturing very large scale intergrated (VLSI) circuits involves about 150 major steps. Deposition of polysilicon comes after about half of the steps are complete, and, as a result, the silicon wafers (thin disks of silicon) used in the process have a significant amount of value-added by the time they reach this step. The polysilicon layer is very important for defining the gate electrodes for the transistors, There are over 250,000 transistors in a square centimeter chip areaefor the 1.75 micron (microme- ter = micron) design rules used in the case study. A hot-wall, reduced-pressure reactor (see Figure 4.1) is used to deposit polysili- con on a wafer. The reactor consists of a quartz tube which is heated by a 3-zone fur- nace. Silane and nitrogen gases are introduced at one end and pumped out the other. The silane gas pyrolizes, and a polysilicon layer is deposited on top of the oxide layer on the wafers. The wafers are mounted on quartz carriers. Two carriers, each carrying 25 wafers, can be placed inside the reactor at a time so that polysilicon is deposited simultaneously on 50 wafers. Sec. 4.1 The Polysilicon Deposition Process and its Main Function 69 Pressure Figure 4.1 Schematle diagram of a reduced pressure reactor. The function of the polysilicon deposition process is to deposit a uniform layer of a specified thickness, In the case study, the experimenters were interested in achiev- ing 3600 angstrom(A ) thickness (1A = 10° meter). Figure 4.2 shows a cross sec- tion of the wafer after the deposition of the polysilicon layer. Interlevel Dielectric SiO, 2300A P-doped Fotyaesn 3600 ‘Si Substrate, AR SiO, Gate Layer SiO, Gate Layer 360A 360A P-doped Polysilicon 2700 Figure 4.2 Cross section of a wafer showing polysilicon layer. At the start of the study, two main problems occurred during the deposition pro- cess: (1) too many surface defects (see Figure 4.3) were encountered, and (2) too large 70 Steps in Robust Design Chap. 4 a thickness variation existed within wafers and among wafers. In a subsequent VLSI manufacturing step, the polysilicon layer is patterned by an etching process to form lines of appropriate width and length. Presence of surface defects causes these lines to have variable width, which degrades the performance of the integrated circuits. The nonuniform thickness is detrimental to the etching process because it can lead to resid- ual polysilicon in some areas and an etching away of the underlying oxide layer in other areas. Figure 4.3 Photographs of polysilicon surface showing surface defects. Prior to the case study, Hey noted that the surface-defect problem was crucial because a significant percentage of wafers were scrapped due to excessive defects. Also, he observed that controlling defect formation was particularly difficult due to its intermittent occurrence; for example, some batches of wafers (50 wafers make one batch) had approximately ten defects per unit area, while other batches had as many as 5,000 defects per unit area. Furthermore, no theoretical models existed to predict defect formation as a function of the various process parameters; therefore, experi- mentation was the only way to control the surface-defect problem. However, the Sec. 4.2 Noise Factors and Testing Conditions n intermittency of the problem had rendered the traditional method of experimentation, where only one process parameter is changed at a time, virtually useless. 4.2 NOISE FACTORS AND TESTING CONDITIONS To minimize sensitivity to noise factors, we must first be able to estimate the sensi- tivity in a consistent manner for any combination of the control factor levels. This is achieved through proper selection of testing conditions. In a Robust Design project, we identify all noise factors (factors whose levels cannot be controlled during manufacturing, which are difficult to control, or expensive to control), and then select a few testing conditions that capture the effect of the more important noise factors. Simulating the effects of all noise factors is impractical because the experimenter may not know all the noise sources and because total simula- tion would require too many testing conditions and be costly. Although it is not neces- sary to include the effect of all noise factors, the experimenter should list as many of them as possible and, then, use engineering judgment to decide which are more impor- tant and what testing conditions are appropriate to capture their effects. Various noise factors exist in the deposition process. The nonuniform thickness and the surface defects of the polysilicon layer are caused by the variations in the parameters involved in the chemical reactions associated with the deposition process. First, the gases are introduced at one end of the reactor (see Figure 4.1). As they travel to the other end, the silane gas decomposes into polysilicon, which is deposited on the wafers, and into hydrogen. This activity causes a concentration gradient along the length of the reactor. Further, the flow pattern (direction and speed) of the gases need not be the same as they travel from one end of the tube to the other. The flow pattem could also vary from one part of a wafer to other parts of the same wafer. Another important noise factor is the temperature variation along the length and cross section of the tube, There are, of course, other sources of variation or noise factors, such as topography of the wafer surface before polysilicon deposition, variation in pumping speed, and variation in gas supply. For the case study of the polysilicon deposition process, Hey and Sherry decided to process one batch of 50 wafers to evaluate the quality associated with each combina- tion of control factor settings suggested by the orthogonal array experiment. Of these 50 wafers, only 3 were test wafers, while the remaining 47 were dummy wafers, which provided the needed "full load" effect while saving the cost of expensive test wafers, To capture the variation in reactant concentration, flow pattem variation, and tempera- ture variation along the length of the tube, the test wafers were placed in positions 3, 23, and 48 along the tube. Furthermore, to capture the effect of noise variation across a wafer, the thickness and surface defects were measured at three points on each test wafer: top, middle, and bottom. Other noise factors were judged to be less important. To include their effect, the experimenters would have had to process multiple batches, thus making the experiments very expensive. Consequently, the other noise factors were ignored. 72 Stops in Robust Design Chap. 4 The testing conditions for this case study are rather simple: observe thickness and surface defects at three positions of three wafers, which are placed in specific posi- tions along the length of the reactor. Sometimes orthogonal arrays (called noise orthogonal arrays) are used to determine the testing conditions that capture the effect of many noise factors. In some other situations, the technique of compound noise fac- tor is used. These two techniques of constructing testing conditions are described in Chapter 8. 4.3 QUALITY CHARACTERISTICS AND OBJECTIVE FUNCTIONS It is often tempting to observe the percentage of units that meet the specification and use that percentage directly as an objective function to be optimized, But, such temp- tation should be meticulously avoided. Besides being a poor measure of quality loss, using percentage of good (or bad) wafers as an objective function leads to orders of magnitude reduction in efficiency of experimentation. First, to observe accurately the percentage of "good" wafers, we need a large number (much larger than three) of test wafers for each combination of control factor settings. Secondly, when the percentage of good wafers is used as an objective function, the interactions among control factors often become dominant; consequently, additive models cannot be used as adequate approximations. The appropriate quality characteristics to be measured for the polysili- con deposition process in the case study were the polysilicon thickness and the surface defect count. The specifications were that the thickness should be within + 8 percent of the target thickness and that the surface defect count should not exceed 10 per square centimeter. As stated in Section 4.2, nine measurements (3 wafers x 3 measurements pet wafer) of thickness and surface defects were taken for each combination of control fac- tor settings in the matrix experiment. The ideal value for surface defects is zero—the smaller the number of surface defects per cm”, the better the wafer. So, by adopting the quadratic loss function, we see that the objective function to be maximized is = 10 logio (mean square surface defects) 133, = 10 login) 9X Lyi (4.1) “1 ja where yj; is the observed surface defect count at position j on test wafer i, Note that J=1, 2, and 3 stand for top, center, and bottom positions, respectively, on a test wafer. And i=1, 2, and 3 refer to position numbers 3, 23, and 48, respectively, along the length of the tube. Maximizing 1 leads to minimization of the quality loss due to sur- face defects. Sec. 4.3 Quality Characteristics and Objective Functions 73 The target value in the study for the thickness of the polysilicon layer was %} = 3600 A . Let tj be the observed thickness at position j on test wafer i. The mean and variance of the thickness are given by 133 Beg ZLwy (4.2) 9A 133 5 7 LL Gy-w. (4.3) 1 i The goal in optimization for thickness is to minimize variance while keeping the mean on target. This is a constrained optimization problem, which can be very difficult, especially when many control factors exist. However, as Chapter 5 shows, when a scaling factor (a factor that increases the thickness proportionally at all points on the wafers) exists, the problem can be simplified greatly. In the case study, the deposition time was a clear scaling factor—that is, for every surface area where polysilicon was deposited, (thickness) = (deposition rate) x (deposition time). The deposition rate may vary from one wafer to the next, or from one position on a wafer to another position, due to the various noise factors cited in the previous section. However, the thickness at any point is proportional to the deposition time. Thus, the constrained optimization problem in the case study can be solved in two steps as follows: 1. Maximize the Signal-to-noise (S/N) ratio, 1’, 2 10 logio 4. (44) o 2. Adjust the deposition time so that mean thickness is on target. In summary, the two quality characteristics to be measured were the surface defects and the thickness. The corresponding objective functions to be maximized were | and 1)’ defined by Equations (4.1) and (4.4), respectively. (Note that S/N ratio is a general term used for measuring sensitivity to noise factors. It takes a different form depending on the type of quality characteristic, as discussed in detail in Chapter 5. Both 1) and 1/’ are different types of S/N ratios.) The economics of a manufacturing process is determined by the throughput as well as by the quality of the products produced. Therefore, along with the quality characteristics, a throughput characteristic also must be studied. Thus, in the case study, the experimenters also observed the deposition rate, r, measured in angstroms of thickness growth per minute. 74 Steps in Robust Design Chap. 4 4.4 CONTROL FACTORS AND THEIR LEVELS Processes, such as polysilicon deposition, typically have a large number of control fac- tors (factors that can be freely specified by the process designer). The more complex a process, the more control factors it has and vice versa. Typically, we choose six to eight control factors at a time to optimize a process. For each factor we generally select two or three levels (or settings) and take the levels sufficiently far apart so that a wide region can be covered by the three levels. Commonly, one of these levels is taken to be the initial operating condition. Note that we are interested in the nonlinear- ity, so taking the levels of control factors too close together is not very fruitful. If we take only two levels, curvature effects would be missed, whereas such effects can be identified by selecting three levels for a factor (see Figure 4.4). Furthermore, by select- ing three levels, we can simultaneously explore the region on either side of the initial operating condition. Hence, we prefer three levels. 7 1 Ay Aa As Ay Az As (@) With two points we (b) With three points we can only fit a straight ‘can identify curvature line. effects and, hence, peaks. Figure 4.4 Linear and curvature effects of a factor. In the case study, six control factors were selected for optimization. These fac- tors and their alternate levels are listed in Table 4.1. The deposition temperature (A) is the steady state temperature at which the deposition takes place. When the wafers are placed in the reactor, they first have to be heated from room temperature to the deposi- tion temperature and then held at that temperature. The deposition pressure (B) is the constant pressure maintained inside the reactor through appropriate pump speed and butterfly adjustment. The nitrogen flow (C) and the silane flow (D) are adjusted using the corresponding flow meters on gas tanks. Settling time (B) is the time between placing the wafer carriers in the reactors and the time at which gases flow. The set- tling time is important for establishing thermal and pressure equilibrium inside the Sec. 4.4 Control Factors and Their Levels 75 reactor before the reaction is allowed to start. Cleaning method (F) refers to cleaning the wafers prior to the deposition step. Before undertaking the case study experiment, the practice was to perform no cleaning. The alternate two cleaning methods the experimenters wanted to study were CM 2, performed inside the reactor, and CM 3, per- formed outside the reactor. TABLE 4.1 CONTROL FACTORS AND THEIR LEVELS Levels* Factor 1 2 3 A. Deposition temperature ((C) | To-25 | To T5425 B. Deposition pressure (mtorr) | Po-200 | Po Py +200 C. Nitrogen flow (sccm) No No-150 | No-75 D, Silane flow (sccm) So-100 | So-50 | So E, Settling time (min) fo tot8 fot 16 F. Cleaning method None | CM cM, * Starting levels are identified by underscore. While deciding on the levels of control factors, a frequent tendency is to choose the levels relatively close to the starting levels. This is due to the experimenter’s con- cem that a large number of bad products may be produced during the matrix experi- ment. But, producing bad products during the experiment stage may, in fact, be beneficial because it tells us which region of control factor levels should be avoided. Also, by choosing levels that are wide apart, we increase the chance of capturing the nonlinearity of the relationship between the control factors and the noise factors, and, thus, finding the levels of control factors that minimize sensitivity to noise. Further, when the levels are wide apart, the factor effects are large when compared to the exper- imental errors. As a result, the factor effects can be identified without too many repeti- tions. Thus, it is important to resist the tendency to choose control factor levels that are rather close. Of course, during subsequent refinement experiments, levels closer to each other could be chosen. In the polysilicon deposition case study, the ratio of the largest to the smallest levels of factors B, C, D, and, E was between three and five which represents a wide variation, Temperature variation from (T)-25) °C to (To +25) °C also represents a wide range in terms of the known impact on the deposi- tion rate. 76 Steps in Robust Design Chap. 4 The initial settings of the six control factors are indicated by an underscore in Table 4.1. The objective of this project was to determine the optimum level for each factor so that 1 and 1’ are improved, while ensuring simultaneously that the deposition fate, r, remained as high as possible. Note that the six control factors and their selected settings define the experimental region over which process optimization was done. 4.5 MATRIX EXPERIMENT AND DATA ANALYSIS PLAN An efficient way to study the effect of several control factors simultaneously is to plan matrix experiments using orthogonal arrays. As pointed out in Chapter 3, orthogonal arrays offer many benefits. First, the conclusions arrived at from such experiments are valid over the entire experimental region spanned by the control factors and their set- tings. Second, there is a large saving in the experimental effort. Third, the data analysis is very easy. Finally, it can detect departure from the additive model. An orthogonal array for a particular Robust Design project can be constructed from the knowledge of the number of control factors, their levels, and the desire to study specific interactions. While constructing the orthogonal array, we also take into account the difficulties in changing the levels of control factors, other physical limita- tions in conducting experiments, and the availability of resources. In the polysilicon deposition case study, there were six factors, each at three levels. The experimenters found no particular reason to study specific interactions and no unusual difficulty in changing the levels of any factor. The available resources for conducting the experi- ments were such that about 20 batches could be processed and appropriate measure- ments made. Using the standard methods of constructing orthogonal arrays, which are described in Chapter 7, the standard array Lg was selected for this matrix experiment. The Lig orthogonal array is given in Table 4.2. It has cight columns and cigh- teen rows. The first column is a 2-level column—that is, it has only two distinct entries, namely 1 or 2. All the chosen six control factors have three levels. So, column 1 was kept empty or unassigned. From the remaining seven 3-level columns, column 7 was arbitrarily designated as an empty column, and factors A through F were assigned, respectively, to columns 2 through 6 and 8. (Note that keeping one or more columns empty does not alter the orthogonality property of the array. Thus, the matrix formed by columns 2 through 6 and 8 is still an orthogonal array. But, if one or more rows are dropped, the orthogonality is destroyed.) The reader can verify the ortho- gonality by checking that for every pair of columns all combinations of ievels occur, and they occur an equal number of times. The 18 rows of the Lyg array represent the 18 experiments to be conducted. Thus, experiment 1 is to be conducted at level 1 for each of the six control factors. These levels can be read from Table 4.1. However, to make it convenient for the experimenter and to prevent translation errors, the entire matrix of Table 4.2 should be Seo. 4.5 Matrix Experiment and Data Analysis Plan 7 translated using the level definitions in Table 4.1 to create the experimenter’s log shect shown in Table 4.3. TABLE 4,2 L,, ORTHOGONAL ARRAY AND FACTOR ASSIGNMENT Column Numbers and Factor Assignment* 3 B 4 ic 5 D 6 E 1 1 1 1 eos 10 u 12 B 14 15 16 17 18 * Empty columns are identified by e. Steps in Robust Design TABLE 4.3 EXPERIMENTER'S LOG Expt. Settling | Cleaning] No. | Temperature | Pressure | Nitrogen | Silane | Time | Method 1 | T)-25 Po-200 | No Sp-100 | ty None 2 | T)-25 Py No-150 | So-50 | t+8 | CM | 3 | T)-25 Po +200 | No-75 | So to416 | CM 4 |T% Po—200 | No So-50 | +8 | CMs 5 | To Po No~150 | So to+16 | None 6 | 7% Po+200 | No~75 | So-100 | to CM, 7 | To+28 P,—200 | No~150 | So-100 | to +16 | Ci 8 | T.+25 Po No~15 | So-50 | to None 9 | T)+25 Po+200 | No So tot8 | CM, 10 | T)-25 Po-200 | No-75 | So to+8 | None | 7-25 Po Ny S-100 | to+16 | CM, | 12 | T)-25 Po +200 | No~150 | So-50 | to cM, 13 | 7 Po-200 | No~150 | So fo CM, 14 | 1%) Po No~75 | So-100 | to+8 | CM; is To Po+200 | No So-50 tot 16 None 16 | To+25 Po-200 | No~75 | So-50 | to+16 | CM2 17 | T.+25 Po No So oy CM, 18 | T)+25 Po+200 | No-150 | So-100 | to+8 | None Chap. 4 Now we combine the experimenters log sheet with the testing conditions described in Section 4.2 to create the following experimental procedure: 1. Conduct 18 experiments as specified by the 18 rows of Table 4.3. 2. For each experiment, process one batch, consisting of 47 dummy wafers and three test wafers. The test wafers should be placed in positions 3, 23, and 48. Sec. 4.6 Conducting the Matrix Experiment 79 3. For each experiment, compute to your best ability the deposition time needed to achieve the target thickness of 3600A. Note that in the experiment the actual thickness may tum out to be much different from 3600A . However, such data are perfectly useful for analysis. Thus, a particular, experiment need not be redone by adjusting the deposition time to obtain 360A thickness. 4. For each experiment, measure the surface defects and thickness at three specific points (top, center, and bottom) on each test wafer. Follow standard laboratory practice to prepare data sheets with space for every observation to be recorded. 4.6 CONDUCTING THE MATRIX EXPERIMENT From Table 4.3 it is apparent that, from one experiment to the next, levels of several control factors must be changed. This poses a considerable amount of difficulty to the experimenter, Meticulousness in correctly setting the levels of the various control fac- tors is critical to the success of a Robust Design project. Let us clarify what we mean by meticulousness. Going from experiment 3 to experiment 4 we must change tem- perature from (Ty -25) °C to To °C, pressure from (Py +200) mtorr to (P)—200) mtorr, and so on. By meticulousness we mean ensuring that the temperature, pressure, and other dials are set to their proper levels. Failure to set the level of a factor correctly could destroy the valuable property of orthogonality. Consequently, conclu- sions from the experiment could be erroneous. However, if an inherent error in the equipment leads to an actual temperature of (To- 1) °C or (Ty +2) *C when the dial is set at To °C, we should not bother to correct for such variations. Why? Because unless we plan to change the equipment, such variations constitute noise and will con- tinue to be present during manufacturing. If our conclusions from the matrix experi- ment are to be valid in actual manufacturing, our results must not be sensitive to such inherent variations. By keeping these variations out of our experiments, we lose the ability to test for robustness against such variations. The matrix experiment, coupled with the verification experiment, has a built-in check for sensitivity to such inherent variations, A difficulty in conducting matrix experiments is their radical difference from the current practice of conducting product or process design experiments, One common practice is to guess, using engineering judgment, the improved settings of the control factors and then conduct a paired comparison with the starting conditions. The guess- and-test cycle is repeated until some minimum improvement is obtained, the deadline is reached, or the budget is exhausted. This practice relies heavily on luck, and it is inefficient and time-consuming. Another common practice is to optimize systematically one control factor at a time. Suppose we wish to determine the effect of the three temperature settings while keeping the settings of the other control factors fixed at their starting levels. To reduce the effect of experimental error, we must process several batches at each temperature 80 Steps in Robust Design Chap. 4 setting. Suppose six batches are processed at each temperature setting. (Note that in the Lg array the replication number is six; that is, there are six experiments for each factor level.) Then, we would need 18 batches to evaluate the effect of three tempera- ture settings. For the other factors, we need to experiment with the two alternate levels, so that we need to process 12 batches each. Thus, for the six factors, we would need to process 18 + 5 x 12 = 78 batches. This is a large number compared to the 18 batches needed for the matrix experiment. Further, if there are strong interactions among the control factors, this method of experimentation cannot detect them. ‘The matrix experiment, though somewhat tedious to conduct, is highly efficient—that is, when compared to the practices above, we can generate more depend- able information about more control factors with the same experimental effort. Also, this method of experimentation allows for the detection of the interactions among the control factors, when they are present, through the verification experiment. In practice, many design improvement experiments, where only one factor is studied at a time, get terminated after studying only a few control factors because both the R&D budget and the experimenter’s patience run out. As a result, the quality improvement tums out to be only partial, and the product cost remains somewhat high, This danger is reduced greatly when we conduct matrix experiments using orthogonal arrays. In the polysilicon deposition case study, the 18 experiments were conducted according to the experimenter’s log given in Table 4.3, It took only nine days (2 experiments per day) to conduct them. The observed data on surface defects are listed in Table 4.4(a), and the thickness and deposition rate data are shown in Table 4.4(b). The surface defects were measured by placing the specimen under an optical micro- scope and counting the defects in a field of 0.2 cm?. When the count was high, the field area was divided into smaller areas, defects in one area were counted, and the count was then multiplied by an appropriate number to determine the defect count per unit area (0.2 cm?), The thickness was measured by an optical interferometer. The deposition rate was computed by dividing the average thickness by the deposition time. 4.7 DATA ANALYSIS The first step in data analysis is to summarize the data for each experiment. For the case study, these calculations are illustrated next. For experiment number 1, the S/N ratio for the surface defects, given by Equa- tion (4.1), was computed as follows: , Me Me 1 =-10 logig - ist j Sec. 4.7 Data Analysis 81 2 2) 4.092 2442 =-10 e60 |4 +0 +17) +2 eed +17+0*) ==10 logi (:| =0.51. From the thickness data, the mean, variance, and S/N ratio were calculated as follows by using Equations (4.2), (4.3) and (4.4): 9 Me Me See Eq. (4.2) p= ty (2029+ 1975 + 1961) + (1975 + 1934 + 1907) + (1952+ 1941 + 1949) 7 ~|- = 1958.1 A See Eq, (4.3) 0? = ; z 3 ist j= y-wy? t {c029- 1958.1)? ++ + -+(1949— 1958.1)? = 1151.36 (A). 2 W = 10 losio EF 1958.17 1151.36 = 10 logio = 35.22 dB. Steps in Robust Design TABLE 4.4(a) SURFACE DEFECT DATA (DEFECTS/UNIT AREA) Test Wafer 1 Test Wafer 2 Test Wafer 3 Expt. No. | Top | Center | Bottom | Top | Center | Bottom | Top | Center | Bottom 1 1 0 1 2 0 0 1 1 0 2 1] 2 8 | 190) 5 o| 26] 3 1 3 | 3] 35] 106 | 360] 38} 135 | 315; 50] 180 4| 6] 15 6| 17} 2] 1] ts} 40] 18 5 | 1720} 1980 | 2000 | 487| 810 | 400 | 2020] 360) 13, 6 135 360 1620 | 2430| 207 2 | 2500 270 35 7 360} 810 1215 | 1620 7 30 | 1800 720 315 8 | 270| 2730 | sooo | 360) 1 2 | 9999] 225 1 9 | 5000} 1000 | 1000 }3000| 1000 | 1000 | 3000] 2800 | 2000 10 3 0 0 a 0 o 1 0 1 Bie 1 0 1 5 oO 0 1 0 1 2 | 3] 1620] 90} 216] 5 4} 2] 8 3 B 1 25 270 810 16 1 25 3 0 | 3] 21 | 162) 9] 6 1) | 15] 39 AS 450 | 1200 1800 | 2530 | 2080 2080 | 1890 180 25 16 5 6 40 54 oO 8 14 1 1 17 | 1200} 3500 | 3500 | 1000] 3 1 |9999} 600 8 18 | 8000 | 2500 | 3500 | 5000] 1000 | 1000 | s000| 2000 | 2000 Chap. 4 Sec, 4.7 Data Analysis TABLE 4.4(6) THICKNESS AND DEPOSITION RATE DATA ‘Thickness (A) Test Wafer 1 Test Wafer 2 Test Wafer 3 + Expt. No. | Top | Center| Bottom | Top | Center| Bottom |} Top | Center 1 | 2029} 1975 1961 | 1975 | 1934 1907 | 1952} 1941 1949 14.5 2 | 5375} 5191 5242 | 5201 | 5254 5309 | 5323 | 5307 5091 36.6 3 | s99| 5804 | 5874 |152] so1o | sese | 6077| seas | so62 | 414 4 | 2118} 2109 | 2099 | 2140] 2125 2108 | 2149} 2130 | 2111 36.1 | 4102] 4152 | 4174 | 4556| 4504 | 4560 | 5031] soao | 5032 | 73.0 6 | 3022] 2932 | 2913 | 2833 | 2837 2828 | 2934 | 2875 ‘2841 49,5 7 | 3030] 3042 | 3028 | 3486] 3333 | 3389 |3709| 3671 | 3687 | 76.6 8 | 4707| 4472 | 4336 | 4407 4156 | 4094 | 5073] 4898 | 4599 | 105.4 9 | 3859 | 3822 | 3850 | 3871 | 3922 3904 | 4110 | 4067 4110 115.0 10 | 3227) 3205 3242 | 3468 | 3450 3420 | 3599} 3591 3535 24.8 11 | 2521| 2499 | 2499 | 2576 | 2537 | 2512 | 2551 | 2552 | 2570 | 200 12 | 5921] 5766 | 5844 | 5780| soos | sei4 | seai| 5777 | 5743 | 390 13 | 2792| 2752 | 2716 | 2684} 2635 | 2606 | 2765] 2786 | 2773 | 53.1 14 | 2863 | 2835 2859 | 2829 | 2864 | 2839 | 2891 | 2844 2841 45.7 15 | 3218) 3149 3124 | 3261 | 3205 3223 | 3241 | 3189 3197 54.8 16 | 3020} 3008 3016 | 3072 | 3151 3139 | 3235 | 3162 3140 768 17 | 4277] 4150 | 3992 | 3888] 3681 | 3572 | 4593] 4298 | 4219 | 1053 1s |3125| 3119 | 3127 | 3567) 3563 | 3520 | 4120] 4088 | 4138 | 914 a4 Steps in Robust Design Chap. 4 ‘The deposition rate in the decibel scale for experiment I is given by ‘NY = 10 logyg r? = 20 logig r = 20 logio(14.5) = 23.23 dBam where dBam stands for decibel A /min. The data summary for all 18 experiments was computed in a similar fashion and the results are tabulated in Table 4.5. Observe that the mean thickness for the 18 experiments ranges from 1958 A to 5965 A. But we are least concerned about this variation in the thickness because the average thickness can be adjusted easily by changing the deposition time. During a Robust Design project, what we are most interested in is the S/N ratio, which in this case is a measure of variation in thickness as a proportion of the mean thickness. Hence, no further analysis on the mean thickness was done in the case study, but the mean thickness, of course, was used in computing the deposition rate, which was of interest. After the data for each experiment are summarized, the next step in data analysis is to estimate the effect of each control factor on each of the three characteristics of interest and to perform analysis of variance (ANOVA) as described in Chapter 3. The factor effects for surface defects (n), thickness (1’), and deposition rate (1), and the respective ANOVA are given in Tables 4.6, 4.7, and 4.8, respectively. A sum- mary of the factor effects is tabulated in Table 4.9, and the factor effects are displayed graphically in Figure 4.5, which makes it easy to visualize the relative effects of the various factors on all three characteristics. To assist the interpretation of the factor effects plotted in Figure 4.5, we note the following relationship between the decibel scale and the natural scale for the three characteristics: * An increase in 1) by 6 dB is equivalent to a reduction in the root mean square surface defects by a factor of 2. An increase in | by 20 dB is equivalent to a reduction in the root mean square surface defects by a factor of 10. * The above statements are valid if we substitute 1 or n” for 7, and standard deviation of thickness or deposition rate for root mean square surface defects. The task of determining the best setting for each control factor can become com- plicated when there are multiple characteristics to be optimized. This is because dif- ferent levels of the same factor could be optimum for different characteristics. The quality loss function could be used to make the necessary trade-offs when different characteristics suggest different optimum levels. For the polysilicon deposition case Sec. 4.7 Data Analysis TABLE 4.5 DATA SUMMARY BY EXPERIMENT Surface Deposition Experiment Condition | Defects |_ Thickness Bate Exp) Mast afe fw) ow No. eABCDEeF (dB) |(A) (dB) (dBam) 1 Piriiriid 0.51 | 1958 | 35.22 23.23 2 112.2 2222 | -37.30 | 5255 | 35.76 31.27 3 11333333 | -45.17 | 5965) 36.02 32,34 4 12112233 | -2576 [211 | 422s | aias s }12223311 | 6254} 472 | 2143 | 3727 6 | 12331122 | -6223 [2801 | 3291 | 3.89 7 [13121323 | -s088 | a375 | 2139 | a8 8 |1323.2131|-7169 | 4527) 2284 | 40.46 9 13.3.1 3.21 2 | -68.15 | 3946 | 30.60 4121 10 21133221 -3.47 | 3415 | 26.85 2789 i 21211332 —5.08 | 2535 | 38.80 26.02 2] 21322113] -s485 | 5781 | 3806 | 31.82 13, 22123132 | -49.38 | 2723 | 32.07 34.50 i4 2223 1213 | -3654 | 2852) 43.34 33.20 15, 22312321 | -64.18 | 3201 37.44 34.76 we }23.132312]-2731 | 305) 3186 | 3771 17 23213123) -71L51 | 4074] 22.01 40.45 18 23°32:1231 | -72.00 | 3596 18.42 (39,22 * Empty column is denoted by ¢. 86 Stops in Robust Design Chap. 4 1) =-10 log,, (mean square surface defects) 25 a —-\--_\--fa N\A AA A-- 50: 75 A, Ar As B, By By C2030, Dy Dz Dy EE, & Fi Fa Fs a8 40: "=10 log, (45 ) for thickness ‘7 Ap Aap Ay Ay Ay By BBs CyC,C; D, Dy Dy E Ep Ey Fy Fe Fy 20: Bam 40: x“ —- AG a 1" = 10 f0g\q (deposition rate)? Ay Ay Ay By Bz By Co Cy Cy Dy Dz Dy Ej Ee Ey Fi Fo Fe Wn Temp. Pressure Nitrogen Silane Settling Cleaning (°C) (mtorr) (sccm) ~— (sccm) Time Method (min) Figure 4.5 Plots of factor effects. Underline indicates starting level. ‘Two-standard- deviation confidence limits are also shown for the starting level. Estimated confidence limits for 1)” are too small to show. Sec. 4.7 Data Analysis a7 study, we can make the following observations about the optimum setting from Fig- ure 4.5 and Table 4.9: Deposition temperature (factor A) has the largest effect on all three characteris- tics, By reducing the temperature from the starting setting of To °C to To -25 °C, n can be improved by {(- 24.23) — (-50.10)} = 26 dB, This is equivalent to a 20-fold reduction in root mean square surface defect count. The effect of this temperature change on thickness uniformity is only (35.12-34.91) = 0.21 dB, which is negligible. But the same temperature change would lead to a reduction in deposition rate by (34.13 28.76) = 5.4 dB, which is approximately a 2-fold reduction in the deposition rate. Thus, temperature can dramatically reduce the surface defect problem, but it also would double the deposition time. Accordingly, there is a trade-off to be made between reducing the quality cost (including the scrap due to high surface defect count) and the number of wafers processed per day by the reactor. Deposition pressure (factor B) has the next largest effect on surface defect and deposition rate. Reducing the pressure from the starting level of Po mtorr to (P~200) mtorr can improve 1) by about 20 dB (a 10-fold reduction in the root mean square surface defect count) at the expense of reducing the deposition rate by 2.75 dBam (37 percent reduction in deposition rate). The effect of pressure on thickness uniformity is very small. Nitrogen flow rate (factor C) has a moderate effect on all three characteristics. The starting setting of No sccm gives the highest S/N ratios for surface defects and thickness uniformity. There is also a possibility of further improving these two S/N ratios by increasing the flow rate of this dilutant gas. This is an impor- tant fact to be remembered for future experiments, The effect of nitrogen flow rate on deposition rate is small compared to the effects of temperature and pres- sure, Silane flow rate (factor D) also has a moderate effect on all three characteristics. Thickness uniformity is the best when silane flow rate is set at (Sg—50) sccm. This can also lead to a small reduction in surface defects and the deposition rate. Settling time (factor E) can be used to achieve about 10 dB improvement in sur- face defects by increasing the time from fo minutes to (f9 +8) minutes. The data indicates that a further increase in the settling time to (fg +16) minutes could negate some of the reduction in surface defect count. However, this change is small compared to the standard deviation of the error; and it is not physically justifiable, Settling time has no effect on the deposition rate and the thickness uniformity. Cleaning method (factor F) has no effect on deposition rate and surface defects. But, by instituting some cleaning prior to deposition, the thickness uniformity can be improved by over 6.0 dB (a factor of 2 reduction in standard deviation of 88 Steps in Robust Design Chap. 4 thickness). Cleaning with CM2 or CM; could give the same improvement in thickness uniformity. However, CM cleaning can be performed inside the reac- tor, whereas CM; cleaning must be done outside the reactor. Thus, CM clean- ing is more convenient. From these observations, the optimum settings of factors E and F are obvious, namely E and F. However, for factors A through D, the direction in which the qual- ity characteristics (surface defects and thickness uniformity) improve tend to reduce the deposition rate. Thus, a trade-off between quality loss and productivity must be made in choosing their optimum levels. In the case study, since surface defects were the key quality problem that caused significant scrap, the experimenters decided to take care of it by changing temperature from Az to A. As discussed earlier, this also meant a sub- stantial reduction in deposition rate. Also, they decided to hold the other three factors at their starting levels, namely B>, Cy, and D3. The potential these factors held would TABLE 4.6 ANALYSIS OF SURFACE DEFECTS DATA* Average 1 by Factor Level (dB) Degree of | Sum of | Mean Factor 1 2 3 | Freedom | Squares | Square | F A. Temperature = 24.23 | -50,10 | - 61.76 2 4427 | 2214 | 27 B. Pressure - 2755 ) -47.44 | - 61.10 2 3416 | 1708 | 21 C. Nitrogen 39.03 | - 55.99 | - 41.07 2 130 | sis | 64 D, Silane ~ 39.20 | - 4685 | -50.04 2 372 | 186 | 23 E, Settling time 51.52 | - 4054 | - 4403 2 378 | 189 | 23 F, Cleaning method — 4158 | - 48.95 2 lott | 82 Error 5 40st | 81 Total 7 10192 (Error) ” 69) | @t) * Overall mean 7 = —45.36 dB. Underscore indicates starting level. + Indicates the sum of squares added together to form the pooled error sum of squares shown in parentheses. Sec. 4.7 Data Analysis 89 have been used if the confirmation experiment indicated a need to improve the surface defect and thickness uniformity further. Thus, the optimum conditions chosen were: A\BoC 1D 3EoF >. ‘The next step in data analysis is to predict the anticipated improvements under the chosen optimum conditions. To do so, we first predict the S/N ratios for surface defects, thickness uniformity, and deposition rate using the additive model. These ‘computations for the case study are displayed in Table 4.10. According to the table, an improvement in surface defects equal to [-19.84—(~56.69)] = 36.85 dB should be anticipated, which is equivalent to a reduction in the root mean square surface defect count by a factor of 69.6. The projected improvement in thickness uniformity is 36.79-29.95 = 6.84 dB, which implies a reduction in standard deviation by a factor of 2.2. The corresponding change in deposition rate is 29.60— 34.97 5.37 dB, which amounts to a reduction in the deposition rate by a factor of 1.9. TABLE 4.7 ANALYSIS OF THICKNESS DATA* ] Average n’ by Level (B) Degree of | Sum of | Mean Factor i 2 3 | Freedom | Squares | Square | F A. Temperature 35.12 | 34.91 | 2452 2 440 | 220° | 16 B Pressure 31.61 | 30.70 | 3224 2 Ww) 35 C. Nitrogen 3439 | 27.86 | 32.30 2 134 | 67 30 D. Silane 31.68 | 34.70 | 28.17 2 8 | 64 48 B. Settling time 3052 | 3287 | 31.16 2 we | 9 F, Cleaning method | 27.04 | 33.67 | 33.85 2 181 | 905 | 68 Error 5 ser | 192 Total 7 1004 | 59.1 (Error) oe (21) | 3.4) * Overall mean n’ = 31.52 4B. Underscore indicates starting level. + Indicates the sum of squares added together to form the pooled error sum of squares shown in parentheses. 90 Steps in Robust Design Chap. 4 TABLE 4.8 ANALYSIS OF DEPOSITION RATE DATA* ‘Average 1” by Factor Level (@Bam) Degree of | Sum of | Mean Factor 1 2 3 Freedom | Squares | Square | F ‘A. Temperature 28.76 | 3413 | 39.46 2 343.1 | i7is | 553 B, Pressure 3203 | 3478 | 35.54 2 aio | 25 | 66 C. Nitrogen 35.29 | 34.25 2 18.7 94 | 30 D, Silane 34.53} 35.61 2 363 | 181 | 38 E, Settling time 3399 | 3430 2 ost | 02 F. Cleaning method | 33.81 | 34.10 | 34.44 2 12 | 06 Error 5 13% | 026 Total "7 a9 | 259 (Error) @ @8) | @30 * Overall mean n” = 34.12 dBam. Underscore indicates starting level. Indicates the sum of squares added together to form the pooled error sum of squares shown in parentheses. 4.8 VERIFICATION EXPERIMENT AND FUTURE PLAN Conducting a verification experiment is a crucial final step of a Robust Design project. Its purpose is to verify that the optimum conditions suggested by the matrix experiment do indeed give the projected improvement. If the observed S/N ratios under the optimum conditions are close to their respective predictions, then we conclude that the additive model.on which the matrix experiment was based is a good approximation of the reality. Then, we adopt the recommended optimum conditions for our process or product, as the case may be. However, if the observed S/N ratios under the optimum conditions differ drastically from their respective predictions, there is an evidence of failure of the additive model. There can be many reasons for the failure and, thus, there are many ways of dealing with it. The failure of the additive model generally indicates that choice of the objective function or the S/N ratio is inappropriate, the observed quality characteristic was chosen incorrectly, or the levels of the control fac- tors were chosen inappropriately. The question of how to avoid serious additivity problems by properly choosing the quality characteristic, the S/N ratio, and the control factors and their levels is discussed in Chapter 6. Of course, another way to handle the Sec. 4.8 Verification Experiment and Future Plan 1 TABLE 4.9 SUMMARY OF FACTOR EFFECTS. Surface Defects [ Thickness | Deposition Rate 1 v Li Factor (ae) F | @p) | F | @Bam | F ‘A. Temperature () 14.3 35.12 28.76 5010 | 27) 3491 | 16 | 3413. | 553 61.76 24.52 39.46 B. Pressure (mtorr) 27.55 31.61 32.03 —47.44 21 30.70 ms 34.78 66 -61.10 32.24 35.54 C. Nitrogen (seem) 39.03 3439 3281 -5599 | 64 | 2786] 5.0] 3529 | 30 41.07 32.30 34.25 D. Silane (seem) 39.20 31.68 3221 46.85 23 34.70 48 3453 58 ~50.04 28.17 35.61 E. Settling time (min) wEESE: 30.52 34.06 —40.54 23 32.87 ~ 33,99 - 44.03 31.16 34,30 F. Cleaning method 45.56 27.04 33.81 41.58 | - | 3367) 68} 3410 | - 48.95 33.85 34.44 ‘Overall mean 45.36 31.52 34.12 additivity problem is to study a few key interactions among the control factors in future experiments. Construction of orthogonal arrays that permit the estimation of a few specific interactions, along with all main effects, is discussed in Chapter 7. The verification experiment has two aspects: the first is that the predictions must agree under the laboratory conditions; the second aspect is that the predictions should be valid under actual manufacturing conditions for the process design and under actual field conditions for the product design. A judicious choice of both the noise factors to be included in the experiment and the testing conditions is essential for the predictions made through the laboratory experiment to be valid under both manufacturing and field conditions. For the polysilicon deposition case study, four batches of 50 wafers containing 3 test wafers were processed under both the optimum condition and under the starting conditions. The results are tabulated in Table 4.11. It is clear that the data agree very well with the predictions about the improvement in the S/N ratios and the deposition rate. So, we could adopt the optimum settings as the new process settings and proceed to implement these settings. 92 Steps in Robust Design Chap. 4 TABLE 4.10 PREDICTION USING THE ADDITIVE MODEL Starting Condition Optimum Condition Contribution? (4B) Contributiont (4B) Surface Deposition Surface Deposition Factor | Setting | Defects | Thickness | Rate | Setting ) Defects | Thickness | Rate at A, | 474} 339 oo A, 21.13 3.60 5.36 B B, | -208 | 0.00 0.66 B, | -208 | 0.00 0.66 c Bi 633 287 =131 CG 633 | 287 -131 D D, 468 3.35 149 Dy 4.68 —3.35 149 Et E, 6.16 0.00 0.00 E, 482 0.00 0.00 mp F, 0.00 ~4.48 0.00 Fy 0.00 215 0.00 Overall 45.36 31.52 34.12 45.36 31.52 34.12 Mean Total -56.69 | 29.95 34.97 1984 | 36.79 29.60 * Indicates the factors whose levels are changed from the starting to the optimum conditions. + By contribution we mean the deviation from the overall mean caused by the particular factor level, TABLE 4.11 RESULTS OF VERIFICATION EXPERIMENT Starting | Optimum Condition | Condition | Improvement Surface rms 600/cm? Tem? Defects ny -55.64B | -16.94B | 38.74B std.dev. | 0.028 0.013, Thickness v 311aB | 37.748 66 dB Deposition rate 60 A/min | 35 A /min Rate nv 35.6 dBam | 30.9 dBam | —4.7 dBam * Standard deviation of thickness is expressed as a fraction of the mean thickness. Sec. 4.9 Summary 93 Follow-up Experiments Optimization of a process or a product need not be completed in a single matrix exper- iment. Several matrix experiments may have to be conducted in sequence before com- pleting a product or process design. The information leamed in one matrix experiment is used to plan the subsequent matrix experiments for achieving even more improve- ment in the process or the product. The factors studied in such subsequent experi- ments, or the levels of the factors, are typically different from those studied in the ear- lier experiments. From the case-study data on the polysilicon deposition process, temperature stood out as the most important factor—both for quality and productivity. The experi- mental data showed that high temperature leads to excessive formation of surface defects and nonuniform thickness. This led to identifying the type of temperature con- troller as a potentially important control factor. The controller used first was an under- damped controller, and, consequently, during the initial period of deposition, the reac- tor temperature rose significantly above the steady-state set-point temperature. It was then decided to try a critically damped controller. Thus, an auxiliary experiment was conducted with two control factors: (1) the type of controller, and (2) the temperature setting. This experiment identified the critically damped controller as being significantly better than the underdamped one. The new controller allowed the temperature setting to be increased to Ty- 10 °C while keeping the surface defect count below | defect/unit area. The, higher tempera- ture also led to a deposition rate of 55A /min rather than the 35A /min that was observed in the initial verification experiment. Simultaneously, a standard deviation of thickness equal to 0.007 times the mean thickness was achieved. Range of Applicability In any development activity, it is highly desirable that the conclusions continue to be valid when we advance to a new generation of technology. In the case study of the polysilicon deposition process, this means that having developed the process with 4- inch wafers, we would want it to be valid when we advance to 5-inch wafers. ‘The process developed for one application should be valid for other applications. Processes and products developed by the Robust Design method generally possess this charac- teristic of design transferability. In the case study, going from 4-inch wafers to 5-inch wafers was achieved by making minor changes dictated by the thermal capacity calcu- lations. Thus, a significant amount of development effort was saved in transferring the process to the reactor that handled 5-inch wafers. 4.9 SUMMARY Optimizing the product or process design means determining the best architecture, lev- els of control factors, and tolerances. Robust Design is a methodology for finding the 94 Steps in Robust Design Chap. 4 optimum settings of control factors to make the product or process insensitive to noise factors. It involves eight major steps which can be grouped as planning a matrix experiment to determine the effects of the control factors (Step 1 through 5), conduct- ing the matrix experiment (Step 6), and analyzing and verifying the results (Steps 7 and 8). * Step 1. Identify the main function, side effects and failure modes. This step requires engineering knowledge of the product or process and the customer's environment, Step 2. Identify noise factors and testing conditions for evaluating the quality Joss. The testing conditions are selected to capture the effect of the more impor- tant noise factors. It is important that the testing conditions permit a consistent estimation of the sensitivity to noise factors for any combination of control factor levels. In the polysilicon deposition case study, the effect of noise factors was captured by measuring the quality characteristics at three specific locations on each of three wafers, appropriately placed along the length of the tube. Noise orthogonal array and compound noise factor are two common techniques for con- structing testing conditions. These techniques are discussed in Chapter 8. Step 3. Identify the quality characteristic to be observed and the objective func- tion to be optimized. Guidelines for selecting the quality characteristic and the objective function, which is generically called S/N ratio, are given in Chapters 5 and 6. The common temptation of using the percentage of products that meet the specification as the objective function to be optimized should be avoided. It leads to orders of magnitude reduction in efficiency of experimentation. While optimizing manufacturing processes, an appropriate throughput characteristic should also be studied along with the quality characteristics because the econom- ics of the process is determined by both of them. Step 4. Identify the control factors and their alternate levels. The more complex a product or a process, the more control factors it has and vice versa. Typically, six to eight control factors are chosen at a time for optimization. For each con- trol factor two or three levels are selected, out of which one level is usually the starting level, The levels should be chosen sufficiently far apart to cover a wide experimental region because sensitivity to noise factors does not usually change with small changes in control factor settings. Also, by choosing a wide experi- mental region, we can identify good regions, as well as bad regions, for control factors. Chapter 6 gives additional guidelines for choosing control factors and their levels. In the polysilicon deposition case study, we investigated three levels each of six control factors. One of these factors (cleaning method) had discrete levels. For four of the factors the ratio of the largest to the smallest levels was between three and five. * Step §. Design the matrix experiment and define the data analysis procedure. Using orthogonal arrays is an efficient way to study the effect of several control factors simultaneously. The factor effects thus obtained are valid over the Sec.4.9 Summary 95, experimental region and it provides a way to test for the additivity of the factor effects. The experimental effort needed is much smaller when compared to other methods of experimentation, such as guess and test (trial and error), one factor at a time, and full factorial experiments, Also, the data analysis is easy when orthogonal arrays are used. The choice of an orthogonal array for a particular project depends on the number of factors and their levels, the convenience of changing the levels of a particular factor, and other practical considerations. Methods for constructing a suitable orthogonal array are given in Chapter 7. The orthogonal array Lg, consisting of 18 experiments, was used for the polysilicon deposition study. The array Lg happens to be the most commonly used array because it can be used to study up to seven 3-level and one 2-level factors. Step 6. Conduct the matrix experiment. Levels of several control factors must be changed when going from one experiment to the next in a matrix experiment. Meticulousness in correctly setting the levels of the various control factors is essential—that is, when a particular factor has to be at level 1, say, it should not be set at level 2 or 3. However, one should not worry about small perturbations that are inherent in the experimental equipment. Any erroneous experiments or missing experiments must be repeated to complete the matrix. Errors can be avoided by preparing the experimenter’s log and data sheets prior to conducting the experiments. This also speeds up the conduct of the experiments significantly. The 18 experiments for the polysilicon deposition case study were completed in 9 days, Step 7. Analyze the data, determine optimum levels for the control factors, and predict performance under these levels. The various steps involved in analyzing the data resulting from matrix experiments are described in Chapter 3. S/N ratios and other summary statistics are first computed for each experiment, (In Robust Design, the primary focus is on maximizing the S/N ratio.) Then, the factor effects are computed and ANOVA performed. The factor effects, along with their confidence intervals, are plotted to assist in the selection of their optimum levels. When a product or a process has multiple quality characteris- tics, it may become necessary to make some trade-offs while choosing the optimum factor levels. The observed factor effects together with the quality loss function can be used to make rational trade-offs. In the polysilicon case study, the data analysis indicated that levels of three factors—deposition temperature (A), settling time (E), and cleaning method (F)—be changed, while the levels of the other five factors be kept at their starting levels. Step 8. Conduct the verification (confirmation) experiment and plan future actions. The purpose of this final and crucial step is to verify that the optimum conditions suggested by the matrix experiments do indeed give the projected improvement. If the observed and the projected improvements match, we adopt the suggested optimum conditions. If not, then we conclude that the additive model underlying the matrix experiment has failed, and we find ways to correct that problem. The corrective actions include finding better quality characteristics, or signal-to-noise ratios, or different control factors and levels, or studying a few Steps in Robust Design Chap. 4 specific interactions among the control factors. Evaluating the improvement in quality loss, defining a plan for implementing the results, and deciding whether another cycle of experiments is needed are also a part of this final step of Robust Design. It is quite common for a product or process design to require more than one cycle of Steps 1 through 8 for achieving needed quality and cost improve- ment. In the polysilicon deposition case study, the verification experiment confirmed the optimum conditions suggested by the data analysis. In a follow up Robust Design cycle, two control factors were studied—deposition temperature and type of temperature controller. The final optimum process gave nearly two orders of magnitude reduction in surface defects and a 4-fold reduction in the standard deviation of the thickness of the polysilicon layer. Chapter 5 SIGNAL-TO-NOISE RATIOS The concept of quadratic loss function introduced in Chapter 2 is ideally suited for evaluating the quality level of a product as it is shipped by a supplier to a customer, “As shipped” quality means that the customer would use the product without any adjustment to it or to the way it is used. Of course, the customer and the supplier could be two departments within the same company. A few common variations of the quadratic loss function were given in Chapter 2. Can we use the quadratic loss function directly for finding the best levels of the control factors? What happens if we do so? What objective function should we use to minim- ize the sensitivity to noise? We examine these and other related questions in this chapter. In particular, we describe the concepts behind the signal-to-noise (S/N) ratio and the rationale for using it as the objective function for optimizing a product or pro- cess design. We identify a number of common types of engineering design problems and describe the appropriate S/N ratios for these problems. We also describe a pro- cedure that could be used to derive S/N ratios for other types of problems. This chapter has six sections: * Section 5.1 discusses the analysis of the polysilicon thickness uniformity. ‘Through this discussion, we illustrate the disadvantages of direct minimization of the quadratic loss function and the benefits of using S/N ratio as the objective function for optimization. * Section 5.2 presents a general procedure for deriving the S/N ratio, 98 Signal-to-Noise Ratios Chap. 5 * Section 5.3 describes common static problems (where the target value for the quality characteristic is fixed) and the corresponding S/N ratios. * Section 5.4 discusses common dynamic problems (where the quality characteris- tic is expected to follow the signal factor) and the corresponding S/N ratios. * Section 5.5 describes the accumulation analysis method for analyzing ordered categorical data. * Section 5.6 summarizes the important points of this chapter. 5.1 OPTIMIZATION FOR POLYSILICON LAYER THICKNESS UNIFORMITY One of the two quality characteristics optimized in the case study of the polysilicon deposition process in Chapter 4 was the thickness of the polysilicon layer. Recall that one of the goals was to achieve a uniform thickness of 3600 A . More precisely, the experimenters were interested in minimizing the variance of thickness while keeping the mean on target. The objective of many robust design projects is to achieve a par- ticular target value for the quality characteristic under all noise conditions. These types of projects were previously referred to as nominal-the-best type problems. The detailed analysis presented in this section will be helpful in formulating such projects. This section discusses the following issues: ‘* Comparison of the quality of two process conditions * Relationship between S/N ratio and quality loss after adjustment (Q,) ‘* Optimization for different target thickness * Interaction induced by the wrong choice of objective function * Identification of a scaling factor * Minimization of standard deviation and mean separately Comparing the Quality of Two Process Conditions Suppose we are interested in determining which is a preferred temperature setting, To °C of (Tq + 25) °C, for achieving uniform thickness of the polysilicon layer around the target thickness of 3600 A. We may attempt to answer this question by running a number of batches under the two temperature settings while keeping the other control factors fixed at certain levels. Suppose the observed mean thickness and standard devi- ation of thickness for these two process conditions are as given in Table 5.1. Although no experiments were actually conducted under these conditions, the data in Table 5.1 are Tealistic based on experience with the process. This is also true for all other data used in this section, Note that under temperature To “C, the mean thickness is 1800 A., which is far away from the target, but the standard deviation is small. Whereas under temperature (Tp + 25) °C, the mean thickness is 3400 A , which is close to the Sec. 5.1 Optimization for Polysilicon Layer Thickness Uniformity 99 target, but the standard deviation is large. As we observe here, it is very typical for both the mean and standard deviation to change when we change the level of a factor. TABLE 5.1 EFFECT OF TEMPERATURE ON THICKNESS UNIFORMITY, Mean Standard Expt. | Temperature | Thickness (4)* | Deviation (0) ot No. co (A) (A) ay 1 To 1800 32 3.241 x 10° 2 To +25 3400 200 8.000 x 10* * Target mean thickness = {tg = 3600 A TO=- WP to From the data presented in Table 5.1, which temperature setting can we recom- mend? Since both the mean and standard deviation change when we change the tem- perature, we may decide to use the quadratic loss function to select the better tempera- ture setting. For a given mean, p, and standard deviation, o, the quality loss without adjustment, denoted by Q, is given by Q = quality loss without adjustment = k fw —3600)? +0? G1) where k is the quality loss coefficient. Note that throughout this chapter we ignore the constant k (that is, set it equal to 1) because it has no effect on the choice of optimum levels for the control factors. The quality loss under T' °C is 3.24 x 10°, while under (To +25) °C it is 8.0 x 10'. Thus, we may conclude that (To + 25) ‘°C is the better temperature setting. But, is that really a correct conclusion? Recall that the deposition time is a scaling factor for the deposition process—that is, for any fixed settings of all other control factors, the polysilicon thickness at the various points within the reactor is proportional to the deposition time. Of course, the proportionality constant, which is the same as the deposition rate, could be different at different locations within the reactor. This is what leads to the variance, o”, of the polysilicon thickness. We can use this knowledge of the scaling factor to estimate the quality loss after adjusting the mean on target. For To °C temperature, we can attain the mean thickness of 3600 A by increas- ing the deposition time by a factor of 3600/1800 = 2.0. Correspondingly, the standard 100 Signal-to-Noise Ratios Chap. 5 deviation would also increase by the factor of 3600/1800 to 64A . Thus, the estimated quality loss after adjusting the mean is 4.1 x 10°, Similarly, for (To + 25) °C we can obtain 3600 A thickness by increasing the deposition time by a factor of 3600 / 3400, which would result in a standard deviation of 212 A . Thus, the estimated quality loss after adjusting the mean is 4.49 x 10*. From these calculations it is clear that when the mean is adjusted to be on target, the quality loss for Ty °C is an order of magnitude smaller than the quality loss for (T + 25) °C; that is, the sensi- tivity to noise is much less when the deposition temperature is Tg “C as opposed to (To + 25) °C. Hence, Ty °C is the preferred temperature setting. A decision based on quality loss without adjustment (Q) is influenced not only by the sensitivity to noise (), but also by the deviation from the target mean (1—[Ug). Often, such a decision is heavily influenced, if not dominated, by the deviation from the target mean. As a result, we risk the possibility of not choosing the factor level that minimizes sensitivity to noise. This, of course, is clearly undesirable. But when we compute the quality loss after adjustment, denoted by Q,, for all practical purposes we eliminate the effect of change in mean. In fact, it is a way of isolating the sensi- tivity to noise factors. Thus, a decision based on Q, minimizes the sensitivity to noise, which is what we are most interested in during robust design. Relationship between S/N Ratio and Q, ‘The general formula for computing the quality loss after adjustment for the polysilicon thickness problem, which is a nominal-the-best type problem, can be derived as fol- lows: If the observed mean thickness is 1, we have to increase the deposition time by a factor of [g/t to get the mean thickness on target. The predicted standard deviation after adjusting the mean on target is (49/1) 6, where G is the observed standard devia- tion. So, we have Qq = quality loss after adjustment = (5.2) We can rewrite Equation (5.2) as follows: o Q, = kus [= . (5.3) Since in a given project k and [lo are constants, we need to focus our attention only on (p7/0"). We call (17/0) the S/N ratio because 0° is the effect of noise factors and p is the desirable part of the thickness data. Maximizing (\u?/6”) is equivalent to minim- izing the quality loss after adjustment, given by Equation (5.3), and also equivalent to minimizing sensitivity to noise factors. Sec. 5.1 Optimization for Polysilicon Layer Thickness Uniformity 101 For improved additivity of the control factor effects, it is common practice to take log transform of (4/07) and express the S/N ratio in decibels, 2 n= 10 logo | 6.4) Although it is customary to refer to both (\u”/o*) and 1) as the S/N ratio, it is clear from the context which one we mean. The range of values of (u*/o*) is (0, o), while the range of values of 1 is (— <2, oo). Thus, in the log domain, we have better additivity of the effects of two or more control factors. Since log is a monotone func- tion, maximizing (\47/o”) is equivalent to maximizing n. Optimization for Different Target Thicknesses Using the S/N ratio rather than the mean square deviation from target as an objective function has one additional advantage. Suppose for a different application of the polysilicon deposition process, such as manufacturing a new code of microchips, we want to have 3000 A target thickness, Then, the optimum conditions obtained by maximizing the S/N ratio would still be valid, except for adjustment of the mean. However, the same cannot be said if we used the mean square deviation from target as the objective function, We would have to perform the optimization again. The problem of minimizing the variance of thickness while keeping the mean on target is a problem of constrained optimization. As discussed in Appendix B, by using the S/N ratio, the problem can be converted into an unconstrained optimization prob- Tem that is much easier to solve. The property of unconstrained optimization is the basis for our ability to separate the actions of minimizing sensitivity to noise factors by maximizing the S/N ratio and the adjustment of mean thickness on target. When we advance from one technology of integrated circuit manufacturing to a newer technology, we must produce thinner layers, print and etch smaller width lines, etc. With this in mind, it is crucial that we focus our efforts on reducing sensitivity to noise by optimizing the S/N ratio. The mean can then be adjusted to meet the desired target. This flexible approach to process optimization is needed not only for integrated circuit manufacturing, but also for virtually all manufacturing processes and optimiza- tion of all product designs. During product development, the design of subsystems and components must proceed in parallel. Even though the target values for various characteristics of the subsystems and components are specified at the beginning of the development activity, it often becomes necessary to change the target values as more is learned about the product. Optimizing the S/N ratio gives us the flexibility to change the target later in the development effort. Also, the reusability of the subsystem design for other applica- tions is greatly enhanced, Thus, by using the S/N ratio we improve the overall produc- tivity of the development activity. 102 Signal-to-Noise Ratios Chap. 5 Interactions Induced by Wrong Choice of Objective Function Using the quality loss without adjustment as the objective function to be optimized can also lead to unnecessary interactions among the control factors. To understand this point, let us consider again the data in Table 5.1. Suppose the deposition time for the two experiments in Table 5.1 was 36 minutes. Now suppose we conducted two more experiments with 80 minutes of deposition time and temperatures of To °C and (To +25) °C. Let the data for these two experiments be as given in Table 5.2. For case of comparison, the data from Table 5.1 are also listed in Table 5.2. TABLE 5.2 INTERACTIONS CAUSED BY THE MEAN Deposition | _ Mean Standard Expt, |Temperature| Time —_| Thickness (§1)* | Deviation (0) at Qt No. | (© (min) Aw (Ay ay ay 1 To 36 1300 2 3.241 x 10° | 4.096 x 10° 2 | 1425 36 3400 200 8,000 x 10° | 4.484 x 10* 3 Te 80 4000 70 1,649 x 10° | 3.969 x 10° 4 To +25 80 ‘7550 440, 15,796 x 10° | 4.402 x 10° * Target mean thickness = [1g = 3600A, FQ =H) +0? f o 2Q,=03 | { e The quality loss without adjustment is plotted as a function of temperature for the two values of deposition tims Figure 5.1(a). We see that for 36 minutes of deposition time, the (Tp + 25) °C is the preferred temperature, whereas for 80 minutes of deposition time the preferred temperature is To °C. Such opposite conclusions about the optimum levels of control factors (called interactions) are a major source of confu- sion and inefficiency in experimentation for product or process design improvement. Not only is the estimation of interaction expensive but the estimation might not yield the true optimum settings for the control factors—that is, if there are strong antisynergistic interactions among the control factors, we risk the possibility of choos- ing a wrong combination of factor levels for the optimum conditions. In this example, based on Q, we would pick the combination of (T9 + 25) °C and 36 minutes as the best combination. But, if we use the S/N ratio or the Q, as the objective function, we would unambiguously conclude that To °C is the preferred temperature [see Figure 5.1(b)]. Sec. 5.1 Optimization for Polysilicon Layer Thickness Uniformity 103 70 (a) When @ is the objective Se 80 min function, the control factors, = 36 min temperature and time, have 2 50 strong antisynergistic 2 interaction. r 30 To +28 70 (b) When Q, is the objective function, there is no interaction between temperature and time. Here, since time is a scaling factor, the curves for 36 min. and 80 min. deposition time are 10 logio(Q,) aN 32. 55 almost overlapping. 0 To +25 70 ct ‘80 min t 36 min (©) From this figure we see that = 50 much of the interaction in (a) : is caused by the deviation of 2 the mean from the target. S 30 To +25 Figure 5.1 Interactions caused by the mean. The squared deviation of the mean from the target thickness is a component of the objective function Q [see Equation (5.1)]. This component is plotted in Figure 5.1(c).. From the figure it is obvious that the interaction revealed in Figure 5.1(a) is primarily caused by this component. The objective function Q, does not have the squared deviation of the mean from the target as a component. Consequently, the corresponding interaction, which unnecessarily complicates the decision process, is eliminated. In general, if we observe that for a particular objective function the interactions among the control factors are strong, we should look for the possibility that the objec- tive function may have been selected incorrectly. The possibility exists that the objec- tive function did not properly isolate the effect of noise factors and that it still has the deviation of the product's mean function from the target as a component. 104 Signal-to-Noise Ratios Chap. 5 Identification of a Scaling Factor In the polysilicon deposition case study, the deposition time is an easily identified scal- ing factor. However, in many situations where we want to obtain mean on target, the scaling factor cannot be identified readily. How should we determine the best settings of the control factors in such situations? It might, then, be tempting to use the mean squared deviation from the target as the objective function to be minimized. However, as explained earlier, minimizing the mean squared deviation from the target can lead to wrong conclusions about the optimum levels for the control factors; so, the temptation should be avoided. Instead, we should begin with an assumption that a scaling factor exists and identify such a fac- tor through experiments, The objective function to be maximized, namely n, can be computed from the observed j1 and 6 without knowing which factor is a scaling factor. Also, the scaling operation does not change the value of 1. Thus, the process of discovering a scaling factor and the optimum levels for the various control factors is a simple one. It con- sists of determining the effect of every control factor on | and y, and then classifying these factors as follows: 1. Factors that have a significant effect on 1, For these factors, we should pick the levels that maximize n. 2. Factors that have a significant effect on but practically no effect on. Any one of these factors can serve as a scaling factor. We use one such factor to adjust the mean on target. We are generally successful in finding at least one scaling factor. However, sometimes we must settle for a factor that has a small effect on 1) as a scaling factor. 3. Factors that have no effect on 1 and no effect on t. These are neutral factors and we can choose their best levels from other considerations such as ease of operation or cost. Minimizing Standard Deviation and Mean Separately Another way to approach the problem of minimizing variance with the constraint that the mean should be on target is, first, t0 minimize standard deviation while ignoring the mean, and, then, bring the mean on target without affecting the standard deviation by changing a suitable factor. The difficulty with this approach is that often we cannot find a factor that can change the mean over a wide range without affecting the stan- dard deviation. This can be understood as follows: In these problems, when the mean is zero, the standard deviation is also zero. However, for all other mean values, the standard deviation cannot be identically zero. Thus, whenever a factor changes the it also affects the standard deviation. Also, an attempt to minimize standard deviation without paying attention to the mean drives both the standard deviation and Sec. 5.2 Evaluation of Sensitivity to Noise 105 the mean to zero, which is not a worthwhile solution. Therefore, we should not try to minimize 6 without paying attention to the mean. However, we can almost always find a scaling factor. Thus, an approach where we maximize the S/N ratio leads to useful solutions. Note that the above discussion pertains to the class of problems called nominal- the-best type problems, of which polysilicon thickness uniformity is an example. A class of problems called signed-target type problems where it is appropriate to first minimize variance and then bring the mean on target is described in Section 5.3. 5.2 EVALUATION OF SENSITIVITY TO NOISE Let us now examine the general problem of evaluating sensitivity to noise for a dynamic system, Recall that in a dynamic system the quality characteristic is expected to follow the signal factor. The ideal function for many products can be written as y=M 65) where y is the quality characteristic (or the observed response) and M is the signal (or the command input). In this section we discuss the evaluation of sensitivity to noise for such dynamic systems. For specificity, suppose we are optimizing a servomotor (a device such as an electric motor whose movement is controlled by a signal from a command device) and that y is the displacement of the object that is being moved by the servomotor and M specifies the desired displacement. To determine the sensitivity of the servomotor, suppose we use the signal values M, Mz, «++, Mj; and for each signal value, we use the noise conditions x), x2, <*>, %,. Let yy denote the observed displacement for a particular value of control factor settings, 2 = (21, 22. °**. 29). when the signal is M; and noise is x;. Representative values of y,; and the ideal func- tion are shown in Figure 5.2. The average quality loss, Q(z), associated with the con- trol factor settings, 2, is given by Oy — MY. (5.6) Me z 3} o@) = am 2 As shown by Figure 5.2, Q(z) includes not only the effect of noise factors but also the deviation of the mean function from the ideal function. In practice, Q(2) could be dominated by the deviation of the mean function from the ideal function. Thus, the direct minimization of Q(z) could fail to achieve truly minimum sensitivity to noise. Tt could lead simply to bringing the mean function on target, which is not a difficult problem in most situations anyway. Therefore, whenever adjusiment is possible, we should minimize the quality loss after adjustment. 106 Signal-to-Noise Ratios Chap. 5 Ideal Function y=M e o e 2 . Uo. Bow, Us ‘Observed e Mean Function My M, My a“ Figure 5.2 Evaluation of sensitivity to noise. For the servomotor, it is possible to adjust a gear ratio so that, referring to Figure 5.2, the slope of the observed mean function can be made equal to the slope of the ideal function. Let the slope of the observed mean function be B. By changing the gear ratio we can change every displacement y,; to vy = (1/B)y,j. This brings the mean function on target. For the servomotor, change of gear ratio leads to a simple linear transformation of the displacement yj. In some products, however, the adjustment could lead to a more complicated function between the adjusted value v;; and the unadjusted value yi. For a general case, let the effect of the adjustment be to change cach yj; to a value viz = hp (ij), Where the function Hp defines the adjustment that is indexed by a param- eter R. After adjustment, we must have the mean function on target—that is, the errors Sec. 5.2 Evaluation of Sensitivity to Noise 107 (4; — Mj) must be orthogonal to the signal M;. Mathematically, the requirement of orthogonality can be written as Ms M: (yj — Mp) M; = 0. 5.7) i Equation (5.7) can be solved to determine the best value of R for achieving the mean function on target. Then the quality loss after adjustment, Q,(z), can be evaluated as follows: kame Q@)= 7 DD Oy (6.8) The quantity Q,(z) is a measure of sensitivity to noise. It does not contain any part that can be reduced by the chosen adjustment process. However, any systematic part of the relationship between y and M that cannot be adjusted is included in Q,(2). [For the servomotor, the nonlinearity (2nd, 3rd, and higher order terms) of the relationship between y and M are contained in Q,(z).] Minimization of Q,(2) makes the design robust against the noise factors and reduces the nonadjustable part of the relationship between y and M. Any control factor that has an effect on yj; but has no effect on Q,(2) can be used to adjust the mean function on target without altering the sensitivity to noise, which has already been minimized. Such a control factor is called an adjust- ment factor. It is easy to verify that minimization of Q,(2), followed by adjusting the mean function on target using an adjustment factor, is equivalent to minimizing Q(z) subject to the constraint that the mean function is on target. This optimization procedure is called a two-step procedure for obvious reasons. For further discussion of the 2-step procedure and the S/N ratios, see Taguchi and Phadke [T6], Phadke and Dehnad [P4], Leon, Shoemaker, and Kackar [L2], Nair and Pregibon [N2], and Box [B1]. It is important to be able to predict the combined effect of several control factors from the knowledge of the effects of the individual control factors. The natural scale of Q,(2) is not suitable for this purpose because it could easily give us a negative pred- iction for Q,(2) which is absurd. By using the familiar decibel scale, we not only avoid the negative prediction, but also improve the additivity of the factor effects. Thus, to minimize the sensitivity to noise factors, we maximize 1) which is given by 1 = -10 logio Qq(z). 6.9) Note that the constant k in Q,(2) and sometimes some other constants are generally ignored because they have no effect on the optimization, 108 Signal-to-Noise Ratios Chap. 5 Following Taguchi, we refer to 1) as the S/N ratio. In the polysilicon deposition example discussed in Section 5.1, we saw that Q, a (u2/07), where 0? is the effect of the noise factors, and p is the desirable part of the thickness data. Thus Q, is the ratio of the power of the signal (the desirable part) to the power of the noise (the undesirable part), As will be seen through the cases discussed in the subsequent sections of this chapter, whenever a scaling type of an adjustment factor exists, Qg takes the form of a ratio of the power of the signal (the desirable part of the response) to the power of the noise (the undesirable part of the response). Therefore, Q, and 7) are both referred to as the S/N ratio. As a matter of convention, we call Q, and 7 the S/N ratio, even in other cases where “ratio” form is not that apparent. The general optimization strategy can be summarized as follows: 1. Evaluate the effects of the control factors under consideration on 7) and on the mean function. 2. For the factors that have a significant effect on 1, select levels that maximize 1. 3. Select any factor that has no effect on 7) but a significant effect on the mean function as an adjustment factor. In practice, we must sometimes settle for a fac- tor that has a small effect on 1 but a significant effect on the mean function as an adjustment factor. Use the adjustment factor to bring the mean function on target. Adjusting the mean function on target is the main quality control activity in manufacturing. It is needed because of changing raw material, varying pro- cessing conditions, etc. Thus, finding an adjustment factor that can be changed conveniently during manufacturing is important. However, finding the level of the adjustment factor that brings the mean precisely on target during product or process design is not important. 4. For factors that have no effect on 7 and the mean function, we can choose any level that is most convenient from the point of view of other considerations, such as other quality characteristics and cost. What adjustment is meaningful in a particular engineering problem and what fac- tor can be used to achieve the adjustment depend on the nature of the particular prob- lem. Subsequent sections discuss several common engineering problems and derive the appropriate S/N ratios using the results of this section. 5.3 S/N RATIOS FOR STATIC PROBLEMS Finding a correct objective function to maximize in an engineering design problem is very important. Failure to do so, as we saw earlier, can lead to great inefficiencies in experimentation and even wrong conclusions about the optimum levels, The task of finding what adjustments are meaningful in a particular problem and determining the correct S/N ratio is not always easy. Here, we describe some common types of static problems and the corresponding S/N ratios, Sec. 5.3 ‘S/N Ratios for Static Problems 109 Minimizing the surface defect count and achieving target thickness in polysilicon deposition are both examples of static problems. In each case, we are interested in a fixed target, so that the signal factor is trivial, and for all practical purposes, we can say it is absent. In contrast, the design of an electrical amplifier is a dynamic problem in which the input signal is the signal factor and our requirement is to make the output signal proportional to the input signal. The tracking of the input signal by the output signal makes it a dynamic problem. We discuss dynamic problems in Section 5.4. Static problems can be further characterized by the nature of the quality charac- teristic. Recall that the response we observe for improving quality is called quality characteristic. The classification of static problems is based on whether the quality characteristic is: * Continuous or discrete * Scalar, vector, or curve (such as frequency response function) * Positive or covers the entire real line + Such that the target value is extreme or finite Commonly encountered types of static problems and the corresponding S/N ratios are described below (see also Taguchi and Phadke [T6] and Phadke and Dehnad [P4)). In these problems, the signal factor takes only one value. Thus, we denote by Yis ¥ar‘t*+ Jn the n observations of the quality characteristic under different noise conditions. (a) Smaller-the-better Type Problem Here, the quality characteristic is continuous and nonnegative—that is, it can take any value from 0 to eo. Its most desired value is zero, Such problems are characterized by the absence of a scaling factor or any other adjustment factor. The surface defect count is an example of this type of problem. Note that for all practical purposes we can treat this count as a continuous variable. Another example of a smaller-the-better type problem is the pollution from a power plant. One might say that we can reduce the total pollutants emitted by reduc- ing the power output of the plant. So why not consider the power output as an adjust- ment factor? However, reducing pollution by reducing power consumption does not signify any quality improvement for the power plant. Hence, it is inappropriate to think of the power output as an adjustment factor. In fact, we should consider the pol- lution per megawatt-hour of power output as the quality characteristic to be improved instead of the pollution itself. Additional examples of smaller-the-better type problems are electromagnetic radi- ation from telecommunications equipment, leakage current in integrated circuits, and corrosion of metals and other materials. 110 Signal-to-Noise Ratios Chap. 5 Because there is no adjustment factor in these problems, we should simply minimize the quality loss without adjustment—that is, we should minimize @Q = k(mean square quality characteristic) aki F ye ~4{2 PP (5.10) isl ing Q is equivalent to maximizing 1) defined by the following equation, 1 = — 10 logy (mean square quality characteristic) 142 =~ 10 logio |= & y?]- (5.1) a Note that we have ignored the constant & and expressed the quality loss in the decibel scale, In this case the signal is constant, namely to make the quality characteristic equal to zero, Therefore, the S/N ratio, 7, measures merely the effect of noise. (b) Nominal-the-best Type Problem Here, as in smaller-the-better type problem, the quality characteristic is continuous and nonnegative—that is, it can take any value from 0 to ee. Its target value is nonzero and finite, For these problems when the mean becomes zero, the variance also becomes zero. Also, for these problems we can find a scaling factor that can serve as an adjust- ment factor to move the mean on target. This type of problem occurs frequently in engineering design. We have already discussed the problem in great detail with particular reference to achieving target thick- ness in polysilicon deposition. The objective function to be maximized for such prob- lems is wt n= 10 logo (5.12) Sec. 5.3 SIN Ratios for Static Problems 1 where z oO - By. (5.13) In some situations, the scaling factor can be identified readily through engineer- ing expertise. In other situations, we can identify a suitable scaling factor through experimentation. The optimization of the nominal-the-best problems can be accomplished in two steps: 1. Maximize 1) or minimize sensitivity to noise. During this step we select the lev- els of control factors to maximize 1) while we ignore the mean. 2. Adjust the mean on target. During this step we use the adjustment factor to bring the mean on target without changing 1. Note that as we explained in Section 5.1, we should not attempt to minimize o and then bring the mean on target. (c) Larger-the-better Type Problem Here, the quality characteristic is continuous and nonnegative, and we would like it to be as large as possible. Also, we do not have any adjustment factor. Examples of such problems are the mechanical strength of a wire per unit cross-section area, the miles driven per gallon of fuel for an automobile carrying a certain amount of load, etc. This problem can be transformed into a smaller-the-better type problem by consid- ering the reciprocal of the quality characteristic. The objective function to be maxim- ized in this case is given by 1) =— 10 logyo (mean square reciprocal quality characteristic) 1 | (8.14) 1 =-10 logo |L & “ere [ y The following questions are often asked about the larger-the-better type prob- lems: Why do we take the reciprocal of a larger-the-better type characteristic and then treat it as a smaller-the-better type characteristic? Why do we not maximize the mean square quality characteristic? This can be understood from the following result from mathematical statistics: Mean square reciprocal quality characteristic = |1 + 3 w 112 Signal-to-Noise Ratios Chap. 5 where } and 6 are the mean and variance of the quality characteristic. [Note that if y denotes the quality characteristic, then the mean square reciprocal quality characteristic is the same as the expected value of (1/y)’.] Minimizing the mean square reciprocal quality characteristic implies maximizing j1 and minimizing 0”, which is the desired thing to do. However, if we were to try to maximize the mean square quality charac- teristic, which is equal to (4? + 6”), we would end up maximizing both 1 and o?, which is not a desirable thing to do. (d) Signed-target Type Problem In this class of problems, the quality characteristic can take positive as well as negative values, Often, the target value for the quality characteristic is zero. If not, the target value can be made zero by appropriately selecting the reference value for the quality characteristic. Here, we can find an adjustment factor that can move the mean without changing the standard deviation. Note that signed-target problems are inherently dif ferent from smaller-the-better type problems, even though in both cases the best value is zero. In the signed-target problems, the quality characteristic’ can take positive as well as negative values whereas in the smaller-the-better type problems the quality characteristic cannot take negative values. The range of possible values for the quality characteristic also distinguishes signed-target problems from nominal-the-best type problems. There is one more distinguishing feature. In signed-target type problems when the mean is zero, the standard deviation is not zero, but in nominal-the-best type problems when the mean is zero, the standard deviation is also zero. ‘An example of signed-target problems is the de offset voltage of a differential operational amplifier. The offset voltage could be positive or negative. If the offset voltage is consistently off zero, then we can easily compensate for it in the circuit that receives the output of the differential operational amplifier without affecting the stan- dard deviation. The design of a differential operational amplifier is discussed in Chapter 8. In such problems, the objective function to be maximized is given by 1) = - 10 logy 0” =- 10 logig + z ow (6.15) Note that this type of problem occurs relatively less frequently compared to the nominal-the-best type problems. Sec. 6.3 SIN Ratios for Static Problems 113 (e) Fraction Defective This is the case when the quality characteristic, denoted by p, is a fraction taking values between 0 and 1. Obviously, the best value for p is zero. Also, there is no adjustment factor for these problems. When the fraction defective is p, on an average, we have to manufacture 1/(1-p) pieces to produce one good piece. Thus, for every good piece pro- duced, there is a waste and, hence, a loss that is equivalent to the cost of processing {1/(1-p) -1] =p/(1-p) pieces. Thus, the quality loss is given by Q, Q=k oo (5.16) where k is the cost of processing one piece. Ignoring k, we obtain the objective function to be maximized in the decibel scale as 10 logio I (5.17) lp Note that the range of possible values of Q is 0 to oo, but the range of possible values of 7 is cc to cc, Therefore, the additivity of factor effects is better for than for Q. The S/N ratio for the fraction-defective problems is the same as the familiar logit transform, which is commonly used in biostatistics for studying drug response. (f) Ordered Categorical Here, the quality characteristic takes ordered categorical values. For example, after a drug treatment we may observe a patient’s condition as belonging to one of the following categories: worse, no change, good, or excellent. In this situation, the extreme category, excellent, is the most desired category. However, in some other cases, an intermediate category is the most desired category. For analyzing data from ordered categorical prob- lems, we form cumulative categories and treat each category (or its compliment, as the case may be) as a fraction-defective type problem. We give an example of analysis of ordered categorical data in Section 5.5. (g) Curve or Vector Response ‘As the name suggests, in this type of problem the quality characteristic is a curve or a vector rather than a single point. The treatment of this type of problem is described in Chapter 6 in conjunction with the design of an electrical filter and paper transport in copying machines. The basic strategy in these problems is to break them into several scalar problems where each problem is of one of the previously discussed types. 114 Signal-to-Noise Ratios Chap. 5 5.4 S/N RATIOS FOR DYNAMIC PROBLEMS Dynamic problems have even more variety than static problems because of the many types of potential adjustments. Nonetheless, we use the general procedure described in Sections 5.1 and 5.3 to derive the appropriate objective functions or the S/N ratio. Dynamic problems can be classified according to the nature of the quality characteristic and the signal factor, and, also, the ideal relationship between the signal factor and the quality characteristic. Some common types of dynamic problems and the correspond- ing S/N ratios are given below (see also Taguchi [T1], Taguchi and Phadke [T6], and Phadke and Dehnad [P4]). (a) Continuous-continuous (C-C) Here, both the signal factor and the quality characteristic take positive or negative con- tinuous values. When the signal is zero, that is, M = 0, the quality characteristic is also zero, that is, y = 0. The ideal function for these problems is y = M, and a scaling fac- tor exists that can be used to adjust the slope of the relationship between y and M. This is one of the most common types of dynamic problems. The servomotor example described in Section 5.2 is an example of this type. Some other examples are analog telecommunication, design of test sets (such as voltmeter and flow meter), and design of sensors (such as the crankshaft position sensor in an automobile). We now derive the S/N ratio for the C-C type problems. As described in Sec- tion 5.2, let yj; be the observed quality characteristic for the signal value M, and noise condition x;. The quality loss without adjustment is given by The quality loss has two components. One is due to the deviation from linearity and the other is due to the slope being other than one. Of the two components, the latter can be eliminated by adjusting the slope. In order to find the correct adjustment for given control factor settings, we must first estimate the slope of the best linear relation- ship between y;; and M;. Consider the regression of y;; on M; given by Ji = BM; + 4 (5.18) where B is the slope and ¢,; is the error. The slope { can be estimated by the least squares criterion as follows: Sec. 5.4 ‘S/N Ratios for Dynamic Problems: 115 (5.19) that is (5.20) that is (5.21) Note that Equation (5.20) is nothing but a special case of the general Equation (5.7) for determining the best adjustment. Here, fg (yj) = (1/B) yi = vj and B is the same as the index R. Also note that the least squares criterion is analogous to the cri- terion of making the error [(1/B) yj; — Mj] orthogonal to the signal, M;. The quality loss after adjustment is given by (5.21) Oy - BM). “Ms M= 116 Signal-to-Noise Ratios Chap. § Minimizing Q, is equivalent to maximizing 1) given by gs 11 = 10 logig >. (5.22) oF Note that B is the change in y produced by a unit change in M. Thus, B? quantifies the effect of signal. ‘The denominator 67 is the effect of noise. Hence, 1) is called the S/N ratio. Note that 62 includes sensitivity to noise factors as well as the nonlinearity of the relationship between y and M. Thus, maximization of 1) leads to reduction in non- linearity along with the reduction in sensitivity to noise factors. In summary, the C-C type problems are optimized by maximizing 1 given by Equation (5.22). After maximization of 7), the slope is adjusted by a suitable scaling factor. Note that any control factor that has no effect on 1 but an appreciable effect on B can serve as a scaling factor. Although we have shown the optimization for the target function y = M, it is still valid for all target functions that can be obtained by adjusting the slope—that is, the optimization is valid for any target functions of the form y = ByM where Bo is the desired slope. Another variation of the C-C type target function is Oo + BoM. (5.23) In this case, we must consider two adjustments: one for the intercept and the other for the slope. One might think of this as a vector adjustment factor. The S/N ratio to be maximized for this problem can be shown to be 1), given by Equation (5.22). The two adjustment factors should be able to change the intercept and the slope, and should have no effect on 1. (b) Continuous-digital (C-D) A temperature controller where the input temperature setting is continuous, while the output (which is the ON or OFF state of the heating unit) is discrete is an example of the C-D type problem. Such problems can be divided into two separate problems: one for the ON function and the other for the OFF function. Each of these problems can be viewed as a separate continuous-continuous or nominal-the-best type problem. The design of a temperature control circuit is discussed in detail in Chapter 9. (c) Digital-continuous (D-C) The familiar digital-to-analog converter is an example of the D-C case problem. Here again, we separate the problems of converting the 0 and 1 bits into the respective Sec. 5.4 SIN Ratios for Dynamic Problems. 7 continuous values, The conversion of 0, as well as the conversion of 1, can be viewed as a nominal-the-best type static problem. (d) Digital-digital (D-D) Digital communication systems, computer operations, etc., where both the signal factor and the quality characteristic are digital, are examples of the D-D type problem. Here, the ideal function is that whenever 0 is transmitted, it should be received as 0, and whenever | is transmitted, it should be received as 1. Let us now derive an appropri- ate objective function for minimizing sensitivity to noise. Here, the signal values for testing are My = 0 and M, = 1. Suppose under cer- tain settings of control factors and noise conditions, the probability of receiving 1, when 0 is transmitted, is p (see Table 5.3). Thus, the average value of the received signal, which is the same as the quality characteristic, is p and the corresponding vari- ance is p(1-p). Similarly, suppose the probability of receiving 0, when 1 is transmit- ted, is q. Then, the average value of the corresponding received signal is (I-g) and the corresponding variance is q(1-q). The ideal transmit-receive relationship and the observed transmit-receive relationship are shown graphically in Figure 5.3, Although the signal factor and the quality characteristic take only 0-1 values, for convenience we represent the transmit-receive relationship as a straight line. Let us now examine the possible adjustments. TABLE 5.3 TRANSMIT-RECEIVE RELATIONSHIP FOR DIGITAL COMMUNICATION Probabilities Associated Properties of the with the Received Signal Received Signal o 1 Mean | Variance o l-p Pe P p(l-p) ‘Transmitted Signal 1 q 1-q 1-q q(l-q) It is well-known that 2 communication system is inefficient if the errors of transmitting 0 and 1 are unequal. More efficient transmission is achieved by making p =. This can be accomplished by a leveling operation, an operation such as chang- ing the threshold. The leveling operation can be conceptualized as follows: Under- neath the transmission of a digital signal, there is a continuous signal such as voltage, frequency, or phase. If it is at all possible and convenient to observe the underlying 118 Signal-to-Noise Ratios Chap. § continuous variable, we should prefer it. In that case, the problem can be classified as a C-D type and dealt with by the procedure described earlier. Here, we consider the situation when it is not possible to measure the continuous variable. (c) After Leveling (b) Observed Function Received Signal Transmitted Signal Figure 5.3 Digital communication, Figure 5.4 shows possible distributions of the continuous variable received at the output terminal when 0 or 1 are transmitted. If the threshold value is R ,, the errors of © would be far more likely than the errors of 1. However, if the threshold is moved to Sec. 5.4 S/N Ratios for Dynamic Problems 119 R>, we would get approximately equal errors of 0 and 1 . The effect of this adjustment is also to reduce the total error probability (p + q). f(vloy (a) When the threshold is at R;, the error probabilities p and q are not equal. Avo) {b) By adjusting the threshold to, , we can make the two error probabilities equal, i.e. p'=q'. Figure 5.4 Effect of leveling on error probabi How does one determine p’ (which is equal to q’) corresponding to the observed error rates p and q? The relationship between p’, p, and q will obviously depend on the continuous distribution. However, we are considering a situation where we do not 120 Signal-to-Noise Ratios Chap. 5 have the ability to observe the distributions. Taguchi has suggested the use of the fol- lowing relationship for estimating p’ after equalization or leveling: — 10 logio pb P q 2 x 10 lo; = - 10 lo : (5.24) B10 E | B10 ( T (5.24) The two terms on the right hand side of Equation (5.24) are fraction-defective type S/N ratios for the separate problems of the errors of 0 and errors of 1. Equation (5.24) asserts that the effect of equalization is to make the two S/N ratios equal to the average of the S/N ratios before equalization. We can rewrite Equation (5.24) as follows: “1 top. Asa. . (5.25) Po4 Equation (5.25) provides an explicit expression computing p’ from p and g. The transmit-receive relationship after leveling is depicted by line c in Fig- ure 5.3. When 0 is transmitted, the mean value of the received signal yo, is p’.. When 1 is transmitted, the mean value of the received signal, yi, is (I-g’) = (I-p’). In both cases the variance is p’(I-p’). Both the lines a and c pass through the point (0.5, 0.5). But their slopes are not equal. The slope of line c is = OP" § ay" B 05 (1-2p"). Thus, the quality loss after adjusting the slope is given by oF p'(-p’) Ou kak ape Gan Thus, ignoring the constant &, the S/N ratio to be optimized is given by 1 = 10 logy oer 2 (5.27) Sec. 5.5 Analysis of Ordered Categorical Data 121 Observe that (1-2p’) is the difference of the averages of the received signal when 0 and 1 are transmitted. The quantity p’(1 — p’) is the variance of the received signal. So 1 measures the ability of the communication system to discriminate between 0 and 1 at the receiving terminal. The strategy to optimize a D-D system is to maximize 1, and then use a control factor which has no effect on 1, but can alter the ratio p:q to equalize the two error probabilities. Chemical separation processes can also be viewed as D-D type dynamic prob- lems. For example, when making iron in a blast furnace, the goal is to separate the iron molecules from the impurities. Molten iron stays at the bottom and the impurities float as slag on top of the molten metal. Suppose 100 p percent of iron molecules go into the slag and 100 q percent of a particular impurity go in the molten metal. Minimization of p and q can be accomplished by maximizing the S/N ratio given by Equation (5.27). In this problem, however, the losses due to iron molecules going into the slag and impurities going into the iron are not equal. We may wish to make the p:q ratio different from 1:1. The desired p:q ratio can be accomplished during adjust- ment without altering 1. 5.5 ANALYSIS OF ORDERED CATEGORICAL DATA Data observed in many matrix experiments is ordered categorical because of the inherent nature of the quality characteristic or because of the convenience of the meas- urement technique. Let us consider the surface defect data from the polysilicon deposi- tion case study of Chapter 4, which we will use to illustrate the analysis of ordered categorical data. Suppose, because of the inconvenience of counting the number of surface defects, the experimenters had decided to record the data in the following sub- jective categories, listed in progressively undersirable order: practically no surface defects, very few defects, some defects, many defects, and too many defects. For our illustration, we will take the observations listed in Table 4.4 (a) and associate with them categories I through V as follows: I: 0—3 defects Tl: 4— 30 defects TIL : 31 — 300 defects IV: 301 — 1000 defects V_: 1001 and more defects Thus, among the nine observations of experiment 2, five belong to category I, two to category Il, three to category III, and none to categories IV and V. The categorical data for the 18 experiments are listed in Table 5. 122 Signal-to-Noise Ratios Chap. 5 We will now describe Taguchi’s accumulation analysis method [T7, T1], which is an effective method for determining optimum control factor settings in the case of ordered categorical data. (See Nair [N1} for an alternate method of analyzing ordered categorical data.) The first step is to define cumulative categories as follows: @= : O—3 defects (dD = HI : O— 30 defects 1) = ++I : 0 — 300 defects (IV) =I+I+II+IV «0 — 1000 defects (V) = I+I+I04IV+V : 0 — oo defects The number of observations in the cumulative categories for the eighteen experiments are listed in Table 5.4. For example, the number of observations in the five cumulative categories for experiment 2 are 5, 7, 9, 9, and 9, respectively. The second step is to determine the effects of the factor levels on the probability distribution by the defect categories. This is accomplished analogous to the determina- tion of the factor effects described in Chapter 3. To determine the effect of tempera- ture of level Ai, we identify the six experiments conducted at that level and sum the observations in each cumulative category as follows: Cumulative Categories @ @ am dy (vy Experiment 1 9 9 9 9 9 Experiment 2 5 7 9 9 9 Experiment 3 1 1 7 9 9 Experiment 10 9 9 9 9 9 Experiment 11 8 9 9 9 9 Experiment 12 2 3 8 8 9 Total 34 40 51 53 54 The number of observations in the five cumulative categories for every factor level are listed in Table 5.5. Note that the entry for the cumulative category (V) is equal to the total number of observations for the particular factor level and that entry is uniformly 54 in this case study. If we had used the 2-level column, namely column 1, or if we had used the dummy level technique (described in Chapter 7), the entry in category (V) would not be 54. The probabilities for the cumulative categories shown in Table 5.5 are obtained by dividing the number of observations in each cumulative category by the entry in the last cumulative category for that factor level, which is 54 for the present case. Sec. 5.5 Analysis of Ordered Categorical Data TABLE 5.4 CATEGORIZED DATA FOR SURFACE DEFECTS Number of Observations Number of Observations by Categories by Cumulative Categories Expt. N. | 1 0 Mm Wovi@ a am ay Ww 1 9 oO o 0 0 9 9 9 9 9 2/5 2 2 0 of s 7 9 9 9 3 1 oO 6 2 0 1 1 7 9 9 4 0 8 1 oO 0 0 8 9 9 9 5 oO 1 o 4 4 0 1 1 5 2 6/1 0 4 1 3f1 1 5 6 9 7/0 1 1 4 3/0 1 2 6 9 8 |3 0 2 1 3/3 3 5s 6 9 9 0 o 0 4 5 0 0 0 4 9 w}9 0 0 0 of 9 9 9 9 9 nu }|s 1 0 0 of 8 9 9 9 9 wfi]2 3 3 0 1f2 5s 8 8 9 13 4 2 Zz 1 0 4 6 8 9 9 i 2 3 4 0 0 2 5 9 9 9 1s 0 1 1 1 6 0 1 2 3 9 16 3 4 2 0 oO 3 7 9 9 9 iy 2 1 o 2 4 2 3 a 5 = at 0 0 0 Z 7 0 0 0 2 9 Total | 49 27 28 22 364/49 76 104-126 162 123 The third step in data analysis is to plot the cumulative probabilities. Two useful plotting methods are the line plots shown in Figure 5.5 and the bar plots shown in Fig- ure 5,6. From both figures, it is apparent that temperature (factor A) and pressure (fac- tor B) have the largest impact on the cumulative distribution function for the surface defects. The effects of the remaining four factors are small compared to temperature and pressure. Among the factors C, D, E, and F, factor F has a somewhat larger effect. In the line plots of Figure 5.5, for each control factor we look for a level for which the curve is uniformly higher than the curves for the other levels of that factor. 124 ‘Signal-to-Noise Ratios Chap. 5 TABLE 5.5 FACTOR EFFECTS FOR THE CATEGORIZED SURFACE DEFECT DATA Number of Observations by Probabilities for the ‘Cumutative Categories ‘Cumulative Categories Factor Level © a ap ay Mi] o a am ay AxTo 2 34 41 $4 |] 013 O41 063 0.76 1.00 ‘A. Temperature (©) AyiTo-25. | 34 40 515354 |] 063 0.74 094 098 1.00 7 Ay Tot25 | 8 14 19 32 54 | 01S 026 035 0.59 1.00 B. Pressure (mio) By: Py-200 | 25 40 46 St SA | 0.46 0.74 0.85 094 1.00 Br: Po 20 28 36 43 «54 | 037 0.52 067 080 1.00 By:Pot200 | 4 8 22 32 $4 |}007 0.15 041 059 100 C, Nitrogen (seem) 19 30 32 39 «54 || 035 056 059 0.72 1,00 1 2 28 «63 ~— $4 | 020 037 052 072 100 19 2 44 «48 «54 [035 048 O81 089 1.00 D. Silane (sccm) 2 25 34 41 54 |} 037 046 063 0.76 1.00 13 31 42 4454 |] 024 0.57 078 O81 1.00 16 20 28 41 54 |/030 037 052 075 1.00 E, Settling time (min) Ey: fo 227384354 |] 039 0.50 0.70 080 100 Exrtot8 | 16 29 36 42-54 11030 054 0.67 0.78 1.00 Ey toH16 | 12 20 30 4154 {1022 037 056 0.76 1.00 F, Cleaning method Fy: None | 21 23 26 «3454 || 039 043 048 063 1.00 FCM, | 21 30 40 4 54 |}039 056 074 085 1,00 Fy CMs 7 2B 38 46 54 013 043 0.70 085 1.00 A uniformly higher curve implies that the particular factor level produces more obser- vations with lower defect counts; hence, it is the best level. In Figure 5.6, we look for a larger height of category I and smaller height of category V. From the two figures, it clear that A,, By, and F» are the best levels for the respective factors. The choice of the best level is ‘not as clear for the remaining three factors. However, the curves for the factor levels C2, D3, and E; lie uniformly lower among the curves for all lev- els of the respective factors, and these levels must be avoided. Thus, the optimum set- tings suggested by the analysis are A1B,(C 1/C3) (D1/D2) (E/E) F2, By comparing Figures 5.5, 5.6, and 4.5 it is apparent that the conclusions based on the ordered categorical data are consistent with the conclusions based on the actual counts, except for factors C, D, and E whose effects are rather small. The next step in the analysis is to predict the distribution of defect counts under the starting and optimum conditions. This can be achieved analogous to the procedure described in Chapter 3, except that we must use the omega transform, also known as 125 Sec. 5.5 Analysis of Ordered Categorical Data Silane Temperature 0.5 05 Probability a) ai) av) (wy) oO a a ay) Pressure tot Setting Time 0.5 Probability a ay ain avy (vy way ay ay) (Y) Cleaning Method Ni wvogen 1.0 05 05 Probability (a aly av) (vy (ay (lv) (Y) Cumulative Categories Cumulative Categories Figure 5.5 Line plots of the factor effects for the categorized surface defect data. Probability Probability Signal-to-Noise Ratios Chap. 5 Nov >1000 Ar Ag As D, Dz, Ds K Temperature Silane J iv. 301-1000 M 31-300 UN 4-30 1 0-3 Nitrogen Cleaning Method Figure 5.6 Bar plots of the factor effects for the categorized surface defect data. Sec. 5.5 Analysis of Ordered Categorical Data 127 the logit transform, of the probabilities for the cumulative categories (see Taguchi [T1]). The omega transform for probability p is given by the following equa- tion: @(p) = 10 logie GE Note that the omega transform is the same as the S/N ratio for the fraction defective type of static problems. Let us take the optimum settings recommended in Chapter 4, namely A1B2C\D3E2F2, to illustrate the computation of the predicted distribution. Since, according to Figure 5.5, factors C, D, and E have a small effect, we will only use the effects of factors A, B, and F in the prediction formula. The average probability for category (I) taken over 18 experiments is ly = 49/162 = 0.30 (see Table 5.4). Referring to Table 5.5, the predicted omega value for category (1), denoted by @4,2,c,05£,F,1/)» an be computed as follows: O42 D3EF A = Ou + {oan - ay] ee = (0.30) + [0(0.63) ~ 0(0.30)] + [0(0.37) — 0X(0.30)] + [0(0.39) — @(0.30)] = 3.68 + [2.31 + 3.68] + [-2.31 + 3.68] + [-1.94 + 3.68] = 5.42 dB. Then, by the inverse omega transform, the predicted probability for category (1) is 0.78. Predicted probabilities for the cumulative categories (Il), (II) and (IV) can be obtained analogously. Prediction is obviously 1.0 for category (V). The predicted probabilities for the cumulative categories for the starting and the optimum settings are listed in 128 Signal-to-Noise Ratios Chap. 5 Table 5.6. These probabilities are also plotted in Figure 5.7. It is clear that the recom- mended optimum conditions give much higher probabilities for the low defect count categories when compared to the starting conditions. The probability of 0-3 defects, (category D), is predicted to increase from 0.23 to 0.78 by changing from starting to the ‘optimum conditions. Likewise, the probability for the 1001 and more category reduces from 0.37 to 0.01. The predicted probabilities for the cumulative categories should be compared with those observed under the starting and optimum conditions to verify that the addi- tive model is appropriate for the case study. This is the same as the Step 8 of the Robust Design cycle described in Chapter 4. Selection of appropriate orthogonal arrays for a case study, as well as the confounding of 2-factor interactions with the main effects, is discussed in Chapter 7, However, we note here that when accumula- tion analysis is used, 3-level orthogonal arrays should be preferred over 2-level orthog- onal arrays for minimizing the possibility of misleading conclusions about the factor effects, Particularly, the orthogonal arrays Lig and Lg are most suitable. In any case, the verification experiment is important for ensuring that the conclusions about the fac- tor effects are valid, 5.6 SUMMARY The quadratic loss function is ideally suited for evaluating the quality level of a product as it is shipped by a supplier to a customer. It typically has two com- ponents; one related to the deviation of the product’s function from the target, and the other related to the sensitivity to noise factors, * S/N ratio developed by Genichi Taguchi, is a predictor of quality loss after making certain simple adjustments to the product’s function, It isolates the sensitivity of the product’s function to noise factors. In Robust Design we use the S/N ratio as the objective function to be maximized, * Benefits of using the S/N ratio for optimizing a product or process design are: — Optimization does not depend on the target mean function, Thus, the design can be reused in other applications where the target is different. — Design of subsystems and components can proceed in parallel. Specifications for the mean function of the subsystems and components can be changed later during design integration without adversely affecting the sensitivity to noise factors. — Additivity of the factor effects is good when an appropriate S/N ratio is used. Otherwise, large interactions among the control factors may occur, resulting in high cost of experimentation and potentially unreliable results. Sec. 5.6 Summary 129 TABLE 5.6 PREDICTED PROBABILITIES FOR THE CUMULATIVE CATEGORIES © Values for the Probabilities for the Control Cumulative Categories Cumulative Categories Factor Settings wm ab (ny avy) | (¥) a ay | amy } av) | (vy) Optimum: A\B2C\D3E,F; | 542 | 698 | 1453 | 19.45 | » | 0.78 | 0.83 } 0.97 | 0.99 | 1.00 Starting A,B,C \D3E\F, | -3.68 | -141 | 0.04 | 2.34 | « | 0.23 | 0.42 | 0.50 | 0.63 | 1.00 1 L _L Probability () ay (WY) (V) Cumulative Categories Figure 5.7 Predicted probabilities for the cumulative categories. 130 Signal-to-Noise Ratios Chap. 5 — Overall productivity of the development activity is improved. Robust Design problems can be divided into two broad classes: static problems, where the target value for the quality characteristic is fixed, and dynamic problems, where the quality characteristic is expected to follow the signal factor. Common types of static problems and the corresponding S/N ratios are summarized in Table 5.7. Common types of dynamic problems and the corresponding S/N ratios are summar- ized in Table 5.8. For the problems where an adjustment factor does not exist, the optimization is done by simply maximizing the S/N ratio. For the problems where an adjustment factor exists, the problem can be generically stated as minimize sensitivity to noise factors while keeping the mean function on target. By using S/N ratio, these problems can be converted into unconstrained optimization problems and solved by the following two-step procedure: 1. Maximize S/N ratio, without worrying about the mean function. 2. Adjust the mean on target by using the adjustment factor. The optimization strategy consists of the following steps: 1, Evaluate the effects of control factors under consideration on 7) and on the mean function. 2. For factors that have a significant effect on 1, select levels that maximize 1). 3. Select any factor that has no effect on 1 but a significant effect on the mean function as an adjustment factor. Use it to bring the mean function on target. (In practice, we must sometimes settle for a factor that has a small effect on 1) but a significant effect on the mean function as an adjustment factor.) 4. For factors that have no effect on 1) and the mean function, we can choose any level that is convenient from other considerations such as cost or other quality characteristics. Sec. 5.6 ‘Summary 131 TABLE 5.7 S/N RATIOS FOR STATIC PROBLEMS Problem | Range for the | Ideal Adjust- SIN Ratio Type Observations Value ment and Comments Small --better a paeethebetes | Oy 3.25 micrometers. 138 Achieving Additivity Chap. 6 The proper quality characteristic for photolithography is the actual line-width measurement, for example, 2.8 or 3.1 micrometers. When the line-width distribution is known, it is an easy task to compute the yield, Note that the line width is directly related to the amount of energy transferred during exposure and developing. The more the energy transferred, the larger is the line width. Also, our experience with photo- lithography has shown that the actual line width measurement is a monotonic charac- teristic with respect to the control factors. Further, when the target dimensions are changed, we can use the same experimental data to determine the new optimum set- tings of the control factor. In this case, the measurement technique is not much of an issue, although taking categorical data (small, desired, or large) is generally a little easier than recording actual measurements. A case study where the actual line width was used to optimize a photolithography process used in VLSI circuit fabrication is given by Phadke, Kackar, Speeney and Grieco [P5]. In summary, for the photolithography example, yield is the worst quality charac- teristic, ordered categorical data are better, and actual line width measurement (continu- ous variable) is the best. Spray Painting Process This example vividly illustrates the importance of energy transfer in selecting a quality characteristic. Sagging is a common defect in spray painting. It is caused by forma- tion of large paint drops that flow downward due to gravity. Is the distance through which the paint drops sag a good quality characteristic? No, because this distance is primarily controlled by gravity, and it is not related to the basic energy transfer in spray painting. However, the size of the drops created by spray painting is directly related to energy transfer and, thus, is a better quality characteristic. By taking the size of the drops as the quality characteristic, we can block out the effect of gravity, an extraneous phenomenon for the spray painting process. Yield of a Chemical Process There are many chemical processes that begin with a chemical A, which after reaction, becomes chemical B and, if the reaction is allowed to continue, tums into chemical C. If B is the desired product of the chemical process, then considering the yield of B as a quality characteristic is a poor choice. As in the case of photolithography, the yield is not a monotonic characteristic. A better quality characteristic for this experiment is the concentration of each of the three chemicals. The concentration of A and the concen- tration of A plus B possess the needed monotonicity property. 6.3 EXAMPLES OF S/N RATIOS Basic types of Robust Design problems and the associated S/N ratios were described in Chapter 5. A majority of Robust Design projects fall into one of these basic types of Sec.6.3 Examples of S/N Ratios 139 problems. This section gives three examples to illustrate the process of classification of Robust Design problems. Two of these examples also show how a complex prob- lem can be broken down into a composite of several basic types of problems. Heat Exchanger Design Heat exchangers are used to heat or cool fluids. For example, in a refrigerator a heat exchanger coil is used inside the refrigerator compartment to transfer the heat from the air in the compartment to the refrigerant fluid. This leads to lowering of the tempera- ture inside the compartment. Outside the refrigerator, the heat from the refrigerant is transferred to the room air through another heat exchanger. In optimizing the designs of heat exchangers and other heat-transfer equipment, defining the reference temperature is critical so that the optimization problem can be correctly classified. Consider the heat exchanger shown in Figure 6.1, which is used to cool the fluid inside the inner tube. The inlet temperature of the fluid to be cooled is T,. As the fluid moves through the tube, it loses heat progressively to the fluid outside the tube; its outlet temperature is T2. The inlet and outlet temperature for the coolant fluid are T3 and T4, respectively. Let the target outlet temperature for the fluid being cooled be To. Also, suppose the customer's requirement is that |T,-T) | <10 °C. What is the correct quality characteristic and S/N ratio for this Robust Design problem? Coolant +7; Coolant Ue EJ Fluid to i Fluid to be cooled be cooled yt Te Figure 6.1 Schematic diagram of a heat exchanger. One choice is to take target temperature Ty as the reference temperature and y= |T,-To| as the quality characteristic. Minimizing y is then the goal of this experiment; thus, at first it would appear that y should be treated as a smaller-the-better characteristic where the mean square value of y is minimized. The difficulty with this 140 Achieving Additivity Chap. 6 formulation of the problem is that by taking the square of y the positive and negative deviations in temperature are treated similarly. Consequently, interactions become important. This can be understood as follows: If y is too large because T> is too large compared to To, then y can be reduced by increasing the coil length. Note that a longer coil length leads to more cooling of the fluid and, hence, smaller T7. On the contrary, if y is too large because T2 is too small, then y can be reduced by decreasing the coil length. Thus, there are two opposite actions that can reduce y, but they cannot be distinguished by observing y. Therefore, y is not a good quality characteristic, and this problem should not be treated as smaller-the-better type. Here, the proper reference temperature is T, because it represents the lowest tem- perature that could be achieved by the fluid inside the tube. Thus, the correct quality characteristic is y’ =T—T. Note that y’ is always positive. Also, when the mean of y” is zero, its variance must also be zero. Hence, the problem should be classified as a nominal-the-best type with the target value of y’ equal to Ty—7'3. This formula- tion does not have the complication of interaction we described with y as the quality characteristic. Furthermore, if the target temperature Ty were changed, the information obtained using y’ as the quality characteristic would still be useful. All that is neces- sary is to adjust the mean temperature on the new target. However, if y were used as the quality characteristic, the design would have to be reoptimized when Tg is changed, which is undesirable. This loss of reusability is one of the reasons for lower R&D pro- ductivity. Paper Handling in Copying Machines In a copying machine, a critical customer-observable response is the number of pages copied before a paper-handling failure. In designing the paper-handling system we might take A, the number of pages copied before failure, as the quality characteristic. However, in this case, the number of pages that would have to be copied during the copier development would be excessively large. Also, decoupling the designs of the various modules is not possible when A is taken as the quality characteristic. A close look at the paper-handling equipment reveals that there are two basic functions in paper handling: paper feeding and paper transport. Paper feeding means picking a sheet, either the original or a blank sheet. Paper transport means moving the sheet from one station to another. A schematic diagram of a paper feeder is shown in Figure 6.2(a). The two main defects that arise in paper feeding are: no sheet fed or multiple sheets fed. A funda- mental characteristic that controls paper feeding is the normal force needed to pick up a sheet. Thus, we can measure the threshold force, F , to pick up just one sheet and the threshold force, F2, to pick up two sheets. Note that the normal force is a control fac- tor and that F, and F meet the guidelines listed in Section 6.1 and are better quality characteristics compared to A. By making F; as small as possible and F, as large as possible, we can improve the operating window F, — F> [see Figure 6.2(b)], reduce both types of paper feeding defects, and thus increase 2. The idea of enlarging the operating window as a means of improving product reliability is due to Clausing [C2]. Sec. 6.3 Examples of S/N Ratios 141 Force (CLITA Guide Ms (a) Schematic Diagram of a Paper Feeder Threshold Force for Feeding ‘Two Sheets Threshold Force for Feeding Single Sheet Feeding Single Sheet Feeding Two Sheets (b) Threshold Force and Operating Window Figure 6.2 Design of a paper feeder. Here, the appropriate S/N ratios for F, and F are, respectively, the smaller-the- better type and the larger-the-better type. Note that the two threshold forces comprise a vector quality characteristic. We must measure and optimize both of them. This is what we mean by completeness of a quality characteristic. 142 Achieving Additivity Chap. 6 In a copying machine, paper is moved through a 3-dimensional path using several paper transport modules. Figure 6.3 shows a planar diagram for paper move- ment through a single module. The fundamental characteristics in transporting paper are the (x,y) movements of the center of the leading edge of the paper, the rotation angle @ of the leading edge, and the time of arrival of the paper at the next module. The lateral movement (y movement) of the paper can be taken care of by registration against a guide. The remaining characteristics can then be addressed by placing two sensors to measure the arrival times of the left and right ends of the leading edge of the paper. Both of these arrival times have a common nonzero target mean and can be classified as nominal-the-best type problems. Beginning and End - Positions of the Leading Edge Paper Position Paper Position at the Beginning at the End of a Transport of a Transport Module Module Figure 6.3 Planar diagram for paper transport. Here, also, the two arrival times can be viewed as a vector quality characteristic. Both times must be measured and optimized. If we omit one or the other, we cannot guard against failure due to the paper getting twisted during transport. Also, by opti- mizing for both the arrival times (that is, minimizing the variability of both the arrival times and making their averages equal to each other), the design of each paper tran- sport module can be decoupled from other modules. Optimizing each of the paper- feeding and paper-transport characteristics, described above, automatically optimizes 2. Thus, the problem of life improvement is broken down into several problems of nominal-the-best, smaller-the-better, and larger-the-better types. It is quite obvious that optimizing these separate problems automatically improves A, the number of pages copied before failure. Sec. 6.3 Examples of S/N Ratios 143 Electrical Filter Design Electrical filters are used widely in many electronic products, including telecommunica- tions and audio/video equipment. These circuits amplify (or attenuate) the components of the input voltage signal at different frequencies according to the specified frequency response function [see Figure 6.4(a)]. An equalizer, used in high-fidelity audio equip- ment, is an example of a filter familiar to many people. It is used to amplify or attenu- ate the sounds of different musical instruments in a symphony orchestra. Electrical Filter (@) Block Diagram of an Electrical Filter Specified Upper Desired Frequency and Lower Limits Response Function ‘on the Gain (b) Frequency Response Function Figure 6.4 Design of an electrical filter. Figure 6.4(b) shows an example of a desired frequency response function and the customer-specified upper and lower limits for the gain. If the customer-specified gain limits are violated at any frequency, the filter is considered defective. From the preceding discussion in this chapter, it should be apparent that counting the percentage of defective filters, though easiest to measure, is not a good quality characteristic. 144 Achieving Additivity Chap. 6 ' This problem can be solved more efficiently by dividing the frequencies into several bands, say five bands as shown in Figure 6.4(b). For each of the middle three bands, we must achieve gain equal to the gain specified by the frequency response function. Therefore, we treat these as three separate nominal-the-best type problems. For each band, we must identify a separate adjustment factor that can be used to set the mean gain at the right level. Note that a resistor, capacitor, or some other component in the circuit can serve as an adjustment factor. For any one of these bands, the adjust- ment factors for the other two bands should be included as noise factors, along with other noise factors, such as component tolerances and temperature. Then, adjusting the gain in one band would have a minimal effect on the mean gain in the other bands. For each of the two end bands, we must make the gain as small as possible. Accordingly, these two bands belong to the class of smaller-the-better type problems. Thus, we have divided a problem where we had to achieve a desired curvilinear response into several familiar problems. 6.4 SELECTION OF CONTROL FACTORS Additivity of the effects of the control factors is also influenced by the selection of the control factors and their levels. By definition, the control factors are factors whose levels can be selected by the designer. Next, it is important that each control factor influence a distinct aspect of the basic phenomenon affecting the quality characteristic. If two or more control factors affect the same aspect of the basic phenomenon, then the ibility of interaction among these factors becomes high. When such a situation is recognized, we can reduce or even eliminate the interaction through proper transforma- tion of the control factor levels. We refer to this transformation as sliding levels. The following examples illustrate some of the important considerations in the selection of control factors. A qualitative understanding of how control factors affect a product is very important in their selection. Treatment of Asthmatic Patients This rather simplified example brings out an important consideration in the selection of control factors. Consider three drugs (A, B, and C) proposed by three scientists for treating wheezing in asthmatic patients. Suppose the drug test results indicate that if no drug is given, the patient's condition is bad. If only drug A is given, the patients get somewhat better; if only drug B is given, the patients feel well; and if only drug C is given, the patients feel moderately good. Can the three drugs be considered as three separate control factors? If so, then a natural expectation is that by giving all three drugs simultaneously, we can make the patients very well. Suppose we take a close look at these drugs to find out that all three drugs con- tain theophillin as an active ingredient, which helps dilate the bronchial tubes. Drug A has 70 percent of full dose, drug B has 100 percent of full dose, and drug C has 150 Sec. 6.4 Selection of Control Factors 145 percent of full dose. Administering all three drugs simultaneously implies giving 320 percent of full dose of theophillin. This could significantly worsen the patient’s condi- tion. Therefore, the three drugs interact. The proper way to approach this problem is to think of the theophillin concentration as a single control factor with four levels: 0 percent (no drug), 70 percent (drug A), 100 percent (drug B), and 150 percent (drug ©). Here the other ingredients of the three drugs should be examined as additional potential control factors. Photolithography Process Aperture and exposure are among the two important control factors in the photolithog- raphy process used in VLSI fabrication (see Phadke, Kackar, Speeney, and Grieco [P5]). The width of the lines printed by photolithography depends on the depth of field and the total light energy falling on. the photoresist. The aperture alone determines the depth of field. However, both aperture and exposure time influence the total light energy. In fact, the total light energy for fixed light intensity is proportional to the product of the aperture and exposure time. ‘Thus, if we chose aperture and exposure time as control factors, we would expect to see strong interaction between these two factors. The appropriate control factors for this situation are aperture and total light energy. Suppose 1.2N, N, and 0.8N are used as three levels for light energy, where N stands for the nominal level or the middle level. We can achieve these levels of light energy for various apertures through the sliding levels of exposure as indicated in Table 6.3. The level N of total light energy can be achieved by setting exposure at 120 when aperture is 1, exposure at 90 when aperture is 2, and exposure at 50 when aperture is 3. TABLE 6.3 SLIDING LEVELS FOR EXPOSURE iN PHOTOLITHOGRAPHY PROCESS Exposure (PEP-Setting) 12N N 0.8N 1| 96 10 | 144 Aperture 2) 2 90 | 108 3| 40 50 60 The thickness of the photoresist layer is another fundamental characteristic that influences the line width. The thickness can be changed by controlling the photoresist 146 Achieving Additivity Chap. 6 viscosity and the spin speed. Here too, sliding levels of spin speed should be con- sidered to minimize interactions (see Phadke, Kackar, Speeney, and Grieco [P5]). 6.5 ROLE OF ORTHOGONAL ARRAYS Matrix experiments using orthogonal arrays play a crucial role in achieving additivity—they provide a test to see whether interactions are large compared to the main effects. Consider a matrix experiment where we assign only main effects to the columns ‘of an orthogonal array so that the interactions (2-factor, 3-factor, etc.) are confounded with the main effects (see Chapter 7). There are two possibilities for the relative mag- nitudes of the interactions: 1. If one or more of these interactions are large compared to the main effects, then the main effects with which these interactions are confounded will be estimated with large bias or error. Consequently, the observed response under the predicted optimum conditions will not match the prediction based on the additive model. Thus, in this case the verification experiment will point out that large interactions are present. 2. On the contrary, if the interactions are small compared to the main effects, then the observed response under the predicted optimum conditions will match the prediction based on the additive model. Thus, in this case the verification exper- iment will confirm that the main effects dominate the interactions. Optimization studies where only one factor is studied at a time are not capable of determining if interactions are or are not large compared to the main effects. Thus, it is important to conduct multifactor experiments using orthogonal arrays. Dr. Taguchi considers the ability to detect the presence of interactions to be the primary reason for using orthogonal arrays to conduct matrix experiments. Sections 6.2, 6.3, and 6.4 described the engineering considerations in selecting the quality characteristics, S/N ratios, and control factors and their levels. Matrix experiments using orthogonal arrays provide a test to see whether the above selections can successfully achieve additivity. If additivity is indeed achieved, the matrix exper ment provides simultaneously the optimum values for the control factors. If additivity is not achieved, the matrix experiment points it out so that one can re-examine the selection of the quality characteristics, S/N ratios, and control factors and their levels. 6.6 SUMMARY * Ability to predict the robustness (sensitivity to noise factors) of a product for any combination of control factor settings is needed so that the best control factor levels can be selected. The prediction must be valid, not only under the laboratory condi- tions, but also under manufacturing and customer usage conditions. Sec. 6.6 ‘Summary 147 It is important to have additivity of the effects of the control factors on the sensi- tivity to noise factors (robustness) for the following reasons: — Only main effects need to be estimated which takes only a small number of experiments. However, if the interactions among the control factors are strong, experiments must be conducted under all combinations of control fac- tor settings, which is clearly expensive, if not impractical. — Conditions under which experiments are conducted can also be considered as a control factor. The conditions consist of three types: laboratory, manufac- turing, and customer usage. Presence of strong interactions among the con- trol factors studied in a laboratory is an indication that the experimental con- ditions are likely to interact with the control factors that have been studied. This interaction, if present, can make the laboratory results invalid, which leads to product failure during manufacturing and customer usage. ‘The additivity is influenced greatly by the choice of the quality characteristic, the S/N ratio, and control factors and their levels. ‘The following guidelines should be used in selecting quality characteristics: 1, The quality characteristic should be directly related to the energy transfer associated with the basic mechanism or the ideal function of the product. 2. As far as possible, choose continuous variables as quality characteristics. 3. The quality characteristics should be monotonic. Also, the related S/N ratio should possess additivity. 4. Quality characteristics should be easy to measure. 5. Quality characteristics should be complete—that is, they should cover all dimensions of the ideal function. 6. For products, having feedback mechanisms, the open loop, sensor and com- pensation modules should be optimized separately, and the modules should then be integrated. Similarly, complex products should be divided into suit- able modules for optimization purposes. Although the final success of a product or a process may depend on the reliability or the yield, such responses often do not make good quality characteristics. They tend to cause strong interactions among the control factors as illustrated by the pho- tolithography example. Different types of variables can be used as quality characteristics: the output or the response variable, and threshold values of suitable control factors or noise factors for achieving a certain value of the output. When the output is discrete, such as ON-OFF states, it becomes necessary to use the threshold values. Additivity of the effects of the control factors is also influenced by the selection of control factors and their levels. If two or more control factors affect the same 148 Achieving Additivity Chap. 6 aspect of the basic phenomenon, then the possibility of interaction among these fac- tors becomes high. When such a situation is recognized, the interaction can be reduced or even eliminated through proper transformation of the control factor levels (sliding levels). A qualitative understanding of how control factors affect a product is important in their selection. Selecting a good quality characteristic, S/N ratio, and control factors and their levels is essential in improving the efficiency of development activities. The selec- tion process is not always easy. However, when experiments are conducted using orthogonal arrays, a verification experiment can be used to judge whether the interactions are severe. When interactions are found to be severe, it is possible to look for an improved quality characteristic, S/N ratio, and control factor levels, and, thus, mitigate potential manufacturing problems and field failures. Matrix experiment based on an orthogonal array followed by a verification experi- ment is a powerful tool for detecting lack of additivity. Optimizing a product design one factor at a time does not provide the needed test for additivity. Chapter 7 CONSTRUCTING ORTHOGONAL ARRAYS The benefits of using an orthogonal array to conduct matrix experiments as well as the analysis of data from such experiments are discussed in Chapter 3, The role of orthog- onal arrays in a Robust Design experiment cycle is delineated in Chapter 4 with the help of the case study of improving the polysilicon deposition process. This chapter describes techniques for constructing orthogonal arrays that suit a particular case study at hand. Construction of orthogonal arrays has been investigated by many researchers including Kempthome [K4], Plackett and Burman [P8], Addelman [A1], Raghavarao IR1J, Sciden [$3], and Taguchi [T1]. The process of fitting an orthogonal array to a specific project has been made particularly easy by a graphical tool, called linear graphs, developed by Taguchi to represent interactions between pairs of columns in an orthogonal array. This chapter shows the use of linear graphs and a set of standard orthogonal arrays for constructing orthogonal arrays to fit a specific project. Before constructing an orthogonal array, the following requirements must be defined: 1. Number of factors to be studied 2. Number of levels for each factor 3. Specific 2-factor interactions to be estimated 4. Special difficulties that would be encountered in running the experiments 150 Constructing Orthogonal Arrays Chap. 7 This chapter describes how to construct an orthogonal array to meet these requirements and consists of the following eleven sections: Section 7.1 describes how to determine the minimum number of rows for the matrix experiment by counting the degrees of freedom. Section 7.2 lists a number of standard orthogonal arrays and a procedure for selecting one in a specific case study. A novice to Robust Design may wish to use a standard array that is closest to the needs of the case study, and if neces- sary, slightly modify the case study to fit a standard array. The remaining sec- tions in this chapter describe various techniques of modifying the standard orthogonal arrays to construct an array to fit the case study. Section 7.3 describes the dummy level method which is useful for assigning a factor with number of levels less than the number of levels in a column of the chosen orthogonal array. Section 7.4 discusses the compound factor method which can be used to assign two factors to a single column in the array. Section 7.5 describes Taguchi's linear graphs and how to use them to assign interactions to columns of the orthogonal array. Section 7.6 presents a set of rules for modifying a linear graph to fit the needs of a case study. Section 7.7 describes the column merging method, which is useful for merging columns in a standard orthogonal array to create columns with larger number of levels. Section 7.8 describes process branching and shows how to use the linear graphs to construct an appropriate orthogonal array for case studies involving process branching. Section 7.9 presents three step-by-step strategies (beginner, intermediate, and advanced) for constructing an orthogonal array. Section 7.10 describes the differences between Robust Design and classical sta- tistical experiment design. Section 7.11 summarizes the important points of this chapter. 7.1 COUNTING DEGREES OF FREEDOM The first step in constructing an orthogonal array to fit a specific case study is to count the total degrees of freedom that tells the minimum number of experiments that must be performed to study all the chosen control factors. To begin with, one degree of freedom is associated with the overall mean regardless of the number of control factors to be studied. A 3-level control factor counts for two degrees of freedom because for a 3-level factor, A, we are interested in two comparisons. Taking any one level, A, as Sec. 7. Selecting a Standard Orthogonal Array 151 the base level, we want to know how the response changes when we change the level to Ay or A3. In general, the number of degrees of freedom associated with a factor is equal to one less than the number of levels for that factor. The degrees of freedom associated with interaction between two factors, called A and B, are given by the product of the degrees of freedom for each of the two factors. This can be seen as follows. Let m4 and ng be the number of levels for factors A and B. Then, there are n4 ng total combinations of the levels of these two factors. From that we subtract one degree of freedom for the overall mean, (n4-1) for the degrees of freedom of A and (ng—I) for the degrees of freedom of B. Thus, Degrees of freedom for interaction A x B = ngng — 1 — (ag-1) — (ng-1) = (mg-1) (mg-1) = (degrees of freedom for A) x (degrees of freedom for B) . Example 1: Let us illustrate the computation of the degrees of freedom. Suppose a case study has one 2-level factor (A), five 3-level factors (B, C, D, E, F), and we are interested in estimating the interaction A x B, The degrees of freedom for this experiment are then computed as follows: Factor/Interaction Degrees of freedom Overall mean 1 A 2-1 B, C, D, E, F 5 x 3-1)=10 AxB (2-1) x G-1)=2 Total 14 So, we must conduct at least 14 experiments to be able to estimate the effect of each factor and the desired interaction. 7.2 SELECTING A STANDARD ORTHOGONAL ARRAY Taguchi [T1] has tabulated 18 basic orthogonal arrays that we call standard orthogonal arrays (see Appendix C). Most of these arrays can also be found in somewhat dif- ferent forms in one or more of the following references: Addelman [AI], Box, Hunter, 152 Constructing Orthogonal Arrays Chap. 7 and Hunter (B3], Cochran and Cox [C3], John [J2], Kempthome [K4], Plackett and Burman (P8], Raghavarao [RI], Seiden [$3], and Diamond [D3]. In many case stu- dies, one of the arrays from Appendix C can be used directly to plan a matrix experi- ment. An array’s name indicates the number of rows and columns it has, and also the number of levels in each of the columns. Thus, the array L4(2*) has four rows and three 2-level columns. The array L4g(2'3’) has 18 rows; one 2-level column; and seven 3-level columns. Thus, there are eight columns in the array L43(2'3’). For brevity, we generally refer to an array only by the number of rows. When there are two arrays with the same number of rows, we refer to the second array by a prime. Thus, the two arrays with 36 rows are referred to as Ly and L45. The 18 standard orthogonal arrays along with the number of columns at different levels for these arrays are listed in Table 7.1. TABLE 7.1 STANDARD ORTHOGONAL ARRAYS Number | Maximum | Maximum Number of Columns Orthogonal | of Number of at These Levels Array* Rows Factors 2 3 4 5 Le 4 3 3 - ~ ~ Ls 8 7 7 7 = - Ly 9 4 - 4 - : La 2 u n = 4 7 Li 16 15 15 - _ _ Lie 16 5 = _ 5 - Lis 18 8 1 7 a“ - Las 25 6 - - - Ln 2 3 a 1B - Z Ly 32 31 31 - 7 - Ly 32 10 1 - 9 - Ly 36 B u 12 - = Lig 36 16 3 13 - = Ls 50 12 1 _ - it Ly 54 26 1 28 - - Le 64 6 63 - - _ Lig 64 2 “ - 2 = Le al 40 - 40 - _ * Qevel arrays: La, Ley Lia 3eevel arrays: Lo. Loi. Li Mixed 2- and 3-level array: Lis. Lyx, Lose Lips Lae, Liss Lowe Sec. 7.2 Selecting a Standard Orthogonal Array 153 The number of rows of an orthogonal array represents the number of experiments. In order for an array to be a viable choice, the number of rows must be at least equal to the degrees of freedom required for the case study. The number of columns of an array represents the maximum number of factors that can be studied using that array. Further, in order to use a standard orthogonal array directly, we must be able to match the number of levels of the factors with the numbers of levels of the columns in the array. Usually, it is expensive to conduct experiments. Therefore, we use the smallest possible orthogonal array that meets the requirements of the case study. However, in some situations we allow a somewhat larger array so that the additivity of the factor effects can be tested adequately, as discussed in Chapter 8 in conjunction with the dif- ferential operational amplifier case study. Let us consider some examples to illustrate the choice of standard orthogonal arrays. Example 2: A case study has seven 2-level factors, and we are only interested in main effects, Here, there are a total of eight degrees of freedom—one for overall mean and seven for the seven 2-level factors. Thus, the smallest array that can be used must have eight or more rows. The array L has seven 2-level columns and, hence, fits this case study perfectly—each column of the array will have one factor assigned to it. Example 3: A case study has one 2-level factor and six 3-level factors. This case study has 14 degrees of freedom—one for overall mean, one for the 2-level factor and twelve for the six, 3-level factors. Looking at Table 7.1, we see that the smallest array with at least 14 rows is Lg. But this array has fifteen 2-level columns. We cannot directly assign these columns to the 3-level factors. The next larger array is Lyg which has one 2- level and seven 3-level columns. Here, we can assign the 2-level factor to the 2-level column and the six 3-level factors to six of the seven 3-level columns, keeping one 3- level column empty. Orthogonality of a matrix experiment is not lost by keeping one or more columns of an array empty. So, L1 is a good choice for this experiment. In a situation like this, we should take another look at the control factors to see if there is an additional control factor to be studied, which we may have ignored as less impor- tant, If one exists, it should be assigned to the empty column. Doing this allows us a chance to gain information about this additional factor without spending any more Tesources. Example 4: Suppose a case study has two 2-level and three 3-level factors. The degrees of free- dom for this case study are nine. However, L cannot be used directly because it has no 2-level columns. Similarly, the next larger array Lj) cannot be used directly because it has no 3-level columns. This line of thinking can be extended all the way 154 Constructing Orthogonal Arrays Chap. 7 through the array Ly7. The smallest array that has at least two 2-level columns and three 3-level columns is Lg. However, if we selected Lg, we would be effectively wasting 36-9=27 degrees of freedom, which would be very inefficient experimenta- tion. This raises the question of whether these standard orthogonal arrays are flexible enough to be modified to accommodate various situations. The answer is yes, and the subsequent sections of this chapter describe the different techniques of modifying orthogonal arrays. Difficulty in Changing the Levels of a Factor The columns of the standard orthogonal arrays given in Appendix C are arranged in increasing order of the number of changes; that is, the number of times the level of a factor has to be changed in running the experiments in the numerical order is less for the columns on the left when compared to the columns on the right. Consequently, we should assign a factor whose levels are difficult to change to columns on the left and vice versa, 7.3 DUMMY LEVEL TECHNIQUE The dummy level technique allows us to assign a factor with m levels to a column that has n levels where n is greater than m. Suppose a factor A has two levels, Aj and A>. We can assign it to a 3-level column by creating a dummy level Ay which could be taken the same as A or Ap. Example Let us consider a case study that has one 2-level factor (A) and three 3-level factors (B, C, and D) to illustrate the dummy level technique. Here we have eight degrees of free- dom. Table 7.2 (a) shows the Ly array and Table 7.2 (b) shows the experiment layout generated by assigning the factors A, B, C, and D to columns 1, 2, 3, and 4, respec- tively, and by using the dummy level technique. Here we have taken A3=A, and called it A{ to emphasize that this is a dummy level. Note that after we apply the dummy level technique, the resulting array is still proportionally balanced and, hence, orthogonal (see Appendix A and Chapter 3). Also, note that in Example 5, we could just as well have taken Ay=Ay. But to ensure orthogonality, we must consistently take Ay=A, or Ay=A3 within the matrix experi- ment. The choice between taking A;=A, or A;=A2 depends on many issues. Some of the key issues are as follows: 1, If we take A;=A; then the effect of A> will be estimated with two times more precision than the effect of A;. Thus, the dummy level should be taken to be the one about which we want more precise information. Thus, if A, is the starting condition about which we have a fair amount of experience and A> is the new alternative, then we should choose A 3 =. Sec. 7.3 Dummy Level Technique 155 TABLE 7.2 DUMMY LEVEL AND COMPOUND FACTOR TECHNIQUES (b) Experiment Layout (©) Experiment Layout for Dummy Level for Compound Factor (a) Ly Array Technique (Example 5) ‘Technique (Example 6) Column Number Column Number Column Number Expt. Expt. Expt. 102 3 4// No} 1 2 3 4|/]No{ 1 2 3 4 ryioardia 1/4, B Cy, Dy 1 |AE, B,C, Dy 2 12 2 2 2 |A, By Cy Dy 2 | AE: Br Cz Dr 3 ]1 33 3 3 (A, By Cy Ds 3 | AE, By Cy Dy 4 ]2 1 2 3 4 |4, By Cr Dy 4 | AE, By Cr Dy s ]2 2 3 1 5 |A, By C, D, 5S | AE, B, Cs D 6 |2 3 1 2 6 |A, By C, Dy 6 | A\Er By Cy Da 7 3 1 3 2 7 | AL Br Cy D3 7 | AE, By Cy Dy 8 |3 2 1 3 8 [Ai BC, Dy 8 | Az, By C, Dy 9 }3 3 21 9 | A By Cy Dy 9 | A,£, By Cr D, A BoC D A BC D AE BC OD Factor Factor Factor Assignment Assignment Assignment 2. Availability of experimental resources and ease of experimentation also plays a role here. Thus, if A; and A are two different raw materials and A, is very scarce, then we may choose A3=A2 so that the matrix experiment can be finished in a reasonable time. One can apply the dummy level technique to more than one factor in a given case study. Suppose in Example 5 there were two 2-level factors (A and B) and two 3-level factors (C and D). We can assign the four factors to the columns of the orthog- onal array Ly by taking dummy levels Ay=A‘ (or Ay=A4) and B;=B% (or By=B5). Note that the orthogonality is preserved even when the dummy level technique is applied to two or more factors. The dummy level technique can be further generalized, without losing ortho- gonality, to assign an m-level factor to an n-level column where m is less than n. For example, for studying the effect of clearance defects and other manufacturing parame- ters on the reliability of printed wiring boards (described by Phadke, Swann, and Hill [P6}, and Mitchell [M1]), a 6-level factor (A) was assigned to a 9-level column by tak- ing A7=A{, Ag=A} and Ag=A4. 156 Constructing Orthogonal Arrays Chap. 7 7.4 COMPOUND FACTOR METHOD The compound factor method allows us to study more factors with an orthogonal. array than the number of columns in the array. It can be used to assign two 2-level factors to a 3-level column as follows. Let A and B be two 2-level factors, There are four total combinations of the levels of these factors: AyBy, AB), A1Bo, and A>B>. We pick three of the more important levels and call them as three levels of the compound factor AB. Suppose we choose the three levels as follows; (AB), = A,By, (AB), = A,B, and (AB); = AB. Factor AB can be assigned to a 3-level column and the effects of A and B can be studied along with the effects of the other factors in the experiment. For computing the effects of the factors A and B, we can proceed as follows: the difference between the level means for (AB); and (AB), tells us the effect of changing from B, to Bz. Similarly, the difference between the level means for (AB); and (AB); tells us the effect of changing from A, to Az. In the compound factor method, however, there is a partial loss of orthogonality. The two compounded factors are not orthogonal to each other. But each of them is orthogonal to every other factor in the experiment. This complicates the computation of the sum of squares for the compounded factors in constructing the ANOVA table. The following examples help illustrate the use of the compound factor method. Example 6: Let us go back to Example 4 in Section 7.2 where the case study has two 2-level fac- tors (A and E) and three 3-level factors (B, C, and D). We can form a compound fac- tor AE with three levels (AE), = AE), (AE)2 = A,Ep and (AE); = AE\. This leads us to four 3-level factors that can be assigned to the Lig orthogonal array. See Table 7.2(c) for the experiment layout obtained by assigning factors AE, B, C, and D to columns 1, 2, 3, and 4, respectively. Example 7: The window photolithography case study described by Phadke, Kackar, Speeney and Grieco [P5] had three 2-level factors (A, B, and D) and six 3-level factors (C, E, F, G, H, and 1). The total degrees of freedom for the case study are sixteen. The next larger standard orthogonal array that has several 3-level factors is Lj (2' x 37). The experi- menters formed a compound factor BD with three levels (BD), =B,D1, (BD): =B 2D, and (BD);=B D>. This gave them one 2-level and seven 3-level factors that match perfectly with the columns of the Lig array. Reference [PS] also describes the compu- tation of ANOVA for the compound factor method. As a matter of fact, the experimenters had started the case study with two 2-level factors (A and B) and seven 3-level factors (C through I). However, observing that by Sec. 7.5 Linear Graphs and Interaction Assignment 187 dropping one level of one of the 3-level factors, the Lj, orthogonal array would be suitable, they dropped the least important level of the least important factor, namely factor D. Had they not made this modification to the requirements of the case study, they would have needed to use the L7 orthogonal array, which would have amounted to 50 percent more experiments! As illustrated by this example, the experimenter should always consider the possibility of making small modifications in the require- ments for saving the experimental effort. 7.5 LINEAR GRAPHS AND INTERACTION ASSIGNMENT Sections 7.2 through 7.4 considered the situations where we are not interested in estimating any interaction effects. Although in most Robust Design experiments we choose not to estimate any interactions among the control factors, there are situations where we wish to estimate a few selected interactions. The linear graph technique, invented by Taguchi, makes it easy to plan orthogonal array experiments involving interactions. Confounding of Interactions with Factor Effects Let us consider the orthogonal array Lg [Table 7.3 (a)] and suppose we assigned fac- tors A, B, C, D, E, F, and G to the columns 1 through 7, respectively. Suppose we believe that factors A and B are likely to have strong interaction, What effect would the interaction have on the estimates of the effects of the seven factors obtained from this matrix experiment? The interaction effect is depicted in Figure 7.1. We can measure the magnitude of interaction by the extent of nonparallelism of the effects shown in Figure 7.1. Thus, AB interaction = (y4,8,-Y4,8,) — A,B, —YA,B,) = (a8 +YA,B,) ~ arb, + YA B2) + From Table 7.3 (a) we see that experiments under level C, of factor C (experiments I, 2,7 and 8) have combinations A,B, and A,B of factors A and B; and experiments under level Cz of factor C (experiments 3, 4, 5 and 6) have combinations A,B and A2B of factors A and B. Thus, we will not be able to distinguish the effect of factor C from the A x B interaction. Inability to distinguish effects of factors and interac- tions is called confounding. Here we say that factor C is confounded with interaction AX B. We can avoid the confounding by not assigning any factor to column 3 of the array Lg. 158 Constructing Orthogonal Arrays Chap. 7 Figure 7.1 2-factor interaction. Interaction between factors A and B shows as nonparallel- ism of the effects of factor A under levels B, and B of factor B. Interaction Table The interaction table, shown in Table 7.3 (b), shows in which column the interaction is confounded with (or contained in) for every pair of columns of the Lg array. Thus, it can be used to determine which column of the Lg array should be kept empty (that is, not be assigned to a factor) in order to estimate a particular interaction. From the table, we see that the interaction of columns 1 and 2 is confounded with column 3, the interaction of columns 3 and 5 is confounded with column 6, and so on. Note that the interaction between columns a and b is the same as that between columns 6 and a. That is, the interaction table is a symmetric matrix. Hence, only the upper triangle is given in the table, and the lower triangle is kept blank. Also, the diagonal terms are indicated in parentheses as there is no real meaning to interaction between columns a and a. The interaction table contains all the relevant information needed for assigning factors to columns of the orthogonal array so that all main effects and desired interac- tions can be estimated without confounding. The interaction tables for all standard orthogonal arrays prepared by Taguchi [T1] are given in Appendix C, except for the arrays where the interaction tables do not exist, and for the arrays Le4, Lg, and Ley, because they are used rather infrequently, The interaction tables are generated directly from the linear algebraic relations that were used in creating the orthogonal arrays themselves. Sec. 7.5 Linear Graphs and Interaction Assignment 159 TABLE 7.3 Ly ORTHOGONAL ARRAY AND ITS INTERACTION TABLE (a) L4(2”) orthogonal array (b) Interaction table for Ly Column Column Expt. No }1 2 3 4 5S 6 7| |Columa aftourtadd oe 2/1 442 2 2 2 2 3/1 22 112 2 oe afi 222 211 4 Ss }2 1 2 12 1 2 a 6/2 12 2 121 6 7/2 241221 7 ® 8/2 2 12 11 2 7 Note: Entries in this table show the column with A BC D EF G| Which the interaction between every pair of columns is confounded, Factor Assignment Linear Graphs Using the interaction tables, however, is not very convenient, Linear graphs represent the interaction information graphically and make it easy to assign factors and interac- tions to the various columns of an orthogonal array. Ina linear graph, the columns of an orthogonal array are represented by dots and lines. When two dots are connected by a line, it means that the interaction of the two columns represented by the dots is con- tained in (or confounded with) the column represented by the line. In a linear graph, each dot and each line has a distinct column number(s) associated with it. Further, every column of the array is represented in its linear graph once and only once. One standard linear graph for the array Lg is given in Figure 7.2 (a). It has four dots (or nodes) corresponding to columns 1, 2, 4, and 7. Also, it has three lines (or edges) representing columns 3, 6, and 5. These lines correspond to the interactions between columns 1 and 2, between columns 2 and 4, and between columns 1 and 4, respectively. From the interaction table, Table 7.3 (b), we can verify that columns 3, 6, and 5 indeed correspond to the interactions mentioned above. In general, a linear graph does not show the interaction between every pair of columns of the orthogonal array. It is not intended to do so; that information is con- tained in the interaction table. ‘Thus, the interaction between columns | and 3, between columns 2 and 7, etc., are not shown in the linear graph of L in Figure 7.2 (a). 160 Constructing Orthogonal Arrays Chap. 7 1 2 3 3 5 o7 1 = 4 8 4 7 a 6 (a) (b) Figure 7.2 ‘Two standard linear graphs of Ls. ‘The other standard linear graph for Lg is given in Figure 7.2 (b). It, too, has four dots corresponding to columns 1, 2, 4, and 7. Also, it has three lines representing columns 3, 5 and 6. Here, these lines correspond to the interactions between columns 1 and 2, between columns | and 4, and between columns 1 and 7, respectively. Let us see some examples of how these linear graphs can be used. In general, an orthogonal array can have many linear graphs. Each linear graph, however, must be consistent with the interaction table of the orthogonal array. The dif- ferent linear graphs are useful for planning case studies having different requirements. Taguchi [T1] has prepared many linear graphs, called standard linear graphs, for each orthogonal array. Some of the important standard linear graphs are given in Appendix C. Note that the linear graphs for the orthogonal arrays Lgq and Ly are not given in Appendix C because they are needed rather infrequently. However, they can be found in Taguchi [T1]. Section 7.6 describes the rules for modifying linear graphs to fit them to the needs of a given case study. Example 8: Suppose in a case study there are four 2-level factors A, B, C, and D. We want to estimate their main effects and also the interactions A x B , B x C, and B x D. Here, the total degrees of freedom are eight, so Lg is a candidate array. ‘The linear graph in Figure 7.2 (b) can be used directly here. The obvious column assignment is: factor B should be assigned to column 1, Factors A, C, and D can be assigned in an arbitrary order to columns 2, 4, and 7. Suppose we assign factors A, C, and D to columns 2, 4, and 7, respectively. Then the interactions A x B, B x C and B x D can be obtained from columns 3, 5, and 6, respectively. These columns must be kept empty. Table 7.4 shows the corresponding experiment layout. Sec. 7.5 Linear Graphs and Interaction Assignment 161 TABLE 7.4 ASSIGNMENT OF FACTORS AND INTERACTIONS: EXPERIMENT LAYOUT USING ARRAY Ls Column* Expt. No] to 2 3 4 § 6 7 1 By AY Cc Dy 2 |B A © Dz 3) B, Ap a Dz 4 |B, A; Ca Dy 5 |B, A CG Dy 6 |B, Ay C2 Dy 7 |B A CG Dd 8 B, Az C2 Dz BA AxB C BxC BxD D Factor Assignment * Note that columns 3, 5, and 6 are left empty (no factors are assigned) so that interactions A x B, B x C and B x D can be estimated, Estimating an interaction means determining the nonparallelism of the factor effects. To estimate an interaction, we prepare a 2-way table from the observed data. For example, to estimate A x B interaction in Example 8 we prepare the following table whose rows correspond to the levels of factor A, columns correspond to the levels of fac- tor B, and entries correspond to the average response for the particular combination of the levels of factors A and B: Level of Factor B B, B yitya | yst¥e A , 2 2 Level of -—-——+ factor A yrtya | dotye Ar | 3 162 Constructing Orthogonal Arrays Chap. 7 In the above table y; stands for the response for experiment i. Experiments 1 and 2 are conducted at levels A, and B, of factors A and B (see Table 7.4). Accordingly, the entry in the A,B, position in (y, + y2) /2. The entries in the other positions of the table are determined similarly. The data of the above 2-way table can be plotted to display the AxB interaction. The interactions BxC and B xD can be estimated in the same manner. If fact, this estimation procedure can be used regardless of the number of levels of a factor. Example 9: Suppose there are five 2-level factors A, B, C, D, and E. We want to estimate their main effects and also the interactions AxB and BC. Here, also, the needed degrees of freedom is eight, making L a candidate array. However, none of the two standard linear graphs of Lg can be used directly. Section 7.6 shows how the linear graphs can be modified so that a wide variety of experiment designs can be constructed con- veniently. Linear Graphs for 3-level Factors So far in this section we have discussed the interaction between two 2-level factors. The concept can be extended to situations involving factors with higher number of lev- els. Figure 7.3 (a) shows an example of no interaction between two 3-level factors, whereas Figures 7.3 (b) and (c) show examples where interaction exists among two 3-level factors. 7 1 By J» B, B, 8, YN 6, 7 A, Ap As A A, As A Ay 4s (2) No Interaction (b) Synergistic (©) Antisynergistic Interaction Interaction Figure 7.3 Examples of interaction. Sec. 7.6 Modification of Linear Graphs 163 Linear graphs and interaction tables for the arrays Lo, L27, etc., which have 3-level columns, are slightly more complicated than those for arrays with 2-level columns. Each column of a 3-level factor has two degrees of freedom associated with it. The interaction between two 3-level columns has four degrees of freedom. Hence, to estimate the interaction between two 3-level factors, we must keep two 3-level columns empty, in contrast to only one column needed to be kept empty for 2-level orthogonal arrays. This fact is reflected in the interaction tables and linear graphs shown in Appendix C. As discussed repeatedly in earlier chapters, we generally do not study intetac- tions in Robust Design. Then why study linear graphs? The answer is because linear graphs are useful in modifying orthogonal arrays to fit specific case studies. The fol- lowing three sections describe the rules for modifying linear graphs and their use in modifying orthogonal arrays. 7.6 MODIFICATION OF LINEAR GRAPHS ‘The previous section showed how linear graphs can be used to assign main effects and interactions to the columns of standard orthogonal arrays. However, the principal util- ity of linear graphs is for creating a variety of different orthogonal arrays from the standard ones to fit real problems. The linear graphs are useful for creating 4-level columns in 2-level orthogonal arrays, 9-level columns in 3-level orthogonal arrays and 6-level columns in mixed 2- and 3-level orthogonal arrays. They are also useful for constructing orthogonal arrays for process branching. Sections 7.7 and 7.8 describe these techniques. Common to all these applications of linear graphs is the need to modify a standard linear graph of an orthogonal array so that it matches the linear graph required by a particular problem. A linear graph for an orthogonal array must be consistent with the interaction table associated with that array; that is, every line in a linear graph must represent the interaction between the two columns represented by the dots it connects. In the fol- lowing discussion we assume that for 2-level orthogonal arrays, the interaction between columns @ and 6 is contained in column c. Also, the interaction between columns f and g is contained in column c. If it is a 3-level orthogonal array, we assume that the interaction between columns a and b is contained in columns c¢ and d. Also, the interaction between columns f and g is contained in columns c and d. The following three rules can be used for modifying a linear graph to suit the needs of z specific case study. 1. Breaking a line. In the case of a 2-level orthogonal array, a line connecting two dots, a and b, can be removed and replaced by a dot. The column associated with this dot is same as the column associated with the line it was created from. In case of linear graphs for 3-level orthogonal arrays, a line has two columns associated with it and it maps into two dots. Figures 7.4 (a) and (b) show this rule diagrammatically. ssydesd svauyt jo worwoyipoy pL 2unIyy q pio 3 6 , 6 9 4 6 y -—e ° ° —_— e e q 9.8 aur] © BuAOW q e q poe q e ° ° — e ° —_—-, our] e Bunwos our] 8 Bupyeorg ‘SUUINJOD Jo UORIBJeIU! ‘Os|y “9 UUIN|OD UL S| g pur B SUUINIOD JO UOROEIEIUI *SUUUN|OO JOAo}-z O18 9 pue ‘g ‘B 21Ny UONEOYIPOW “SULUN|OO janeI-€ a1 P pure ‘D sheuy jeuoboyyiO ‘skeuy jeuoBoyuo. 1eneT-e feneT-2 164 Sec. 7.6 Modification of Linear Graphs 165 2. Forming a line. A line can be added in the linear graph of a 2-level orthogonal array to connect two dots, a and b, provided we remove the dot ¢ associated with the interaction between a and b. In the case of the linear graphs for a 3-level orthogonal array, two dots c and d, which contain the interaction of a and b, must be removed. The particular dot or dots to be removed can be determined from the interaction table for the orthogonal array. Figures 7.4 (c) and (d) show this rule diagrammatically. 3. Moving a line. This rule is really a combination of the preceding two rules. A line connecting two dots a and 6 can be removed and replaced by a line joining another set of two dots, say f and g, provided the interactions a x b and f x g are contained in the same column or columns. This rule is diagrammatically shown in Figures 7.4 (e) and (f). The following examples illustrate the modification of linear graphs. Example 10: Consider Example 9 in Section 7.5. The standard linear graph of Lg, Figure 7.5 (a) can be changed into the linear graph shown in Figure 7.5 (b) by breaking the line join- ing dots 1 and 6. This modified linear graph matches the problem perfectly. The fac- tors A, B, C, D and E should be assigned, respectively, to columns 2, 1, 4, 6, and 7. The AXB and BXC interactions can be estimated by keeping columns 3 and 5 empty. 2 2 5 7 1 4 a. 4 7 é 7 (a) A Standard Linear (b) Modified Linear Graph Graph of Ls Obtained by Breaking Line 6 Figure 7.5 Standard and modified linear graph of Lg. Example 11: The purpose of this example is to illustrate the rule 3, namely moving a line. Figure 7.6 (a) shows one of the standard linear graphs of the orthogonal array Ly. It can be changed into Figure 7.6 (b) by breaking the line connecting columns 6 and 11, and adding isolated dot for column 13. This can be further turned into Figure 7.6 (c) by adding a line to connect columns 7 and 10, and simultaneously removing the isolated dot 13. 166 Constructing Orthogonal Arrays Chap. 7 1 4 5 7 6 3 | | | | 2 8 10 9 " (a) A Standard Linear Graph of Lis 5 | 14 2 10 9 (b) Modified Linear Graph Obtained by Breaking Line 13 in (a) 1 4 5 7 13 3 12 15 14 e e 1 2 8 10 9 (©) Modified Linear Graph Obtained by Forming a Line Between Dots 7 and 10 in (b). Interaction of Columns 7 and 10 is Column 13 7 4 3 12 8 Figure 7.6 An example of linear graph modification. 7.7 COLUMN MERGING METHOD The column merging method can be used to create a 4-level column in a standard orthogonal array with all 2-level columns, a 9-level column in a standard orthogonal array with all 3-level columns, and a 6-level column in a standard orthogonal array with some 2-level and some 3-level columns. Sec 7.7 Column Merging Method 167 To create a 4-level column in a standard orthogonal array with 2-level columns, we merge any two columns and their interaction column. For example, in the Lg array the interaction of columns 1 and 2 lies in column 3. Thus, these three columns can be merged to form a 4-level column. Note that the three columns that are merged have one degree of freedom each, thus together they have the three degrees of freedom needed for the 4-level column. Suppose columns a, b, and c (the column containing interaction of a and b) are designated to form a 4-level column, The steps in forming the 4-level column are: 1. Create a new column called abc as follows: For the combination (1,1) in columns @ and 6 write 1 in column abe For the combination (1,2) in columns @ and 6 write 2 in column abc For the combination (2,1) in columns @ and b write 3 in column abc For the combination (2,2) in columns @ and b write 4 in column abc 2, Remove columns a, b, and ¢ from the array. These columns cannot be used to study any other factors or interactions. The creation of a 4-level column using columns 1, 2, and 3 of Lg is shown in Table 7.5. It can be checked that the resulting array is still balanced and, hence, orthogonal. It can be used to study one 4-level factor and up to four 2-level factors. TABLE 7.5 COLUMN MERGING METHOD: CREATION OF A 4-LEVEL COLUMN IN Lo. (a) Ly array: We designate a=column 1, (b) Modified Ly array: Columns 1, 2, and 3 b=column 2, aXé interaction is in column 3. are merged to form a 4-level column. Column Column Ey Expt. 12 3 4567 No. | (123) 4 5 6 7 1 fio. 1 ne 1 1 Poro4uwod 2/11 1 2222 2 1 202 2 2 3 ]1 2 2 1 122 3 2 11 2 2 af[12 2 2211 4 2 2 2 1 1 s |2 1 2 1212 5 3 12 1 2 6 |/2 1 2 2121 6 3 2 1 2 4 7/2 2 1 1221 1 4 12 204 8/2 2 1 214 1 2 8 4 2 1 41 2 TT Tf a ob c=axb 168 Constructing Orthogonal Arrays: Chap. 7 In the linear graph of a 2-level orthogonal array we represent a 4-level factor by two dots and a line connecting them. The column merging procedure above generalizes to orthogonal arrays with columns other than only 2-level columns. Thus, to form a 9-level column in a stan- dard orthogonal array with 3-level columns, we follow the same procedure as above except we must merge four columns: two columns from which we form the 9-level column and the two columns containing their interactions. 7.8 BRANCHING DESIGN A process for applying covercoat on printed wiring boards consists of (1) spreading the covercoat material (a viscous liquid) on a board, and (2) baking the board to form a hard covercoat layer. Suppose, to optimize this process, we wish to study two types of material (factor A), two methods of spreading (factor B) and two methods of baking (factor C). The two methods of baking are a conventional oven (C;) and an infrared oven (C2). For the conventional oven there are two additional control factors, bake temperature (factor D, two levels) and bake time (factor E, two levels), Whereas for the infrared oven, there are two different control factors: infrared light intensity (factor F, two levels) and conveyor belt speed (factor G, two levels). The factors for the covercoat process are diagrammed in Figure 7.7. Factor C is called a branching factor because, depending on its level, we have different control factors for further processing steps. Branching design is a method of constructing ‘orthogonal arrays to suit such case studies. Linear graphs are extremely useful in constructing orthogonal arrays when there is process branching. The linear graph required for the covercoat process is given in Figure 7.8 (a). We need a dot for the branching factor C, and two dots connected with lines to that dot. These two dots correspond to the factors D and E for the conven- tional oven branch, and F and G for the infrared oven branch. The columns associated with the two interaction lines connected to the branching dot must be kept empty. In the linear graph we also show two isolated dots corresponding to factors A and B. The standard linear graph for Lg in Figure 7.8(b) can be modified casily to match the linear graph in Figure 7.8(a). We break the bottom line to form two isolated dots corresponding to columns 6 and 7. Thus, by matching the modified linear graph with the required linear graph, we obtain the column assignment for the control factors as follows: Factor Column A 6 B 7 Cc 1 DF 2 (3) E,G 4(5) Sec. 7.8 Branching Design 169 i, ‘A. Covercoat Material to B. Method of Spreading ee C. Method of Baking Conventional n Infrared Oven D. Bake Temperature F. Light Intensity Bake Time G. Conveyor Belt Speed Figure 7.7 Process branching in covercoat process. Columns 3 and 5, shown in parenthesis, must be kept empty. The factors D and F are assigned to the same column, namely column 2. Whether a particular experiment is conducted by using factor D or F depends on the level of factor C, which is deter- mined by column 1. Thus, the levels of factors D and F are determined jointly by the columns | and 2 as follows: For the combination (1,1) in columns For the combination (1,2) in columns For the combination (2,1) in columns For the combination (2,2) in columns and 2 write Dy in column 2 and 2 write D2 in column 2 and 2 write Fy in column 2 and 2 write F> in column 2 Factors D and F can have quite different effects; that is, mp,—mp, need not be equal to mp,—mp,. This difference shows up as interaction between columns 1 and 2, which is contained in column 3. Hence, column 3 must be kept empty. The factors E and G are assigned to the column 4 in a similar way, and column 5 is kept empty. 170 Constructing Orthogonal Arrays Chap. 7 D, F ° > 2G (a) Required Linear Graph 2 a 5 4 : § 7 (¢} Modified Standard Linear Graph and Assignment of Factors to Columns. Figure 7.8 Covercoat process column assignment through linear graph. The experiment layout for the covercoat process is given in Table 7.6. Note that experiments 1 through 4 are conducted using the conventional oven, while experiments 5 through 8 are conducted using the infrared oven. It is possible that after branching, the process can reunite in subsequent steps. Thus, in the printed wiring board application, after the covercoat is applied, we may go through common printing and etching steps that all have a common set of control fac- tors. Branching can also occur in product design; for example, we may select different mechanisms to achieve a part of the function. Here, associated with each mechanism, there would be different control factors. Sec.79 Strategy for Constructing an Orthogonal Array 71 TABLE 7.6 EXPERIMENT LAYOUT FOR THE COVERCOAT PROCESS Column Expt No [1 2 3 4.5 6 7 1 ]c, dD E Ay By 2/a dD E, A, Bp 3 |e Dy E, Ar By 4 /C, Dy E; A, B, 5 |G, Fy G, A, By 6 | C, Fy G, Ar By 7 Cr, Fi Gy Ar By 8 1G, Fy G, Ay Bp C DF Empty EG Empy A B Factor Assignment 7.9 STRATEGY FOR CONSTRUCTING AN ORTHOGONAL ARRAY Up to this point, this chapter discussed many techniques for constructing orthogonal arrays needed by the matrix experiments. This section focuses on showing how to orchestrate the techniques for constructing an orthogonal array to suit a particular case study. The skill needed to apply these techniques varies widely. Accordingly, we describe three strategies—beginner, intermediate, and advanced—requiring progres- sively higher levels of skill with the techniques described earlier in this chapter. A vast majority of case studies can be taken care of by the beginner and intermediate strategies, whereas a small fraction of the case studies requires the advanced strategy. The router bit life improvement case study in Chapter 11 is one such case study. Beginner Strategy A beginner should stick to the direct use of one of the standard orthogonal arrays. Table 7.7 is helpful in selecting a standard orthogonal array to fit a given case study. Because it gets difficult to keep track of data from a larger number of experiments, the beginner is advised to not exceed 18 experiments, which makes the possible choices of orthogonal arrays as L4, Lg, Lo, Lz, Lig, Lig, and Lg. 172 Constructing Orthogonal Arrays Chap. 7 TABLE 7.7 BEGINNER STRATEGY FOR SELECTING AN ORTHOGONAL ARRAY (a) All 2-level Factors (b) All 3-tevel Factors No. of No. of 2level_ | Recommended 3-level | Recommended Factors | Orthogonal Array! Factors | Orthogonal Array 2-3 Ly 2-4 Ly 4-7 Ly 5-7 Lis* 8-1 Ln *When Lig is used, one 2-level 12-15 Lis factor can be used in addition to seven 3-evel factors. A beginner should consider either all 2-level factors or all 3-Level factors (prefer- ably 3-level factors) and not attempt to estimate any interactions. This may require him or her to modify slightly the case-study requirements, The rules given in Table 77 can then be used to select the orthogonal array. The assignment of factors to the columns is straightforward in the cases dis- cussed above. Any column can be assigned to any factor, except for factors that are difficult to change, which should be assigned to the columns toward the left. Among all the arrays discussed above, the array Lg is the most commonly used array because it can be used to study up to seven 3-level factors and one 2-level factor, which is the situation with many case studies. Intermediate Strategy Experimenters with modest experience in using matrix experiments should use the dummy level, compound factor, and column merging techniques in conjunction with the standard orthogonal arrays to broaden the possible combinations of the factor lev- els. The factors should have preferably two or three levels and the estimation of interactions should be avoided. Also, as far as possible, arrays larger than Lg should be avoided. Table 7.8 can be used to select an appropriate standard orthogonal array depending on the number of 2- and 3-level factors in the case study. The following rules can then be used to modify the chosen standard orthogonal array to fit the case study: 1. To create a 3-level column in the attay Lg or Lys, merge three columns in the array (two columns and the column containing their interaction) to form a 4-level column. Then use the dummy level technique to convert the 4-level column into a 3-level column. Sec. 7.9 Strategy for Constructing an Orthogonal Array 173 2. To create two 3-level columns in the array L jg, merge two distinct sets of three columns in the array (two columns and the column containing their interaction) to form two 4-level columns. Then use the dummy level technique to convert the 4-level columns into 3-level columns. 3. When the array Lg is suggested by the Table 7.8 and the total number of factors ig less than or equal to four, use the dummy level technique to assign a 2-level factor to a 3-level column. 4. When the array Lg is suggested by the Table 7.8 and the total number of factors exceeds four, use the compound factor technique to create a 3-level factor from two 2-level factors until the total number of factors becomes 4. 5. When the array Lyg is suggested by the Table 7.8 and the number of 2-level columns exceeds one, use the dummy level and compound factor techniques in the manner similar to rules 3 and 4 above. TABLE 7.8 INTERMEDIATE STRATEGY FOR SELECTING AN ORTHOGONAL ARRAY* Number of 3-level factors Number of 2-level factors 0 1 2 3 4 5 6 7 0 Ly by Lp Lis La Lew 1 ty ky Lie Lig Lie Lae 2 Ly Ly Ly ky Lig Lig Lig 3 te Lay Ly we ig ig Lew 4 Ly Ly kg hg hg Lew s Ly be big hig Lig Les 6 Ly Lie hig ig Law 7 a 8 Ln bw Lis Law 9 in Le bi Li 10 Ln Lie avy be Lig 2 Ly bie 13 Lie 14 Ly 1s Lis * Combination of 2- and 3-level factors not covered by the intermediate strategy are indicated by a blank, 174 Constructing Orthogonal Arrays Chap. 7 Advanced Strategy ‘An experimenter, who has a fair amount of experience in conducting matrix experi- ments and wishes to have wider freedom in terms of the number of factors and their levels or wants to estimate interactions, must use linear graphs and rules for their modification. The advanced strategy consists of the following steps: 1. Use the beginner or intermediate strategy to obtain a simple solution. If that is not possible, proceed with the following steps. 2. Count the degrees of freedom to determine the minimum size of the orthogonal array. 3. Select an appropriate standard orthogonal array from among those listed in Table 7.1. If most of the factors are 2- or 4-level factors, then a 2-level array should be selected. If most of the factors are 3-level factors, then a 3-level array should be selected. 4, Construct the linear graph required for the case study. The linear graph should contain the interactions to be estimated and also the appropriate patterns for column merging and process branching. 5. Select a standard linear graph for the chosen array that is closest to the required linear graph. 6. Modify the standard linear graph to match the required linear graph by using the rules in Section 7.6. The column assignment is obvious when the two linear graphs match. If we do not succeed in matching the linear graphs we must repeat the procedure above with either a different linear graph for the chosen standard orthogonal array, or choose a larger standard orthogonal array, or modify the requirements for the case study. The advanced strategy needs some skill in using the linear graph modification rules. The router bit life improvement case study of Chapter 11 illustrates the use of the advanced strategy. Artificial intelligence programs can be used to carry out the modifications efficiently as described by Lee, Phadke, and Keny [L1]. 7.10 COMPARISON WITH THE CLASSICAL STATISTICAL EXPERIMENT DESIGN As mentioned in Chapter 1, both classical statistical experiment design and Robust Design use the basic principles of planning experiments and data analysis developed by R. A. Fisher in the 1920s. Thus, there are many common ideas in the two methods. The differences in the methods come about because the two methods were developed by people who were concemed with different problems. This section describes the differences primarily for the benefit of readers familiar with classical statistical experi- ment design, which is described in many books, such as Box, Hunter, and Hunter [B3], John [J2], Cochran and Cox [C3], Daniel [D1], and Hicks [H2]. It is hoped that this Sec. 7.10 Comparison with the Classical Statistical Experiment Design 175 section will help such readers understand and apply the Robust Design Method. This section may be skipped without affecting the readability of the rest of the book. Any method which was developed over several decades is likely to have varia- tions in the way it is applied. Here, the term classical statistical experiment design tefers to the way the method is practiced by the majority of its users. Exceptions to the majority practice are not discussed here. The term Robust Design, of course, means the way it is described in this book. The comparison is made in three areas: problem formulation, experiment layout, and data analysis. The differences in the areas of experiment layout and data analysis are primarily a result of the fact that the two methods address different problems. Differences in Problem Formulation Emphasis on Variance Reduction The primary problem addressed in classical statistical experiment design is to model the response of a product or process as a function of many factors called model factors. Factors, called nuisance factors, which are not included in the model, can also influence the response. Various techniques are employed to minimize the effects of the nuisance factors on the estimates of model parameters. These techniques include hold- ing the nuisance factors at constant values during the experiments when possible, as well as techniques called blocking and randomization, The effects of the nuisance fac- tors not held constant show as variance of the response. Classical statistical experi- ment design theory is aimed at deriving a mathematical equation relating the mean Tesponse to the levels of the model factors. As a general rule, it assumes that the vari- ance of the response remains constant for all levels of the model factors. Thus, it ignores the problem of reducing variability which is critical for quality improvement. The primary problem addressed in Robust Design is how to reduce the variance of a product’s function in the customer's environment. Recall that the variance is caused by the noise factors and the fundamental idea of Robust Design is to find levels of the control factors which minimize sensitivity of the product’s function to the noise factors. Consequently, Robust Design is focused on determining the effects of the con- trol factors on the robustness of the product's function. Instead of assuming that the variance of the response remains constant, it capitalizes on the change in variance and looks for opportunities to reduce the variance by changing the levels of the control fac- tors. In Robust Design, accurate modeling of the mean response is not as important as finding the control factor levels that optimize robustness. This is so because, after the variance has been reduced, the mean response can be easily adjusted with the help of only one control factor. Finding a suitable control factor, known as adjustment factor, which can be used for adjusting the mean response is one of the concerns of the Robust Design method. 176 Constructing Orthogonal Arrays Chap. 7 Selection of Response/Quality Characteristic Classical statistical experiment design considers the selection of the response to be out- side the scope of its activities, but Robust Design requires a thorough analysis of the engineering scenario in selecting the quality characteristic and the S/N ratio. Guide- lines for the selection of the quality characteristic and S/N ratio are given in Chapters 5 and 6. Frequently, the final goal of a project is to maximize the yield or the percent of products meeting specifications. Accordingly, in classical statistical experiment design yield is often used as a response to be modeled in terms of the model factors. As dis- cussed in Chapters 5 and 6, use of such response variables could lead to unnecessary interactions and it may not lead to a robust product design. Systematic Sampling of Noise The two methods also differ in the treatment of noise during problem formulation. Since classical statistical experiment design method is not concerned with minimizing sensitivity to noise factors, the evaluation of the sensitivity is not considered in the method. Instead, noise factors are considered nuisance factors. They are either kept at constant values during the experiments, or techniques called blocking and randomiza- tion are used to block them from having an effect on the estimation of the mathemati- cal model describing the relationship between the response and the model factors. On the contrary, minimizing sensitivity to noise factors (factors whose levels cannot be controlled during manufacturing or product usage, which are difficult to con- trol, or expensive to control) is a key idea in Robust Design. Therefore, noise factors are systematically sampled for a consistent evaluation of the variance of the quality characteristic and the S/N ratio, Thus, in the polysilicon deposition case study of Chapter 4, the test wafers were placed in specific positions along the length of the reac- tor and the quality characteristics were measured at specific points on these wafers. This ensures that the effect of ngise factors is equitable in all experiments. When there exist many noise factors whose levels can be set in the laboratory, an orthogonal array is used to select a systematic sample, as discussed in Chapter 8, in conjunction with the design of a differential operational amplifier. Use of an orthogonal array for sam- pling noise is a novel idea introduced by Robust Design and it is absent in classical statistical experiment design. Transferability of Product Design Another important consideration in Robust Design is that a design found optimum dur- ing laboratory experiments should also be optimum under manufacturing and customer environments. Further, since products are frequently divided into subsystems for design purposes, it is critically important that the robustness of a subsystem be minimally affected by changes in the other subsystems. Therefore, in Robust Design Sec. 7.10 Comparison with the Classical Statistical Experiment Design 17 interactions among control factors, especially the antisynergistic interactions, are con- sidered highly undesirable. Every effort is made during problem formulation to select the quality characteristic, S/N ratio, and control factor levels to minimize the interac- tions. If antisynergistic interactions are discovered during data analysis or verification experiments, the experimenter is advised to go back to Step 1 of the Robust Design cycle and re-examine the choice of quality characteristics, S/N ratios, control factors, noise factors, etc. On the other hand, classical statistical experiment design has not been concemed with transferability of product design. Therefore, presence of interactions among the model factors is not viewed as highly undesirable, and information gained from even antisynergistic interactions is utilized in finding the factor levels that predict the best average response. Differences in Experiment Layout Testing for Additivity Additivity means absence of all interactions—2-factor, 3-factor, etc. Achieving addi- tivity, though considered desirable, is usually not emphasized in classical statistical experiment design; and orthogonal arrays are not used to test for additivity. Interac- tions are allowed to be present; they are appropriately included in the model; and experiments are planned so that they can be estimated. Achieving additivity is very critical in Robust Design, because presence of large interactions is viewed as an indication that the optimum conditions obtained through a matrix experiment may prove to be non-optimum when levels of other control factors (other than those included in the matrix experiment at hand) are changed in subsequent Robust Design experiments, Additivity is considered to be a property that a given quality characteristic and S/N ratio possess or do not possess. Matrix experiment based ‘on an orthogonal array, followed by a verification experiment is used as a tool to test whether the chosen quality characteristic and S/N ratio possess the additivity property. Efficiency Resulting from ignoring Interactions among Control Factors Let us define some terms commonly used in classical statistical experiment design. Resolution V designs are matrix experiments where all 2-factor interactions can be estimated along with the main effects. Resolution IV designs are matrix experiments where no 2-factor interaction is confounded with the main effects, and no two main effects are confounded with each other, Resolution III designs (also called saturated designs) are matrix experiments where no two main effects are confounded with each other, In a Resolution III design, 2-factor interactions are confounded with main effects. In an orthogonal array if we allow assigning a factor to each column, then it becomes a Resolution III design. It is possible to construct a Resolution IV design 178 Constructing Orthogonal Arrays Chap. 7 from an orthogonal array by allowing only specific columns to be used for assigning factors. It is obvious from the above definitions that for a given number of factors, Reso- lution III design would need the smallest number of experiments, Resolution IV would need somewhat more experiments and Resolution V would need the largest number of experiments, Although heavy emphasis is placed in classical statistical experiment design on ability to estimate 2-factor interactions, Resolution V designs are used only very selectively because of the associated large experimental cost. Resolution IV designs are very popular in classical statistical experiment design, Robust Design almost exclusively uses Resolution III designs, except in some situations where estima- tion of a few specific 2-factor interactions is allowed. The relative economics of Resolution II and Resolution IV designs can be understood as follows. By using the interaction tables in Appendix C one can see that Resolution IV designs can be realized in 2-level standard orthogonal arrays by assign- ing factors to selected columns as shown in Table 7.9. TABLE 7.9 RESOLUTION III AND IV DESIGNS Resolution II Design Resolution IV Design Maximum Maximum Orthogonal | Number of | Columns to be | Number of | Columns to be Array Factors Used Factors Used La 3 1-3 2 12 La 7 1-7 4 1.247 Le 15 1-15 8 1,2,4,7,8,11,13,14 Ly 31 1-31 16 1.2.4.7,8,11,13. 16,19,21,22,25,26, 2831 From the above table it is apparent that for a given orthogonal array roughly twice as many factors can be studied with Resolution III design compared to Resolution IV design Screening Experiments Classical statistical experiment design frequently uses the following strategy for build- ing a mathematical model for the response: 1. Screening. Use Resolution III designs to conduct experiments with a large number of model factors for determining whether each of these factors should be included in the mathematical model. Sec. 7.10 Comparison with the Classical Statistical Experiment Design 179 2. Modeling. Use Resolution IV (and occasionally, Resolution V) designs to con- duct experiments with the factors found important during screening to build the mathematical model. Robust Design considers screening to be an unnecessary step. Therefore it does not have separate screening and modeling experiments. At the end of every matrix experiment the factor effects are estimated and their optimum levels identified. Robust Design advocates the use of Resolution III designs for all matrix experiments with the exception that sometimes estimation of a few specific 2-factor interactions is allowed. Flexibility in Experimental Conditions Because of the heavy emphasis on the ability to estimate interactions and the complex- ity of the interactions between 3-level factors, classical statistical experiment design is frequently restricted to the use of 2-level fractional factorial designs. Consequently, the number of possible types of experiment conditions is limited. For example, it is not possible to compare three or four different types of materials with a single 2-level frac- tional factorial experiment. Also, the curvature effect of a factor (see Figure 4.4) can- not be determined with only two levels. However, as discussed earlier in this chapter, the standard orthogonal arrays and the linear graphs used in Robust Design provide excellent flexibility and simplicity in planning multifactor experiments. Central Composite Designs Central composite designs are commonly used in classical experiment design, especially in conjunction with the response surface methodology (see Myers [M2]) for estimating the curvature effects of the factors. Although some research is needed to compare the central composite designs with 3-level orthogonal arrays used in Robust Design, the following main differences between them are obvious: the central composite design is useful for only continuous factors, whereas the orthogonal arrays can be used with continuous as well as discrete factors, As discussed in Chapter 3, the predicted response under any combination of the control factor levels has the same variance when an orthogonal array is used. However, this is not true with central composite designs. Randomization Running the experiments in a random order is emphasized in classical statistical experi- ment design to minimize the effect of the nuisance factors on the estimated model fac- tor effects. Running the experiments in an order that minimizes changes in the levels of factors that are difficult to change is considered more important in Robust Design. Randomization is advised to the extent it can be convenient for the experimenter. In Robust Design, we typically assign control factors to all or most of the columns of an orthogonal array. Consequently, running the experiments in a random 180 Constructing Orthogonal Arrays Chap. 7 order does not scramble the order for all factors effectively. That is, even after arrang- ing the experiments in a random order, it looks as though the experiments are in a nearly systematic order for one or more of the factors. Nuisance factors are analogous to noise factors in Robust Design terminology. Since robustness against the noise factors is the primary goal of Robust Design, we introduce the noise factors in a systematic sampling manner to permit equitable evalua- tion of sensitivity to them. Before we describe the differences in data analysis, we note that many of the lay- out techniques described in this book can be used beneficially for modeling the mean response also. Differences in Data Analysis Two Step Optimization As mentioned earlier, the differences in data analysis arise from the fact that the two methods were developed to address different problems. One of the common problems in Robust Design is to find control factor settings that minimize variance while attain- ing the mean on target. In solving this problem, provision must be made to ensure that the solution can be adapted easily in case the target is changed. This is a difficult, multidimensional, constrained optimization problem. The Robust Design method solves it in two steps. First, we maximize the S/N ratio and, then, use a control factor that has no effect on the S/N ratio to adjust the mean function on target. This is an unconstrained optimization problem, much simpler than the original constrained optimi- zation problem. Robust Design addresses many engineering design optimization prob- lems as described in Chapter 5 Classical statistical experiment design has been traditionally concerned only with modeling the mean response. Some of the recent attempts to solve the engineering design optimization problems in the classical statistical experiment design literature are discussed in Box [BI], Leon, Shoemaker, and Kackar [L2], and Nair and Pregibon [N2]. Significance Tests In classical statistical experiment design, significance tests, such as the F test, play an important role. They are used to determine if a particular factor should be included in the model. In Robust Design, F ratios are calculated to determine the relative impor- tance of the various control factors in relation to the error variance. Statistical significance tests are not used because a level must be chosen for every control factor regardless of whether that factor is significant or not. Thus, for each factor the best level is chosen depending upon the associated cost and benefit. Sec. 7.11 Summary 181 7.11 SUMMARY * The process of fitting an orthogonal array to a specific project has been made particularly easy by the standard orthogonal arrays and the graphical tool, called linear graphs, developed by Taguchi to represent interactions between pairs of columns in an orthogonal array. Before constructing an orthogonal array, one must define the requirements which consist of: 1, Number of factors to be studied 2. Number of levels for each factor 3. Specific 2-factor interactions to be estimated 4. Special difficulties that would be faced in running the experiments ‘The first step in constructing an orthogonal array to fit a specific case study is to count the total degrees of freedom that tells the minimum number of experiments that must be performed to study the main effects of all control factors and the chosen interactions. Genichi Taguchi has tabulated 18 standard orthogonal arrays. In many problems, one of these arrays can be directly used to plan a matrix experiment. The arrays are presented in Appendix C, Orthogonality of a matrix experiment is not lost by keeping empty one or more columns of the array. * The columns of the standard orthogonal arrays are arranged in the increasing order of number of changes; that is, the number of times the level of a factor must be changed in running the experiments in the numerical order is smaller for the columns on the left than those on the right. Consequently, factors whose lev- els are difficult to change should be assigned to columns on the left. Although in most Robust Design experiments one chooses not to estimate any interactions among the control factors, there are situations where it is desirable to estimate a few sclected interactions. The linear graph technique makes it easy to plan orthogonal array experiments that involve interactions. Linear graphs represent interaction information graphically and make it casy to assign factors and interactions to the various columns of an orthogonal array. In a linear graph, the columns of an orthogonal array are represented by dots and lines. When two dots are connected by a line, it means that the interaction of the two columns represented by the dots is contained in (or confounded with) the column(s) represented by the line. In a linear graph, each dot and each line has a distinct column number(s) associated with it. Furthermore, every column of the array is represented in its linear graph once and only once. 182 Constructing Orthogonal Arrays Chap. 7 + The principal utility of linear graphs is for creating a variety of different orthogo- nal arrays from the standard orthogonal arrays to fit real problems. * Techniques described in this chapter for modifying orthogonal arrays are sum- marized in Table 7.10. * Depending on the needs of the case study and experience with matrix experi- ments, the experimenter should use the beginner, intermediate, or advanced stra- tegy to plan experiments. The beginner strategy (see Table 7.7) involves the use of a standard orthogonal array. The intermediate strategy (see Table 7.8) involves minor but simple modifications of the standard orthogonal arrays using the dummy level, compound factor, and column merging techniques. A vast majority of case studies can be handled by the beginner or the intermediate stra- tegies. The advanced strategy requires the use of the linear graph modification rules and is needed relatively infrequently. In complicated case studies, the advanced strategy can greatly simplify the task of constructing orthogonal arrays. * Although Robust Design draws on many ideas from classical statistical experi- ment design, the two methods differ because they address different problems. Classical statistical experiment design is used for modeling the mean response, whereas Robust Design is used to minimize the variability of a product's func- tion, TABLE 7.10 TECHNIQUES FOR MODIFYING ORTHOGONAL ARRAYS: Needed Linear ‘Technique Application Graph Pattern Dummy level Assign an m-level factor to a n-level column NA (m or R3, can be used to adjust the value of Ry at which the bridge balances. We decide to use R3, and, thus, it is our signal factor for deciding the temperature setting. The resistance Ry by itself is the signal factor for the ON-OFF operations. The purpose of the Zener diode (nominal voltage = E,) in the circuit is to regu- late the voltage across the terminals a and b (see Figure 9.2). That is, when the Zener diode is used, the voltage across the terminals a and b remains constant even if the power supply voltage, Eo, drifts or fluctuates. Thus it reduces the dependence of the threshold values Ry-oy and Ry-ogr on the power supply voltage Eo. As a general rule, the nominal values of the various circuit parameters are poten- tial control factors, except for those completely defined by the tracking rules. In the temperature control circuit, the control factors are R,, R2, R4 and £,. Note that we do 218 Design of Dynamic Systems Chap. 9 not take Eo as a control factor. As a rule, the design engineer has to make the deci- sion about which parameters should be considered control factors and which should not. The main function of Eo is to provide power for the operation of the relay, and its nominal value is not as important for the ON-OFF operation. Hence, we do not include Eo as a control factor. The tolerances on Ry, R>, R4, E;, and Eq are the noise factors. For proper operation of the circuit, we must have E, < Ep. Also, R4 must be much bigger than Ry or R>. These are the tracking relationships among the circuit parameters, 9.3 QUALITY CHARACTERISTICS AND S/N RATIOS Selection of Quality Characteristics The resistances Rr_gy and Ry_gpp are continuous variables that are obviously directly related to the ON-OFF operations; together, they completely characterize the circuit function. Through standard techniques of circuit analysis, one can express the values of Ry-ow and Ry_opp as following simple mathematical functions of the other circuit parameters: RyRYER stEoR 1) R =o A 7-08 ~ BER ER, ERD op R3R2Ry Ryo = BARD” (9.2) Thus, by the criteria defined in Chapter 6, Rr_oy and Rr_ogr are appropriate choices for the quality characteristics. Suppose Equations (9.1) and (9.2) for the evaluation of Ry_gy and Ry_opp were not available and that hardware experiments were needed to determine their values. Measuring Rp_oy and Rr_orr would still be easy. It could be accomplished by incre- menting the values of Ry through small steps until the heater turns ON and decrement- ing the values of Ry until the heater turns OFF. Selection of S/N Ratio The ideal relationship of Ry_oy and Rr-gpr with R3 (the signal factor) is linear, pass- ing through the origin, as shown in Figure 9.5. So for both quality characteristics, the appropriate S/N ratio is the C-C type S/N ratio, described in Chapter 5. Suppose for some particular levels of the control factors and particular tolerances associated with Sec. 9.3 Quality Characteristics and S/N Ratios 219 Ry(kQ) 0 0s 1.0 15 Aa (kQ) Figure 9.5 Plot of Ry_ow and Ry_orr VS. R3 for the starting design. the noise factors we express the dependence of Rr-gy on R3 by the following equation obtained by the least squares fit: Rr-on = BR3 +e (9.3) where B is the slope and ¢ is the error. Note that any nonlinear terms in R (such as R3 or R3) are included in the error e. The S/N ratio for R7_w is given by B? 10 logy =. (9.4) 6; Similarly, let the dependence of R ror on R 3 be given by Rr-orr = B’R3 + e” (9.5) 220 Design of Dynamic Systems Chap. 9 where B’ is the slope and e” is the error. Then, the corresponding S/N ratio for Ry_ore is given by 10 logo J . (0.6) Evaluation of the S/N Ratio Let us first see the computation of the S/N ratio for Ry_oy. The nominal values of the circuit parameters under the starting conditions, their tolerances (three-standard- deviation limits), and the three levels for testing are shown in Table 9.1. These levels were computed by the procedure described in Chapter 8; that is, for each noise factor, the levels 1 and_3 are displaced from level 2, which is equal to its mean value, on either side by ¥3/2 6, where o is one-third the tolerance. TABLE 9.1 NOISE AND SIGNAL FACTORS FOR TEMPERATURE CONTROL CIRCUIT 4 Levels (Multiply by Mean for Noise Factors) Tolerance Factor Mean* (%) 1 2 3 AR, 40 kQ 5 1.0204 | 1.0 | 0.9796 B. Ry 8.0 kQ 5 1.0204 | 1.0 | 0.9796 c. Ry 40.0 kQ 5 1.0204 | 1.0 | 0.9796 D. Eo 10.0 V 5 1.0204 | 1.0 | 0.9796 FE, 60V 5 10208 | 1.0 | 0.9796 M. Ry (signal) | - - 05kQ | 1.02 | 1542 * Mean values listed here correspond to the nominal values for the starting design. The ideal relationship between R3 and Ry_oy is a straight line through the origin with the desired slope. Second- and higher-order terms in the relationship between R and Ry_oy should be, therefore, minimized. Thus, we take three levels for the signal factor (R3): 0.5 kQ, 1.0 kQ, and 1.5 kQ. Here Rr_ow must be zero when R33 is zero. So, with three levels of R3, we can estimate the first-, second-, and third-order terms in the dependence of Ry_oy on R3. The first order, or the linear effect, constitutes the Sec. 9.3 Quality Characteristics and S/N Ratios 221 desired signal factor effect. We include the higher-order effects in the noise variance so they are reduced with the maximization of n. [It is obvious from Equations (9.1) and (9.2) that the second- and higher-order terms in R3 do not appear in this circuit. Thus, taking only one level of R3 would have been sufficient. However, we take three levels to illustrate the general procedure for computing the S/N ratio.] As discussed in Chapter 8, an orthogonal array (called noise orthogonal array) can be used to simulate the variation in the noise factors. In addition to assigning noise factors to the columns of an orthogonal: array, we can also assign one of the columns to the signal factor. From the values of Rr_gy corresponding to each row of the noise orthogonal array, we can perform least squares regression (see Section 5.4, or Hogg and Craig [H3], or Draper and Smith [D4]) to estimate B and o? and then the SIN ratio, 1. Chapter 8 pointed out that the computational effort can be reduced greatly by forming a compound noise factor. For that purpose, we must first find directionality of the changes in Ry_y caused by the various noise factors. By studying the derivatives of Rr-oy with respect to the various circuit parameters, we observed the following relationships: Ry_oy increases whenever R; decreases, R> increases, Ry decreases, Eo increases, or E, decreases. (If the formula for Ry_gy were complicated, we could have used the noise orthogonal array to determine the directionalities of the effects.) Thus, we form the three levels of the compound noise factor as follows: (CN),: (R13, (Rais (Ras, Eo)» (E2)3 (CN)2: (Ry)2s R2)2, Rada» (Eod2» (Ez)2 (CN): Ridis R2)s, Ra), ods: Edn - For every level of the signal factor, we calculate Rr_oy with the noise factor lev- els set at the levels (CN);, (CN), and (CN);. Thus, we have nine testing conditions for the computation of the S/N ratio. The nine values of Ry_gy corresponding to the starting values of the control factors (R, = 4.0 kQ, R2 = 8.0 kQ, Ry = 40.0 kQ, and E, = 6.0 KQ) are tabulated in Table 9.2. Let y; denote the value of Ry_oy for the i” testing condition; and let Ru) be the corresponding value of R3. Then, from standard least squares regression analysis (see Section 5.4), we obtain Rai + Rayo t°'" + Rao Rhy + Rha) ++" + Ro) B 0.7) The error variance is given by a= Me £E Or BRI? 08) 222 Design of Dynamic Systems Chap. 9 Substituting the appropriate values from Table 9.2 in Equations (9.7) and (9.8) we obtain, B = 2.6991 and 7 = 0.030107. Thus, the S/N ratio for Rr-oy corresponding to the starting levels of the control factors is B2 a = 23.84 dB. o 1 = 10 logio The S/N ratio, 1/, for Rr_ore can be computed in exactly the same manner as we computed the S/N ratio for Rr_oy- Note that for dynamic systems, one must identify the signal factor and define the SIN ratio before making a proper choice of testing conditions. This is the case with the temperature control circuit. TABLE 9.2 TESTING CONDITIONS FOR EVALUATING 1) Ry CN (Signal Factor) | (Compound Noise | y = Ry-ov* ‘Test No. (ka) Factor) 2) 1 05 (CN), 1.2586 2 05 (CN, 1.3462 3 05 (CN), 1.4440 4 10 (CN), 25171 5s 10 (CN); 2.6923 6 10 (CN); 2.8880 7 15 (CN), 3.7757 8 15 (CN); 4.0385 9 Ls (CN)3, 43319 * These Rr-ow values correspond to the starting levels for the control factors. 9.4 OPTIMIZATION OF THE DESIGN Control Factors and Their Levels The four control factors, their starting levels, and the altemate levels are listed in Table 9.3. For the three resistances (Rj, Ro, and R,), level 2 is the starting level, level 3 is 1.5 times the starting level, and level i is 1/1.5 = 0.667 times the starting level. Thus, Sec. 9.4 Optimization of the Design 223 we include a fairly wide range of values with the three levels for each control factor. Since the available range for E, is restricted, we take its levels somewhat closer. Level 3 of E, is 1.2 times level 2, while level | is 0.8 times level 2. TABLE 9.3 CONTROL FACTORS FOR THE TEMPERATURE CONTROL CIRCUIT* Levels* Factor 1 2] 3 A. Ry (kQ) | 2.668 | 40 | 60 B. R,(kQ) | 5336 | 80 | 120 C. Ry (kQ) | 26.68 60.0 D. E) | 48 | 60 | 72 * Starting levels are indicated by an underscore. Control Orthogonal Array The orthogonal array Ly, which has four 3-level columns, is just right for studying the effects of the four control factors. However, by taking a larger array, we can also get a better indication of the additivity of the control factor effects. Further, computation is very inexpensive for this circuit. So, we use the Lyg array to construct the control orthogonal array. The Lyg array and the assignment of the control factors to the columns are given in Table 9.4. The control orthogonal array for this study is the sub- matrix of Lg formed by the columns assigned to the four control factors. Data Analysis and Optimum Conditions For each row of the control orthogonal array, we computed B and n for Rr-oy. The values of 1) and B* are shown in Table 9.4 along with the control orthogonal array. The possible range for the values of B is 0 to oo and we are able to get a better additive model for B in the log transform. Therefore, we study the values of B in the decibel scale, namely 20 logyoB. The results of performing the analysis of variance on 1 and 20 logioB are tabulated in Tables 9.5 and 9.6. The control factor effects on 1) and 20 logioB are plotted in Figure 9.6(a) and (b). Also shown in the figure are the control factor effects on ty’ and 20 logigB” corresponding to the quality characteristic Rr_orr. The following observations can be made from Figure 9.6(a): * For the ranges of control factor values listed in Table 9.3, the overall S/N ratio for the OFF function is higher than that for the ON function. This implies that Design of Dynamic Systems Chap. 9 the spread of Rr_gy values caused by the noise factors is wider than the spread of Rr_ore values. * The effects of the control factors on 1)’ are much smaller than the effects on 1. * Ry has negligible effect on 1 or 1. * 1) can be increased by decreasing R2; however, it leads to a small reduction in y. * 1 can be increased by increasing R4; however, it too leads to a small reduction inty. * n can be increased by increasing E,, with no adverse effect on 11’. TABLE 9.4 CONTROL ORTHOGONAL ARRAY AND DATA FOR Ay ow Column Numbers and Factor Assignment* Expt. 12345678 a No. eeABCe De | (dB) BP 1 Pit dott dd | 2241 9.59 2 112222 2 2 | 23.84 7.29 3 113 3 3 3 3 3 | 2479 6.12 4 121 1 22:3 3 | 2585 5.33 5 1222331 1 | 2419 712 6 1233 112 2 | 1947 15.66 7 131 2 1 3 2 3 | 22.25 19.27 8 1323213 1 | 2361 15.04 9 13 3°13 2 1 2 | 2493 1,42 10 2411332 2:1 | 2423 31,22 i 212 1 1 3 3 2 | 2450 3.07 12 213221 1 3 ] 2213 5.03 B 22123 :1 3 2 | 2602 31 4 22231213] 1619 | 6102 15 223 123 2:1 | 2460 149 16 2313 231 2 | 2026 58.36 0 23213123) 2594 | 249 18 23°32 12 3:1 | 2305 3.95 * Empty column is indicated by e. Sec, 9.4 Optimization of the Design BLE 9.5 ANALYSIS OF S/N RATIO DATA FOR Aron” Average n by Level* Degrees of | Sum of | Mean Factor | 1 2 3 | Freedom | Squares | Square | F A. R, | 23.50 | 23.04 | 23.16 2 0.7 04 - Ry | 24.70 | 23.58 | 21.42 2 33.33 16.67 22 Ry | 2131 | 2338 | 25.02 2 41.40 | 20.70 | 27 D. &, | 21.68 | 23.39 | 24.64 2 2637 | 13.19 | 17 Error 9 6.87 0.76 Total 7 108.67 * Overall mean 1) = 23.23 dB. + Starting levels are indicated by an underscore. TABLE 9.6 ANALYSIS OF 20 log,. 8 FOR Rr-on* Average 20 log,oB by | Levelt Degrees of | Sum of | Mean Factor | 1 | 2 | 3 | Freedom | Squares | Square | F R, | 1218 | 927 | 601 2 420) 571 | 94 B. Rz |} 486 | 892 | 1368 2 2335 | 1168 | 191 C. R, | 1055 | 901 | 789 2 24a | 107 | 18 D. £, | 10.40 | 9.01 | 8.05 2 16.8 84 | 14 Enor 9 55 061 Total 17 3914 * Overall mean value of 20 logioB = 9.15 dB. + Starting levels are indicated by an underscore. 226 Design of Dynamic Systems Chap. 9 4 G0 --=—- ape -- of -- -epe-- OFF g F g a ON 3 & OFF AA, A, 8, 8,8, ©, 0,0, D, D, D, 7A. 3 8.0 12.0(26.7 40.0 60.0) 48 6.0 7.2 27 40 6.0) (53 80 129 AKO) — Re(kQ) — Aa(kQ) £2 (V) Figure 9.6 Plots of factor effects. Underscore indicates starting level. Two-standard- deviation confidence limits are also shown for the starting level. Estimated confidence limits for 20 logy are too small to show. Thus, the optimum settings of the control factors suggested by Figure 9.6(a) are Ry = 4.0 kQ, Ry = 5.336 KQ, Ry = 60.0 KQ, and E, = 7.2 V. The verification experi- ment under the optimum conditions gave 1) = 26.43 dB, compared to 23.84 dB under Sec. 9.5 Hterative Optimization 227 the starting design. Similarly, under the optimum conditions we obtained 1’ = 29.10 dB, compared to 29.94 dB under the starting design. Note that the increase in 1 is much larger than the reduction in 1’. From Figure 9,6(b), it is clear that both R, and Rz have a large effect on 6 and B’. The control factors R4 and E, have a somewhat less effect on B and B’. Since Ry has no effect on the S/N ratios, it is an ideal choice for adjusting the slopes B and f’. This ability to adjust B and B gives the needed freedom to: (1) match the values of Rr-on and Ry_org with the chosen thermistor for serving the desired temperature range, and (2) obtain the desired hysteresis. As discussed earlier, the needed separation (hysteresis) between Rp_oy and Ry_opr is determined by the thermal analysis of the heating system, which is not discussed here. 9.5 ITERATIVE OPTIMIZATION The preceding section showed one cycle of Robust Design, It is clear from Figure 9.6 that the potential exists for further improvement. By taking the optimum point as a starting design, one can repeat the optimization procedure to achieve this potential— that was indeed done for the temperature control circuit. For each iteration, we took the middle level for each control factor as the optimum level from the previous itera- tion, and then took levels | and 3 to have the same relationship with the level 2 for that factor as in the first iteration. However, during these iterations, we did not let the value of E, exceed 7.2 V so that adequate separation between E, and Eo obtained through three iterations are shown in Table 9.7. Of course, some additional improve- ment is possible, but by the third iteration, the rate of improvement has clearly slowed down, so one need not proceed further. The foregoing discussion points to the potential of using orthogonal arrays to optimize nonlinear functions iteratively. Although we would expect this to be a topic of active research, the following advantages of using orthogonal arrays over many com- monly used nonlinear programming methods are apparent: * No derivatives have to be computed * Hessian does not have to be computed * Algorithm is less sensitive to starting conditions * Large number of variables can be handled easily * Combinations of continuous and discrete variables can be handled easily While most standard nonlinear programming methods are based on the first- and second-order derivatives of the objective function at a point, the orthogonal array method looks at a wide region in each iteration. That is, while the nonlinear program- ming methods constitute point approaches, the orthogonal array method is a region approach. Experience in using the orthogonal array method with a variety of problems 228 Design of Dynamic Systems Chap. 9 indicates that, because of the region approach, it works particularly well in the early stages, that is, when the starting point is far from the optimum, Once we get near the optimum point, some of the standard nonlinear programming methods, such as the Newton-Ralphson method, work very well. Thus one may wish to use the orthogonal array method in the beginning and then switch to a standard nonlinear programming method. TABLE 9.7 ITERATIVE OPTIMIZATION n ney Iteration Number | (dB) (dB) (4B) Starting condition | 23.84 | 29.94 | 53.78 Iteration 1 26.43 | 29.10 | 55.53 Iteration 2 27.30 | 28.70 | 56.00 Iteration 3 27.77 | 28.51 | 56.28 9.6 SUMMARY Dynamic systems are those in which we want the system’s response to follow the levels of the signal factor in a prescribed manner. The changing nature of the levels of the signal factor and the response make the design of a dynamic system more complicated than designing a static system, Nevertheless, the eight steps of Robust Design (described in Chapter 4) still apply. A temperature controller is a feedback system and can be divided into three main modules: (1) temperature sensor, (2) temperature control circuit, and (3) a heat- ing (or cooling) element. For designing a robust temperature controller, the three modules must be made robust separately and then integrated together. The temperature control circuit is a doubly dynamic system. First, for a particu- lar target temperature of the bath, the circuit must tum a heater ON or OFF at specific threshold temperature values. Second, the target temperature may be changed by the user. Four circuit parameters (R;, R2, R4, and E,) were selected as control factors. The resistance R3 was chosen as the signal factor. The tolerances in the control factors were the noise factors. Sec. 9.6 Summary 229 © The threshold resistance, Roy, at which the heater turns ON and the threshold resistance, Rr_ogf, at which the heater tums OFF were selected as the quality characteristics. The variation of Rr-oy and Rp-orp as a function of R3 formed two C-C type dynamic problems. To evaluate the S/N ratio (n) for the ON function, a compound noise factor, CN was formed. Three levels were chosen for the signal factor and the compound noise factor. Rr_oy was computed at the resulting nine combinations of the sig- nal and noise factor levels and the S/N ratio for the ON function was then com- puted. (An orthogonal array can be used for computing the S/N ratio when engineering judgment dictates that multiple noise factors be used), The S/N ratio for the OFF function (1’) was evaluated in the same manner, ‘The Lys array was used as the control orthogonal array. Through one cycle of Robust Design, the sum 7 + 7)’ was improved by 1.75 dB. Iterating the Robust Design cycle three times led to 2.50 dB improvement in ) + 1)’ . Orthogonal arrays can be used to optimize iteratively a nonlinear function. ‘They provide a region approach and perform especially well when the starting point is far from the optimum and when some of the parameters are discrete. Chapter 10 TUNING COMPUTER SYSTEMS FOR HIGH PERFORMANCE ‘A computer manufacturer invariably specifies the “best" operating conditions for a computer. Why then is tuning a computer necessary? The answer lies in the fact that manufacturers specify conditions based on assumptions about applications and loads. However, actual applications and load conditions might be different or might change ‘over a period of time, which can lead to inferior computer performance. Assuming a system has tunable parameters, two options are available to improve performance: (1) buying more hardware or (2) finding optimum settings of tunable parameters. One should always consider the option of improving the performance of a computer system through better administration before investing in additional hardware. In fact, this option should be considered for any system that has tunable parameters that can be set by the system administrator. Determining the best settings of tunable parameters by the prevailing trial-and-error method may, however, prove to be excessively time con- suming and expensive. Robust Design provides a systematic and cost efficient method ‘of experimenting with a large number of tunable parameters, thus yielding greater improvements in performance quicker. This chapter presents a case study to illustrate the use of the Robust Design method in tuning computer performance. A few details have been modified for pedagogic purposes. The case study was performed by T, W. Pao, C. S. Shetterd, and M. S. Phadke [P1] who are considered to be the first to conduct such a study to opti- mize a hardware-sofiware system using the Robust Design method. 231 232 Tuning Computer Systems for High Performance Chap. 10 There are nine sections in this chapter: Section 10.1 describes the problem formulation of the case study (Step 1 of the Robust Design steps described in Chapter 4). Section 10.2 discusses the noise factors and testing conditions (Step 2). Section 10.3 describes the quality characteristic and the signal-to-noise (S/N) ratio (Step 3). Section 10.4 discusses control factors and their alternate levels (Step 4). Section 10.5 describes the design of the matrix experiment and the experimental procedure used by the research team (Steps 5 and 6). Section 10.6 gives the data analysis and verification experiments (Steps 7 and 8). Section 10.7 describes the standardized S/N ratio that is useful in compensating for variation in load during the experiment. Section 10.8 discusses some related applications. Section 10.9 summarizes the important points of this chapter. 10.1 PROBLEM FORMULATION The case study concerns the performance of a VAX* 11-780 machine running the UNIX operating system, Release 5.0. The machine had 48 user terminal ports, two remote job entry links, four megabytes of memory, and five disk drives. The average number of users logged on at a time was between 20 and 30. Before the start of the project, the users’ perceptions were that system perfor- mance was very poor, especially in the afternoon. For an objective measurement of the response time, the experimenters used two specific, representative commands called standard and trivial. The standard command consisted of creating, editing, and remov- ing a file; the trivial command was the UNIX system date command, which does not involve input/output (1/0). Response times were measured by submitting these com- mands via the UNIX system crontab facility and clocking the time taken for the com- puter to respond using the UNIX system timex command, both of which are automatic system processes. For the particular users of this machine, the response time for the standard and trivial commands could be considered representative of the response time for other * VAX is a registered trademark of Digital Equipment Corporation. Sec. 10.1 Problem Formulation 233 various commands for that computer. In some other computer installations, response time for the compilation of a C program or the time taken for the troff command (a text processing command) may be more representative. Figure 10.1(a)—(b) shows the variation of the response times as functions of time of day for the standard and trivial commands. Note that at the start of the study, the average response time increased as the afternoon progressed (see the curves marked "Initial" in the figure), The increase in response time correlated well with the increase in the work load during the afternoon. The objective in the experiment was to make the response time uniformly small throughout the day, even when the load increased as, usual. There are two broad approaches for optimizing a complex system such as a com- puter: (1) micro-modeling and (2) macro-modeling. They are explained next. £1/2 Std. Dev. Initial fires 10 1.0 g g o @ é E = = 2s Zoos a a0 8 s « Optimized © Optimized ° 0 9 10 11 12 13 14 15 16 9 10 11 12°13 14 15 16 Time of Day Time of Day (a) For the Standard Command (0) For the Trivial Command Figure 10.1 Comparison of response times. Micro-Modeling As the name suggests, micro-modeling is based on an in-depth understanding of the system. It begins by developing a mathematical model of the system, which, in this case, would be of the complex internal queuing of the UNIX operating system. When systems are complex, as in the case study, we must make assumptions that simplify the 234 Tuning Computer Systems for High Performance Chap. 10 operation, as well as put forth considerable effort to develop the model. Furthermore, the more simplifying we do, the less realistic the model will be, and, hence, the less adequate it will be for precise optimization. But once an adequate model is con- structed, a number of well-known optimization methods, including Robust Design, can be used to find the best system configuration. Macro-Modeling In macto-modeling, we bypass the step of building a mathematical model of the sys- tem, Our concem is primarily with obtaining the optimum system configuration, not with obtaining a detailed understanding of the system itself. As such, macro-modeling gives faster and more efficient results. It gives the specific information needed for optimization with a minimum expenditure of experimental resources. The UNIX system is viewed as a "black box," as illustrated in Figure 10.2. The parameters that influence the response time are identified and divided into two classes: noise factors and control factors. The best settings of the control factors are deter- mined through experiments. Thus, the Robust Design method lends itself well for optimization through the macro-modeling approach. Noise Factors (system load) Response Time UNIX System yt (xiz) Control Factors (system configuration) Figure 10.2 Block diagram for the UNIX system. 10.2 NOISE FACTORS AND TESTING CONDITIONS Load variation during use of the machine, from day-to-day and as a function of the time of day, constitutes the main noise factor for the computer system under study. Sec. 10.3 ‘Quality Characteristic and S/N Ratio 235 The number of users logged on, central processor unit (CPU) demand, I/O demand, and memory demand are some of the more important load measures. Temperature and humidity variations in the computer room, as well as fluctuations in the power supply voltage, are also noise factors but are normally of minor consequence. The case study was conducted live on the computer. As a result, the normal variation of load during the day provided the various testing conditions for evaluating the S/N ratio. At the beginning of the study, the researchers examined the operating logs for the previous few weeks to evaluate the day-to-day fluctuation in response time and load. ‘The examination revealed that the response time and the load were roughly simi- lar for all five weekdays. This meant that Mondays could be treated the same as Tues- days, ete. If the five days of the week had tured out to be markedly different from each other, then those differences would have had to be taken into account in planning the experiment. 10.3 QUALITY CHARACTERISTIC AND S/N RATIO Let us first consider the standard command used in the study. Suppose it takes to seconds to execute that command under the best circumstances—that is, when the load is zero, fy is the minimum possible time for the command. Then, it becomes obvious that the actual response time for the standard command minus fg is a quality charac- teristic that is always nonnegative and has a target value of zero—that is, the actual response time minus ty belongs to the smaller-the-better type problems. In the case study, the various measurements of response time showed that ty was much smaller than the observed response time. Hence, tg was ignored and the measured response time was treated as a smaller-the-better type characteristic. The corresponding S/N ratio to be maximized is mean square response time N= 10 logio | for the standard command 10.) Referring to Figure 10.1, it is clear that at the beginning the standard deviation of the response time was large, so much so that it is shown by bars of length 1/2 standard deviation, as opposed to the standard practice of showing +2 standard devia tions. From the quadratic loss function considerations, reducing both the mean and variance is important. It is clear that the S/N ratio in Equation (10.1) accomplishes this goal because mean square response time is equal to sum of the square of the mean and the variance. 236 Tuning Computer Systems for High Performance Chap. 10 For the response time for the trivial command, the same formulation was used. That is, the S/N ratio was defined as follows: mean square response time W = —10 logio | for the trivial command (10.2) 10.4 CONTROL FACTORS AND THEIR ALTERNATE LEVELS The UNIX operating system provides a number of tunable parameters, some of which relate to the hardware and others to the software. Through discussions with a group of system administrators and computer scientists, the experiment team decided to include the eight control factors listed in Table 10.1 for the tuning study. Among them, factors A, C, and F are hardware related, and the others are software related. A description of these parameters and their alternate levels is given next. The discussion about the selection of levels is particularly noteworthy because it reveals some of the practical difficulties faced in planning and carrying out Robust Design experiments. TABLE 10.1 CONTROL FACTORS AND LEVELS Levels* Factor 1 2 3 A. Disk drives (RMOS & RPO6) | 4&1 | 4&2 | 4&2 B. File distribution alo © C. Memory size (MB) 4 | 3 | 35 D. System buffers us | 14 | 3 E. Sticky bits 3 8 F. KMCs used G. INODE table entries 8 |r |e 3 . lo |g H_ Other system tables a b * The starting levels are indicated by an underscore. The number and type of disk drives (factor A) is an important parameter that determines the I/O access time, At the start, there were four RMOS disks and one RPO6 disk. The experimenters wanted to see the effect of adding one more RPO6 Sec. 10.4 Control Factors and Their Alternate Levels 237 disk (level A>), as well as the effect of adding one RPO7 disk and a faster memory controller (level A3). However, the RPO7 disk did not arrive in time for the experi- ments. So, level A3 was defined to be the same as level A for factor A. The next section discusses the care taken in planning the matrix experiment, which allowed the experimenters to change the plan in the middle of the experiments. The file system distributions (factor B) a, b and ¢ refer to three specific algo- rithms used for distributing the user and system files among the disk drives. Obvi- ously, the actual distribution depends on the number of disk drives used in a particular system configuration, Since the internal entropy (a measure of the lack of order in storing the files) could have a significant impact on response time, the team took care to preserve the intemal entropy while changing from one file system to another during the experiments. One system administrator suggested increasing the memory size (factor C) to improve the response time. However, another expert opinion was given that stated additional memory would not improve the response for the particular computer system being studied. Therefore, the team decided not to purchase more memory until they were reasonably sure its cost would be justified. They took level C; as the existing memory size, namely 4 MB, and disabled some of the existing memory to form levels C and C3. They decided to purchase more memory only if the experimental data showed that disabling a part of the memory leads to a significant reduction in perfor- mance. Total memory is divided into two parts: system buffers (factor D) and user memory. The system buffers are used by the operating system to store recently used data in the hope that the data might be needed again soon. Increasing the size of the system buffers improves the probability (technically called hit ratio) of finding the needed data in the memory. This can contribute to improved performance, but it also reduces the memory available to the users, which can lead to progressively worse sys- tem performance. Thus, the optimum size of system buffers can depend on the particu- lar load pattern, We refer to the size of the system buffers as a fraction of the total memory size. Thus, the levels of the system buffers are sliding with respect to the memory size. Sticky bit (factor E) is a way of telling an operating system to treat a command in a special way. When the sticky bit for a command such as rm or ed is set, the exe- cutable module for that command is copied contiguously in the swap area of a disk during system initialization, Every time that command is needed but not found in the memory, it is brought back from the swap area expeditiously in a single operation. However, if the sticky bit is not set and the command is not found in the memory, it must be brought back block by block from the file system. This adds to the execution time. In this case study, factor E specifies how many and which commands had their sticky bits set. For level Ey, no command had its sticky bit set. For level E>, the three commands that had their sticky bits set were sh, ksh and rm. These were the three most frequently used commands during the month before the case study 238 Tuning Computer Systems for High Performance Chap. 10 according to a 5-day accounting command summary report, For the level £3, the eight commands that had their sticky bits set were the three commands mentioned above, plus the next five most commonly used commands, namely, the commands Is, cp, expr, chmod, and sade (a local library command). KMCs (factor F) are special devices used to assist the main CPU in handling the terminal and remote job entry traffic. They attempt to reduce the number of interrupts faced by the main CPU. In this case study, only the KMCs used for terminal traffic were changed, Those used for the remote job entry links were left alone. For level F, two KMCs were used to handle the terminal traffic, whereas for level F2 the two KMCs were disabled. The number of entries in the NODE table (factor G) determines the number of user files that can be handled simultaneously by the system. The three levels for the factor G are 400, 500, and 600. The three levels of the eighth factor, namely, the other system tables (factor H), are coded as a, b, and c. Note that the software factors (B, D, E, G, and H) can affect only response time. However, the three hardware factors (A, C, and F) can affect both the response time and the computer cost. Therefore, this optimization problem is not a pure parameter design problem, but, rather, a hybrid of parameter design and tolerance design. 10.5 DESIGN OF THE MATRIX EXPERIMENT AND THE EXPERIMENTAL PROCEDURE This case study has seven 3-level factors and one 2-level factor. There are 7x 3-1) + 1 x (2-1) + 1=16 degrees of freedom associated with these factors. The orthogonal array Lg is just right for this project because it has seven 3-level columns and one 2-level column to match the needs of the matrix experiment. The Lg array and the assignment of columns to factors are shown in Table 10.2. Aside from assign- ing the 2-level factor to the 2-level column, there is really no other reason for assigning a particular factor to a particular column. The factors were assigned to the columns in the order in which they were listed at the time the experiment was planned. Some aspects of the assignment of factors to columns are discussed next, The experiment team found that changing the level of disk drives (factor A) was the most difficult among all the factors because it required an outside technician and took three to four hours. Consequently, in conducting these experiments, the team first conducted all experiments with level A, of the disk drives, then those with level A> of the disk drives, and finally those with level A3 of the disk drives. The experiments with level A3 of the disk drives were kept for iast to allow time for the RPOT disk to arrive. However, because the RPO7 disk did not arrive in time, the experimenters redefined level A of the disk drives to be the same as level A, and continued with the rest of the plan. According to the dummy level technique discussed in Chapter 7, this redefinition of level does not destroy the orthogonality of the matrix experiment, This arrangement, however, gives 12 experiments with level A» of the disk drives; hence, Sec. 10.5 Design of the Matrix Experiment and the Experimental Procedure 239 more accurate information about that level is obtained when compared to level Ay. This is exactly what we should look for because level A2 is the new level about which we have less prior information. TABLE 10.2 Lj, ORTHOGONAL ARRAY AND FACTOR ASSIGNMENT Column Number and Factor Assignment Expt. |1 0203 4 5 6 7 8 No |F BC DE A GH pprorororrari a 2/1 12 2 2 2 2 2 3/1 1 3 3 3 3 3 3 4 [12 1 12 2 3 3 $s f]1 222 3 3 11 6 }1 2 3 3 1 1 2 2 7 }1 3 1 2 1 3 2 3 8 ]1 3 2 3 2 13 1 13°93 13 2 1 2 to }2 1 1 3 3 2 2 1 m]2 6 2 1 1 3 3 2 m2 1 32 2 11 3 wil22 12 3 1 3 2 w}2 22 3 12 1 3 is ]2 2 3 1 2 3 2 1 te |2 3 1 3 2 3 1 2 7 {2 3 2 1 3 1 2 3 wo}2 3 3 2 12 3 1 File distribution (factor B) was the second most difficult factor to change. Therefore, among the six experiments with level A, of the disk drives, the experiment team first conducted the two experiments with level B,, then those with level B, and finally those with level B3. The same pattern was repeated for level A> of the disk drives and then level A of the disk drives. The examination of the Lj array given in Table 10.2 indicates that some of the bookkeeping of the experimental conditions could have been simplified if the team had assigned factor A to Column 2 and factor B to 240 Tuning Computer Systems for High Performance Chap. 10 Column 3. However, the inconvenience of the particular column assignment was unimportant. For each experiment corresponding to each row of the Ls array, the team ran the system for two days and collected data on standard response time and trivial response time once every 10 minutes from 9:00 a.m. to 5:00 p.m. While measuring the response times, they made sure that the UNIX system cron facility did not schedule some routine data collection operations at the same exact moments because this would affect the measurement of response times. Running experiments on a live system can invite a number of practical problems. Therefore, the first thing the experimenters did was to seek the cooperation of users. One problem of great concem to the users was that for a particular combination of con- trol factor settings, the system performance could become bad enough to cause a major disruption of their activities. To avoid this, the system administrator was instructed to make note of such an event and go back to the pre-experiment settings of the various factors. This would minimize the inconvenience to the users. Fortunately, such an event did not occur, but had it happened, the experiment team would still have been able to analyze the data and determine optimum conditions using the accumulation analysis method described in Chapter 5 (see also Taguchi [T1], and Taguchi and Wu (17). Under the best circumstances, the team could finish two experiments per week. For 18 experiments, it would then take nine weeks. However, during the experiments, a snowstorm arrived and the Easter holiday was observed, both events causing the sys tem load to drop to an exceptionally low level. Those days were eliminated from the data, and the team repeated the corresponding combinations of control factor settings to have data for every row of the matrix experiment, 10.6 DATA ANALYSIS AND VERIFICATION EXPERIMENTS From the 96 measurements of standard response time for each experiment, the team computed the mean response time and the S/N ratio, The results are shown in Table 10.3. Similar computations were made for the trivial response time, but they are not shown here. The effects of the various factors on the S/N ratio for the standard response time are shown, along with the corresponding ANOVA in Table 10.4. The factor effects are plotted in Figure 10.3. Note that the levels C,, C2, and C3 of memory size are 4.0 MB, 3.0 MB, and 3.5 MB, respectively, which are not in a mono- tonic order. While plotting the data in Figure 10.3, the experimenters considered the correct order. It is apparent from Table 10.4 that the factor effects are rather small, especially when compared to the error variance. The problem of getting large error variance is more likely with live experiments, such as this computer system optimization experi- ment, because the different rows of the matrix experiment are apt to see quite different noise conditions, that is, quite different load conditions. Also, while running live Sec. 10.6 Data Analysis and Verification Experiments 241 n(d8) A, A, 3 8, B, Gy c, ¢, D, D, Wi 4/2 40 3.5 3.0, 1/5 1/4 is a BD 6 Ah 98 Oe Disk File Memory System Drives Distr. Size Buffer n (0B) £8 & FF, GG GH, H, Hy 038 490 600 600 28 82 9, 400600600, 2 Pe | Sticky KMCs Inode Other Bits Table System Entries Tables Figure 10.3 Factor effects for S/N ratio for standard response. Underscore indicates start- ing level. One-standard-deviation limits are also shown. experiments, the tendency is to choose the levels of control factors that are not far apart. However, we can still make valuable conclusions about optimum settings of control factors and then see if the improvements observed during the verification exper- iment are significant or not. 242 L Tuning Computer Systems for High Performance TABLE 10.3 DATA SUMMARY FOR STANDARD RESPONSE TIME Expt. | Mean | 9) No. | (see) | (4B) x 4.65 14.66 2 5.28 ~16.37 3 3.06 10.49 4 4,53 14.85 5 3.26 10.94 6 455 14.96 3.37 11.77 5.62 16,72 487 -14.67 10 413 -13.52 Ww 4.08 TS RR 445 -14.19 13 | 381 | -1289 14 | 587 | -16.75 1s 3.42 11.65 16 3.66 12.23 7 3.92 1281 18 442 13.71 Chap. 10 The following observations can be made from the plots in Figure 10.3 and Table 10.4 (note that these conclusions are valid only for the particular load characteristics of the computer being tuned): Going from not setting any sticky bits to setting sticky bits on the three most used commands does not improve the response time. This is probably because these three commands tend to stay in the memory as a result of their very fre- quent use, regardless of setting sticky bits. However, when sticky bits are set on the five next most used commands, the response time improves by 1.69 dB. This suggests that we should set sticky bits on the eight commands, and in future experiments, we should consider even more commands for setting sticky bits. KMCs do not help in improving response time for this type of computer environ- ment. Therefore, they may be dropped as far as terminal handling is concerned, thus reducing the cost of the hardware. Sec. 10.6 Data Analysis and Verification Experiments 243 TABLE 10.4 ANALYSIS OF SIN RATIOS FOR STANDARD RESPONSE* Average 1 by Factor Level Degrees of | Sumof | Mean Factor 1 2 3 | Freedom | Squares | Square | F A. Disk drives -14.37 | -13.40 t 3.76 3.76 1 B. File distribution 13.84 | -13.67 | -13.65 2 0.12t 0.06 C. Memory size 713.32 | -14.56 | -13.28 2 640 | 3.20 D, System buffers 13.74 | -13.31 | -14.01 2 1,92 0.97 E, Sticky bits =14.27 | -14.34 2 1227 | 614 | 2 F. KMCs used =13.94 | -13.50 1 ost | 0.42 G. INODE table -13.91 | -13.51 | -13.74 2 oa7t | 0.24 entries Hi. Other system tables | -13.53 | -14.15 | -13.48 2 168 | 1.34 i - 3 32.60 | 1087 Total a — a 60.06 3.53 (Emror) o (34.03) | G78) * Overall mean 1) = -13.72 AB; underscore indicates starting conditions. + Indicates the sum of squares that were added to form the pooled error sum of squares shown in parentheses. 3. Adding one more disk drive leads to better response time. Perhaps even more disks should be considered for improving the response time. Of course, this would mean more cost, so proper trade-offs would have to be made. 4, The S/N ratio is virtually the same for 4 MB and 3.5 MB memory. It is significantly lower for 3 MB memory. Thus, 4 MB seems to be an optimum value—that is, buying more memory would probably not help much in improv- ing response time. 5. There is some potential advantage (0.8 dB) in changing the fraction of system buffers from 1/3 to 1/4, 6. The effects of the remaining three control factors are very small and there is no advantage in changing their levels. 244 Tuning Computer Systems for High Performance Chap. 10 Optimum Conditions and Verification Experiment The optimum system configuration inferred from the data analysis above is shown in Table 10.5 along with the starting configuration. Changes were recommended in the settings of sticky bits, disk drives, and system buffers because they lead to faster response. KMCs were dropped because they did not help improve response, and drop- ping them meant saving hardware. The prediction of the S/N ratio for the standard Tesponse time under the starting and optimum conditions is also shown in Table 10.5 Note that the contributions of the factors, whose sum of squares were among the small- est and were pooled, are ignored in predicting the S/N ratio. Thus, the S/N ratio predicted by the data analysis under the starting condition is 14.67 dB, and under the optimum conditions it is -11.22 dB. The corresponding, predicted rms response times under the starting and optimum conditions are 5.41 seconds and 3.63 seconds, respec lively. TABLE 10.5 OPTIMUM SETTINGS AND PREDICTION OF STANDARD RESPONSE TIME Starting Condition ‘Optimum Condition Contributiont Contributiont Factor Setting (aB) Setting (aB) A. Disk drives* Ay -0.65 Ay 0.32 B. File distribution B - By - C. Memory size CQ 0.40 c 0.40 D. System buffers* Dy 0.39 Dz 041 E, Sticky bits* E, -055 Ey Liq F. KMCs used* Fy - Ei - G. INODE table entries | G3 - G; - H. Other system tables | 024 Hy 024 Overall mean 13.72 Total 1467 “122 * Indicates the factors whose levels are changed from the starting to the optimum conditions. + By contribution we mean the deviation from the overall mean caused by the particular factor level. Sec. 10.6 Data Analysis and Verification Experiments 245 As noted earlier, here the error variance is large. We would also expect the vari- ance of the prediction error to be large. The variance of the prediction error can be computed by the procedure given in Chapter 3 {see Equation (3.14)]. The equivalent sample size for the starting condition, n,, is given by Ak, ft Ae (pL A] [LR = mn \m, a} (nc, 2)” mp, toi 1 7 +|—--|+ |—-— me, 7) (my, 2 ~t ft ia 14 toa -4+(t +}+[2 18] * [6 ~ 18 a (ee Alle [be ae 6 18J [6 18 = 0.61. Thus, 7, = 1.64. Correspondingly, the two-standard-deviation confidence limits for the predicted S/N ratio under the starting conditions are - 14.67 +2 V3.78/1.64, which simplify to —14.67 +3.04 dB. Similarly, the two-standard-deviation confidence limits for _the predicted S/N ratio under the optimum conditions are 11.22 £2V3.78/1.89, which simplify to — 11.22 +2.83 dB. Note that there is a slight difference between the widths of the two confidence intervals. This is because there are 12 experiments with level A> and only six with Ay. Subsequently, the optimum configuration was implemented. The average response times for the standard and trivial commands for over two weeks of operation under the optimum conditions are plotted in Figure 10.1(a)—(b). Comparing the response time under the starting (initial) configuration with that under the optimum configuration, we see that under the optimum conditions the response time is small, even in the afternoon when the load is high. In fact, the response time is uniformly low throughout the day. The data from the confirmation experiment are summarized in Table 10.6. For standard response time we see that the mean response time was reduced from 6.15 sec. to 2.37 sec., which amounts to a 61-percent improvement. Similar improvement was seen in the rms response time. On the S/N ratio scale, the improvement was 8.39 dB. Similarly, for the trivial response time, the improvement was seen to be between 70 percent and 80 percent of the mean value, or 13.64 dB, as indicated in Table 10.6. 246 Tuning Computer Systems for High Performance Chap. 10 TABLE 10.6 RESULTS OF VERIFICATION EXPERIMENT Standard Response Time Trivial Response Time Starting | Optimum Starting | Optimum Measure | Levels | Levels | Improvement | Levels | Levels | Improvement Mean (sec) 615 237 61% 0.521 0.148 N% ms (sec) 7.59 2.98 61% 0.962 0.200 79% 1 (6B) -1760 | -921 839dB | +034 | +1398 13.64 dB Standardized 15.88 8.64 7.24 dB +3.26 +15.71 12.45 dB 1 (dB) Note that the observed S/N ratios for standard response time under the starting and optimum conditions are within their respective two-standard-deviation confidence limits. However, they are rather close to the limits. Also, the observed improvement (8.39 dB) in the S/N ratio is quite large compared to the improvement predicted by the data (3.47 dB). However, the difference is well within the confidence limits. Also, observe that the S/N ratio under the optimum conditions is better than the best among the 18 experiments. Thus, here we achieved a 60- to 70-percent improvement in response time by improved system administration. Following this experiment, two similar experiments were performed by Klingler and Nazaret [K5], who took extra care to ensure that pub- lished UNIX system tuning guides were used to establish the starting conditions. Their experiments still led to a 20- to 40-percent improvement in response time. One extra factor they considered was the use of PDQs, which are special auxiliary processors for handling text processing jobs (troff). For their systems, it turned out that the use of PDQs could hurt the response time rather than help. 10.7 STANDARDIZED S/N RATIO When running Robust Design experiments with live systems, we face the methodologi- cal problem that the noise conditions, which are the load conditions for our computer system optimization, are not the same for every row of the control orthogonal array. This can lead to inaccuracies in the conclusions. One way to minimize the impact of changing noise conditions is to construct a standardized S/N ratio, which we describe next. As noted earlier, some of the more important load measures for the computer system optimization experiment are: number of users, CPU demand, 1/O demand, and Sec. 10.7 ‘Standardized S/N Ratio 247 memory demand. After studying the load pattern over the days when the case study was conducted, we can define low and high levels for each of these load measures, as shown in Figure 10.4. These levels should be defined so that for every experiment we have a reasonable number of observations at each level. The 16 different possible combinations of the levels of these four load measures are listed in Table 10.7 and are nothing more than 16 different noise conditions. No. of Users ‘CPU Demand VO Demand Memory Demand Average Value Range Figure 10.4 Selection of levels for load measures. In a live experiment, the number of observations for each noise condition can change from experiment to experiment. For instance, one day the load might be heavy while another day it might be light. Although we cannot dictate the load condition for the different experiments, we can observe the offered load. The impact of load varia- tion can be minimized as follows: We first compute the average response time for each experiment in each of the 16 load conditions. We then treat these 16 averages as raw data to compute the S/N ratio for each experiment. The S/N ratio computed in this manner is called standardized S/N ratio because it effectively standardizes the load conditions for each experiment. The system can then be optimized using this standard- ized S/N ratio. Note that for the standardized S/N ratio to work, we must have good definitions of noise factors and ways of measuring them. Also, in each experiment every noise condition must occur at least once to be able to compute the average. In practice, 208 Tuning Computer Systems for High Performance Chap. 10 TABLE 10.7 COMPUTATION OF STANDARDIZED S/N RATIO” Expt. | No.of | CPU VO | Memory | Average Response No. | Users | Demand | Demand | Demand | Time for Experiment 1 1 1 1 1 va 2 1 1 1 2 Yan 3 1 1 2 1 Ys 4 1 1 2 2 Jia 5 1 2 1 1 Ys 6 1 2 1 2 Ye 7 1 2 2 1 Yin 8 1 2 2 2 Yis 9 | 2 1 1 1 Ye w | 2 1 1 2 Yaw u 2 1 2 1 on n | 2 1 2 2 yin 3 | 2 2 1 1 is 4 | 2 2 1 2 Jina | 2 2 2 1 Yass w | 2 2 2 2 Yire * Standardized S/N ratio for experiment i, 1 w= 10 loBi0 Lag vi however, if one or two conditions are missing, we may compute the S/N ratio with the available noise conditions without much harm. In the computer system optimization case study discussed in this chapter, the experimenters used the concept of standardized S/N ratio to obtain better comparison of the starting and optimum conditions. Some researchers expressed a concem that part of the improvement observed in this case study might have been due to changed load conditions over the period of three months that it took the team to conduct the matrix experiment. Accordingly, the experimenters computed the standardized S/N ratio for the two conditions and the results are shown in Table 10.6 along with the other results of the verification experiment. Since the improvement in the standardized S/N ratio is quite close to that in the original S/N ratio, the experiment team concluded that the load change had a minimal impact on the improvement. Sec. 10.9 Summary 249 10.8 RELATED APPLICATIONS Computer System Parameter Space Mapping In practice, running live experiments to optimize every computer installation is obvi- ously not practicable. Instead, more benefits can be achieved by performing off-line matrix experiments to optimize for several different load conditions. Such load condi- tions can be simulated using facilities like the UNIX system benchmarking facility. The information gained from these off-line experiments can be used to map the operat- ing system parameter space. This catalog of experimental results can then be used to improve the performance of different machines. Optimizing Operations Support Systems Large systems, such as the traffic in a telecommunications network, are often managed by software systems that are commonly known as operations support systems. The load offered to a telephone network varies widely from weekday to weekend, from morning to evening, from working days to holidays, etc. Managing the network includes defining strategies for providing the needed transmission and switching facili- ties, routing a call end to end on the network, and many other tasks. Steve Eick and Madhav Phadke applied the Robust Design method in AT&T to improve the strategy for adapting the routing algorithm to handle unusually high demands. Studying eight parameters of the strategy using the Lj orthogonal array, they were able to increase the excess load carried during unusually high demands by as much as 70 percent. This not only improved the revenue, but also improved network performance by reducing the number of blocked calls, These experiments were done on a simulator because of the high risks of the live experiments. International telephone traffic is carried over several different types of transmis- sion facilities. Seshadri [S4] used the Robust Design method to determine the pre- ferred facility for transmitting facsimile data by conducting live experiments on the network. ‘The success in improving the telephone network traffic management suggests that the Robust Design approach could be used successfully to improve other networks, for example, air traffic management. In fact, it is reported that the method has been used in Japan to optimize runway and air-space usage at major airports. 10.9 SUMMARY + The performance of complex systems (such as a computer), which have tunable parameters, can often be improved through better system administration. Tuning is necessary because the parameter settings specified by the system manufacturer Tuning Computer Systems for High Performance Chap. 10 pertain to specific assumptions regarding the applications and load, However, the actual applications and load could be different from those envisioned by the manufacturer. There are two systematic approaches for optimizing a complex system: (1) micro-modeling and (2) macro-modeling. The macro-modeling approach can utilize the Robust Design method to achieve more rapid and efficient results. Load variation during use of the computer, from day to day and as a function of the time of day, constitutes the main noise factor for the computer system. The number of users logged on, CPU demand, I/O demand, and memory demand are some of the more important load measures. The response time for the standard (or the trivial) command minus the minimum possible time for that command was the quality characteristic for the case study. It is a smaller-the-better type characteristic. ‘The minimum possible response time was ignored in the analysis as it was very small compared to the average response time. Eight control factors were chosen for the case study: disk drives (A), file distri- bution (B), memory size (C), system buffers (D), sticky bits (E), KMCs used (F), INODE table entries (G), and other system tables (H), Factors A, C, and F are hardware related, whereas the others are software related. Factor F had two lev- els while the others had three levels. The Lig orthogonal array was used for the matrix experiment. Disk drives was the most difficult factor to change, and file distribution was the next most difficult factor to change. Therefore, the 18 experiments were conducted in an order that minimized changes in these two factors. Also, the experiments were ordered in a way that allowed changing level 3 of disk drives, as anticipated dur- ing the planning stage. The experiments were conducted on a live system. Each experiment lasted for two days, with eight hours per day. Response time for the standard and the trivial commands was observed once every 10 minutes by using an automatic measurement facility. Running Robust Design experiments on live systems involves a methodological problem that the noise conditions (which are the load conditions in the computer Tesponse time case study) are not the same for every row of the matrix experi- ment, Consequently, the estimated error variance can be large compared to the estimated factor effects. Also, the predicted response under the optimum condi- tions can have large variance. Indeed, this was observed in the present case study. Standardized S/N ratios can be used to reduce the adverse effect of changing noise conditions, which is encountered in running Robust Design experiments on live systems, Sec. 10.9 ‘Summary 251 Data analysis indicated that levels of four control factors (A, D, E, and F) should be changed and that the levels of the other four factors should be kept the same. It also indicated that the next round of experiments should consider setting sticky bits on more than eight commands. The verification experiment showed a 61-percent improvement in both the mean and rms response time for the standard command. This corresponds to an 83- percent reduction in variance of the response time. It also showed a 71-percent improvement in mean response time and a 79-percent improvement in rms response time for the trivial command. Computer response time optimization experiments can be conducted off-line using automatic load generation facilities such as the UNIX system benchmark- ing facility. Off-line experiments are less disruptive for the user community. Orthogonal arrays can be used to determine the load conditions to be simulated. Chapter 11 RELIABILITY IMPROVEMENT Increasing the longevity of a product, the time between maintenance of a manufactur- ing process, or the time between two tool changes are always important considerations in engineering design. These considerations are often lumped under the term reliability improvement. There are three fundamental ways of improving the reliability of a pro- duct during the design stage: (1) reduce the sensitivity of the product’s function to the variation in product parameters, (2) reduce the rate of change of the product parame- ters, and (3) include redundancy. The first approach is nothing more than the parame- ter design described earlier in this book. The second approach is analogous to toler- ance design and it typically involves more expensive components or manufacturing processes. Thus, this approach should be considered only after sensitivity has been minimized. The third approach is used when the cost of failure of the product is high compared to the cost of providing redundant components or even the whole product. The most cost-effective approach for reliability improvement is to find appropri- ate continuous quality characteristics and reduce their sensitivity to all noise factors. Guidelines for selecting quality characteristics related with product reliability were dis- cussed in Chapter 6 in connection with the design of the paper handling system in copying machines. As noted in Chapter 6, determining such quality characteristics is not always easy with existing engineering know-how. Life tests, therefore, must be performed to identify settings of control factors that lead to longer product life. This chapter shows the use of the Robust Design methodology in reliability improvement through a case study of router bit life improvement conducted by Dave Chrisman and Madhav Phadke, documented in Reference [P3). 253 254 Reliability Improvement Chap. 11 This chapter has nine sections: Section 11.1 describes the role of signal-to-noise (S/N) ratios in reliability improvement. Section 11.2 describes the routing process and the goal for the case study (Step 1 of the eight Robust Design steps described in Chapter 4), Section 11.3 describes the noise factors and quality characteristic (Steps 2 and 3). Section 11.4 gives the control factors and their alternate levels (Step 4). Section 11.5 describes the construction of the control orthogonal array (Step 5). The requirements for the project were such that the construction of the control orthogonal array was a complicated combinatoric problem, Section 11.6 describes the experimental procedure (Step 6). Section 11.7 gives the data analysis, selection of optimum conditions, and the verification experiment (Steps 7 and 8). Section 11.8 discusses estimation of the factor effects on the survival probability curves, Section 11.9 summarizes the important points of this chapter. 11.1 ROLE OF S/N RATIOS IN RELIABILITY IMPROVEMENT First, let us note the difference between reliability characterization and reliability improvement. Reliability characterization refers to building a statistical model for the failure times of the product. Log-normal and Weibull distributions are commonly used for modeling the failure times. Such models are most useful for predicting warranty cost. Reliability improvement means changing the product design, including the set- tings of the control factors, so that time to failure increases, Invariably, it is expensive to conduct life tests so that an adequate failure-time model can be estimated. Consequently, building adequate failure-time models under various settings of control parameters, as in an orthogonal array experiment, becomes impractical and, hence, is hardly ever done. In fact, it is recommended that conducting life tests should be reserved as far as possible only for a final check on a product. Accelerated life tests are well-suited for this purpose. For improving a product’s reliability, we should find appropriate quality charac- teristics for the product and minimize its sensitivity to all noise factors. This automati- cally increases the product's life. The following example clarifies the relationship between the life of a product and sensitivity to noise factors. Consider an electrical circuit whose output voltage, y, is a critical characteristic. If it deviates too far from the target, the circuit’s function fails. Suppose the variation in a resistor, R, plays a key role in the variation of y. Also, suppose, the resistance R is sensitive to environmental temperature and that the resistance increases at a certain Sec. 11.1 Role of S/N Ratios in Reliability Improvement 255 rate with aging. During the use of the circuit, the ambient temperature may go too high or too low, or sufficient time may pass leading to a large deviation in R. Conse- quently, the characteristic y would go outside the limits and the product would fail Now, if we change the nominal values of appropriate control factors, so that y is much less sensitive to variation in R, then for the same ambient temperatures faced by the circuit and the same rate of change of R due to aging, we would get longer life out of that circuit. Sensitivity of the voltage y to the noise factors is measured by the S/N ratio. Note that in experiments for improving the S/N ratio, we may use only temperature as the noise factor. Reducing sensitivity to temperature means reducing sensitivity to variation in R and, hence, reducing sensitivity to the aging of R also. Thus, by appropriate choice of testing conditions (noise factor settings) during Robust Design experiments, we can improve the product life as well. Estimation of Life Using a Benchmark Product Estimating the life of a newly designed product under customer-usage conditions is always a concern for the product designer. It is particularly important to estimate the life without actually conducting field studies because of the high expense and the long delay in getting results. Benchmark products and the S/N ratio can prove to be very useful in this regard. A benchmark product could be the earlier version of that product with which we have a fair amount of field experience. Suppose we test the current product and the benchmark product under the same set of testing conditions. That is, we measure the quality characteristic under different levels of the noise factors. For example, in the circuit example above, we may meas- ure the resistance R at the nominal temperature and one or more elevated temperatures. The test may also include holding the circuit at an elevated temperature for a short period of time and then measuring R. By analyzing the test data, suppose we find the S/N ratio for the current product as 4, and for the benchmark product as qj. Then, the sensitivity of the quality characteristic for the current product is 7, — Tz dB lower than that for the benchmark product—that is, the variance for the current product is smaller by a factor of r, where r is given by [aoe | r=10' © It is often the case that the rate of drift of a product’s quality characteristic is proportional to the sensitivity of the quality characteristic to noise factors. Also, the drift in the quality characteristic as a function of time can be approximated reasonably well by the Wiener process. Then, through standard theory of level crossing, we can infer that the average life of the current product would be r times longer than the life 256 Reliability Improvement Chap. 11 of the benchmark product whose average life is known through past experience. Thus, the S/N ratio permits us to estimate the life of a new product in a simple way without conducting expensive and time-consuming life tests. This section described the role of S/N ratios in reliability improvement. This is a more cost-effective and, hence, preferred way to improve the reliability of a product or a process. However, for a variety of reasons (including lack of adequate engineering know-how about the product) we are forced, in some situations, to conduct life tests to find a way of improving reliability. In the remaining sections of this chapter, we describe a case study of improving the life of router bits by conducting life studies. 11.2 THE ROUTING PROCESS Typically, printed wiring boards are made in panels of 18 x 24 in. size. Appropriate size boards, say 8 X 4 in,, are cut from the panels by stamping or by the routing pro- cess. A benefit of the routing process is that it gives good dimensional control and smooth edges, thus reducing friction and abrasion during the circuit pack insertion pro- cess. When the router bit gets dull, it produces excessive dust which then cakes on the edges of the boards and makes them rough. In such cases, a costly cleaning operation is necessary to smooth the edges. However, changing the router bits frequently is also expensive. In the case study, the objective was to increase the life of the router bits, primarily with regard to the beginning of excessive dust formation. The routing machine used had four spindles, all of which were synchronized in their rotational speed, horizontal feed (X-Y feed), and vertical feed (in-feed). Each spindle did the routing operation on a separate stack of panels. Typically, two to four panels are stacked together to be cut by each spindle. The cutting process consists of lowering the spindle to an edge of a board, cutting the board all around using the X-Y feed of the spindle, and then lifting the spindle, This is repeated for each board on a panel. 11.3 NOISE FACTORS AND QUALITY CHARACTERISTICS Some of the important noise factors for the routing process are the out-of-center rota- tion of the spindle, the variation from one router bit to another, the variation in the material properties within a panel and from panel to panel, and the variation in the speed of the drive motor. Ideally, we should look for a quality characteristic that is a continuous variable related to the energy transfer in the routing process, Such a variable could be the wear of the cutting edge or the change in the cutting edge geometry. However, these vari- ables are difficult to measure, and the researchers wanted to keep the experiment sim- ple. Therefore, the amount of cut before a bit starts to produce an appreciable amount of dust was used as the quality characteristic. This is the useful life of the bit. Sec. 11.4 Control Factors and Their Levels 257 11.4 CONTROL FACTORS AND THEIR LEVELS The control factors selected for this project are listed in Table 11.1. Also listed in the table are the control factors’ starting and alternate levels. The rationale behind the selection of some of these factors and their levels is given next. TABLE 11.1 CONTROL FACTORS AND THEIR LEVELS Sf Levee Factors 1} 2|]3 ]4 A. Suction (in of Hg) | 1 | 2 B. X-Y feed (inmin) | 60 | 80 C. In-feed (in/min) 10 | 50 D. Type of bit rf2),3]4 E. Spindle positiont 1} 2]3]4 F, Suction foot sR | BB G. Stacking height (in) | 3/16 | 174 H. Depth of slot (thou) | 60 | 100 1. Speed (rpm) 30K | 40K, * Starting levels are indicated by an underscore. + Spindle position is not a control factor. It is a noise factor. Suction (factor A) is used around the router bit to remove the dust as it is gen- erated, Obviously, higher suction could reduce the amount of dust retained on the boards. The starting suction was two inches of mercury (Hg). However, the pump used in the experiment could not produce more suction, So, one inch of Hg was chosen as the alternate level, with the plan that if the experiments showed a significant difference in the dust, a more powerful pump would be obtained. Related to the suction are suction foot and the depth of the backup slot. The suc- tion foot determines how the suction is localized near the cutting point. Two types of suction foot (factor F) were chosen: solid ring (SR) and bristle brush (BB). A backup panel is located undemeath the panels being routed. Slots are precut in this backup panel to provide air passage and a place for dust to accumulate temporarily. The depth of these slots was a control factor (factor H) in the case study. 258 Reliability Improvement Chap. 11 Stack height (factor G) and X-Y feed (factor B) are control factors related to the productivity of the process—that is, they determine how many boards are cut per hour. The 3/16-in. stack height meant three panels were stacked together while 1/4-in. stack height meant four panels were stacked together. The in-feed (factor C) determines the impact force during the lowering of the spindle for starting to cut a new board. Thus, it could influence the life of the bit regarding breakage or damage to its point. Four different nypes of router bits (factor D) made by different manufacturers were investi- gated in this study. The router bits varied in cutting geometry in terms of the hel angle, the number of flutes, and the type of point. Spindle position (factor E) is not a control factor. The variation in the state of adjustment of the four spindles is indeed a noise factor for the routing process. All spindle positions must be used in actual production; otherwise, the productivity would suffer. The reason it was included in the study is that in such situations one must choose the settings of control factors that work well with all four spindles, The rationale for including the spindle position along with the control factors is given in Section 11.5. 11.5 DESIGN OF THE MATRIX EXPERIMENT For this case study, the goal was to not only estimate the main effects of the nine fac- tors listed in the previous section, but also to estimate the four key 2-factor interac- tions. Note that there are 36 distinct ways of choosing two factors from among nine factors, ‘Thus, the number of two-factor interactions associated with nine factors is 36. An attempt to estimate them all would take excessive experimentation, which is also unnecessary anyway. ‘The four interactions chosen for the case study were the ones judged to be more important based on the knowledge of the cutting process: 4. (X-Y feed) x (speed), that is, B x 1 2. (in-feed) x (speed), that is, C x I 3. (stack height) x (speed), that is, G x I 4, (X-Y feed) x (stack height), that is, Bx G The primary purpose of studying interactions in a matrix experiment is to see if any of those interactions are strongly antisynergistic. Lack of strong antisynergistic interactions is the ideal outcome. However, if a strong antisynergistic interaction is found, we should look for a better quality characteristic. If we cannot find such a characteristic, we should look for finding levels of some other control factors that cause the antisynergistic interaction to disappear. Such an approach leads to a stable design compared to an approach where one picks the best combination when a strong antisyn- ergistic interaction is found. In addition to the requirements listed thus far, the experimenters had to consider the following aspects from a practical viewpoint: Sec. 11.5 Design of the Matrix Experiment 259 * Suction (factor A) was difficult to change due to difficult access to the pump. * All four spindles on a machine move in identical ways—that is, they have the same X-Y feed, in-feed, and speed. So, the columns assigned to these factors should be such that groups of four rows can be made, where each group has a common X-Y feed, in-feed, and speed. This allows all four spindles to be used effectively in the matrix experiment. Construction of the Orthogonal Array ‘These requirements for constructing the control orthogonal array are fairly complicated. Let us now see how we can apply the advanced strategy described in Chapter 7 to con- struct an appropriate orthogonal array for this project. First, the degrees of freedom for this project can be calculated as follows: Source Degrees of Freedom Two 4-level factors 2x(4- 1I)=6 Seven 2-level factors 7x(2-1)=7 Four 2-factor interactions 4x(@2-D)x2-D=4 between 2-level columns Overall mean 1 Total 18 Since there are 2-level and 4-level factors in this project, it is preferable to use an array from the 2-level series. Because there are 18 degrees of freedom, the array must have 18 or more rows. The next smallest 2-level array is Lp. The linear graph needed for this case study, called the required linear graph, is shown in Figure 11.1(a). Note that each 2-level factor is represented by a dot, and interaction is represented by a line connecting the corresponding dots. Each 4-level factor is represented by two dots, connected by a line according to the column merging method in Chapter 7. The next step in the advanced strategy for constructing orthogonal arrays is to select a suitable linear graph of the orthogonal array L3) and modify it to fit the required linear graph. Here we take a slightly different approach. We first simplify the required linear graph by taking advantage of the special circumstances and then proceed to fit a standard linear graph. 260 Reliability Improvement Chap. 11 xB IxG | l e e D E H F (a) Required Linear Graph vo | ] e e Cc D E F H ve—en (b) Modified Required Linear Graph 1 4 5 7 6 3 | ] aI ‘| 2 8 10 9 " (¢) Selected Standard Linear Graph of Ly, G 15 7 4 a 7 “] J] 10 2 9 8 e 8 D E (d) Modified Standard Linear Graph Figure 11.1 Linear graph for the routing project. We notice that we must estimate the interactions of the factor I with three other factors. One way to simplify the required linear graph is to teat I as an outer ‘factor—that is, to first construct an orthogonal array by ignoring 1 and its interactions. Sec. 11.5 Design of the Matrix Experiment 261 Then, conduct each row of the orthogonal array with the two levels of I. By so doing, we can estimate the main effect of I and also the interactions of I with all other factors. The modified required linear graph, after dropping factor I and its interactions is shown in Figure 11.1(b). This is a much simpler linear graph and, hence, easier to fit to a standard linear graph. Dropping the 2-level factor I and its interactions with three, 2evel factors is equivalent to reducing the degrees of freedom by four. Thus, there are 14 degrees of freedom associated with the linear graph of Figure 11.1(b). There- fore, the orthogonal array Ly, can be used to fit the linear graph of Figure 11.1(b). This represents a substantial simplification compared with having to use the array L3 for the original required linear graph. The linear graph of Figure 11.1(b) has three lines, connecting two dots each, and four isolated dots. Thus, a standard linear graph that has a number of lines that connect pairs of dots seems most appropriate. Such a linear graph was selected from the standard linear graphs of L jg given in Appendix C and it is shown in Figure 11.1(c). It has five lines, each connecting two distinct dots. The step-by-step modification of this linear graph to make it fit the one in Figure 11.1(b) is discussed next. The requirement that there should be as few changes as possible in factor A implies that factor A should be assigned to column 1. Therefore, we break the line connecting dots 1 and 2, giving us three dots 1, 2, and 3. Now, we want to make groups of four rows which have common X-Y feed (factor B) and in-feed (factor C). Therefore, these two factors should be assigned to columns that have fewer changes. Hence, we assign columns 2 and 3 to factors B and C, respectively. Referring to the columns 2 and 3 of the array Lg in Table 11.2, it is clear that each of the four groups of four rows (1-4, 5-8, 9-12, and 13-16) has a common X-Y feed and in-feed. Now we should construct a 4-level column for spindle position (factor E) so that all four spindle positions will be present in each group of the four rows mentioned above. We observe that this can be achieved by merging columns 4 and 8 according to Section 7.7. Of course, column 12, which represents the interaction between columns 4 and 8, must be kept empty. Note that we could have used any of the other three lines in Figure 11.1(6) for this purpose. Next, from among the remaining three lines in the standard linear graph, we arbi- trarily chose columns 7 and 9 to form a 4-level column for factor D. Of course, the interaction column 14 must be kept empty. The two remaining lines are then broken to form six isolated dots corresponding to columns 5, 10, 15, 6, 11, and 13. The next priority is to pick a column for factor G so that the interaction B x G would be contained in one of the remaining five columns. For this purpose, we refer to the interaction table for the Ly array given in Appendix C. We picked column 15 for factor G. Column 13 contains interaction between columns 2 and 15, so it can be used to estimate the interaction B x G. We indicate this in the linear graph by a line joining the dots for the columns 2 and 15. From the remaining four columns, we arbitrarily assign columns 10 and 5 to fac- tors F and H. Columns 6 and 11 are kept empty. Chap. 11 Reliability Improvement TABLE 11.2 Lis ORTHOGONAL ARRAY Column Number 7 8 9 10 ML 12 13 14 15 6 1 Expt. No. 10 u 2 3 14 15 16 The final, modified standard linear graph along with the assignment of factors to columns is shown in Figure 11.1(d). The assignment of factors to the columns of the Lig array is as follows: Column Factor 10 15 13 BxG Column Factor 7,9, 14 D E 4, 8, 12 Sec. 11.5 Design of the Matrix Experiment 263 The 4-level columns for factors D and E were formed in the array Lys by the column merging method of Section 7.7. The resulting 16 row orthogonal array is the same as the first 16 rows of Table 11.3, except for the column for factor I. Because I is an outer factor, we obtain the entire matrix experiment as follows: make rows 17-32 the same as rows 1-16. Add a column for factor I that has 1 in the rows 1-16 and 2 in the rows 17-32. Note that the final matrix experiment shown in Table 11.3 is indeed an orthogonal array——that is, in every pair of columns, all combinations occur and they occur an equal number of times. We ask the reader to verify this claim for a few pairs of columns, Note that the matrix experiment of Table 11.3 satisfies all the require- ments set forth earlier in this section. The 32 experiments in the control orthogonal array of Table 11.3 are arranged in eight groups of four experiments such that: a, For each group there is a common speed, X-Y feed, and in-feed b. The four experiments in each group correspond to four different spindles Thus, each group constitutes a machine run using all four spindles, and the 32 experi- ments in the control orthogonal array can be conducted in eight runs of the routing machine. Observe the ease with which we were able to construct an orthogonal array for a very complicated combinatoric problem using the standard orthogonal arrays and linear graphs prepared by Taguchi. Inclusion of a Noise Factor in a Matrix Experiment As a tule, noise factors should not be mixed in with the control factors in a matrix experiment (orthogonal array experiment). Instead, noise factors should be used to form different testing conditions so that the S/N ratio can accurately measure sensi- tivity to noise factors. According to this rule, we should have dropped the spindled position column in Table 11.3 and considered the four spindle positions as four testing conditions for each row of the orthogonal array, which would amount to a four times larger experimental effort. To save the experimental effort, in some situations we assign control factors as well as noise factors to the columns of the matrix experiment. In the router bit life improvement case study, we included the spindle position in the matrix experiment for the same reason, Note that in such matrix experiments noise is introduced systemati- cally and in a balanced manner. Also, results from such matrix experiments are more dependable than the results from a matrix experiment where noise is fixed at one level to save experimental effort. Control factor levels found optimum through such experi- ments are preferred levels, on average, for all noise factor levels in the experiment. In the router bit example, results imply that the control factor levels found optimum are, on average, preferred levels for all four spindle positions. Reliability Improvement Chap. 11 TABLE 11.3 MATRIX EXPERIMENT AND OBSERVED LIFE 264 B 2S3e2/saes/2asa| sass Sess kaka | sage 3 Ej----|----|----|----[aane é I 3 | Br - re [ere anne loa en ne tbo oa a lwewe | saw et eee BRe |- 00 an ? Belinea lanana|----janaale-n-e = % Bg|-eeti[ecce[seusimzne (sees | RARS/NARA|R RAS gz * Life was measured in hundreds of inches of movement in X-Y plane, Tests were terminated at 1,700 inches, Sec.11.7 Data Analysis 265 11.6 EXPERIMENTAL PROCEDURE In order to economize on the size of the experiment, the experimenters took only one observation of router bit life per row of the control orthogonal array. Of course, they realized that taking two or three noise conditions per row of the control orthogonal array would give them more accurate conclusions, However, doing this would mean exceeding the allowed time and budget. Thus, a total of only 32 bits were used in this project to determine the optimum settings of the control factors. During each machine run, the machine was stopped after every 100 in. of cut (that is, 100 in. of router bit movement in the X-Y plane) to inspect the amount of dust. If the dust was beyond a certain minimum predetermined level, the bit was recorded as failed. Also, if a bit broke, it was obviously considered to have failed. Otherwise, it was considered to have survived. Before the experiment was started, the average bit life was around 850 in. Thus, each experiment was stopped at 1,700 in. of cut, which is twice the original average life, and the survival or failure of the bit was recorded. Usually, measuring the exact failure time of a product is very difficult. There- fore, in practice, it is preferable to make periodic checks for survival as was done every 100 in. of cut in this case study. Also, running a life test beyond a certain point costs a lot, and it does not add substantially to the information about the preferred level of a control factor. Therefore, truncating the life test at an appropriate point is recom- mended. In the reliability analysis or reliability engineering literature, determining an interval in which a product fails is called interval censoring, whereas terminating a life test at a certain point is called right censoring. 11.7 DATA ANALYSIS Table 11.3 gives the experimental data in hundreds of inches. A reading of 0.5 means that the bit failed prior to the first inspection at 100 in. A reading of 3.5 means that the bit failed between 300 and 400 in. Other readings have similar interpretation, except the reading of 17.5 which means survival beyond 1,700 in., the point where the experiment was terminated. Notice that for 14 experiments, the life is 0.5 (50 in.), meaning that those conditions are extremely unfavorable. Also, there are eight cases of life equal to 17.5, which are very favorable conditions. During experimentation, it is important to take a broad range for each control factor so that a substantial number of favorable and unfavorable conditions are created. Much can be learned about the optimum settings of control factors when there is such diversity of data. Now we will show two simple and separate analyses of the life data for deter- mining the best levels for the control factors. The first analysis is aimed at determin- ing the effect of each control factor on the mean failure time. The second analysis, described in the next section, is useful for determining the effect of changing the level of each factor on the survival probability curve. Reliability Improvement Chap. 11 The life data was analyzed by the standard procedures described in Chapter 3 to determine the effects of the control factors on the mean life. The mean life for cach factor level and the results of analysis of variance are given in Table 11.4. These results are plotied in Figure 11.2. Note that in this analysis we have ignored the effect ‘of both types of censoring. The following conclusions are apparent from the plots in Figure 11.2: © 1-in. suction is as good as 2-in. suction. Therefore, it is unnecessary to increase suction beyond 2 in. * Slower X-Y feed gives longer life. * The effect of in-feed is small. TABLE 11.4 FACTOR EFFECTS AND ANALYSIS OF VARIANCE FOR ROUTER BIT LIFE* Level Meanst Sum of | Degrees of | Mean Factor 1 2 3 4 ‘Squares Freedom Square F A. Suction 5.94 5.94 0 1 oO 0.0 B, XY feed 1543 105.13 1 | aos.s | 4 C. In-feed 5446.44 8.00 1 8.00 | 03 D. Type of bit 6.03 2.63 3.63 11.38 367.38 % 122.46 | 48 E. Spindle position | 7.25 4.13 8.00 438 93.63, 3 aiai | 12 F. Suction foot 7.69 4.19 98.00 1 98.00 | 39 G, Stack height 8.56 3.31 220.50 1 220.50 | 8.7 H, Depth of slot 563 625 33 1 3.43 | OL L Speed 3.56 831 180.50 1 180.50 | 7.1 IxB 10.50 1 10.50 | 0.4 Ixc 10.13 1 10.13 | 04 IxG 171.13 1 A7NA3 | 6.7 BxG 4.50 | 450 | 02 Error 355.37 14 25.38 Total 1627.90 31 * Life in hundreds of inches. + Overall mean life = 5.94 hundreds of inches; starting conditions are identified by an underscore. Sec.11.7 Data Analysis 267 The starting bit is the best of the four types. The difference among the spindles is small. However, we should check the centering of spindles 2 and 4. Solid ring suction foot is better than the existing bristle brush type. Lowering the stack height makes a large improvement. ‘This change, however, raises a machine productivity issue. The depth of the slot in the backup material has a negligible effect. Higher rotational speed gives improved life. If the machine stability permits, even higher speed should be tried in the next cycle of experiments. The only 2-factor interaction that is large is the stack height versus speed interac- tion. However, this interaction is of the synergistic type. Therefore, the optimum settings of these factors suggested by the main effects are consistent with those suggested by the interaction. Optimum Control Factor Settings The best settings of the control factors, called optimum 1, suggested by the results above, along with their starting levels, are displayed side by side in Table 11.5. Using the linear model and taking into consideration only the terms for which the variance ratio is large (that is, the factors B, D, F, G, I and interaction I x G), we can predict the router bit life under the starting, optimum, or any other combination of control factor settings. The predicted life under the starting and optimum conditions are 888 in. and 2,225 in., respectively. The computations involved in the prediction are displayed in Table 11.5. Note that the contribution of the I x G interaction under starting conditions was computed as follows: (Contribution at 2, G2 due to I x G interaction) = (71,6, -W) ~ (M1, —W) ~ Omg, -B) = (3.38-5.94) — (8.31 —5.94) — (3.31-5.94) =-2.30 Note that m),¢, is the mean life for the experiments with speed / and stack height Gz. The contribution of the I x G interaction under the optimum conditions was com- puted in a similar manner. Because of the censoring during the experiment at 1,700 268 Reliability Improvement Chap. 11 15 g 6 £ 310 8 s a5 § 2 0 A, A, 8 8, © 0, D0, D, D, D 2, 6080, 1050/1 23 4 2 OO; OO A 8 8 Suction X-Y Feed In Feed Type of Bit (in) (invmin) —_(irvmin) B10 3 g 8 =6 2 = : 0 ~ EEE, FF, GG HM, I te 2 3 4, SR BB 316 1/4 60 100, 30K 40K ao“ gL 2 8 8 SRB, ee, SM, Spindle Suction Stack Slot. Speed Position Foot Height Depth (rpm) {in} (thou) Figure 11.2 Main effects of control factors on router bit life and some 2-factor interac- tions. Two-standard-deviation confidence limits on the main effect for the starting level are also shown. Sec.11.7 Data Analysis. 1564 |x B Interaction é & 3 3107 = 8, 2 § 5 257 B, 0 30K 40K I. Speed (rpm) 15-4 e 1x C Interaction 3 2 3 10-4 g 8 ° § & 11 ZK 0 30K 40K \. Speed (rpm) 18 Ix G Interaction 2 2 . 10 g 6, 2 5 i $5 ny A ° 30K 40K |. Speed (rpm) 18 GxB Interaction t 2 3 10 3 = B, 2 2 § 3 5 8, 0 36 4 G. Stack Ht (in) Figure 11.2 (Continued) Main effects of control factors on router bit life and some 2-factor interactions. 270 Reliability Improvement Chap. 11 TABLE 11.5 PREDICTION OF LIFE USING ADDITIVE MODEL” Starting Condition Optimum 1 ‘Optimum 2 Factor Setting | Contribution | Setting | Contribution | Setting ) Contribution A. Suction A? - A - Ay - B. X-Y feed By 181 By 181 B, -1.81 C. infeed Ga - Oi ~ C - D. Type of bit Ds 5.44 Ds 344 Ds 5.44 E. Spindle position | £,-E, - E\-E, - E\-Es - F, Suction foot F, -115 Fy 1.75 F, 115 G. Stack height G, 2.63 G 2.63 G 2.63 H. Depth of slot Hy - Hy - ay - 1. Speed b 2.37 It 2.37 h 237 1xG interaction | 1,6; -230 G, | 231 LG, 231 Overall mean 5.94 5.94 5.94 Total 8.88 22.25 18.63 * Life in hundreds of inches. . these predictions, obviously, are likely to be on the low side, especially the predic- tion under optimum conditions which is likely to be much less than the realized value. From the machine logs, the router bit life under starting conditions was found to be 900 in., while the verification (confirmatory) experiment under optimum conditions yielded an average life in excess of 4,150 in. In selecting the best operating conditions for the routing process, one must con- sider the overall cost, which includes not only the cost of router bits but also the cost of machine productivity, the cost of cleaning the boards if needed, etc. Under the ‘optimum conditions shown in Table 11.5, the stack height is 3/16 in. as opposed to 1/4 in. under the starting conditions. This means three panels are cut simultaneously instead of four panels. However, the lost machine productivity caused by this change can be made up by increasing the X-Y feed. If the X-Y feed is increased to 80 in. per minute, the productivity of the machine would get back approximately to the starting level. The predicted router bit life under these alternate optimum conditions, called Sec. 11.8 Survival Probability Curves 2m optimum 2, is 1,863 in., which is about twice the predicted life for starting conditions. Thus, a 50-percent reduction in router bit cost can be achieved while still maintaining machine productivity. An auxiliary experiment typically would be needed to estimate precisely the effect of X-Y feed under the new settings of all other factors. This would enable us to make an accurate economic analysis. In summary, orthogonal array based matrix experiments are useful for finding optimum control factor settings with regard to product life. In the router bit example, the experimenters were able to improve the router bit life by a factor of 2 to 4. Accelerated Life Tests Sometimes, in order to see any failures in a reasonable time, life tests must be con- ducted under stressed conditions, such as higher than normal temperature or humidity. Such life tests are called accelerated life tests. An impomant concem in using accelerated life tests is how to ensure that the control factor levels found optimum dur- ing the accelerated tests will also be optimum under normal conditions, This can be achieved by including several stress levels in the matrix experiment and demonstrating additivity. For an application of the Robust Design method for accelerated life tests, see Phadke, Swann, and Hill [P6] and Mitchell [M1]. 11.8 SURVIVAL PROBABILITY CURVES The life data can also be analyzed in a different way (refer to the minute analysis method described in Taguchi [T1] and Taguchi and Wu [T7}) to construct the survival probability curves for the levels of each factor. To do so, we look at every 100 in. of cut and note which router bits failed and which survived. Table 11.6 shows the sur vival data displayed in this manner, Note that a 1 means survival and a 0 means failure. The survival data at every time point can be analyzed by the standard method described in Chapter 3 to determine the effects of various factors. Thus, for suction levels A, and A, the level means at 100 in, of cut are 0.4375 and 0.6875, respec- tively. These are nothing but the fraction of router bits surviving at 100 in. of cut for the two levels of suction. The survival probabilities can be estimated in a similar manner for each factor and each time period—100 in., 200 in., etc. These data are plotted in Figure 11.3. These plots graphically display the effects of factor level changes on the entire life curve and can be used to decide the optimum settings of the control factors. In this case, the conclusions from these plots are consistent with those from the analysis described in Section 11.6. Plots similar to those in Figure 11.3 can be used to predict the entire survival probability curve under a new set of factor level combinations such as the optimum combination, The prediction method is described in Chapter 5, Section 5.5 in conjunc- tion with the analysis of ordered categorical data (see also Taguchi and Wu [T7]. Reliability Improvement Chap. 11 Survival Probability Probability 0 1000 2000 0 1000 2000 Inches of Router Bit Movement Figure 11.3. Effects of control factors on survival probability curves. Sec. 11.8 Survival Probabllity Curves 1.0; F. Suction Foot Survival 0 1000 2000 1.0 G. Stack Height G $3 \" e 2 ane Ke He KeXeNeKONeKeNe Kee ° 1000 2000 1.0 H. Depth of Slot s BO dex, S805 Hy 3 2 (HKONKHNKKANK, 0 1000 2000 1.0 . |, Speed 32 $ 28 ae 0 1000 2000 Inches of Router Bit Movement Figure 11.3. (Continued) Effects of control factors on survival probability curves. Chap. 11 Reliability Improvement 274 TABLE 11.6 SURVIVAL DATA AT VARIOUS TIME POINTS Survival at Various Time Points* (100s of inches) 16. 17. 4.15, % 1 WL 12 13. B45 8 2 L No. 10 u 2 B “4 15 16 ” 18 19 a B 7 29 31 2 * Entry 1 means survival of the bit; entry 0 means failure. Sec. 11.9 Summary 25 Note that in this method of determining life curves, no assumption was made regarding the shape of the curve—such as Weibull or log-normal distribution. Also, the total amount of data needed to come up with the life curves is small. In this exam- ple, it took only 32 samples to determine the effects of eight control factors. For a sin- gle good fit of a Weibull distribution, one typically needs several tens of observations. So, the approach used here can be very beneficial for reliability improvement projects. The approach of determining survival probability curves described here is similar to the Eulerian method of analyzing fluid flows. In the Eulerian method, one looks at fixed boundaries in space and examines the fluid masses crossing those boundaries. In the life study example, we look at fixed time points (here, measured in inches of cut) and observe which samples survive past those time points. The survival probability curves are constructed from these observations. In the Lagrangian method of analyzing fluid flows, one tracks a fluid particle and examines the changes in velocities, pressure, etc., experienced by the particle. Fluid flow equations are derived from this examina- tion. In the reliability study, the analogous way would be to observe the actual failure times of each sample and then analyze the failure data. The analysis of Section 11.6 was based on this approach. However, when there are many censored observations (in this example eight router bits did not fail by the time the experiment was stopped and readings were taken only at 100-in. intervals), the approach of this section, where sur- vival probability curves are estimated, gives more comprehensive information. 11.9 SUMMARY © Reliability improvement can be accomplished during the design stage in one of the three ways: (1) reduce sensitivity of the product’s function to the variation in product parameters, (2) reduce the rate of change of the product parameters, and (3) include redundancy. The first way is most cost-effective and it is the same as parameter design. The second way is analogous to tolerance design and it involves using more expensive components or manufacturing processes. The third way is used when the cost of failure is very large compared to the cost of providing spare components or the whole product. Finding an appropriate continuous quality characteristic and reducing its sensi- tivity to noise factors is the best way of improving reliability. When an appropriate quality characteristic cannot be found, then only life tests should be considered for life improvement. Matrix experiments using orthogonal arrays can be used to conduct life tests efficiently with several control factors. The S/N ratio calculated from a continuous quality characteristic can be used to estimate the average life of a new product. Let 7); be the S/N ratio for the new product and 1) be the S/N ratio for a benchmark product whose average life is 276 Reliability Improvement Chap. 11 known. Then the average life of the new product is r times the average life of the benchmark product, where The goal of the router bit case study was to reduce dust formation. Since there existed no continuous quality characteristic that could be observed conveniently, the life test was conducted to improve the router bit life. r=10 Effects of nine factors, eight control factors and spindle position, were studied using an orthogonal array with 32 experiments, Out of the nine factors, two fac- tors had four levels, and the remaining seven had two levels, Four specific 2-factor interactions were also studied, In addition, there were several physical restrictions regarding the factor levels. Use of Taguchi's linear graphs made it easy to construct the orthogonal array, which allowed the estimation of desired factor main effects and interactions while satisfying the physical restrictions. Only one router bit was used per experiment. Dust formation was observed every 100 in. of cut in order to judge the failure of the bit. The length of cut prior to formation of appreciable dust or breakage of the bit was called the bit life and it was used as the quality characteristic. Each experiment was ter- minated at 1,700 in. of cut regardless of failure or survival of the bit. Thus the life data were censored. Effects of the nine factors on router bit life were computed and optimum levels for the control factors were identified. Under a set of optimum conditions, called optimum 1, a 4-fold increase in router bit life was observed, but with a 12.5-percent reduction in throughput. Under another set of optimum conditions, called optimum 2, a 2-fold increase in router bit life was observed, with no drop in throughput. The life data from a matrix can also be analyzed by the minute analysis method to determine the effects of the control factors on the survival probability curves. This method of analysis does not presume any failure time distribution, such as log-normal or Weibull distribution, Also, the total amount of data needed to determine the survival probability curves is small. Appendix A ORTHOGONALITY OF A MATRIX EXPERIMENT Before defining the orthogonality of a matrix experiment we need to review the follow- ing definitions from linear algebra and statistics. Recall that N,, Nz, --., Tg are the observations for the nine rows of the matrix experiment given by Table 3.2. Consider the linear form, L;, given by ZL; =wiit + with + - °° + woh (A.1) which is a weighted sum of the nine observations. The linear form L; is called a con- trast if the weights add up to zero—that is, if wit twig to7 > twig = 0 (A.2) Two contrasts, L; and L3, are said to be orthogonal if the inner product of the vectors corresponding to their weights is zero—that is, if WW + WW + 77° + WigW29 (A3) Let us consider three weights w 11, w 12, and w 43 corresponding to the three lev- ls in the column 1 of the matrix experiment given by Table 3.2 (Chapter 3), Then we call the following linear form, Ly, the contrast corresponding to the column 1 277 278 ‘Appendix A Li =wuth +wt2 + ws + wate + Wis + Wie twisty + wang + Wisto (AA) provided all weights add up to zero. In this case, it implies Wi tw +13 = 0. (AS) Note that in Equation (A.4) we use the weight wy, whenever the level is 1, weight W412 whenever the level is 2, and weight w13 whenever the level is 3. An array used in a matrix experiment is called an orthogonal array if the con- trasts corresponding to all its columns are mutually orthogonal. Let us consider columns 1 and 2 of the matrix experiment given by Table 3.2. Equation (A.4) is the contrast corresponding to column 1. Let w2,, w2, and w 3 be the weights corresponding to the three levels in column 2. Then, the contrast corresponding to column 2 is L2, where Lz = wath + wate + wats + waits + Wats + W23Tle twat + Wats + Wot. (A6) Of course, the weights in Equation (A.6) must add up to zero—that is wa + W722 +W3 = 0. (A.7) The inner product of the vectors corresponding to the weights in the two contrasts L , and L» is given by WiiWar + Wiiwo2 + Wiha + WiaWar + Wiawe + Wi2Wrs + Wi3Wap + 1322 + 1323 = [wn +02 +003] [war tm +m) =0. Hence, columns 1 and 2 are mutually orthogonal. The orthogonality of all pairs of columns in the matrix experiment given by Table 3.2 can be verified in a similar manner. In general it can be shown that the balancing property is a sufficient condition for a matrix experiment to be orthogonal. Appendix A 279 In column 2 of the matrix experiment given by Table 3.2, suppose we replace level 3 by level 2. Then looking at columns 1 and 2 it is clear that the two columns do not have the balancing property. However, for level | in column | there is one row with level 1 in column 2 and two rows with level 2 in column 2. Similarly, for level 2 in column 1, there is one row with level 1 in column 2 and two rows with level 2 in column 2. The same can be said about level 3 in column 1. This is called propor- tional balancing. It can be shown that proportional balancing is also a sufficient condi- tion for a matrix experiment to be orthogonal. Among the three weights corresponding to column 1, we can independently choose the values of any two of them. The third weight is then determined by Equa- tion (A.5). Hence, we say that column 1 has two degrees of freedom. In general, a column with n levels has n — 1 degrees of freedom. Appendix B UNCONSTRAINED OPTIMIZATION Here we define in precise mathematical terms the problem of minimizing the variance of thickness while keeping the mean on target and derive its solution. Let 2 = (21, 22)... , 29)” be the vector formed by the control factors; x be the vector formed by the noise factors; and y(x; z) denote the observed quality characteris- tic, namely the polysilicon layer thickness, for particular values of the noise and control factors. Note that y is nonnegative. Let [1(z) and o7(z) denote the mean and variance of the response. Obviously, and o? are obtained by integrating with respect to the probability density of x, and hence are functions of only z. Problem Statement: The optimization problem can be stated as follows: Minimize 0°(z) z 1) Subject to (Z) = Ho - This is a constrained optimization problem and very difficult to solve experimentally. 281 282 Appendix B Solution: We postulate that one of the control factors is a scaling factor, say, z. It, then, implies that yx; 2) = 2,A(x; 2’) (B.2) for all x and z, where z’ = (22, 23, . . 4 2g)", and h(x; 2’) does not depend on 2. It follows that B@) =2, Baz’) (B.3) and (2) = 2? f(z’) (B.4) where [1, and 6} are, respectively, the mean and variance of A(x, z). Suppose z" = (z,*, z’") is chosen by the following procedure: ont’) Hie’) (a) z” is an argument that minimizes + ant (b) 2, * is chosen such that p(z)*, 2") = Ho. We will now show that z* is an optimum solution to the problem defined by Equation B.1). First, note that z” is a feasible solution since 4(z,", 2”) = Ho. Next consider any feasible solution z = (z, 7”). We have 1, 7) OG), 7) = We), 7)- SAP fen, Py = Wer 2) a oi’) Hi@’) = up - (Bs) Appendix B 283 Combining the definition of z* in (a) above and Equation (B.5), we have for all feasi- ble solutions, (2), 7) 2 (1% 2"). Thus, z” = (21, 2”) is an optimal solution for the problem defined by Equation (B.1). Referring to Equations (B.3) and (B.4) it is clear that (21, 2) _ of(2') Weer Z) whe’) for any value of 2). Therefore, step (a) can be solved as follows: (a) choose any value for z, then 1, 2’) (a”) find 2’ that minimizes ——— we v) In fact, it is not even necessary to know which control factor is a scaling factor. We can discover the scaling factor by examining the effects of all control factors on the signal-to-noise (S/N) ratio and the mean, Any factor that has no effect on the S/N ratio, but a significant effect on the mean can be used as a scaling factor. In summary, the original constrained optimization problem can be solved as an unconstrained optimization problem, step (a), followed by adjusting the mean on target, step (b). For obvious reasons, this procedure is called a 2-step procedure. For further discussions on the 2-step procedure, see Taguchi and Phadke [T6]; Phadke and Dehnad [P4]; Leon, Shoemaker and Kackar (L2]; Nair and Pregibon [N2}; and Box [B1]. The particular derivation given in this appendix was suggested by M. Hamami. Note that the derivation above is perfectly valid if we replace 7; in Equation (B.2) by an arbitrary function g(z,) of z;. This represents a generalization of the com- mon concept of linear scaling. Appendix C STANDARD ORTHOGONAL ARRAYS AND LINEAR GRAPHS* * The orthogonal arrays and linear graphs are reproduced with permission from Dr. Genichi Taguchi and with help from Mr. John Kennedy of American Supplier Institute, Inc. For more details of the orthogonal arrays and linear graphs, see Taguchi (T1] and Taguchi and Konishi [T5]. 285 266 Appendix C Ly L, (2?) Orthogonal Array Expt. | Column Na ft 2 3 rjrag 2 {122 3 f2 12 a[223 Lo} Linear Graph for L, Lg (2’) Orthogonal Array 3 1e-———e2 [129] Interaction Table for Ly Expt Column weft 234 8 67 rfrrrarada air 112222 3 fi 221 122 afr 222204 sj2121212 6 [2.22124 ale 201221 e{z212112 Linear Graphs for L, a) (2) 1 3 22 3 5 7 1 Ses e 8 2 4 7 Appendix C 287 Ly 3") Orthogonal Array Expt. | Column .2a¢ rtraig 2 {1222 31333 a fz 23 s|2231 6 {2312 rls. 32 a {3213 o |33 21 Linear Graphs for L, 34 1e——e2 Ln @") Ly; (2"") Orthogonal Array Expt. Colusa Nw. |/1 23 45 67 8 9 wou aftrrrarrraidgs 2frrri 122222 2 3]. 12221112 2 2 afi2122 1224 1 2 sjt224.22 424 21 @ Ji 2221221211 ri2122412 21 21 a {2021222 14 1 2 9jz211222122 01 wil222171122 1 2 u}2212121 112 2 wl[z211242422 1 ‘Note: The interaction between any two columns is confounded partially with the remaining nine columns. Do not use this array if the interactions must be estimated. Lis 2") Ly (2'°) Orthogonal Array Expt, Columa No |/1 23456789 nee wos ajriag paodorarada. afrrrrrtr222 2 22 22 gf. t1222 213 1 122 22 ajiar2222222 21 1149 s[i22112 2141 2 rol 2 2 6 j1 22 122223 12 2 15 pj) 222210412 2 2 214 a f12222 122 1 1 11 2 2 9 f21212121212 1279 2 wl2r2i2r2212 1 2 1 2 1 muf212212112 12 2 1 24 mBi2122121212 3 12 12 wf221 122 122 1 12 2 4 wf2 2.12232 1 1 2 2 § 9 2 wf2 212 1212 2 12 41 2 wy2 21241 22112 12 24 Interaction Table for ., (2'5) Column Coumnjt 23456789 D1 2B Os 1yy3 2547698 1 0 2 5 4 2 M1674 500 8 9 41S 2 3 3 O@7654u09 § is 4B 2 4 i232 4 Is 8 9 oH 5 93 2B 49 8 1 wo 6 @1Wis 2 3 0 un 8 9 7 OW 2 Uo 9 8 8 M123 4567 9 @3 25 476 10 dol 67 45 u ay7 65 4 2 ay 23 B ay 32 4 ay [3s as, Appendix C Appendix C m8 Linear Graphs for Lig @ 1 13 WW co 12 3 9 1/ \a 0 © © 290 Appendix C Lie 4) Lig (4°) Orthogonal Array Couma 10203 4 5 piroroaoaog 241 2 2 2 2 > ]) 3 3 3 3 a]1 4 4 4 4 s|2 1 2 3 4 6|2 2 1 4 3 7 ]2 3 4 1 2 Biz 4 3 2 0 9 js 1 3 4 2 wls 2 4 3 9 nfs 3 1 2 4 mits @ 2 1 3 nj@ or 4 23 wf4o2 3 1 4 is ]4 3 2 4 4 w leo 4 1 3 2 Note: To estimate the interaction between columns 1 and 2, all other columns must be kept empty. Linear Graph for L',, Appendix C 291 Lys (2! x 37) Lye Q! x 3”) Orthogonal Array Expt. Columa afrro rao viroa.d 2/11 22 22 22 3 ji) r 33 33°33 |i 2 14 22 33 s ji 2022 33 11 6 ji 2 33 11 22 7J13 12 13 23 eji 3 23 21 31 9] 3 31 32 42 wl2. 13 32 215 m2 i 21 13 32 m]2i1 32 21 13 wl22 12 35 32 wi22 23 12 13 wej22 31 23 21 wl? 3 43 23 12 wiz 3 21 31 23 w)2 3032 12 31 Note: Interaction between columns | and 2 is orthogonal to all columns and hence can be estimated without sacrificing any column. The interaction can be estimated from the 2-way table of columns 1 and 2. Columns 1 and 2 can be combined to form a 6-level column, Interactions between any other pair of columns is confounded partially with the remaining columns. Linear Graph for Lg Appendix C Las (5°) ‘L2s (5°) Orthogonal Array Columa 4 3 Expt. No. 10 n n 3 4 re 6 7 1 a 2 Note: To estimate the interaction between columns 1 and 2, all other columns must be kept empty. Linear Graph for L, 3,4,5,6 2 4 293 Appendix C Lm @”) Orthogonal Array po | POR RR 8 pote sain Bins | mH HOR an Ee [awe cae cae [enw enn ane lame wen en Ba [enn eee neo [Sateen ene aan AAA RR Interaction Table for Ly, (3) Cotumn Gum }1 2 3 4 5 6 7 8 9 © HM RB alg 3 2 2 6 5s 5 9 8 8 Rou HD 44377 6 0 © 9 B B RB 2 @ i) 8 9 mo $s 6 7 5 6 7 43 0 2p pon 2B 8 9 10 3 o 1 8 © 8 7 5 6 6 7 Ss 2B U2 2k Bn ww 8 9 4 w 0 8 67 8 7 5 6 23 Bou 2 9 0 8 s ® § 12 3 4 2 4 3 7 6 Wo 2 8 w 9 6 14 2 3 3 2 4 © s 5 2 1 0 9 8 7 mi 4 2 4 3 2 Rou Bb 9 8 10 s 1 oto2 3 4 5 wm 9 5 7 6 9 14 203 Oe 7 6 5 30402 10) oe os 7 4 14 p nr 2 1 12 ay 13 a3) Appendix Appendix C Linear Graph for Lz, (2) Appendix C Ln @") Ly (2") Orthogonal Array | awn | acca | aca | mtn | eon [anna | area | awe al asen | aaa | -san|oena |acna | ean fanaa | sane | ena |acce | ewan | anne | wan [area | wae | anne BR) -wae | ewan | anne | aenn [anna [anna | maa | ann Benes fawn [anae lane | mann [ann [acca | anne | cae | awn | ewan | owe [anne [aren | enw | ene Z| ewan | onan |anan | onan | naan | ona | eae | ean RB) awa | a-a- | ana | cane lawn | owen | mace [aren a | ceca lace |acae | cane | ace | aean | wean | mane BR | aac laces | cana | anne | saee | acan [ae | oan gaara | cana | anaes | anae laren | wean | macaw | wen ge [ISIS | ISTO] DAIS Pare EQS PTSIS PISS fTOrS 82 |)--aa|aan- | anae | sae faw-- | sna | enw panne Bs none l|ucore |agsa | aaee|[seaa|aaas | sane | asae ‘Appendix C Interaction Table for Lz (2) cum{t 24s eT ee ee eM ‘ Hiri aoe 8 oO V9 Bw NS wD > Meas 47 eee ee Hw awe 2 om 1 2 3 te NM Be nee * Uy OD me wn e » mia sete ee wR x @s2bn ee no 2 mies RB ie » ome Fas 2 ayn » ws x on: 207 298 Appendix Linear Graphs for L, Appendix © 299 Linear Graphs for Ls, (Continued) (6) 18 . . 2 (7) 462 2% 12 28 Mo 7 13 Afi . "18.26 16 Te s Ke 2 A % 29 300 Appendix C Linear Graphs for Ly, (Continued) (10) (1) 301 ‘Appendix C we Li (2! x 4°) Orthogonal Array La Coloma 10 Expt. No. 10 u 2 1B 16 ” Note: interaction between columns 1 and 2 is orthogonal to all columns and hence can be estimated without sacrificing any column. It can be estimated from the 2-way table of these columns. Columns 1 and 2 can be combined to form an 8-level column. Interactions between any two 4-level columns is con- founded partially with each of the remaining 4-level columns. 302 1 2 oe 3 e Linear Graph for L'sy 4 5 6 eee Appendix C 303 Appendix C Ly 2" x 3”) Ly (2"' x 3") Orthogonal Array Rl eae man | ana ame lama eam | mn nm lam mann | mam mam | -se neal ameaca | man ane fame ace [ane can | oan ace Bi [meena [acter ee | ercsgs a esen [ieee are [imma eee | wegen = Ea | oan can | man cae | mae cae | cae cae | owe cae | mae man zg 8 Bg [oso vee [ewe gua [age ees |eassans|aanaas| aaa ane Interaction between any two columns is partially confounded with the remaining columns. Note: Appendix C Lg @ x 3) Orthogonal Array ek | am anclans malas can [ana canlera can|are ann Bo | can ean | ane caalana cam|eca manlans amalaan ane Be | ae one lann am lane den | man maa |awm amen leon ane ° sanlam- aaa |eon en |onm melas ame lene oan Ba | -s0 eoe[neo san|nan enslaga aaslaan ang|aae ans Notes: (i) The interactions 1 x 4,2 x 4 and 3 x 4 are orthogonal to all columns and hence can be obtained factor interaction between columns 1, 2, and 4 can be obtained by without sacrificing any column. (ii) The 3. and 4 and evel factor can be formed by combining columns 1, 2, by keeping column 3 empty. (iii) Columns 5 through 16 in the array L3,(2° x 3!) are the same as the keeping only column 3. empty. ‘Thus, a 12 columns 12 through 23 in the array Lag(2!' x 32), Appendix C Linear Graphs for L's, a) (2) Appendix C Esp (2' x 5") Orthogonal Array Column ‘ son 1 2345 2 Expt No. 29 0 BERSS (Continued) Appendix 307 Ls (2! x 5") (Continued) Expt. Colume Not 2345678 9 OH a i[2414s5 412 5 2 3 3 ai[2425152313 4 4 we i243 121342 4 5 5 wi[24423 2453 5 1 1 a [2453 43 514 1 2 2 wi[2s51522534 4 3 1 a i252 133145 5 4 2 w[25 32442 51 1 5 3 ei[25 43 553122 1 4 ol25 5411423 3 2 5 Note: Interaction between columns | and 2 is orthogonal to all columns and hence can be estimated without sacrificing any column. It can be estimated from the 2-way table of these two columns. Columns 1 and 2 can be combined to form a 10-level column. Linear Graphs for Len 1 2 3 4 5 6 7 8B 9 10 11 12 Appendix C Ly (2! x 3) Ls, (2! x 3%) Orthogonal Array [own oa pame wee [owe woe laes cae |ane cama analans RB | cam nelera meal ame|aen canlans cas lane acalann | enn mea lame amelaam ana |aea anna mealana camlann R | com ane|meea mnalanm melanin |oam ame [aaa cam| ann RB can mealam= mnalans camlane can|are maalans cam|aan RB [eon nnalam= mealame= ana laam men lans mela a calan= Bs [inn conan ncalncn ecaleca ara |mon cam [sam conan 88 | -a wae |-an nna | men analans ans lees ams lans ann |aon S| ewe -aelean -amlane eanlannan-lannmcalaaa nnalann Ba |-or ewelnws casi[ose see/saa aaa [aan RAS RAs Ras R RS 8 (Continued) Appendix C 309 Ly, (2! x 3%) (Continued) Expt Column No |12345678 12-13 14 18-16 47 18 19 20 24 22 23 24 25 26 w [2223121312313 2233241 231132 [2223121323 10215 301322312203 @f22230203302320022093725520 fa [22302321023 1323 20023322301 42230232023 020313225 101335122 go \22312324 3123212053 t2211253 w f231323 121323027113 223321128 [23132302213 125 2213501322357 i251 3234252023135 20t 2213502 [232131221323 122311F 2012332 so f23 2131232135 1253 12243225103 si j23213123321231123321331221 so [23321231132312322301233200 so )23 321236215125 153122311322 sf23321231321231211233122133 Notes: (i) Interaction between columns 1 and 2 is orthogonal to all columns and hence can be estimated without sacrificing any column, Also, these columns can be combined to form a 6-level column. (ii) The interactions | x 9, 2x9, and 1 x 2x 9 appear comprehensively in the columns 10, 11, 12, 13, and 14, Hence, the aforementioned interactions can be obtained by keeping columns 10 through 14 empty. Also, columns 1, 2, and 9 can be combined to form a 18-level column by keeping columns 10 through 14 empty. Linear Graph for Ly, 1 9 25,26 ‘Appendix C 310 Lea 2°) Le (2) Orthogonal Array g | --nwadn- | anne nan | wee enn | Hoe | eon BR) -canannn | ween ean | ees aanen | wee eee | eee an Ce en a | enanann g | oon | Hann lanes eee | anne | een a | --wann- | eee | eee nae | ane ee cae | ene wg) -canann- | Hew e | ean | Hee = | een B| oc nae- | aewanane | eee | mean | eee ee ee ee eee eres gq | ocaceccan | aaendaee | aa een | Hanae | nea g | onan onan | omens [aa ocanee |aaecawns | oewaceae B= | oenewces | ~ some ree [ote nee | soos ens | og te By | anand | wade ces | wanwacce | see cewwn | ween eoee BR) ---eanan | nae ene | eee aan | nanan | nee Ba |-nnveene |osuasens | cesanaaa | wannaaaa | aERSRARs (Continued) 311 Appendix C Lea (2%) (Continued) Columa 1223.4 S 6 7 & 910 AL 12 13 14 45 16 17 18 9 20 21 22 23 24 28 26 27 28 29 301 2 2 1 1 1 1 2 2 ' 1 22 22 ' 1 2 2 L ' 2 2 1 1 2 2 24 21 20 21 12 12 ' 122 ' ' 1 t 2 2 2 2 122 122 221 221 1220122402201 r2200221122100 22 22 22 22 1 1 ' ' ' 1 1 122 2212 12 24 20 22 it it 124 1 1201224 212201 212211 221122 221122 1 1 No. (Continued) 1 1 2 2 1 Appendix C 2222 an ha 2222 i 1 1 22 22 t 22 2 tt it 22 it wt 1 22 ra 2222 122222222 1 ba 122222222 tt 12222 12222 1 Mt 122 2222 rid 222 ht 2222 1 1220 it ha ni i 122 ' 1 12 1 1 ' 1 22 22 1 ' 1 2222 1 2222222 1 1 1 1 ' 2222 ' 1 2222 22222222 ' ' ' ' 1 1 1 1 1 1 ' 1 1 1 122 1 2a 122 12222 22221 1 12222 2222222222222222 1 ' 1 1 1 1 1 1 1 22 22222222 1 2222 2222 2222 Il 2222222222222222 ' ' 1 1 1 t Column ' 222 ' ' 1 22 2222 1 1 1 22 22 22 1 2222 2222 1 22 22 1 ' es (2%) (Continued) rt 11 p12222 22221 rid ' 1224 2210 122 2211 Mt 1122 22 i 1 12222 2222222222222222 1 22222222 2222 22222222 1 221 1 ' t 1 222222221 1 222222221 1 ' 1 1 1 1 1 1 2222 1 2222 1 i 22221 Ma t 122 1 b220 221 1 1 1 re! 1 1 122 1 22221 1 122222222 221 1122 224 tt 22 21i22 221 21122 12222 12222 12222 ' 2222 2 22222222222222222222222222222222 2222 r220001 wl22221 wi2222 2 4 ‘ a 2 B 8 ” wf 2 2 mw i22 RARAAR AE [No |32 33 34 38 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 S7 58 59 60 61 62 3 Expt | 312 2 2 (Continued) 1 22120 2 1 2 212212 1 313 2 2 22 tt 22 tt 22 tt tt 2 2d. 22 it rt 1 220 1 ' 22 22 a 12 2 1 ae 12 12 21 Mt it 22 22 22 22 it 22 22 rt nt 21 2 2 1 1 22 22 Column 1 220 221 2 20 20 t 1 22 .s4 (2°) (Continued) ' 122 ' 22 ' 22 122 1 122 1 1 22 ' 22 ' 22 12 2 1 22 1 1 1 t 22 22 ' 122 1 22 1 22 ' 4 22 ti 22 2 1 2 ai 22 1 1 2 2 1 ri224 22 L 22 2 1 22 22 2122 12d ' 21 2 1 2 122 122 2 2 122 22112 22 22 1 2a 122 1 12 1 2122 221 2 2 1 2 22 22 ' 1 2 1 22 1 1 ' 12 2 2 1 a No. |32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 Expt, a “a 6 31 2 33 4 ss 56 37 9 i241 ‘Appendix C 1122 122112 121221210 2201 22 122120 1122129 221 2 Appendix C 314 Lig 4") Orthogonal Array a) inne wane |aown meen [aera doen [anaes came | emde oaae [sate can = | seven wa | eine emetn laren itera | cine sive = ge | ieee ers cece scar [ati aes | fant acts | anc nein 8s | ancalen| cela soul |caceaces|accasnas|ouas area Be [-ane were |oeenaens (neaa nana | nana agam | aan BRAS (Continued) 315 Appendix C Bl mera wcen | amen mere | name waar ga [ores ceee aces cone [eens enn Sel asea aan | same acem [aeon enae Bg |seoseves |eaaa aang l|nass sees Appendix C 316 Ls: 3") Orthogonal Array a wo can [wow wea nna [amon wa cae saa lens 2 Se can) ane Re we | oe eR aS samen men | aan as anmnn | aan ann aan lana mne aan [ann ane man | ane 8e naman [ana mam eee fama een e ma eee aan [eee Ba werea|Sua ase sce | SRT NIE VR AR ARR ABS | RRA (Continued) 317 Appendix C Q) SSS in peiestet e Bfoncotc|orn ton art [gee 202 ot IE IED PD : sag ter [tances tt | Mec 22 Tet Le I LEE arco rrr IIS ET RED PELLET LET | LIE IND IDDM ie =n [wnat bam == - cannon cls [aensesnan|ole nanan PLU TET PSST SET ISE PRIM SS Anais tet seette cs : wane | aan ann wee DD Lhn [ten col nan [oon lle o|ananna | aaeiaae saa | aan menad [wae Mam ame |one mee om ‘JSS /fecs[2 2 sseses Bs a9 939 | ees san aaz nag ees seo |asssesece|czeerees (Continued) ‘Appendix C 3836-37 3B 39 40 q Column M22 23 24 28 26 27 9 MH Expt, No. 318 (Continued) 3 2233 2 2 2 13 4 Is 16 7 1B 19 Fa 3 319 Appendix © 3 8 z 3 a a & a a 8 x a BSSS3Se2R B eR RER RSs Al A2. BL. B2. B3. B4. ot. 2. REFERENCES Addelman, S. “Orthogonal Main Effect Plans for Asymmetrical Factorial Experi- ments.” Technometrics (1962) vol. 4: pp. 21-46. Anderson, B. "Parameter Design Solution for Analog In-Circuit Testing Prob- lems." Proceedings of IEEE International Communications Conference, Phi- ladelphia, Pennsylvania (June 1988) pp. 0836-0840. Box, G. E. P. "Signal to Noise Ratios, Performance Criteria and Transforma- tions.” Technometrics (February 1988) vol. 30, no. 1, pp. 1-31 Box, G. E. P, and Draper, N, R. Evolutionary Operations: A Statistical Method for Process Improvement. New York: John Wiley and Sons, 1969. Box, G. E. P., Hunter, W. G., and Hunter, J. S$. Statistics for Experimenters—An Introduction to Design, Data Analysis and Model Building. New York: John Wiley and Sons, 1978. Byme, D. M. and Taguchi, S. "The Taguchi Approach to Parameter Design.” ASQC Transactions of Annual Quality Congress, Anaheim, CA, May, 1986. Clausing, D. P. "Taguchi Methods Integrated into the Improved Total Develop- ment." Proceedings of IEEE International Conference on Communications. Philadelphia, Pennsylvania (June 1988) pp. 0826-0832, Clausing, D. P. "Design for Latitude.” Intermal Memorandum, Xerox Corp., 1980. 321 322 C4. &. D1. D2. D3. D4. DS. Fl. Gl. G2. HI H2. H3. a. w2. 3. KL. References Cochran, W. G. and Cox, G. M. Experimental design. New York: John Wiley and Sons, 1957, Cohen, L. "Quality Function Development and Application Perspective from Digital Equipment Corporation." National Productivity Review, vol. 7, no. 3 (Summer, 1988), pp. 197-208. Crosby, P. Quality is Free. New York: McGraw-Hill Book Co., 1979. Daniel, C. Applications of Statistics t0 Industrial Experimentation. New York: John Wiley and Sons, 1976. Deming, W. E, Quality, Productivity, and Competitive Position. Cambridge: Massachusetts Institute of Technology, Center for Advanced Engineering Study, 1982, Diamond, W. J. Practical Experiment Design for Engineers and Scientists. Lifetime Learning Publications, 1981. Draper, N. and Smith, W. Applied Regression Analysis. Duncan, A. J. Quality Control and Industrial Statistics, 4th Edition. Home- wood, Illinois: Richard D. Irwin, Inc., 1974. Feigenbaum, A. V. Total Quality Control, 3rd Edition, New York: McGraw Hill Book Company, 1983. Garvin, D, A. "What Does Product Quality Really Mean?" Sloan Management Review, Fall 1984, pp. 25-43. Grant, E, L. Statistical Quality Control, 2nd Edition, New York: McGraw Hill Book Co., 1952. Hauser, J. R. and Clausing, D. "The House of Quality." Harvard Business Review (May - June 1988) vol. 66, no. 3, pp. 63-73. Hicks, C. R. Fundamental Concepts in the Design of Experiments. New York: Holt, Rinehart and Winston, 1973 Hogg, R. V. and Craig, A. T. Introduction to Mathematical Statistics, 31d Edi- tion. New York: Macmillan Publishing Company, 1970. Jessup, P. "The Value of Continuing Improvement." Proceedings of the IEEE International Communications Conference, CC-85, Chicago, Dlinois (June 1985). John, P. W. M. Statistical Design and Analysis of Experiments. New York: Macmillan Publishing Company, 1971. Juran, J. M. Quality Control Handbook, New York: McGraw Hill Book Co., 1979. Kackar, R. N. "Off-line Quality Control, Parameter Design and the Taguchi Method.” Journal of Quality Technology (Oct. 1985) vol. 17, no. 4, pp. 176-209. References 323 K2. K3. K4. KS. LL L3. MI. M2. Nl. N2. Pi. BE. Kackar, R. N. "Taguchi’s Quality Philosophy: Analysis and Commentary.” Quality Progress (Dec. 1986) pp. 21-29. Katz, L. E. and Phadke, M. S. "Macro-quality with Micro-money." AT&T Bell Labs Record (Nov. 1985) pp. 22-28. Kempthorne, O. The Design and Analysis of Experiments. New York: Robert E. Krieger Publishing Co., 1979. Klingler, W. J, and Nazaret, W. A. "Tuning Computer Systems for Maximum Performance: A Statistical Approach." Computer Science and Statistics: Proceedings of the 18th Symposium of the Interface, Fort Collins, Colorado (March 1985) pp. 390-396. Lee, N. S., Phadke, M. S., and Keny, R. 8. “An Expert System for Experimental Design: Automating the Design of Orthogonal Array Experiments." ASQC Transactions of Annual Quality Congress, Minneapolis, Minnesota (May 1987) pp. 270-277 Leon, R. V., Shoemaker, A. C., and Kackar, R. N. "Performance Measures Independent of Adjustments." Technometrics (August 1987) vol. 29, no, 3, pp. 253-265. Lin, K. M. and Kackar, R, N, "Wave Soldering Process Optimization by Orthog- onal Array Design Method.” Electronic Packing and Production (Feb. 1985) pp. 108-115. Mitchell, J. P. "Reliability of Isolated Clearance Defects on Printed Circuit Boards." Proceedings of Printed Circuit World Convention IV (June 1987) Tokyo, Japan, pp. 50.1-50.16. Myers, R. H. Response Surface Methodology. Blacksburg, Virginia: R.H. Myers, Virginia Polytechnic Institute and State University, 1976 Nair, V. N. "Testing in Industrial Experiments with Ordered Categorical Data." Technometrics (November, 1986) vol. 28, no. 4, pp. 283-291. Nair, V. N. and Pregibon, D. "A Data Analysis Strategy for Quality Engineering Experiments." AT&T Technical Journal (May/June 1986) vol. 65, No. 3, pp. 3-84, Pao, T. W., Phadke, M. S., and Sherrerd, C. S. "Computer Response Time Optimization Using Orthogonal Array Experiments.” EEE International Com- munications Conference. Chicago, IL (June 23-26, 1985) Conference Record, vol. 2, pp. 890-895. Phadke, M. S, "Quality Engineering Using Design of Experiments." Proceedings of the American Statistical Association, Section on Statistical Education (August 1982) Cincinnati, OH, pp. 11-20. Phadke, M. S. "Design Optimization Case Studies." AT&T Technical Journal (March/April 1986) vol. 65, no. 2, pp. 51-68. 324 P4. PS. P6. P8. RI R2, R3. S1 $2. $3. S4. SS. S6. References Phadke, M. S. and Dehnad, K. “Optimization of Product and Process Design for Quality and Cost." Quality and Reliability Engineering International (April-June 1988) vol. 4, no. 2, pp. 105-112. Phadke, M. S., Kackar, R. N., Speeney, D. V., and Grieco, M. J. "Off-Line Quality Control in Integrated Circuit Fabrication Using Experimental Design.” The Bell System Technical Journal, (May-June 1983) vol. 62, no. 5, pp. 1273-1309. Phadke, M. S., Swann, D. W., and Hill, D. A. “Design and Analysis of an Accelerated Life Test Using Orthogonal Arrays." Paper presented at the 1983 Annual Meeting of the American Statistical Association, Toronto, Canada. Phadke, M. S. and Taguchi, G. "Selection of Quality Characteristics and S/N Ratios for Robust Design." Conference Record, GLOBECOM 87 Meeting, IEEE Communications Society. Tokyo, Japan (November 1987) pp. 1002-1007. Plackett, R. L, and Burman, J. P. "The Design of Optimal Multifactorial Experi- ments." Biometrika, vol. 33, pp. 305-325. Proceedings of Supplier Symposia on Taguchi Methods, April 1984, November 1984, October 1985, October 1986, October 1987, and October 1988, American Supplier Institute, Inc., 6 Parklane Blvd., Suite 411, Dearborn, MI 48126, Raghavarao, D. Construction of Combinatorial Problems in Design Experiments. New York: John Wiley and Sons, 1971 Rao, C. R. "Factorial Experiments Derivable from Combinatorial Arrangements of Arrays." Journal of Royal Statistical Society (1947) Series B, vol. 9, pp. 128-139. Rao, C. R. Linear Statistical Inference and Its Applications, 2nd Edition. New York: John Wiley and Sons, Inc., 1973. Schefté, H. Analysis of Variance. New York: John Wiley and Sons, Inc,, 1959, Searle, S. R. Linear Models. New York: John Wiley and Sons, 1971 Sciden, E. “On the Problem of Construction of Orthogonal Arrays." Annals of Mathematical Statistics (1954) vol. 25, pp. 151-156, Seshadri, V. "Application of the Taguchi Method for Facsimile Performance Char- acterization on AT&T International Network.” Proceedings of IEEE International Conference on Communications, Philadelphia, Pennsylvania (June 1988) pp. 0833-0835. Sullivan, L. P. "Reducing Variability: A New Approach to Quality.” Quality Progress, (July 1984) pp. 15-21. Sullivan, L. P. "Quality Function Deployment." Quality Progress (June 1986) pp. 39-50. References 325 Th T2. 73, 74. TS. T6. “SL. Taguchi, G, Jikken Keikakuho, 3rd Edition. Tokyo, Japan: Maruzen, vol. 1 and 2, 1977 and 1978 (in Japanese). English translation: Taguchi, G. System of Experimental Design, Edited by Don Clausing. New York: UNIPUB/Kraus International Publications, vol. 1 and 2, 1987. Taguchi, G. "Off-line and On-Line Quality Control System." International Conference on Quality Control. Tokyo, Japan, 1978. Taguchi, G. On-line Quality Control During Production. Tokyo, Japan: Japanese Standards Association, 1981. (Available from the American Supplier Institute, Inc., Dearborn, MI). Taguchi, G. Introduction to Quality Engineering. Asian Productivity Organiza- tion, 1986. (Distributed by American Supplier Institute, Inc., Dearborn, M1). Taguchi, G. and Konishi, S. Orthogonal Arrays and Linear Graphs. Dearborn, ML: ASI Press, 1987. Taguchi, G. and Phadke, M. S. "Quality Engineering through Design Optimiza- tion.” Conference Record, GLOBECOM 84 Meeting, IEEE Communications Society, Atlanta, GA (November 1984) pp. 1106-1113. Taguchi G. and Wu, Yu-In. Introduction to Off-Line Quality Control. Central Japan Quality Control Association, Meieki Nakamura-Ku Magaya, Japan, 1979. (Available from American Supplier Institute, Inc., Dearborn, MI). The Asahi, Japanese language newspaper, April 15, 1979. Reported by Genichi Taguchi during lectures at AT&T Bell Laboratories in 1980. Tomishima, A. "Tolerance Design by Performance Analysis Design—An Exam- ple of Temperature Control Device." Reliability Design Case Studies for New Product Development, Edited by G. Taguchi, Japanese Standards Assoc., 1984, pp. 213-220 (in Japanese). Yokoyama, Y. and Taguchi, G. Business Data Analysis: Experimental Regres- sion Analysis. Tokyo: Maruzen, 1975. INDEX A Accelerated life tests, 271 Accumulation analysis method, for determining optimum control factor settings, 122-128 Additive model, 42, 48-50 failure of, 90 Adgitivity, 133-134 tole of orthogonal arrays in ensuring Additivity, 146 Adjustment factor, 107 Advanced strategy, orthogonal array construction, 174, 182 Aluminum etching application, 8 American Supplier Institute, 9 Analysis of Means (ANOM), 46 Analysis of Variance (ANOVA), 3, 42, 51-59 ANOVA tables, interpretation of, 58-59 confidence intervals for factor effects, 58 degrees of freedom, 56 error variance, estimation of, $7 Fourier analysis, analogy with, 51-52 sum of squares, computation of, 53-55 variance ratios, 58 The Asahi (newspaper) 15 Asthmatic patient treatment, control factors, selection of, 144-145, Asymmetric loss function, 21 Automobiles, noise factors, 23 Average quality loss, 25-26, 29 reducing variance, methods of, 26 Balancing property, 44 proportional, 154, 279 Beginner strategy, orthogonal array construction, 171-172, 182 Blaine, Gary, 183 Branching design linear graphs, 168-171 branching factor, 168 Breaking a line, linear graphs, 163 Cc Central composite designs, Robust Design versus. classical statistical experiment design, 179 Chemical vapor deposition (CVD) process, 41 matrix experiment, 42-44 327 928 balancing property, 44 finding frequency response function, 44 starting levels, 42 Classical statistical experiment design ‘compared to Robust Design, 174-175 data analysis differences, 180 experiment layout differences, 177-180 problem formulation differences, 175-177 Column merging method, 150, 168 router bit life improvement application, 261 Compound factor method, 150, 156-157 Compound noise factor, 72, 206-207 Computer-aided Robust Design, 183-211 Computer system parameter space mapping, 249 Computer systems tuning of, 231-251 control factors/levels, 236-238 data analysis, 240-243, experimental procedure, 238-240 macro modeling, 234 matrix experiment, 238-240 micro modeling, 233-234 need for, 231 noise factors/testing conditions, 234-235 problem formulation, 232-234 ‘quality characteristic, 235-236 related applications, 249 signal-to-noise ratio, 235-236 verification experiments, 240-246 Concept design, 33, 39 Confidence intervals, for factor effects, 58 Confirmation experiment, 60 Confounding, definition of, 157 Construction of orthogonal arrays, See orthogonal arrays Continuous-continuous (C-C) problems, ignal-to-noise ratios for, 114-116 signal-to-noise ratios for, 116 Continuous variables, as quality characteristics, 135 Control factors, 27, 31, 39 adjustment factor, 107 differential operational amplifier (op-amp), 194 dynamic systems, 216-218 design, 222-223 ignoring interactions among, 176-177 polysilicon deposition process, 74-76 selecting levels of, 75 selection of, 144-146 asthmatic patient treatment, 144-145 Index computer system tuning, 236-238 photolithography process, 145-146 Control orthogonal array, 195 dynamic systems design, 222-224 Cost, elements of, 4, 11 ‘manufacturing cost, 5, 11 operating cost, 5, 11 R&D cost, 5, 1 of Robust Design followed by tolerance design, 205 Covercoat process, branching design and, 168-170 Curve/vector response, quality characteristic, 113, Customer usage, as life cycle stage, 38 D Data analysis computer system tuning, 240-248 differential operational amplifier (op-amp), 198 dynamic systems design, 223-227 polysilicon deposition process, 80 reliability improvement, 265-271 Data analysis difference, Robust Design versus classical statistical experiment design, 180 significance tests, 180 two-step optimization, 180 Data analysis plan, 76-79, 80-90 Degrees of freedom, 56 counting of, 150-151 Designed experiments, 42 See also Matrix experiments, Design optimization, 194-202, 222-227 differential operational amplifier (op-amp), 194.202, control factors/levels 194-195 control orthogonal array. 195 data analysis, 198 multiple quality charact optimum settings, 198-202 simulation algorithm, 196 dynamic systems, 222-227 control factors/levels, 222-223, control orthogonal array, 223 data analysis/optimum condition, 223-227 Design parameters, estimated effects of, 6 Deterioration, noise factors, 23-24 Developing process, photographs, noise factors, 24 Differential operational amplifier (op-amp), 9 fics, 202 Index analysis of nonlinearity, 207 compound noise factor, 206-208 control factors, 194 control orthogonal array, 195 design optimization, 194-202 deviations in circuit parameter values, 186-188 correlation, 186 shape of distribution, 186 noise factors, 186-188 offset voltage, 184 orthogonal array based simulation, 190-193 quality characteristic, 194 selecting major noise factors, 206 signal-to-noise ratio, 194, 208-209 simulation effort, reduction of, 205-207 tolerance design, 202-205 igital-continuous (D-C) problems, signal-to-noise ratios for, 116-117 Digital-digital (D-D) problems, signal-to-noise ratios for, 117-121 Dots, linear graphs, 159 Dummy level technique, 150, 154-155 computer system tuning application, 238 Dynamic problems, 32, 114-121 Robust Design, 32 signal-to-noise ratios for, 114-121 continuous-digital (C-D) problems, 116 continuous-continyous (C-C) problems, 114-116 digital-continuous (D-C) problems, 116-117

You might also like