Statistical Engineering. An Algorithm For Reducing Variation in Manufacturing Processes. Jock MacKay, Stefan Steiner. ASQ (2005)
Statistical Engineering. An Algorithm For Reducing Variation in Manufacturing Processes. Jock MacKay, Stefan Steiner. ASQ (2005)
Statistical Engineering
H1212CH00 FM.qxd 3/31/05 10:29 AM Page ii
Applied Data Analysis for Process Improvement: A Practical Guide to Six Sigma Black
Belt Statistics
James L. Lamprecht
Failure Mode and Effect Analysis: FMEA From Theory to Execution, Second Edition
D. H. Stamatis
Statistical Engineering
An Algorithm for Reducing Variation
in Manufacturing Processes
TS155.S773 2005
658.5—dc22
2004030676
No part of this book may be reproduced in any form or by any means, electronic, mechanical,
photocopying, recording, or otherwise, without the prior written permission of the publisher.
Publisher: William A. Tony
Acquisitions Editor: Annemieke Hytinen
Project Editor: Paul O’Mara
Production Administrator: Randall Benson
ASQ Mission: The American Society for Quality advances individual, organizational, and
community excellence worldwide through learning, quality improvement, and knowledge
exchange.
Attention Bookstores, Wholesalers, Schools, and Corporations: ASQ Quality Press books,
videotapes, audiotapes, and software are available at quantity discounts with bulk purchases for
business, educational, or instructional use. For information, please contact ASQ Quality Press at
800-248-1946, or write to ASQ Quality Press, P.O. Box 3005, Milwaukee, WI 53201-3005.
To place orders or to request a free copy of the ASQ Quality Press Publications Catalog, including
ASQ membership information, call 800-248-1946. Visit our Web site at www.asq.org or
http://qualitypress.asq.org.
To Samm—R.J.M.
H1212CH00 FM.qxd 3/31/05 10:29 AM Page vi
H1212CH00 FM.qxd 3/31/05 10:29 AM Page vii
Contents
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi
Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Truck Pull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Engine Block Leaks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Camshaft Lobe Runout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Sand Core Strength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Crankshaft Main Diameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Paint Film Build . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.7 Refrigerator Frost Buildup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
vii
H1212CH00 FM.qxd 3/31/05 10:29 AM Page viii
viii Contents
Contents ix
x Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xi
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–41
Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–41
Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–43
Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–43
Chapter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–44
Chapter 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–45
Chapter 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–48
Chapter 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–49
Chapter 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–53
Chapter 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–56
Chapter 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–57
Chapter 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–59
Chapter 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–61
Chapter 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–63
Chapter 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–66
Chapter 18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–67
Chapter 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–68
Chapter 21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–75
xi
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xii
SUPPLEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CD–177
xvii
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xviii
Table 13.1 Valve lifter clearance experiment plan and results. . . . . . . . . . . . . . . . . . . . . 195
Figure 13.2 Oil consumption by lifter clearance level. . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Figure 13.3 Crossbar dimension verification experiment results. . . . . . . . . . . . . . . . . . . . 197
Table 13.2 A two-level factorial experiment with three suspects and eight
treatments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Table 13.3 Suspects and levels for brake rotor verification experiment. . . . . . . . . . . . . . 201
Table 13.4 Brake rotor verification experiment results. . . . . . . . . . . . . . . . . . . . . . . . . . 202
Figure 13.4 Weight by treatment for the brake rotor verification experiment. . . . . . . . . . 202
Figure 13.5 Pareto chart of the effects for brake rotor verification experiment. . . . . . . . . 203
Table 13.5 Balance weight averages by thickness variation and tooling. . . . . . . . . . . . . 204
Figure 13.6 Interaction between tooling and core thickness variation. . . . . . . . . . . . . . . . 204
Figure 13.7 Window leaks verification experiment results. . . . . . . . . . . . . . . . . . . . . . . . 206
Table 13.6 Designed experiments terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Figure 14.1 Right caster daily averages by alignment machine. . . . . . . . . . . . . . . . . . . . . 215
Figure 14.2 Plot of primary seal fit by plastisol amount. . . . . . . . . . . . . . . . . . . . . . . . . . 216
Figure 14.3 Scatter plot of crossbar dimension versus barrel temperature. . . . . . . . . . . . 218
Figure 14.4 Scatter plot of flushness versus left front pad height. . . . . . . . . . . . . . . . . . . 220
Figure 14.5 Dot plot of seal strength for leakers and nonleakers. . . . . . . . . . . . . . . . . . . . 221
Figure 14.6 Plot of final base circle BC runout by heat treatment spindle. . . . . . . . . . . . 224
Figure 14.7 Flowchart to help decide how to proceed after finding a dominant
cause. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Figure 15.1 Changing the process center. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Figure 15.2 Battery seal group comparison results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Table 15.1 Heat seal experiment candidates and levels. . . . . . . . . . . . . . . . . . . . . . . . . . 229
Table 15.2 Treatments and seal strength for battery seal experiment. . . . . . . . . . . . . . . . 229
Figure 15.3 Seal strength by treatment combination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Figure 15.4 Pareto chart of effects for battery seal experiment. . . . . . . . . . . . . . . . . . . . . 230
Figure 15.5 Main effect plot for melt temperature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Figure 15.6 Box plots of dip bump score by core wash solution. . . . . . . . . . . . . . . . . . . . 232
Table 15.3 Boss shrink defect scores. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Table 15.4 Experimental design and results for differential carrier experiment. . . . . . . 234
Figure 15.7 Pareto chart of the effects for piston shrink defect experiment. . . . . . . . . . . 236
Figure 15.8 Main effects of significant inputs for shrink defect experiment. . . . . . . . . . . 237
Figure 15.9 The effect of aligning substream centers on output variation. . . . . . . . . . . . . 239
Figure 16.1 Original (left) and new (right) relationship between the dominant cause
and output characteristic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Table 16.1 Treatments for the engine block porosity desensitization experiment. . . . . . 243
Figure 16.2 Porosity by treatment for high and low pouring temperatures. . . . . . . . . . . . 243
Figure 16.3 Scrap rate by lubricant score. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
Table 16.2 Candidate levels and scrap rates for low and high level of lubricant
amount. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xxii
Figure 16.4 Scrap rate for low and high level of lubricant amount versus treatment. . . . 245
Figure 16.5 Interaction plots for oil pan scrap desensitization experiment. . . . . . . . . . . . 246
Figure 16.6 Interaction plots of die temperature and binder force by lubricant amount
(supplier C only). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Table 16.3 Eight-run fractional factorial refrigerator desensitization experiment. . . . . . 248
Table 16.4 Plan and data for refrigerator frost buildup desensitization experiment. . . . . 249
Figure 16.7 Temperature on cooling plate by treatment. . . . . . . . . . . . . . . . . . . . . . . . . . 250
Figure 16.8 Interaction plots of candidates and the environment cause output in
cooling plate temperature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Figure 16.9 Box plots of eddy current hardness measurements by day. . . . . . . . . . . . . . . 251
Figure 16.10 Scatter plot of eddy current versus Brinell hardness measurements. . . . . . . 252
Table 16.5 Candidate levels for eddy current measurement experiment. . . . . . . . . . . . . 253
Figure 16.11 Plot of eddy current versus Brinell hardness for freq. = 200, temp. = 35,
and gain = 30. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
Figure 17.1 Feedforward control schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Table 17.1 Potato chip spots data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
Figure 17.2 Scatter plot of dark spot score versus sugar concentration (%). . . . . . . . . . . 261
Figure 17.3 Baseline histogram of imbalance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Figure 17.4 Vectoring to reduce imbalance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Figure 17.5 Histogram of simulated imbalances for vectored assemblies. . . . . . . . . . . . . 263
Table 17.2 Results of simulating selective fitting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Figure 18.1 Feedback control schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Figure 18.2 Box plot of tightness by front and axle cable batch. . . . . . . . . . . . . . . . . . . . 270
Table 18.1 Parking brake tightness adjuster experiment results. . . . . . . . . . . . . . . . . . . . 271
Figure 18.3 Diameter at stream 1 at Operation 270 by minute. . . . . . . . . . . . . . . . . . . . . 272
Figure 18.4 Diameter at stream 1 of Operation 270 by minute. . . . . . . . . . . . . . . . . . . . . 273
Figure 18.5 Piston diameters versus time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Figure 18.6 Run chart of flow rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Figure 18.7 EWMA smoothing of paint flow rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Figure 18.8 Right caster angle over time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Figure 18.9 Right caster average by shift. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Figure 18.10 A machining process with persistent shifts due to tooling. . . . . . . . . . . . . . . 280
Table 19.1 Candidates and levels for burn robustness experiment. . . . . . . . . . . . . . . . . . 286
Table 19.2 Experimental plan and data for burn robustness experiment. . . . . . . . . . . . . 287
Figure 19.1 Burn by treatment with added vertical jitter. . . . . . . . . . . . . . . . . . . . . . . . . . 287
Figure 19.2 Pareto plot of effects on average burn score. . . . . . . . . . . . . . . . . . . . . . . . . . 288
Figure 19.3 Main effect plot for back pressure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
Table 19.3 Candidates and levels for iron silicon concentration robustness
experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Table 19.4 Treatments and results for iron silicon concentration robustness
experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 290
Figure 19.4 Silicon concentration by treatment combination. . . . . . . . . . . . . . . . . . . . . . 291
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xxiii
Figure 19.5 Pareto chart of the effects on log(s) for iron silicon experiment. . . . . . . . . . 292
Figure 19.6 Main effects for iron silicon concentration robustness experiment. . . . . . . . 292
Table 19.5 Candidates selected for the pinskip robustness experiment. . . . . . . . . . . . . . 293
Table 19.6 Electroplating pinskip experimental plan and data. . . . . . . . . . . . . . . . . . . . . 294
Figure 19.7 Pareto analysis of effects in pinskips experiment. . . . . . . . . . . . . . . . . . . . . . 295
Figure 19.8 Cube plot for tank 1 concentration, tank 2 concentration, and tank 2
temperature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Figure 19.9 Number of defective grills versus run order. . . . . . . . . . . . . . . . . . . . . . . . . . 296
Figure 20.1 Reducing variation by adding inspection limits. . . . . . . . . . . . . . . . . . . . . . . 301
Figure 21.1 Histogram of crossbar dimension in the validation investigation. . . . . . . . . . 309
Figure 21.2 Porosity scrap rate by month. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xxiv
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xxv
Acknowledgments
T
his book is the product of many years’ experience helping manufacturing organiza-
tions reduce variation. We want to thank all of our clients and colleagues at the Insti-
tute for Improvement in Quality and Productivity (IIQP). The IIQP is a not-for-profit
organization, started in 1985, committed to the development, communication, and applica-
tion of methods for quality and productivity improvement.
Former and current IIQP colleagues include Bovas Abraham, Dennis Beecroft, Jerry
Lawless, Jack Robinson, Jim Whitney, and Clif Young, among others. The numerous cor-
porate members of the institute over the years included AT Plastics, Bell Canada, BF
Goodrich Canada, BUDD Canada, Campbell Soup Company, Canada Post Corporation, C-
I-L, Continental Can Canada, Dofasco, Epton Industries, GSW Water Products Company,
General Motors Canada, Lear Seating Canada, Lear Seigler Industries, Metal Koting Con-
tinuous Colour Coat, Nortel Networks, Quebec and Ontario Paper Co., Research in Motion
(RIM), Stelco Steel, The Woodbridge Group of Companies, Uniroyal Goodrich Canada,
Wescast Industries, and Xerox Canada. Thanks also to other companies we have worked
with, including Fisher and Paykel, Focus Automation Systems, Ford Motor Company,
Imperial Oil of Canada, Seagram Americas, Stackpole, Toyota Motor Manufacturing
Canada, and many others. We learned from you all.
We have also benefited from helping to organize and attending the Continuous Improve-
ment Symposiums held by General Motors of Canada each year since 1987. The symposiums
provide an opportunity for employees and suppliers to showcase process improvement suc-
cess stories. Some of the examples in the book come from these symposiums.
We want to single out Mike Brajac, Pete Peters, and Mark Smith at General Motors of
Canada, with whom we have had a long-standing relationship. You are the best problem
solvers we know. We also thank Wescast Industries and especially Harry Schiestel, who
taught us a lot about foundries and how to solve problems.
We have had outstanding relationships with our colleagues at the University of Water-
loo who have contributed, perhaps unknowingly, in many ways to the book. We especially
thank Winston Cherry and Wayne Oldford for their penetrating questions and stimulating
discussions, and Dennis Beecroft, who taught us there is more to variation reduction than
statistics.
xxv
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xxvi
xxvi Acknowledgments
Also, our appreciation goes to the undergraduate and graduate engineering and statis-
tics students taking the inaugural offering of the University of Waterloo course STAT
435/835, Statistical Methods for Process Improvement, in the Winter 2004 term. The stu-
dents used a draft of the book and provided many helpful comments and suggestions to
improve the presentation.
In the early stages of writing this book, we were fortunate to be on sabbatical. During
fall 2003, Jock received support from CSIRO in Melbourne, Australia, with thanks to
Richard Jarrett and Geoff Robinson. Stefan spent a year starting in May 2002 at the Uni-
versity of Auckland, New Zealand, with thanks to Chris Wild.
We were guided through production by the staff of ASQ Quality Press and Kinetic
Publishing Services. Special thanks to Paul O’Mara, Laura Varela, and John Ferguson.
MINITAB is a trademark of Minitab in the United States and other countries and is
used herein with the owner’s permission. Microsoft Word and Microsoft Excel are trade-
marks of Microsoft.
Last, but certainly not least, we would like to acknowledge the wonderful support from
our families over the long road required for writing this book.
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xxvii
Preface
R
educing the variation in process outputs is a key part of process improvement. If you
have picked up this book, you probably do not need to be convinced of the truth of
this statement. For mass-produced components and assemblies, reducing variation
can simultaneously reduce overall cost, improve function, and increase customer satisfac-
tion with the product. Excess variation can have dire consequences, leading to scrap and
rework, the need for added inspection, customer returns, impairment of function, and a
reduction in reliability and durability.
We have structured the book around an algorithm for reducing process variation that we
call Statistical Engineering. The algorithm is designed to solve chronic problems on exist-
ing high- to medium-volume manufacturing and assembly processes. The algorithm will not
apply to urgent, short-term sporadic problems such as what to do when a defective item is
found. Instead, we look at the problem of reducing the frequency of such defective items.
The fundamental basis for the algorithm is the belief that we will discover cost-effective
changes to the process that will reduce variation if we increase our knowledge of how and
why a process behaves as it does. A key way to increase process knowledge is to learn
empirically—that is, to learn by observation and experimentation. We discuss in detail a
framework for planning and analyzing empirical investigations, known by its acronym
QPDAC (Question, Plan, Data, Analysis, Conclusion). We use the QPDAC framework at
many stages of the Statistical Engineering algorithm to help plan, analyze, and interpret the
results of appropriate investigations.
Using the algorithm, you are guided through a series of empirical investigations to a cost-
effective solution to the problem. The purpose and plan for each investigation depends on:
• The stage of the algorithm
• The accrued knowledge from earlier investigations
• Other engineering and process knowledge
We classify all effective ways to reduce variation into seven approaches. A unique
aspect of the algorithm forces early consideration of the feasibility of each of the
approaches. Selecting a working approach helps generate effective and efficient solutions.
The choice of approach affects the process knowledge required and hence how we proceed.
xxvii
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xxviii
xxviii Preface
Some of the variation reduction approaches (but not all) require knowledge of a domi-
nant cause of variation. We present a low-cost strategy for finding a dominant cause based
on families of causes and the method of elimination. The method of elimination uses a
series of simple investigations, each of which is designed to eliminate a large number of
possibilities from those remaining.
We illustrate all aspects of the algorithm with many examples adapted from our expe-
rience. In some cases, we have disguised the data to protect confidentiality; in others, we
have taken some liberties with what actually happened to make a point.
Throughout the book, we use the statistical software package MINITAB for all calcu-
lations and most displays. To apply the Statistical Engineering algorithm, the user requires
a software package capable of making basic plots and finding simple numerical summaries.
By referring to the procedures in the software, we avoid algebraic expressions for the most
part. We describe the calculations with words and let the software deal with the numerical
implementation.
We avoid formal or complicated statistical analysis procedures. Whenever possible, we
use graphical displays to guide us to the correct interpretation of the data. We assume that
the reader has been exposed to basic statistical concepts and tools, such as standard devia-
tions, averages, histograms, run charts, box plots, scatter plots, process maps, and flow-
charts. We provide detailed explanations of more sophisticated analysis tools as needed,
including multivari charts, analysis of variance (ANOVA), regression, and designed exper-
iments. We include appendices to explain how to use MINITAB to produce the analysis for
all of the methods discussed.
Important issues surround the management of process improvement projects. These are
the same issues that arise in any project management exercise. Priorities must be set, plans
and schedules made, resources provided, and so on. We do not deal with these issues in
detail; rather, we focus on the algorithm, the variation reduction approaches, and the tools
required to achieve variation reduction.
The Statistical Engineering algorithm is not meant to replace global improvement systems
such as Six Sigma. It is focused on and designed for process improvement in high- to medium-
volume manufacturing processes. We suggest that the algorithm, strategies, and methods be
incorporated into a general improvement system and used where appropriate.
TARGET AUDIENCE
The primary audience of this book is people who are involved in the improvement of man-
ufacturing processes. They include:
• Process engineers with responsibility for reducing variation, decreasing costs,
improving quality, and so on
• Six Sigma Green Belts, Black Belts, and Master Black Belts
• Trainers in process improvement methods
• Academics and students interested in quality and productivity improvement
• Teachers and students of courses in engineering statistics
H1212CH00 FM.qxd 3/31/05 10:29 AM Page xxix
Preface xxix
While reading the text, we suggest you start an improvement project, or at least think about
your own process problems. The more analogies you can draw between our examples and
your processes, the better you will understand the material.
We advise the reader to try the exercises and explore the data sets to help gain confi-
dence in the use of the approaches and methods.
xxx Preface
Preface xxxi
1
Introduction
T
his book presents a systematic algorithm for reducing variation. The algorithm is tai-
lored to high- to medium-volume manufacturing processes where it is feasible to
measure the values of selected process inputs and outputs.
We use the word variation to mean both the deviation of the output from a target value
and the changing value of the output from part to part. For example, in a machining process
that produces V6 pistons, the target value for the diameter is 101.591 millimeters. The
measured diameter in millimeters of three successive pistons is
We can see variation in both senses since none of the pistons has the target diameter
and all have different diameters. We will formulate problems by defining appropriate per-
formance measures that capture the nature of the variation that we want to reduce. Exces-
sive variation leads to poor performance, low customer satisfaction, scrap and rework,
complex downstream control plans, and so on. If we can resolve such problems, we can
reduce costs and improve quality and performance.
The fundamental basis for the algorithm is our belief that by increasing knowledge of
how and why a process behaves as it does, we will discover cost-effective changes to the
process that will reduce variation. One way to increase process knowledge is to learn
empirically, that is, to learn by observation and experimentation. Statistics is the discipline
that teaches us how to learn empirically. Statistics provides the answers to questions such
as “How should we plan our process investigation?” and “How do we interpret the data that
we have?” The algorithm we propose relies heavily on statistical methods and tools com-
bined with existing engineering knowledge and theory. Using the algorithm, we will plan
and carry out one or more investigations to learn about process behavior.
1
H1212CH01.qxd 3/31/05 10:30 AM Page 2
In most cases, to be cost-effective, the proposed changes involve better process control
or alterations to process settings rather than fundamental design changes or replacing
process equipment. We have classified these low-cost changes into a set of generic varia-
tion reduction approaches and have structured the algorithm to force early consideration of
an approach.
The specific objectives of the book are to help you to:
Front wheel alignment on light trucks is a set of characteristics that affect the handling of a
vehicle and the life of its tires. One component of the alignment is called pull. Pull is an
important characteristic because it indicates how well the truck will track on a standard high-
way. A driver can feel pull—a value close to target will produce a more drivable vehicle.
Pull is a torque, measured in Newton-meters, and is a function of right and left front
wheel camber and caster angles. For the vehicles discussed here, the relation between pull
and the alignment angles is
Percent
2
–0.2 –0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Pull
The dashed vertical lines on Figure 1.1 show the specification limits for pull, set at
–0.12 and 0.58 Newton-meters. The target value is 0.23 Newton-meters. We see that every
truck has pull within the specification limits (in fact, there are a few pull values outside
these limits that are not visible in the histogram because of the large number of data points).
Any truck with pull that does not meet the specifications is repaired and remeasured before
shipment.
The goal of the project was to reduce the variation in pull around the target so that the
histogram would look like that in Figure 1.2. If this goal can be achieved, the process will
produce a greater proportion of trucks with pull close to the target value; hence, there will
be greater overall customer satisfaction. As well, the proportion of trucks needing rework
will be smaller, thus reducing cost.
This is a problem in reducing variation in pull from truck to truck. The process is cur-
rently centered on the target. As shown by Figure 1.1, adjusting the process center to
increase (or decrease) pull on every truck will make the process worse, because then more
3
Percent
–0.2 –0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Pull
trucks will have pull outside the specification limits and fewer trucks will have pull near the
target value.
We have found that excessive variation from part to part is a widespread problem. For
example, many measurement systems exhibit substantive variation if the same part is
repeatedly measured, especially over a long period. This variation leads to poor process
control, disputes between customers and suppliers, the scrapping of good parts, and so on.
Often we understand enough about a process to have an easy method to keep the center of
the process near the target value. We rarely have such an adjustment to reduce the variation
from one part to the next.
3
Percent scrap due to leakers
0
1 2 3 4 5 6
Month
at the beginning of the project is likely to be low, so that investigations require large sample
sizes to find a few defectives. However, we can sometimes translate a problem defined in
terms of a binary output into one with a continuous output. For example, the cause of some
of the leaks was the underlying wall thickness, a continuous characteristic. If an internal
wall gets too thin, then a leak will occur. For these leaks, we can translate reducing the pro-
portion of leakers into reducing variation in wall thickness about the target value. We
expect to be able to learn about wall thickness variation more easily. When possible, we
will translate a problem with a binary output into one with a continuous output. This is not
always possible. For example, leaks due to sand inclusions cannot (easily) be directly asso-
ciated with an underlying continuous characteristic, and we are forced to deal with the
binary output and large sample sizes.
0.010
0.008
Angle
9
8
7
6
Percent
5
4
3
2
1
0
0 10 20 30 40
BC runout
Figure 1.6 Histogram of camshaft lobe BC runout (dashed line gives the specification
limit).
project to improve the process. Here the goal was to change the process so that the right
tail of the histogram would be shifted to the left and a greater proportion of lobes would
have runout close to zero. Because runout must be positive, adjusting the process to lower
runout on all lobes was not feasible. In this context, we take variation reduction to mean
that we will make a higher proportion of lobes with runout close to the target value of 0
microns.
There are many other process characteristics with a one-sided specification and a phys-
ical lower bound. Examples include flatness, porosity, taper, and so on. We can improve the
process performance defined in terms of such characteristics by increasing the concentra-
tion of values near the target value. That is, we reduce the variation in the process output
about the target value.
H1212CH01.qxd 3/31/05 10:30 AM Page 7
50 50
40 40
Percent
Percent
30 30
20 20
10 10
0 0
5 10 15 20 5 10 15 20
Current Desired
There was no obvious target for core strength. The specification for the core was that it
not be broken. The team planned to increase core strength by increasing the concentration
of resin in the core-making process. On first glance, this is an example of a problem where
we can improve a process by making a one-time adjustment that increases or decreases the
output characteristic on all parts, that is, by changing the process center. Later, the team
discovered that overly strong cores caused casting defects. They changed the goal to elim-
inate cores with low strengths. In other words, the new goal was to reduce variation in
strength from core to core.
It is often easy to find a low-cost solution to shift the process center. The challenge is
to avoid side effects—in which we replace one problem by another, as in the sand core strength
example.
Percent
4
–5 0 5
1front
Figure 1.8 Histogram of crankshaft diameters for the front position of the first main.
The process was off target, and it seemed a simple matter to adjust the upstream
grinders to reduce the average diameter closer to zero. Again, there were negative conse-
quences. The process operators knew that crankshafts with oversized journals could be
reworked but those with undersized journals were scrapped. They had deliberately centered
the process above the target to avoid costly scrap. Here the goal is first to reduce piece-to-
piece variation and shrink the width of the histogram. Then they could adjust the center of
the process to the target without the risk of scrapped parts.
10
Percent
5
15.0 1
15.5 16.0
1 16.5
1 17.0
1 17.5
1 18.0
1 18.5
1
Film build
However, there was a heavy price to pay for this supposed solution due to poorer overall
appearance of the painted surface. Using the algorithm given in this book, the engineers
found a solution that allowed film build to be increased without the appearance defect
occurring.
Key
Points
PART I
Setting the Stage
I
n this first part of the book, we explore the meaning and consequence of process variation,
provide tools for quantifying variation in an understandable way, and discuss various
sources of variation. We also present an algorithm for reducing variation, centered
around seven variation reduction approaches. We explain the seven variation reduction
approaches and give examples. Finally, as the key to improvement is acquiring new process
knowledge, we discuss the QPDAC (Question, Plan, Data, Analysis, Conclusion) frame-
work, used to plan and analyze process investigations.
2
Describing Processes
T
his book is about how to reduce variation in process output. Increased costs and
decreased customer satisfaction are two of the negative consequences of excess vari-
ation. In this chapter, we provide some common language and tools to describe
processes and variation.
13
H1212CH02.qxd 3/31/05 10:31 AM Page 14
Sand molds are formed to determine the external shape of the manifold. Cores are molded
to create internal space in the casting. A core is placed in the mold and the molten iron is
poured. After cooling, the sand is shaken out, and the result is a rough casting. The casting
is finished by machining various surfaces and drilling holes. Throughout, operators make
measurements and process adjustments. They also inspect the castings for defects at sev-
eral points.
A process can be divided into subprocesses and is almost always a part of a larger
process. In the casting example, the melting of the iron, the creation of the mold, and the
core-making are all examples of subprocesses. The manufacturing process for the manifold
sits inside a system that includes the design process for both the part and the manufactur-
ing process; the sales, order, and billing processes; the delivery process; and so on.
A process map or flowchart is a good tool to describe a process, especially if we choose
an appropriate level of detail. For example, we can represent the major subprocesses or
operations within the casting process by the simple flowchart in Figure 2.2.
This flowchart shows the major subprocesses and the order in which they occur. We
can describe the subprocesses in much finer detail if we choose. The flowchart clarifies
which operations in the process are parallel and which are sequential. The chart also
shows the boundaries that we have selected for the process. We could have chosen to
include more upstream or downstream subprocesses as part of the manufacturing process.
The selection of process boundaries, like the choice of the level of detail, is driven by a
tradeoff between presenting facts and providing information. Too much detail can obscure
the information in the chart. We take these issues into consideration when designing a useful
flowchart. See Harrington (1987) for a detailed discussion of chart selection, construction,
and use.
One important feature of a process is that it is repeatable; each time it operates, the
process produces a unit. In the example, each finished casting is a unit. For a manufactur-
ing process, we can think of a unit as a part. For a measurement process, a unit is the act of
making a measurement.
We use the word characteristic to describe a feature or quality of a unit. For the cast-
ing process, characteristics of a manifold include its hardness, its dimensions, the pouring
temperature of the iron, the properties of the sand in the cores used for that casting, and so
on. The customers are the people or organizations that use the process units. Here, the cus-
tomers include the assemblers of the engine and the ultimate users of the vehicles that
Melting
Molding
incorporate the engines. Output characteristics, or more briefly outputs, are characteristics
of interest to customers. The properties of the casting that describe its performance in the
engine and ease of assembly in the engine plant are outputs. We call characteristics of the
casting such as the pouring temperature of the iron and the properties of the sand cores
input characteristics, or inputs for short. We also classify characteristics in terms of the val-
ues they can take.
The suppliers are the people or organizations that control the subprocesses that deter-
mine many of the inputs. In the casting process, some suppliers are the providers of the
sand for the cores and other raw materials, the equipment manufacturers, and so on. To
complete the picture, we call the people involved in the process its owners. In the example,
the owners are the process operators and managers. The owners are interested in the cost
associated with each unit as well as many other input and output characteristics.
We classify input characteristics as fixed or varying. An input is fixed if it changes
only when we deliberately make it change. For example, the target value for the pouring
temperature of the iron is a fixed input. The instruction to the operator to sample five
parts every hour, measure the hole locations, and make an appropriate adjustment is a
fixed input. The fixed inputs may be changed by the process owners. An input is varying
if its value changes from unit to unit or time to time without deliberate intervention. For
example, the dimensions of the cores change from casting to casting. Other varying
inputs, such as pouring temperature and raw material characteristics, change more slowly
over time.
Note that the target value or set point for pouring temperature is a fixed input, but the
actual pouring temperature is a varying input. We need to keep this idea clear because
sometimes inputs that we consider fixed actually vary. For example, two operators may
interpret the same set of instructions in different ways. The instructions are a fixed input, but
the implementation of the instructions is a varying input.
In Chapter 1, we saw that most problems involved reducing the unit-to-unit variation.
We define a cause with respect to this kind of variation.
A cause of variation in a process output is a varying input with the property that if all other
(varying) inputs were held constant, then the output changes when the input changes. Part of
this definition is conceptual since it is not possible to hold all other inputs constant for all units.
In the example, suppose the output is the hardness of the casting measured at a partic-
ular location. Hardness varies from casting to casting. The concentration of carbon in the
iron when it is poured into the mold is a cause of hardness variation. When this concentra-
tion changes, all other inputs being constant, the hardness of the casting will change. There
are many causes of hardness variation.
Now for our first controversial statement. A fixed input cannot be a cause of variation.
For example, the design of the product or process is not the cause of variation since the
design is a fixed input. Since we are interested in problems defined in terms of variation,
we will never say that the design is the cause of the problem. As you will see, we will
change one or more fixed inputs to solve the problem, but these are not the causes.
We use a simple mathematical model to describe a cause by specifying the values of all
varying inputs (again, this is only conceptually possible for a real process) for any one unit
produced by the process. We write the functional model
With this model, input1, for example, is a cause if the output changes when the value
of input1 changes while all other inputs are held constant. Note that the function f ( )
depends implicitly on the values of the fixed inputs.
The model is useful because it helps us understand what is meant by the effect of a
cause. The effect of the cause (or varying input) is the change in the output produced by a
change in the input. The effect depends on the size of the change in the input, the initial
value of the input, and perhaps the values of the other inputs. A cause has a large effect if a
relatively small change in the input produces a relatively large change in the output. We
define a small change in the input and a large change in the output relative to the variation
we see in these characteristics under regular operation of the process. We simplify the lan-
guage if we call all varying inputs causes, even those with no effect.
For any process output, there are likely to be a large number of causes, each with an
effect. We assume that the Pareto principle1 (Juran et al., 1979) will apply and that only a
few causes will have large effects. We call these causes dominant. We base our strategies
and approaches to reducing variation on the assumption that there will be only one or two
dominant causes. We justify this focus on dominant causes more fully in Section 2.4.
We can model the effect of a dominant cause as
where f (dominant cause) is a function that captures the effect of the dominant cause and
the term noise captures the effect of all the other inputs. For a dominant cause, the range in
f (dominant cause) is greater than the output variation due to the noise.2
H1212CH02.qxd 3/31/05 10:31 AM Page 17
A dominant cause can be a single input or involve two or more inputs in a variety of
ways. Figure 2.3 shows a dominant cause for a binary output that is either good (G) or bad
(B). The horizontal line represents the normal range of values of the input. Small values of
the input to the left of the dotted line correspond to good output, large values to bad output.
G G G G B B B B
Input value
Figure 2.3 A dominant cause with continuous input and binary output.
Figure 2.4 shows another example of a dominant cause. The left-hand plot in the figure
shows a continuous input and the right-hand plot a discrete input. In either instance, we see
the full range of variation for both the input and output on their respective axes.
If both the input and output are binary (or discrete with a few possible values), we can
depict a dominant cause using a table of percentages such as Table 2.1.
We hope to find a single dominant cause of variation such as shown in these examples. We
may fail for several reasons. First there may be no single dominant cause; instead, we may
find several causes, each with a relatively large effect. Second, we may find that the domi-
nant cause involves two (or more) inputs.3 Figure 2.5 shows two examples of a dominant
Output
Output
G B B B G
G
B
G B G
B
G
B B G
G G B
Input 2
Input 2
G B B
G G G G
G
G B
G
B
G G G B
G G G
Input 1 Input 1
Figure 2.5 Two examples of a dominant cause involving two inputs with a binary output.
cause involving two continuous inputs, and a binary output where we denote good output as
G and bad output as B. In the example presented in the left panel, both inputs must be
simultaneously large in order to get bad output. In the right panel, good output results when
the two input values are both large or both small. The latter case may arise if the output is
a clearance between two assembled components whose critical dimensions are described
by inputs 1 and 2.
Causes can be classified in many ways.4 For us, the key issue is the size of the effects.
We want to find dominant causes that contribute substantially to the variation in the output.
Percent
4
Angle
The average is –21.3 and the standard deviation is 71.5, as given in the columns Mean and
StDev, respectively. See Appendix B for an interpretation of the other summary measures.
For the camshaft example, we can interpret the average and standard deviation using
the histogram of angle error given in Figure 2.6.
The average is the point on the horizontal axis where the histogram would balance if
we could cut it out of the page. In Figure 2.6, since the histogram is roughly symmetric
about zero, the average (the balance point) is close to 0°. The width of the histogram is
approximately six standard deviations when the histogram has a bell shape. Here, the width
of the histogram is 400 thousandths of a degree and 6 stdev = 429. For a bell-shaped his-
togram, almost all of the characteristic values will fall within the range
average ± 3 stdev
From this argument, we see that the standard deviation is a measure of the unit-to-unit
variation. The average is a measure of the process center. The distance from the average to
the target is a measure of how well the process is targeted and hence a measure of the off-
target variation. The average and standard deviation are sometimes combined in a capabil-
ity ratio,5 which can describe both kinds of variation simultaneously.
In the example, the target value for angle error is 0°. If the collected data represent the
long-term behavior of the process, we cannot reduce variation significantly by better cen-
tering the process, that is, by adjusting the average to the target. Here, we must reduce the
standard deviation to get a substantive reduction in the variation.6
Why is the standard deviation so large? Since the standard deviation measures the vari-
ation in angle error from lobe to lobe, there must be changes in varying inputs from lobe to
lobe in the sample that explain the angle error variation. To reduce the standard deviation,
we may first try to identify dominant causes; that is, varying inputs making major contri-
butions to the standard deviation. Then we can try to eliminate the effects of these causes.
We make this idea clear in the next section.
H1212CH02.qxd 3/31/05 10:31 AM Page 20
7
6
Percent
5
3
2
0 10 20 30 40
BC runout
In Chapter 1, we described a problem in which the goal was to reduce variation in base
circle (BC) runout, another characteristic of a camshaft. In that case, the target value was
zero and the histogram of runout values is shown in Figure 2.7.
The average and standard deviations (given as follows by MINITAB) are roughly 12.6
and 6.4 microns, respectively.
The histogram is not bell-shaped. The average ± 3 standard deviations is (–6.6, 31.8).
We see that some values are above 31.8 and none are close to –6.6. We no longer have the
interpretation that almost all of the values fall within three standard deviations of the aver-
age. However, we can still interpret the standard deviation as about 1/6 of the range since
(33.9 – 2.6) ÷ 6 = 5.2. Here, to reduce variation around the target, we need to shift the aver-
age to the left and reduce the standard deviation. If we identify the dominant cause, we may
find a low-cost way to reduce the variation.
Somewhat surprisingly, adjustment of the process center to the target can play a role in
reducing unit-to-unit variation. To see how this happens, consider again the angle error
data. Recall that there are 12 lobes on each camshaft with the positions numbered 1 to 12.
We can use a box plot (see Appendix C), as given in Figure 2.8, to compare the perform-
ance of angle error from lobe position to position.
The lobe averages range from roughly 45 (lobe 2) to –90 (lobe 12). If we adjusted the
process on each lobe separately so that the angle error average was on target for each lobe,
then the overall standard deviation would be reduced to 61.6 from 71.5. You can see this
result qualitatively by imagining all of the boxes in Figure 2.8 being shifted vertically to
H1212CH02.qxd 3/31/05 10:31 AM Page 21
200
150
100
50
Angle
0
–50
–100
–150
–200
–250
1 2 3 4 5 6 7 8 9 10 11 12
Lobe
Figure 2.8 Box plots of camshaft lobe angle errors by lobe position.
have a centerline on zero. Then the range of values shown by the whiskers on the box plots
and the overall standard deviation would be smaller.7
Data summaries such as the average, standard deviation, histogram, and box plots do
not show how the process output varies over time. To show this behavior, we use a run
chart, a simple plot of the output values against the order or time of collection. The run
chart for the angle-error data from lobe 12 is shown in Figure 2.9.
The run chart shows how the process output varies over time. We may see cycles and
smooth patterns on such a plot. The time structure of the output variation is important when
we define the problem baseline (Chapter 6) and when we consider feedback controllers
(Chapter 18). In Figure 2.9, we see most of the full range of variation from one camshaft to
the next. There is no obvious longer-term pattern. In the production of V6 pistons, the diam-
eter was recorded for one process stream at Operation 270 every minute for 200 minutes.
200
150
100
50
Angle
0
–50
–100
–150
–200
–250
20 40 60 80 100
Time order
600
Diameter at stream 1
595
590
The run chart is given in Figure 2.10. Here, the diameter shifts occasionally but relative to
the long term, the output variation is small over the short term.
We can apply these numerical summaries and plots to data for a continuous output. If
the output is binary (typically defective or not) or categorical with only a few values, we
use proportions to quantify process variation. For example, in the process that produces
manifolds described in Section 2.1, there are two manifolds produced in each mold. Dur-
ing an investigation, a team collected a sample of 40 castings, 20 from each cavity labeled
A and B. The team classified each casting as defective (1) or not (0). We summarize the
variation using the proportion defective. From MINITAB we get
In this case, 2 ∏ 40 = 5% of the castings in the sample are defective. The corresponding
percentages are 0% and 10% for cavities A and B, respectively. We do not find histograms
or bar charts helpful for these data. We can sometimes use a run chart to display clusters of
defectives over time.
We have presented a number of charts and statistics for quantifying and describing
process variation. We construct these summaries from a sample of units collected over time
from the process. We must be careful that we collect data that gives us an accurate picture of
the long-term process performance. For example, we would not have seen the structure of the
variation over time in Figure 2.10 as clearly if we had sampled 200 pistons over a few hours
rather than over 200 hours. We deal with this critical idea technically in Chapter 5 and
H1212CH02.qxd 3/31/05 10:31 AM Page 23
practically throughout the book whenever we describe a process investigation. The bottom
line is that we will not make progress without careful thought on how we collect process data.
Ynew = a + bYold
Frequency
Standard
deviation
Target Mean
Output characteristic
Then, in the model, the mean and standard deviation are changed to
For example, suppose the model for a temperature output has mean 1234°F and stan-
dard deviation 56.7°F. If we change from Fahrenheit to Celsius (°C = 0.556°F – 17.778),
then the new model has mean 668°C and standard deviation 31.5°C. Note that if we multi-
ply Y by negative one, the standard deviation in the model does not change.
We want to decompose the model for the output into pieces corresponding to the
effects of causes. Consider the following example. Suppose that two components A and B
are stacked and the height of the assembly y is the output of interest. This is illustrated in
Figure 2.12.
Component A height
Assembly height
Component B height
We have a model for the height of each component as described in Table 2.2. The
means and standard deviations are based on data collected from the process that produces
each component.
Here XA and XB represent the effect of component A and component B on the stack
height. As such, we can construct a model for stack height from the two models of compo-
nent heights, namely
Y = XA + XB
Adding the models does not mean that we add the idealized histograms. Fortunately,
there are simple rules to describe how to combine the means and standard deviations. For
the means, we get
In words, the mean of the sum is the sum of the means. This formula has important con-
sequences when we adjust a process center to target by changing a fixed input. For exam-
ple, we can use the model to predict the effect on the output if we shift the mean of
component A.
For the standard deviation, we have
sd (Y ) = sd ( X A )2 + sd ( X B )2 (2.1)
In words, the standard deviation of the sum is the square root of the sum of the squares
of the standard deviations of the terms in the sum. There is a demonstration in the exercises
to convince you that this key formula is true. Equation (2.1) applies when the two compo-
nent heights vary independently.
Using the model, the mean and standard deviation of the assembly height are
Equation (2.1) has many important consequences. The standard deviation of the sum is
much less than the sum of the standard deviations. This is good news when you are building
up assemblies, because the overall variation will be less than the sum of the component vari-
ation. However, it is bad news when it comes to reducing variation. To see why, suppose that
(for a price) we contemplate reducing the standard deviation of the height of component B
by 50% from 0.033 to 0.016. We can use the model and Equation (2.1) to predict the impact
on the variation of the assembly height. The effect on the standard deviation of the assem-
bly height is surprisingly small; the standard deviation becomes 0.048, a 15% reduction.
More generally, suppose we consider only two sources of variation, one attributed to a
particular cause and the second to all other causes. To be specific, suppose that the specific
cause (actually a group of causes) is the measurement system, so that all other causes are
responsible for the variation in the true value of the outputs. If the effects are additive and
independent, at least approximately, the overall standard deviation is
Now suppose that the standard deviation due to the measurement system is 30% of the
overall standard deviation, that is, the ratio of sd(due to measurement) to sd(total) is 0.3.
What is the percentage reduction in overall variation if the variation due to the measure-
ment system is eliminated?
We constructed Figure 2.13 to give the percent reduction in overall standard deviation
if we could eliminate completely the contribution of an identified cause. From the plot, we
see that when the ratio of the sd(due to cause) to the sd(total) is 0.3, the potential gain is
about 5%. In other words, if the ratio of measurement system variation to total variation is
0.3, you can reduce the overall standard deviation by at most 5% if you replace the current
measurement system with one that is perfect.
H1212CH02.qxd 3/31/05 10:31 AM Page 26
100
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
sd (due to cause) / sd (total)
In the stack height example, the ratio of sd(XB) to sd(Y) is 0.59, and thus, by eliminat-
ing the variation in component B completely we can reduce the overall standard deviation
by about 20%. Similarly, the ratio of sd(XA) to sd(Y) is 0.80, and by eliminating the varia-
tion in component A, we reduce the overall standard deviation by about 41%.
The message from Figure 2.13 is that we need to address dominant causes of variation
in order to make a significant reduction in variation. If we rank the potential gains (in terms
of percent reduction in overall standard deviation) by eliminating the contributions of var-
ious causes, we expect to see a Pareto chart like Figure 2.14. There is little opportunity for
gain in identifying causes with small contributions to the overall standard deviation. For
this reason, the proposed variation reduction algorithm (see Chapter 4) focuses on finding
and dealing with dominant causes.
40
30
Potential gain
20
10
0
x1 x2 x3 x4 x5 x6
Effect of cause
We can also use a model to describe how variation is transmitted through the process.
Suppose the scatter plot of the output versus some specific input is given by the left-hand
side plot in Figure 2.4. We adopt the model
Y = a + bX + noise,
where Y represents the output, a + bX the effect of the identified cause, X, and noise the rest
of the causes, presumably unidentified. The parameters a and b describe the linear relation-
ship between Y and X. From the model we can see that
sd (Y ) = b 2 sd ( X )2 + sd ( noise )2
In this example, we denote sd (due to X) by b*sd(X). Using the model, we can predict
how much of the standard deviation in Y is due to changes in X. For example if we could
hold X fixed (that is, make its standard deviation 0), then the standard deviation of Y will be
reduced to standard deviation of the noise. We can estimate this residual standard deviation
by fitting a line to the scatter plot in Figure 2.4 (see chapters 10 to 12). We can also con-
template changing a fixed input to change the slope b; again, we can use the model to pre-
dict the effect of this change.
We can construct similar models for binary outputs. These are more complex and prove
less useful so we omit them here.
Key
Points
• Process descriptions should be kept as simple as possible while still capturing
the needed information.
• Variation is an attribute of the process, not of single units produced by the
process.
• We can quantify and describe process variation using summaries such as aver-
age, standard deviation, histograms, and run charts.
• Reducing variation in a cause has a substantial effect on the overall variation
only if the cause is dominant.
• To determine the effect of various sources of variation on the overall variation,
we have
3
Seven Approaches
to Variation Reduction
A fool can learn from his own experiences; the wise learn from the
experience of others.
—Democritus, 460–370 B.C.
T
here are many ways to change fixed inputs in the process to reduce variation in an
output characteristic. In this chapter, we classify methods for variation reduction
into seven generic approaches (MacKay and Steiner, 1997–98):
1. Fixing the obvious based on knowledge of a dominant cause of variation
2. Desensitizing the process to variation in a dominant cause
3. Feedforward control based on a dominant cause
4. Feedback control
5. Making the process robust to cause variation
6. 100% inspection
7. Moving the process center closer to the target
We must identify a dominant cause of the variation for the first three approaches but not the
final four.
To implement any of the approaches, we need to change one or more fixed inputs. The
possible changes include:
• Changing a set point (for example, machine settings, specifications for an
input, supplier, and so on)
• Adding or changing a process step (for example, adding inspection, replacing a
gage, retraining an operator, rewriting instructions, and so on)
29
H1212CH03.qxd 3/31/05 10:32 AM Page 30
Output
Input Input
We can sometimes eliminate variation in the dominant cause. In a drilling process, there
was a frequent problem with holes drilled in the wrong location. Plates were drilled on a
multispindle drill press. The defects were not found until assembly. Setting a plate upside-
down or rotating it from the correct orientation before drilling was the dominant cause. The
process engineer redesigned the fixture holding the plate so that it was impossible to mount
the plate in the press incorrectly. The variation in the dominant cause was eliminated.
Figure 3.1 shows a continuously varying dominant cause. We can also reduce variation
in a discrete cause, as shown in Figure 3.2. The dashed horizontal lines show the range of
output. In the left panel of Figure 3.2, we see that machine number is a dominant cause of
variation in the output. There are large differences in the average output level (process cen-
ter) for different machines. In the right panel of Figure 3.2, we see that we can reduce vari-
ation by aligning the average output for each machine. There may be an obvious method
for making such an adjustment.
For example, if a dominant cause is the difference in how two operators control the
process, then we can reduce variation by retraining the operators and clarifying the control
plan. In another example, a team identified the difference between two suppliers of valve
lifters as the dominant cause of high oil consumption in truck engines. The team could
eliminate most of the problem by switching to a single supplier or by establishing proce-
dures to reduce differences between the suppliers.
In an engine assembly plant, the original problem was the high frequency of rejects for
excessive noise at the valve-train test stands. The team discovered that there were large dif-
ferences in average measured noise level for each of the three parallel test stands, even
though the engines were haphazardly assigned to the stands for testing. After calibrating
each stand, they found the average noise levels were roughly equal. The real challenge for
the team was to keep the problem from recurring.
There are two conditions necessary for the fix-the-obvious approach to work. First, we
must be able to identify a dominant cause. We waste valuable resources and make little or
Output
1 2 3 1 2 3
Machine Machine
Output
Input Input
temperature was due to unscheduled and scheduled (for example, lunch) downtime at
the pouring operation, where there was no method for maintaining iron temperature. The
cost of adding controls was prohibitive. The team looked for ways to desensitize the poros-
ity level to pouring-temperature variation. They discovered that changing the wash on the
sand cores substantially reduced the sensitivity of the porosity to low pouring temperature.
The foundry adopted this low-cost solution.
Process desensitization is a desirable approach since, once it is complete, no further
action is required. First, we must find a dominant cause of variation. We can use knowledge
of this cause to help select fixed inputs that we might change to desensitize the process. In
the engine block porosity example, once the team understood that low pouring temperature
was a dominant cause of porosity, they were led to consider changing the core wash and
core sand composition. Without knowledge of a dominant cause, it is unlikely that they
would have thought about changing these fixed inputs.
It is difficult to predict when desensitization will work. This is its great weakness. We
require a great deal of process knowledge to make an output less sensitive to variation in a
dominant cause. Which fixed inputs should we change? We require expensive designed
experiments to determine these inputs (if any exist) and their appropriate settings. The
experiments may fail to determine process settings that lead to improvement. Also, the new
process settings may lead to increased operating costs or negative side effects.
Output
Input Input
Output
Target Target
Time Time
shows the output of a process without feedback adjustment. The dashed horizontal lines
indicate the range of output values. At any particular time, we can predict that the output will
be larger in the future. If we predict that the output is far from the target, we adjust the
process towards the target. The right panel of Figure 3.5 shows the dramatic reduction in
variation of the output due to this approach. Here, using the feedback control procedure, we
adjust the process whenever the output falls outside the adjustment limit (not shown, but
near the top dashed horizontal line in the right panel of Figure 3.5). The size of adjustment
is based on the deviation from the target of the last observed output value.
A team wanted to reduce variation in the film thickness in the painting of fascias. Too
much paint resulted in defects such as sags and runs, and too little paint created appearance
defects. The team discovered that a dominant cause of thickness variation was the flow rate
of the paint. They reformulated the film thickness problem in terms of flow rate. The team
then identified a dominant cause of variation in flow rate that was addressed by making
equipment modifications. After these changes, the flow rate continued to show a pattern in
the variation over time as shown in Figure 3.6.
110
105
Flow rate
100
95
0 100 200
Time (minutes)
The team defined a feedback control scheme that involved considering an adjustment
of flow rate every minute. If the flow rate fell outside prescribed limits, a valve was
adjusted to move the process center to the target value of 103. The feedback scheme pro-
duced a further substantial reduction in film thickness variation.
As a second example, in a project to reduce scrap due to variation in the diameter of
crankshaft journals, the team discovered that an automated 100% inspection gage drifted
substantially over time. They could not determine the cause of the drift, so they decided to
use a feedback controller to adjust the gage when appropriate. A reference part was meas-
ured several times just after the gage had been cleaned and calibrated. The team used these
measurements to establish a centerline and adjustment limits for the gage. Every four hours
the reference part was measured, and if the measured value fell outside the limits, the oper-
ators recalibrated the gage.
The use of setup procedures based on first-off measurements is another example of
feedback control. For example, an operator used the measured output of the first few units
produced after a tooling change to adjust the machine. Once a good setup was achieved,
no further process measurements were taken or adjustments made until the next tool
change.
We can apply feedback control successfully when three conditions are satisfied. First,
the process must exhibit a strong time structure in the output variation as discussed in
Chapter 2. Examples include drift due to tool wear and stratification due to batch-to-batch
variation. Second, there must be an adjustment procedure to move the process center.
Finally, the time to measure the output and adjust the process must be small relative to the
rate of change of the process.
The major advantage of feedback control is that it requires no knowledge of the domi-
nant cause of variation. We use only the measured output values. A drawback is the high
cost of process measurements and adjustments. Finally, due to the feedback nature of the
control, there is an inherent time delay. To identify when an adjustment is required, we
must first observe some output values that are significantly different from the target value.
Thus, feedback control is always reactive.
Output
robustness approach. First they selected a run and a performance measure, the standard
deviation of the measured silicon concentration in five coins poured from the same
batch of iron. Second, they selected a number of fixed inputs, including mold shape,
sample polishing method, and sample temperature. They organized a large experiment
in which all of the selected inputs were changed in an organized fashion. For each com-
bination of the inputs, they prepared and measured five coins and then calculated the
performance measure. They found new settings for several inputs that reduced the
repeatability variation.
As a second example, in a painting operation (discussed in Chapter 1), there was exces-
sive variation in film build (paint thickness) from vehicle to vehicle at particular locations.
As a consequence, to meet the minimum film build specification, the process operators kept
the process center well above the lower specification. This resulted in high paint usage and
occasional visual defects such as runs. To reduce the variation, the project team used the stan-
dard deviation of film build over five consecutive panels (it was too expensive to use cars
for this investigation) to measure performance. Then they varied five fixed inputs, including
some process parameters and paint properties, to explore settings to reduce the variation. The
team discovered new settings of the fixed inputs that reduced the panel-to-panel variation by
a factor of two. The average film build was then reduced with substantial savings in paint and
rework costs.
Making the process robust will be successful if we can identify fixed inputs that can
be changed to improve the performance measure. This approach is similar to desensitiz-
ing the process to variation in a dominant cause. In both, the goal is to make the process
less sensitive to variation in the dominant cause. However, with the robustness approach,
we select and change fixed inputs without first identifying the dominant cause. Without
such knowledge, we find it more difficult to determine which fixed inputs to change and
by how much. There is considerable risk with the robustness approach that significant
resources will be used in a fruitless search for better process settings.
Note that we use the terms robustness and desensitization to label different approaches.
Many authors use them interchangeably.
H1212CH03.qxd 3/31/05 10:32 AM Page 38
Output
anticipated if there are measurement or inspection errors. As a result, given the propensity
of people to make inspection errors, most successful applications use automated inspection.
Inspection on the output has a negative reputation among quality professionals. How-
ever, we often apply 100% inspection on a dominant cause. This is called source inspec-
tion, one form of error proofing (Shingo, 1986).
Percent
Output Output
To apply the Move Process Center approach, we must discover a fixed input or inputs
that we can change to shift the process center. Typically, we do this with an experiment in
which we investigate several fixed inputs simultaneously.
There are risks associated with adjusting the process center. First, we may not find a
fixed input that changes the process center and so waste the cost of the experiment. Second,
changing the fixed input may add expense or produce negative side effects.
To apply feedback or feedforward control, we need a fixed input that we can change to
adjust the output center a prescribed distance. We call such an input a process center
adjuster, or just an adjuster. We use the same tools to find an adjuster or a fixed input to
move the process center. With the Move the Process Center approach, we hope to make a
single shift. With feedback or feedforward control, we plan to adjust the process often.
In many circumstances, we already know of a fixed input that will change the process
center, but we may need to calibrate the size of the effect in order to target the process prop-
erly. We can use an experiment to determine the effect of the adjuster.
For many problems, moving the process center is identical to making the process
robust. In the shrinkage example, since the output has a physical lower limit of zero, shift-
ing the average and reducing the variation about zero are the same problem and the two
approaches are the same. In the shrink defect example, the two approaches are again iden-
tical. In each case, we are led to look for fixed inputs that we can change to reduce the
defect rate without finding a dominant cause.
Key
Points
• To reduce variation, we must change fixed inputs. Without changing one or
more fixed inputs, the process performance will not improve.
• We consider seven variation reduction approaches:
1. Fixing the obvious using knowledge of a dominant cause of variation
2. Desensitizing the process to variation in a dominant cause
3. Feedforward control based on a dominant cause
4. Feedback control
5. Making the process robust
6. 100% inspection
7. Moving the process center
• Finding a dominant cause of variation is an important step in the first three
approaches.
• The appropriate choice of variation reduction approach depends on the problem
definition, the current state of knowledge about the process, and costs.
4
An Algorithm for
Reducing Variation
I
n this chapter we provide an algorithm to address variation reduction problems such as
those described in Chapter 1. The algorithm is structured around the seven variation
reduction approaches introduced in Chapter 3.
We believe that variation reduction problems (or any problems) are best addressed
using a step-by-step method—that is, an algorithm. There are many advantages to adopt-
ing an algorithm. Some people and teams are natural problem solvers that can follow
their own instincts. Most of us can use some guidance. We find the algorithm useful
because it:
• Is easy to teach (systematic and structured).
• Can be managed.
• Helps to avoid silly mistakes or oversights.
• Provides documentation of progress and success (or failure).
• Helps to ensure that all possible solution approaches have been considered.
• Makes most people, especially when working in teams, better problem solvers.
On the enclosed CD-ROM, we describe in detail three case studies that are successful
applications of the algorithm. Here we outline the algorithm and discuss a range of imple-
mentation issues. In later chapters, we describe each stage in detail.
41
H1212CH04.qxd 3/31/05 10:33 AM Page 42
Reformulate
Check the measurement system
Process managers provide the information necessary to start the algorithm. This infor-
mation should include a specified process, an improvement goal often expressed in mone-
tary terms, and a schedule. The managers also enable people to carry out the work. For
convenience, we refer to these people as the process improvement team or the team for short.
In the first stage of the algorithm, Define Focused Problem, the team narrows the
process boundaries and selects the particular output characteristic(s) needed to specify
the problem. For example, suppose the management statement of the problem is to
reduce scrap costs for a particular process by 50% within a month. The focused state-
ment may be to reduce a particular category of scrap, or scrap at a particular processing
step. The team determines the nature of the excess variation (off-target, unit to unit)
and establishes an appropriate baseline measure of the process performance. They then
set a goal for the project in terms of the baseline performance that is consistent with the
management goal.
One special feature of Statistical Engineering is the explicit and frequent use of the
information provided by the baseline investigation. We establish a baseline to help:
• Set the goal in terms of a particular output.
• Search for the dominant cause.
H1212CH04.qxd 3/31/05 10:33 AM Page 43
The goal is to select a working approach and then direct efforts to determine if this
approach is feasible. If it is, the team proceeds to the implementation stage; if not, they
reconsider the approaches.
We have divided the approaches into two groups based on whether their implementa-
tion requires the identification of a dominant cause of variation. The approaches requiring
the identification of a dominant cause are:
At the next stage of the algorithm, Assess Feasibility and Plan Implementation of
Approach, the team looks at the feasibility of the selected approach. They:
• Examine the process to see if it meets the conditions for the approach to be
effective.
• Determine what further knowledge is required.
• Plan and conduct investigations to acquire the knowledge.
• Determine the solution, that is, what and how fixed inputs will be changed.
• Estimate the benefits and costs of the proposed changes.
• Look for possible negative side effects.
We describe assessing the feasibility of each approach in chapters 14 to 20.
If the selected approach is feasible, the team proceeds to validate and implement the
solution. Otherwise, they must reconsider other variation reduction approaches.
We arrive finally at the Implement and Validate Solution and Hold Gains stage. Here
the team assesses the baseline performance of the changed process to ensure that the proj-
ect goal has been met. They must also examine other process outputs to make sure they
have not created a new problem in order to solve the original one. Finally the team imple-
ments and locks the change into the process or its control plan. We recommend the team
monitor the process output and audit the process change until they are certain the solution
is effective and permanent. As well, the team should document what they have learned and
identify future opportunities to reduce variation further.We discuss the Implement and
Validate Solutions and Hold Gains stage of the algorithm in Chapter 21.
Process Issues
The process needs to be under reasonable control before starting a project using the formal
algorithm. We once were asked to assess the likelihood of success of a project to reduce
the amount of rework due to dirt defects on painted bumpers. In a quick walk-through of the
painting process with a painting expert, we saw fiberglass bats fall off the ceiling onto
the painting line, paint drips everywhere, operators sweeping in areas with unbaked
painted parts, and so on. The process was a mess, riddled with poor practices, even to our
untrained eyes. We suggested there was little value in using the algorithm with this uncon-
trolled process. Instead, we gave the simple message:
Fix the obvious!
H1212CH04.qxd 3/31/05 10:33 AM Page 46
We think that this is an important message both at the start and during any project to
reduce variation. Many times, we have learned after starting a project that scheduled equip-
ment maintenance has been abandoned, sometimes for years, or that operators are ignoring
the control plan for the process, often because no one told them that such a document
existed. A solution that results from any initial work needed to get a process under control
may be thought of as fixing the obvious. In general, we expect to apply the algorithm to
processes where:
• A control plan is being followed.
• Equipment is maintained.
• Gages are calibrated.
• Personnel are adequately trained.
• Housekeeping is addressed.
• Industry standard operating principles (see, for example, Todd et al., 1994) are
followed.
Quality standards, such as ISO 9000 (Hoyle, 2001) and QS-9000 (AIAG, 1998) pro-
vide a mechanism to establish reasonable control. Bhote and Bhote (2000) call this activity
process certification. We do not require that the process be under statistical control, that is,
stable as defined in terms of a control chart, to implement the algorithm. See the discussion
about classifying causes in the supplement to Chapter 2 for more explanation.
The algorithm is best suited to address chronic problems (long-standing adverse situa-
tions; Juran and Gryna 1980, p. 99) rather than problems that are sporadic. Sporadic prob-
lems, where the status quo is suddenly adversely affected, are difficult because they require
immediate attention and quick solution. Sporadic problems often lead to containment of
product and a corresponding large cost due to logistics, delay, and lost inventory. In this
context, applying the proposed stage-by-stage algorithm, with its contemplative nature, is
likely not an option. The algorithm has no mechanism for containment. However, many of
the specific tools and methods discussed in this book as part of the algorithm are useful to
find the cause of a sporadic problem and a solution.
The appearance of a sporadic problem is sometimes used to initiate a project whose
goal is to look at both the new sporadic problem and the related long-term chronic problem.
For example, a sudden large increase in brake rotor balance rejects from 25% to 50%
prompted a process improvement project whose goal was to reduce balance rejects to less
than the chronic rate. It is also important to realize that a recurring sporadic problem is best
thought of as a chronic problem. Firefighting to address sporadic problems is not effective
in the long run. Using the discipline of the proposed algorithm provides greater assurance
of finding a permanent solution.
To apply the algorithm, we need to measure process inputs and outputs in a timely
fashion. If we are unable to make such measurements, we may not gain the process knowl-
edge required to move through the stages of the algorithm. In a project to reduce warranty
costs, the team defined the focused problem in terms of the failure rate of wheel bearings
within the warranty period of three years. Their goal was to reduce this failure rate from 3%
to less than 0.3%. The key output is the time to failure of the bearing. Other than historical
H1212CH04.qxd 3/31/05 10:33 AM Page 47
data stored in the warranty database, the team recognized that it would be difficult to apply
the algorithm because it takes such a long time to measure the output for any new vehicle.
To proceed, they decided to replace the time to failure in the field by a surrogate measure,
the bearing failure time measured under extreme conditions in a laboratory.
In another situation, a shipping company set out to reduce the frequency of short and
wrong stock shipments to its only customer. There was a lag of up to six months in the cus-
tomer’s reporting of shipping errors and little confidence in the accuracy of the reports. The
long lag meant that these measurements were difficult to use for variation reduction. Before
the algorithm could be applied, the customer-based measurements were replaced by local
measurements based on the results of a daily preshipment audit. It was assumed that
changes to improve the process expressed in terms of the audit results would also improve
performance for the customer.
To apply the algorithm, we need a high- to medium-volume process that produces fre-
quent parts or units. We assume that we can conduct process investigations relatively
quickly and at low cost. If we cannot do so, then the algorithm is likely to fail or we will
be very limited in our choice of approaches. In the bearing failure example, each measure-
ment of failure time in the lab was very expensive, so that it was not feasible to carry out
many investigations to find the cause of the variation in failure time. The team decided to
make the process robust, the only feasible variation reduction approach.
The algorithm is designed to identify low-cost changes to the process or product that
will meet the goal of the project. We consider the application of the algorithm a failure if
the team cannot identify a low-cost solution. In some instances, the team has a solution in
mind (for example, expensive new equipment) and does not consider all of the possible
approaches at the Choose the Variation Reduction Approach stage. They do not adequately
consider lower-cost changes such as an improvement to the control plan or a change of
process setting to desensitize the process to variation in a dominant cause. On the other
hand, there may be no cost-effective solution.
We urge you to remember that in most variation reduction problems, the process is cur-
rently producing a high proportion of excellent output. Because the process does an excel-
lent job, most of the time, there likely will be a low-cost way to make it even better.
Management Issues
Management makes critical contributions to the successful application of the algorithm.
Some specific management tasks are to:
It is difficult for managers to make good decisions in terms of these points if they do not
understand the algorithm and how it functions.
The algorithm is project-based, so we want to focus on projects with the potential to
have a large impact on customer satisfaction, cost, or both. We know that there are costs
associated with the algorithm and that the outcome is uncertain. We can use customer sur-
veys, market research, and warranty data to highlight important customer concerns or
desires. We need to link customer concerns to process outputs that can be measured at the
manufacturing site; we can use a tool such as quality function deployment (Revelle et al.,
1998). We can identify good cost reduction or productivity improvement projects by exam-
ining scrap and rework rates and using Pareto analysis (Juran et al., 1979) to rank possible
projects. In many cases, other factors will also influence the decision. For instance, manage-
ment may be planning to update or remove a production process. If such wholesale change
is imminent, we should not waste resources addressing identified problems for that process.
When choosing a project (and also later when choosing a potential solution), managers
must consider the issue of global versus local optimization. Local optimization may occur
when we have a narrow focus and forget that the process of interest is likely part of a much
larger system. The output of one process is an input to the next. Reducing variation in an
output that has little effect on the downstream process is an example of local optimization.
Goldratt (1992) provides an interesting discussion of the local optimization issue (for
example, bottlenecks) in the context of cycle times and machine scheduling.
An example of local optimization occurred in the engine block leak example intro-
duced in Chapter 1. The team solved the problem of center leaks by adding several chap-
lets (small steel inserts) to better support the core. This solution reversed a previous process
change. The chaplets had been removed some months earlier to reduce cost. At that time, it
was not realized that the cost savings from removing the chaplets would be overwhelmed
by increased scrap costs. The truck pull problem described in Chapter 1 provides another
example. The problem was initiated because variation in truck pull is noticeable to the cus-
tomer. However, pull is a derived output characteristic that depends on other outputs such
as camber and caster. See Chapter 6 for further discussion. Local optimization would have
been a concern had management presented the problem as one to reduce variation in cam-
ber and caster. It turns out that camber variation has little influence on pull (relative to the
effect of caster). The team could have spent considerable time and effort reducing variation
in camber, but the (local) improvement would not have been noticeable to the customer.
The managers must specify the project goal, usually in monetary terms, and provide a
schedule. Without a goal, the team will not be able to decide if an approach is feasible, or
when the project is finished. The goal must be reasonable; the use of the algorithm will not
(often) produce miracles. Resources must be allocated to fit the schedule. A good reference
on project management is Lewis (2002).
Teams should include participants with expert knowledge of the selected process. As
well, the team must include at least one individual familiar with the algorithm and the asso-
ciated statistical methods. The core team should be relatively small; we recommend one to
three people. Many others will be consulted as necessary to help with specific tasks.
Successful teams must have strong management support during the conduct of the
project. Management must provide team members time away from other duties and help in
obtaining necessary resources such as testing time. Management can provide training in the
H1212CH04.qxd 3/31/05 10:33 AM Page 49
use of the algorithm if necessary, can facilitate access to experts, and can provide contacts
with customers and suppliers as needed. Management should conduct periodic reviews of
the project to keep the effort focused and to provide a mechanism to terminate the project
if the likelihood of success appears too small or if costs are too high.
Management also has a strong role to play in weighing the likely costs and benefits of pro-
posed solutions. The team can provide a business case to justify any suggested process/prod-
uct change. Management approval is necessary to authorize and pay for such changes.
At the conclusion of a project, management must ensure that important lessons learned
are disseminated throughout the organization. This may involve updating the corporate memory
by changing design guidelines, and so on. A useful source for information on learning
organizations is Senge (1990).
Key
Points
5
Obtaining Process Knowledge
Empirically
N
ew process knowledge is an essential ingredient to the variation reduction algorithm
introduced in Chapter 4. Remember, the basis for the algorithm is the idea that we
can find a cost-effective solution by better understanding how and why the process
behaves as it does.
We use an empirical approach to acquire new process knowledge. Empirical means
derived from experiment and observation. That is, we learn about process behavior by carry-
ing out investigations (see Box, 1999). We follow a five-step process (we call this process
a framework to avoid confusion with the process we are trying to improve) to help plan,
execute, and draw conclusions from such investigations. We refer to the framework by
QPDAC, an acronym that stands for Question, Plan, Data, Analysis, and Conclusion (Oldford
and MacKay, 2001). The purpose of each step is:
Question: Develop a clear statement of what we are trying to learn.
Plan: Determine how we will carry out the investigation.
Data: Collect the data according to the plan.
Analysis: Analyze the data to answer the questions posed.
Conclusion: Draw conclusions about what has been learned.
The QPDAC framework highlights the important issues and forces us to think critically
about the inevitable trade-offs necessary in designing and conducting an empirical investiga-
tion. At the end of the chapter, we give a checklist of issues that you should address in every
such investigation. By so doing, you can be assured of solid conclusions at reasonable cost.
51
H1212ch05.qxd 3/31/05 10:34 AM Page 52
We expect to apply QPDAC several times in any application of the variation reduction
algorithm. We need to use empirical investigations to increase process knowledge in most
of the stages of the algorithm. Examples include:
• Quantifying the problem baseline (Chapter 6)
• Assessing the effectiveness of the measurement system (Chapter 7)
• Finding a dominant cause of variation (chapters 9–12)
• Verifying the dominant cause of variation (Chapter 13)
• Determining the feasibility of the variation reduction approaches (chapters 14–20)
• Validating that the proposed solution has addressed the original problem
(Chapter 21)
Question
In the Question step, we specify what we are trying to learn about the process. The goal is
to pose one or more clear questions that we can address in the subsequent steps of QPDAC.
Without clear questions, it is impossible to determine a good plan and draw appropriate
conclusions.
In the Question step, we need to answer the following:
• To what group of units do we want to apply the conclusions?
• What input and output characteristics are needed to specify the question?
• What attributes of the process specify the question?
• What is the question of interest?
To introduce some terminology, we call a unit an individual realization (or product) of
the process under investigation and the target population the group of units about which we
want to draw conclusions. We sometimes specify the target population in terms of the target
process that produces the units.
In the example, to establish a baseline performance measure, the team wanted to learn
how the piston-making process would behave if it was left to operate normally. A unit was
H1212ch05.qxd 3/31/05 10:34 AM Page 53
a piston and the target population was all pistons to be produced in the future under the cur-
rent operating conditions (the target process).
We use the process language from Chapter 2 that defines input and output characteristics.
In the example, we are only interested in the diameter of the piston in order to pose the
question. In more complex investigations, we will specify a number of input and output
characteristics to help define the questions.
To state clearly what we are trying to learn, we specify attributes of the target population.
An attribute is a function of the characteristics over all the units in the target population. In
the piston example, the team needed an attribute to quantify the output variation in the target
population. They decided to use the standard deviation of the diameters in this set of pistons.
Note that none of these pistons had yet been produced.
Attributes can be numbers such as averages, proportions and standard deviations, or
pictures such as histograms or scatter plots. We define attributes in terms of one or more
input and output characteristics.1 In many applications of QPDAC, we formulate several
questions, so we define several attributes.
We use the selected attributes to specify the question. In the piston example, to estab-
lish a baseline measure of process performance, the team asked the specific question:
What is the standard deviation of the diameters of pistons to be produced in the
future if we leave the process to operate as it is currently?
After applying QPDAC, the team was able to provide an answer to this question. They
then proceeded to look for new ways to operate the process that would reduce the variation.
This search involved several applications of QPDAC. When they found a promising new
method of operation, they asked and answered a question about another attribute of the target
population:
What is the standard deviation of the diameters of pistons to be produced in the
future if we operate the process under the new method?
By comparing the two attributes, the team gained valuable information helpful in making
the decision about whether or not to change the method of operating the process. The cost of
the change and potential side effects also entered into the decision. With knowledge of both
attributes, the team had the process knowledge to make a decision to permanently change the
operating method.
In the example, the target population contained a large unknown number of pistons, all
to be produced in the future. This is the common situation where it is impossible to examine
each unit in the target population. As a consequence, we will never know the target popu-
lation attributes exactly. Our goal in the final four steps of QPDAC is to learn enough about
the attributes of interest, subject to the constraints of time and cost, to make good decisions,
in spite of the inherent uncertainty.
The outcome of the Question step is one or more clear questions about well-defined
attributes of the relevant target population or process.
Plan
In the Plan step, we specify how we will answer the questions generated in the Question
step. The result of this step is a plan to gather a sample of units, to measure a prescribed set
H1212ch05.qxd 3/31/05 10:34 AM Page 54
of characteristics on these units, and to store the information collected. To get a detailed
plan, we need to determine:
• What are the units and population available for the investigation?
• How will we select units to be included in the sample?
• What characteristics of the sampled units will we measure, deliberately change,
or ignore?
• For those characteristics that we plan to measure, do we have confidence in the
measurement systems?
• For those input characteristics that we will deliberately change, how will we
make such changes?
• How will we deal with the logistical issues?
The study population is the collection of units from which we can choose the sample.
In the V6 piston diameter example, the team chose the study population to be all pistons
produced by the process in the next week. That is, they planned to collect the sample over
the next week of production.
What were the consequences of this choice? First, the team would take at least a week to
complete the investigation. Second, if the standard deviation of the diameters of pistons pro-
duced in the given week was different from the corresponding standard deviation in the long-
term future (the target population), there will be an error in the conclusions. The team could
have reduced the likelihood of this study error by extending the time of sampling to a month
or even longer, but then they would pay a price in terms of time and cost.
We can only suspect that study error might be present. We cannot quantify the error
without complete knowledge of all units in both the study and target populations. If we
have this knowledge, then the investigation is pointless because we already have the answer
to our question. You should always think about the relationship between the target popula-
tion and a proposed study population in terms of a trade-off between possible study error
and cost. Remember that the study population is the set of units from which we will get our
sample. Even though we cannot quantify the study error, it is clear that some choices of
study population are much better than others. In the V6 piston diameter example, the team
would have been unwise to define the study population as the next 100 pistons to be pro-
duced, since it is likely that a dominant cause may take longer to vary over its normal range.
In most applications of QPDAC, it is not feasible to examine every unit in the study
population because of cost and time constraints. Rather, we collect a sample of units using
a sampling protocol. The sampling protocol specifies how we select the sample and how
many units we choose. The goal of the sampling protocol is to produce a sample of units
with attributes that match those of interest in the study population. We define sample error
as the difference between the sample and the study population attribute. We cannot deter-
mine the sample error because we do not know the attribute in the study population.
We have numerous choices for a sampling protocol, including random sampling, hap-
hazard sampling, systematic sampling (for example, sample every 10th unit), and conven-
ience sampling (take what we can get).2 When specifying a sampling protocol, we need to
H1212ch05.qxd 3/31/05 10:34 AM Page 55
balance cost and convenience against possible sample error. For a given sampling method,
larger sample sizes are more likely to yield a smaller sample error. However, it is more
expensive to gather and deal with larger samples.
In the V6 piston manufacturing process, about 10,000 pistons were produced per day
for five days in the week. The team decided to commit resources to collect a sample of 500
pistons. The issue of sample size is a difficult one, and the choice is usually driven by
resource considerations. For convenience, the team decided to use a systematic sampling
protocol in which they would pick every 100th piston. They expected such a sample to give
a good representation of the week’s production. That is, they thought the standard deviation
of the diameters of the sampled pistons would be close to that of all the pistons produced
over the week. In other words, they expected little sample error.
Next we decide what characteristics of the sampled units to measure, to ignore, or to
change. We must measure or record any input or output characteristic used in the definition
of the attribute of interest. However, it is often advantageous to measure additional character-
istics. In the V6 piston diameter baseline investigation, for very small cost, the team recorded
the time at which each piston in the sample was measured. They hoped that the pattern of vari-
ation of diameter over time would give valuable clues to be used later in the problem solving.
We typically ignore most characteristics of the sampled units; this is a conscious
choice. In the V6 piston diameter example, the team decided to record the diameter, the
time, and the day of measurement and to ignore all other input and output characteristics.
They made this choice because of the question of interest. However, as a general rule, if
there is little cost involved, we should record other inputs and outputs, especially if we have
automated measurement and data collection systems available. We may be able to use these
data later in the variation reduction algorithm.
In the Plan step, we specify how we will measure the selected output and input charac-
teristics on the sampled units. The gages, operators, methods, materials, and environment
all make up the measurement system. The difference between the measured and true value
is called measurement error. Due to measurement errors, the attribute calculated using the
measured values of the sampled units might differ from that using the true values. We need
to worry about the contribution of errors from the measurement system to the overall error.
In some investigations, we deliberately change an input characteristic on one or more
units in the sample to understand the effect on the output characteristic. If no input charac-
teristics are deliberately changed, we call the plan observational. The plan to determine the
baseline performance of the piston process was observational since the team did not delib-
erately change any inputs as the sample pistons were collected. Of course, many inputs
changed from piston to piston in the sample. The key point is that the team let these
changes occur naturally and did not deliberately manipulate any input.
If we deliberately change one or more input characteristics on the sampled units, we
call the plan experimental. For example, as part of a project to reduce the proportion of
steel stampings that rusted during shipping, a team specially oiled a number of stampings
and shipped these parts in the same crates as stampings that were not oiled. The team hoped
to show that the oiling significantly reduced the frequency of rusting. This was an experi-
mental plan. The oiling procedure would not have happened without deliberate intervention.
In Chapter 13, we look at the details of experimental plans, which are often called designed
experiments.
H1212ch05.qxd 3/31/05 10:34 AM Page 56
We require an experimental plan for many questions, such as the previous example
regarding the effect of oiling on rust. To investigate the effects of fixed inputs that do not
normally vary, we must intervene and change them. Such interventions can cause difficulty
in the production process. For other questions, we may have a choice between an experi-
mental and observational plan. Where possible, we use observational plans because they
have the major advantage of not disrupting the current operation of the process. Also, observa-
tional plans are usually cheaper to conduct than experimental plans.
A final task in the Plan step is to organize the logistics of the investigation. This can be
nontrivial, especially for complex plans. We must consider who, how, where, and when as
related to the investigation. For example, we must decide who will collect the sample of
units, make the appropriate measurements, and record the data. We must take care to ensure
that everyone involved knows what they are supposed to do and that everyone not directly
involved knows what they are not supposed to do.
In the V6 piston diameter example, imagine the confusion generated if someone had
decided to make an unexpected change (that is, change a fixed input) to the process during
the week in which the team was collecting the sample. Remember the question:
What is the standard deviation of the diameters of pistons to be produced in the
future if we leave the process to operate as it is currently?
If the fixed input had been changed in the middle of the sampling period, the team
would have little confidence in the conclusions from their investigation. Worse, they may
not have known that such a change was made.
We have also seen the opposite problem. A team took great care to explain that they
were collecting data to establish a baseline and that no process changes should be made.
The operators took the team at their word and ignored the adjustment procedure in the con-
trol plan until a supervisor noticed that a large number of out-of-specification parts were
produced. The team had failed to explain that the current control plan was part of the
process and was to be executed as usual.
If the plan is experimental (that is, involves deliberate changes to the process), we must
make all interested parties aware of what is happening since we may put customers at risk.
The changes may have unexpected consequences, and we must take special care if we plan
to ship the product produced during the experiment.
Part of the logistics is to plan for data storage and processing. For small investigations,
we can write down the measurements as they are made. Usually we want to store the data
electronically for processing. We recommend a row/column spreadsheet format, where each
row represents a different unit and each column represents a different characteristic, as
described in Appendix A.
The output of the Plan step is, not surprisingly, a detailed plan for carrying out the
investigation. In most applications of QPDAC, we consider all the substeps in the plan but
not necessarily in the order that they are presented here. As well, we often iterate among the
substeps. Sometimes, in the middle of the planning, we are forced back to the Problem step
to clarify the question being asked.
In Figure 5.1, we summarize the connections among the target population and the data
and show where errors can occur. Our task is to create a plan that is not too complex or
expensive and yet controls the potential errors.3
H1212ch05.qxd 3/31/05 10:34 AM Page 57
Target Study
population population
Possible
study Possible
error sample error
Sample
Possible
measurement error
Data
Data
In the Data step, we execute the plan and collect the data. This is often the most time-
consuming and costly part of the investigation. This is especially true for those investiga-
tions where we paid little attention to the plan, since there is a great opportunity for things
to go wrong. In the Data step, we also record any deviations from the plan.
In the example, the team measured 469 pistons and recorded the diameter, day, and
hour of measurement. The data were recorded in production order. We give only the last
three digits of the diameters in microns. The diameter is measured in millimeters and the
deviation from 101 millimeters is multiplied by 1000 before being recorded. The process
did not operate on the last part of the shift on day four. The data from the piston baseline
investigation are stored in the file V6 piston diameter baseline.
Analysis
The goal of the Analysis step is to use the data to answer the question(s) posed in the Ques-
tion step.
For most of the investigations discussed in this book, we use simple numerical and
graphical summaries. However, we also consider some more advanced statistical analysis
techniques provided by MINITAB. We introduce these analysis methods and tools as
needed in subsequent chapters.
The standard deviation of the sample data in the V6 piston diameter example is 3.32
microns. In Figure 5.2, we show a histogram and run chart of the sample piston diameters.
By plotting the data in several ways, we can detect unusual values (outliers). There is
a concern, because many of the numerical attributes such as the average and standard
deviation are sensitive to outliers. As a result, any outliers that greatly affect the calcu-
lated attributes should be identified.4 When outliers are present, great care must be taken
that the conclusions drawn from the investigation are truly representative of the target
population.
H1212ch05.qxd 3/31/05 10:34 AM Page 58
60
600
50
Frequency
40
Diameter
30
590
20
10
0
580
580 590 600 100 200 300 400
Diameter Sampling order
Conclusion
In the Conclusion step, we answer the questions posed about the target population attri-
butes of interest. We also discuss limitations of the answer due to possible errors, both as
envisaged in the Plan step and due to deviations from the plan in the Data step. In thinking
about errors, remember the three types: study error, sample error, and measurement error.
We use the results from the conclusion step to help us decide what to do next. We need to
interpret the conclusion and the associated risk of error in the context of the variation reduc-
tion algorithm and the problem itself. To do so, we use basic understanding of the operation
of the process, appropriate theory, and knowledge gained in earlier empirical investigations.
In the V6 piston diameter baseline investigation, the estimated process standard devia-
tion was 3.32 microns, which served as a baseline measure of process performance against
which the team eventually assessed the effectiveness of deliberate process changes. The
team was confident that the estimated standard deviation was a good representation of the
long-term variation in the current process, because their experience with the process sug-
gested that in one week, most of the (important) varying inputs that could have changed
would have changed. In addition, because of the well-designed plan, they expected that
sample error would have little impact.
They specified the numerical goal for the project in term of the estimated standard
deviation and decided to proceed to the next stage of the variation reduction algorithm.
5.2 EXAMPLES
Here are two more examples to help reinforce the ideas and language of QPDAC.
Operation 270A
Operation 310
Operation 280–300
final inspection
Operation 270B
Operators used the measurement system at Operation 270 to control the two grinders that
operate in parallel. The gage at final inspection operated in a controlled temperature environ-
ment. Every piston was inspected to ensure that the diameter and several other characteris-
tics were within specification.
During an application of the variation reduction algorithm, the team decided to investigate
the relationship between the two measurement systems to help understand the causes of
diameter variation at Operation 310.
The team defined a unit to be the act of taking a measurement on a piston. The target
population was the set of all such acts that would occur in the future under current conditions.
The output characteristic was the measured diameter. One key input was the measurement
system used.
To define an attribute useful for comparing the two measurement systems, think of
measuring the same piston twice, once with each measurement system. Repeat this over all
possible pistons and times (here we think of doing this only conceptually since the number
of possible pistons and times is very large). Finally, plot the measured diameters from
the Operation 270 gage against the corresponding measured diameter from the final gage.
The scatter plot (the attribute of interest) might look like Figure 5.4. Note that the closer the
plotted points fall to the 45° line shown on the plot, the better the agreement between
the two systems. A point on the line corresponds to equal measured diameters by both systems.
Of course, the scatter plot might be very different from the one that is shown.
The specific question was to estimate the scatter plot and determine some of its
properties.
600
Op270 diameter
590
580
580 590 600
Final diameter
Figure 5.4 Hypothetical target population scatter plot comparing two measurement systems.
H1212ch05.qxd 3/31/05 10:34 AM Page 60
The team decided to carry out the investigation the following day. Every hour, the oper-
ator at Operation 270 measured four pistons, two from each of the parallel grinders. A des-
ignated team member marked these pistons so that they were uniquely identified, recorded
the measured diameter, and set the four pistons aside. After six hours, he removed the pistons
to the final gage room and let them to come to ambient temperature. During the next shift,
the final gage operator measured the 24 pistons and recorded the diameters.
We use the language introduced in the previous section to examine this plan in detail.
What is the study population? Measurements could be made on one day. As well, only the 24
selected pistons could be measured with both gages. There is a possibility of study error if the
relationship between the two systems changed over time (day to day) or was dependent on the
nature of the pistons being measured. To help control this source of study error, the team
made a good decision to use pistons from both Operations 270A and 270B.
The sampling protocol specifies which units (acts of making a measurement) will be
selected from the study population. Here, the team decided to use 24 pistons and measure
each twice at specified times. We can criticize the step in which all 24 pistons were measured
by the final gage over a short time. There may be sample error if the 24 pairs of diameters did
not represent the relationship between the two gages on the day of the investigation. In terms
of sample error, the team could have improved the plan by making the measurements with the
final gage throughout the day.
For each piston, the team recorded the diameter from both gages, the time of measure-
ment and the process stream (A or B) at Operation 270. They ignored all other process
characteristics such as other dimensional and physical properties of the pistons, the ambi-
ent temperature at Operation 270, and so on. The team did not change any inputs, so this
was an observational plan.
Since this was an investigation of the measurement systems, there was no need to consider
the issue of measurement systems separately.
The logistics of who does what and when were well organized, since one person was
assigned responsibility for executing the plan and storing the data in row/column format in
a spreadsheet, convenient for subsequent analysis.
The team executed the plan without difficulty. They collected and stored the data in the
file V6 piston diameter gage comparison.
The analysis consisted of simple numerical and graphical summaries of the data. We
show a scatter plot of the pairs of measurements from the sampled units in Figure 5.5 and
give the average diameter by gage over all 24 pistons in Table 5.1.
The team concluded there was a strong relationship between diameters measured on
the two gages. However, the final gage gave a measured diameter that was systematically
higher by about 1.5 microns on average. The difference was small but unexpected. The
H1212ch05.qxd 3/31/05 10:34 AM Page 61
600
Op270 diameter
590
580
580 590 600
Final diameter
Figure 5.5 Scatter plot comparing sample diameters at the final and Operation 270 gages.
team later explained the difference when they discovered that steel, not aluminum, masters
were used to remove the effect of temperature changes from the Operation 270 gage.
The team’s only reservation was that the investigation had been limited to one day and
there was some concern that the systematic difference between the two gages might
become larger on a hotter day.
The team used the knowledge gained to recommend a new calibration procedure for
the Operation 270 gage.
was some environmental factor such as ambient temperature or humidity that changed
from day to day and could not be easily controlled. More ghosting problems seemed to
occur on hot, humid days. She also knew that the ghosting appeared under the tape during
the baking of the color coat and that the problem seemed more frequent with certain
colors. Two other tape suppliers claimed that their products would not produce ghosting.
She decided to investigate whether the other sources of tape were robust to the effects of
the environmental factors and color. In the jargon of the variation reduction algorithm, she
chose the Desensitize the Process approach. The engineer decided to carry out an experi-
mental plan to compare the performance of the current tape, here denoted C, against the
two other possibilities, A and B.
A unit was a fascia and the target population was all fascias to be made in the future.
The output characteristic was the degree of ghosting and the key input characteristic was
the type of tape used to mask. The three attributes of interest were the proportions of fas-
cias with ghosting level 1 if the tape type was A, B, or C. The three questions of interest were:
What is the proportion of fascias in the future production with score 1 if tape A [or
B or C] is used?
The engineer knew that she required an experimental plan since the tape type was a fixed
input that could not change without intervention. Because of the high cost of scrap, she
decided to use fascias that had been scrapped upstream of the masking operation for her
study population. She also decided to use only those fascias produced on a hot, humid day
and to paint these fascias with the color having the most frequent ghosting problem. Fur-
thermore, she planned to mask three small areas, labeled I, II, and III on the primed surface
on both sides of each fascia with the three different tapes. See Figure 5.7.
The study unit was one of the small areas on the scrap fascia. The study population was
all such units available on the selected day. The study units and conditions under which
they were processed were very different from the target population. There was likely to be
study error because the conditions were selected to be extreme, so the degree of ghosting seen
in the investigation was likely to be high. As well, the engineer knew that she was unlikely
to get a large number of scrap fascias for her investigation, so she expected a relatively
small sample size.
Accordingly, she decided to revisit the Question, which was changed to the following:
Under extreme conditions, what is the average ghosting score for tapes A, B, and C?
I II III I II III
Note that these new questions involved averages, not proportions. The engineer thought
that if one tape performed better on average in the extreme conditions, then this tape was
likely to produce a higher proportion of fascias with score 1 in the future.
Going back to the Plan step, she decided to use all available scrap fascias (with a max-
imum of 15) produced on one day. She would chose the day based on a weather forecast.
She planned to record the fascia number, the side, the position, the tape type, and the
ghosting score for each unit in the sample. She had already established a reliable measure-
ment system for ghosting.
Since this was an experimental plan, she had to decide how she was going to select
which tape was applied to each unit. Her plan was to use all three tapes on both sides of each
fascia available. Within a side, she would assign the tapes to the three positions at random.
The logistics were critical to the success of the plan. The engineer involved an operator
to decide how to mask the small areas. She carefully labeled the tapes to avoid confusion.
She prepared 15 schematics like Figure 5.7 that showed which tape went on each position.
She planned to give these schematics to the operator one at a time to help him use the cor-
rect tape to mask each area in the sample. She would identify each fascia with a number
and write it on the schematic. When the operator assessed ghosting, he could write the
score on the sheet above each box.
She notified the process owners of the plan to ensure that the experimental fascias
would be set aside after painting and to avoid interference with normal production proce-
dures. A time to run the experimental parts could be set up once the day was selected.
The plan was executed without any hitches. Twelve fascias were processed and a total
of 72 ghosting scores were recorded. The data were transferred from the schematics to the
file fascia ghosting robustness. Table 5.2 is a summary of the data collected.
Tape 1 2 3 4 5 Average
A 16 7 1 0 0 1.375
B 5 9 7 2 1 2.375
C 7 9 6 2 0 2.125
Under the extreme conditions of the plan, tape A was clearly superior with a much
lower average score. All but one area had a score of 1 or 2. The major limitation was study
error; it was possible that tape A would not perform better than tapes B or C under normal
operating conditions.
The engineer validated the performance of tape A under normal operating conditions
with another application of QPDAC. Since the costs of the three brands of tape were sim-
ilar, management accepted the engineer’s recommendation to switch to tape A. In the
longer term, the proportion of fascias with any detectable ghosting (score >1) fell to less
than 0.3%.
H1212ch05.qxd 3/31/05 10:34 AM Page 64
5.3 SUMMARY
We summarize the terminology used within QPDAC as follows:
QPDAC Terminology
Term Meaning
Attribute Numerical or graphical summary of the characteristics over a
collection of units
Experimental plan Plan where some inputs are deliberately changed on the sample
units
Measurement error Difference between the measured and true value of a characteristic
Measurement system Gages, people, methods, material, and environment involved in
measuring a characteristic
Observational plan Plan where all inputs vary naturally on the sample units
Sample Units selected from the study population and measured in the
Data step
Sample error Difference between the attribute in the study population and sample
Sampling protocol Procedure by which the sample is selected from the study
population
Study error Difference between the attribute in the study and target populations
Study population Collection of units available for investigation
Target population Collection of units produced by the target process that we want to
draw conclusions about, usually all units produced now and in the
future
Unit A part or the act of making a measurement
Key
Points
PART II
Getting Started
If you know a thing only qualitatively, you know it no more than vaguely. If you know
it quantitatively—grasping some numerical measure that distinguishes it from an
infinite number of other possibilities—you are beginning to know it deeply.
—Carl Sagan, 1932–1996
I
n this second part of the book, we explore the first three stages of the variation reduction
algorithm: Define Focused Problem, Check the Measurement System, and Choose
Working Variation Reduction Approach. We require a quantitative goal for the problem
and the full extent of variation of the output to help design and analyze investigations con-
ducted in the search for a dominant cause or a solution, and to check that a proposed solution
meets the problem goal. Next, we look at how to assess and improve measurement systems.
A measurement system is necessary to support all process investigations and may itself be
a dominant cause of variation. Finally, we discuss how to choose a working approach
among the seven variation reduction approaches. In any application, this choice is important
because it determines what we do next.
6
Defining a Focused Problem
Our plans miscarry because they have no aim. When you don’t know what
harbor you’re aiming for, no wind is the right wind.
—Lucius Annaeus Seneca the Younger, 4 B.C.–A.D. 65
T
here are two major tasks in the Define Focused Problem stage of the Statistical Engi-
neering algorithm. First, the team must translate the project and its goal into a prob-
lem statement about excessive variation in a measurable output characteristic. The
problem should be specific so that it is likely that there will be only one or two dominant
causes. Second, the team must quantify the magnitude of the problem. We require this base-
line measure to:
• Help set the goal for the problem.
• Help design investigations and interpret the results as we proceed through the
algorithm.
• Allow validation that the problem has been solved.
The baseline is important in all subsequent stages of the algorithm.
69
H1212ch06.qxd 3/31/05 10:35 AM Page 70
The team started by turning the project into three problems. The engine plant inspected
every block for leaks using a pass/fail test. The team instituted a new test for failed blocks
that determined the location of the leak. Using a sample of 100 leaking blocks, they found
that 92% of the leakers fell into three classes based on the location of the leak, as shown in
Table 6.1.
The team suspected that each class of leak would have its own dominant cause. Hence
they defined three problems based on the three classes. In each case, they set the ambitious
goal of eliminating the entire leak class. If they could achieve these goals completely, they
would far surpass the project goal.
Some projects are defined in terms of processes that are replicated at different sites or
in terms of a class of products. We may choose to focus the problem by concentrating on a
single product or manufacturing line. In the truck alignment problem described in Chapter
1, several assembly plants built the same truck using the same components and assembly
process. To concentrate resources, upper management assigned the project to a team in one
plant. The idea was to apply the same solution to all plants.
In specifying a problem, we may need to define an output characteristic that can be
measured locally. A team at an engine assembly plant was charged with reducing warranty
claims due to excessive oil consumption. There were only a few claims, but the plant man-
agement initiated the project because of the potential damage to the long-term reputation
of the engine. To define a specific problem, the team first spent considerable time and effort
developing a dynamometer test that could reproduce in the lab the failure mode seen in the
field. This effort was necessary because the field failures were so rare. The team specified
the problem in terms of the dynamometer measurement. They were confident that if they
could reduce oil consumption in the dynamometer test that they could eliminate the field
failures.
We try to specify the problem in terms of a continuous output to improve the effi-
ciency of the problem solving. If we use a discrete output, such as pass/fail, we require
larger sample sizes for all of the subsequent investigations. For example, in the engine
block porosity project described in Chapter 3, the project goal was to reduce the scrap
rate from about 4% to below 1%. The team invented a new output, a porosity score based
on the size, location, and number of holes on the surface. A block that was scrapped had
a high score, but more importantly for the problem solving, every block could be assigned
a porosity score that reflected the severity of the problem on a continuous scale. The team
defined the problem goal in terms of reducing the variation in the porosity score.
H1212ch06.qxd 3/31/05 10:35 AM Page 71
We need to be careful defining the output. For instance, in a problem with seat cover
appearance, the team measured shirring on a six-point scale using boundary samples.
Scores of 1 to 4 were acceptable, and high scores came from either too much or too little
shirring. With shirring score defined in this way, the team found it difficult to find a domi-
nant cause of shirring variation, because both high and low values of the cause led to high
scores. The choice of output forced the team into the Make Process Robust approach,
which failed. See the Chapter 19 exercises for further details.
We need to be able to measure the output characteristic quickly. We sometimes aggra-
vate or accelerate the usage conditions so that the problem occurs sooner. In the oil con-
sumption example, the team created a dynamometer test with aggravated conditions in an
attempt to quickly simulate use of the engine under extreme field conditions. There can be
considerable difficulties linking the original management project goal to a goal for the out-
put measured using an aggravated or accelerated test. We want to avoid study error where
the cause or the solution of the problem under the aggravated conditions is not the same as
under the normal conditions. To avoid study error, we need to check that the field failure
mode can be replicated using the aggravated test.
We can sometimes use aggravation to deal with a problem with a binary output. For
example, in a painting process, there were line defects on the roof of about 10% of the
vehicles painted. The team used panels with an increased clear coat film build to make
the defect occur more often. Using the panels under the aggravated conditions, they felt
they could find clues about the dominant cause more quickly since the defect occurred
more often. This advantage had to be balanced against the risk that the dominant cause
of line defects on panels with increased clear coat film build (that is, under the aggra-
vated conditions) was not the same as the dominant cause of lines on vehicle roofs with
the normal clear coat film build.
To illustrate the transition from a project to problem(s), consider a project to reduce
scrap in a process that produced piston rods. Based on past performance, the monthly scrap
rate averaged 3.2%. Management set a goal of reducing the rate to less than 1.6%. We show
a finished rod in Figure 6.1.
Looking at process records, the team found that scrap occurred at several processing
steps. To focus the problem, they used Pareto analysis on the records from one month, as
shown in Figure 6.2. They found that 62% of the scrap was identified at a grinding operation.
Further investigation revealed that about 90% of the grinding scrap was due to undersized
rods. After grinding, the thickness was measured on every rod at the four locations, marked
(with small white circles and faint numbers) in Figure 6.1. A rod was scrapped if the thick-
ness at any location was below the lower specification limit. Rods with thickness above the
upper specification limit were reworked. The team focused their attention on reducing
variation in rod thickness, a continuous output.
The team set the problem goal to produce all rods within the thickness specification.
Achieving this goal would eliminate 90% of the scrap at the grinder, or 56% (90% of 62%)
of the total scrap, and hence meet the project goal.
In summary, the key elements in focusing a project to one or more problems are:
• Identify and address the most important failure modes.
• Replace a binary or discrete output characteristic by a continuous one, if possible.
• Define the problem in terms of an output that can be measured locally and
quickly.
• Choose the problem goal to meet the management project goal.
If multiple problems arise from a single project, we recommend that the team address
the problems one at a time. We need to be careful that, in improving the process with
respect to one output, we do not make it worse with respect to another, unless the gain out-
weighs the loss.
We find it helpful to think about what, where, when, and to what extent the problem
exists. We can use a Pareto analysis to help focus the project to a problem that is both nar-
rowly defined and a major contributor to the concern that generated the original project.
100
80
100
Percent
60
Count
40
50
20
0 0
Grind Bore Broach Assembly Others
Defect
Count 85 24 14 6 2
Percent 64.9 18.3 10.7 4.6 1.5
Cum.% 64.9 83.2 93.9 98.5 100.0
Project
Figure 6.3 shows the hierarchy we propose to define projects, problems, and questions.
Company management initiates projects using concerns regarding quality and cost. The
improvement team focuses the project into one or more problems. The team then attacks
each problem using the Statistical Engineering variation reduction algorithm given in
Chapter 4. In each application of the algorithm, the team will ask many questions about the
process behavior. They address these questions using the QPDAC framework described in
Chapter 5.
Unit-to-unit variation too large with two-sided Standard deviation, histogram, Ppk
specifications
Unit-to-unit variation too large with one-sided Average, standard deviation, histogram,
specifications one-sided Ppk
In the rod thickness example, the team selected a histogram with specifications limits
as the baseline. See Figure 6.4. We discuss the investigation to produce this histogram in
the next section. The team set the problem goal to reduce variation in thickness so that the
histogram would fall entirely between the specification limits 10 to 60.
The data are found in the file rod thickness baseline. The height was recorded in thou-
sandths of an inch, measured as the deviation from 0.900 inches. The summary is:
The standard deviation in the baseline investigation was 11.0. Since the process is roughly
centered and the histogram is bell-shaped, the team needed to reduce the standard deviation
by about 25% so that six times the standard deviation matches the specification range.
7
6
5
Percent
4
3
2
1
0 10 20 30 40 50 60
Thickness
There are many choices for a baseline measure. We once asked a process engineer how
he could tell if a planned process change would lead to an improvement. After some
thought, he told us that he would receive fewer phone calls per week from his customers.
In a problem with a system to measure silicon concentration in cast iron, the baseline was
the measurement system R&R (repeatability and reproducibility). The problem goal was to
reduce R&R to less than 20% of the process variation. See Chapter 7 and its supplement for
more information on assessing measurement systems.
As part of establishing the baseline for a continuous output, we determine the full extent
of variation in the output. We use the full extent of variation in planning and interpreting the
results in subsequent process interpretations. When looking at a histogram of the output, we
define the full extent of variation as the range within which the vast majority of values lie.
The range (minimum to maximum) defines the full extent of variation when the sample size
is large (that is, the sample size is in the hundreds) and there are no outliers. For the rod
thickness example, the full extent of variation is 2 to 59 thousandths of an inch.
More generally, for a histogram with a bell shape (as given in Figure 2.11), the full
extent of variation corresponds to the range given by the average plus or minus three times
the standard deviation. Defined in this way, the full extent of variation covers most of the
output values.
Sometimes, we use the baseline investigation to generate clues about the dominant cause
of the variation. For the rod line example, 40 rods were measured at the four positions each
day for five days. We summarize the day-to-day performance using box plots in Figure 6.5.
We see most of the full extent of variation within each day. We can draw two important
conclusions from this observation:
• The dominant cause of variation changes within days and we can rule out
slowly varying causes that change from day to day or over a longer time frame.
• In subsequent investigations to search for a dominant cause, we can set the
study population to be rods produced on a single day.
60
50
Thickness
40
30
20
10
0
1 2 3 4 5
Day
Figure 6.5 Rod thickness by day (dashed lines show specification limits).
H1212ch06.qxd 3/31/05 10:35 AM Page 76
We need to keep the investigation simple, so we pay little attention to the third purpose.
We discuss common issues in baseline investigations using the QPDAC framework
with the rod line scrap reduction problem as the example.
Question
The target population is all units produced by the process now and in the future (assuming
no changes are made). When we focus the problem, we specify the output characteristic. In
the rod line example, the output is the rod thickness measured at any of the four locations.
The attribute of interest (we call this the baseline) is a measure of the process performance
related to the goal. In the rod example, the team chose the histogram of the thickness in the
target population as the baseline.
Plan
We need to specify a study population from which we will collect a sample of units. The
big issue is the period of time over which the study population extends. In the example, the
team decided that the study population was all rods ground within a week. They felt that a
single week was sufficient to see the full extent of variation in the target population. There
is always a trade-off between avoiding study error with a longer period and the cost and
time it takes to complete the baseline investigation.
Remember, the goal of the baseline investigation is to understand the process perform-
ance as the process currently operates. As a result, process adjustments should be made
according to the current control plan.
We choose a sampling protocol that spreads the sample units across the study population;
in other words, we sample across the time period selected. In most cases, a combination of
a systematic and haphazard method is used. If the output is measured on all units and the
data are stored automatically, we can use the whole study population. In the rod example,
40 rods were selected haphazardly throughout each day for five days.
We specify the sample size to be large enough to avoid substantial sample error. As a
rule of thumb, we suggest sample sizes of hundreds of units if the output characteristic is
continuous and thousands if the output is binary. These sample sizes may seem large, but
tables 6.3 and 6.4 demonstrate the difficulty of estimating a small proportion and the
advantage of a continuous output.
H1212ch06.qxd 3/31/05 10:35 AM Page 77
In Table 6.3, we show roughly how well we can estimate (with 95% confidence) a stan-
dard deviation as a function of the sample size.1
50 ±20%
100 ±14%
200 ±9%
500 ±6%
1000 ±5%
In the rod thickness example, we can estimate (with 95% confidence) the baseline standard
deviation in the study population to within about ±5% with a sample of size 800.
For a binary output, we cannot give such a simple table since the precision of the estimate
depends on both the sample size and the unknown defect rate. In Table 6.4, we give the relative
precision (95% confidence) for estimating a proportion defective with a given sample of size.
In the example, the known baseline scrap rate is about 2%. Since the team exam-
ined only 200 rods in the investigation, the sample proportion of defectives is almost
useless as a baseline measure since it is so imprecise. If instead they had used a base-
line sample of 2000 rods they would have estimated the proportion to within about
±0.6% (2% ¥ 0.31).
In the baseline investigation, we must measure the output and input characteristics that
define the problem. We can decide to measure other characteristics, such as the time the
unit is produced, to generate clues about a dominant cause. We do not need to assess the
measurement system at this point because we want to include variation due to the measure-
ment system in the baseline.
H1212ch06.qxd 3/31/05 10:35 AM Page 78
Data
The plan is implemented in the Data step. In the rod line example, the data were collected
according to the plan without incident. The data are given in the file rod thickness baseline.
We store the data in MINITAB using the row/column format convenient for statistical analy-
sis as shown in Figure 6.6. Each row corresponds to a single measurement, and the columns
give the corresponding day, batch, position, thickness, and rod number. See Appendix A for
more on row/column data storage.
Analysis
To quantify performance, we use the sample attribute corresponding to the population attri-
bute selected in the Problem step. In the rod line example, we gave the sample histogram
(Figure 6.4) and estimated the standard deviation as 11.0. We also look at other simple
numerical and graphical summaries such as box plots and run charts to look for unusual val-
ues and patterns in the data.
For example, suppose we collect data as in the rod line investigation and we see a run chart
as in Figure 6.7. Since there is an obvious trend, we would worry that the one-week study pop-
ulation was not long enough to capture the full extent of variation in the target population.
Similarly, if there were a large day effect, for example as illustrated by Figure 6.8, we
would worry about study error. In this case, five days is not long enough to obtain a good
70
60
50
Thickness
40
30
20
10
–10
60
50
Thickness
40
30
20
10
1 2 3 4 5
Day
estimate of the long-term process variation (target population). Here, since the variation
within each day is so small, the effective sample size for estimating the overall variation is
closer to five (the number of days) than 800 (the number of thickness measurements). With
a large day effect, the additional measurements taken each day give information about the
within day variation but do not provide much more information about the overall variation.
Conclusion
In drawing conclusions from the baseline investigation, we report the estimates of the
process attributes of interest such as the standard deviation, histogram, and run chart. In
addition, we note any material limitations due to possible study or sample error. As illus-
trated by figures 6.7 and 6.8, we may discover these limitations from the sample data. We
may also have limitations due to concerns about the plan. If the sample size is small, we
worry about sample error. In the rod example, the baseline is given by Figure 6.4. The full
extent of variation is 2 to 59 thousandths of an inch.
Question
The purpose of the investigation is to:
• Estimate the baseline, an appropriately chosen attribute of the current process
• Determine the full extent of variation of the output characteristic
The team must select an appropriate baseline—for example, a histogram, a standard devia-
tion, or a proportion.
Plan
• Choose a study population covering a period long enough to see the full extent of variation
in the output.
• Determine what outputs and inputs to measure.The inputs should include the time of production.
• Select a sample well spread across the study population with respect to time and other
(possibly) important inputs such as machine, position, and so on. The sample size should be
hundreds of parts for continuous outputs and thousands of parts for binary outputs.
Data
Record the input and output values with one row for each output value measured (row/column
format).
Analysis
• Summarize the data using the appropriate sample performance measure(s). For:
–a continuous output, use the average, standard deviation, histogram, and run chart
–a binary output, use the proportion defect and a run chart
• Check for patterns in the output over time (and possibly other inputs).
• Check for outliers.
• Estimate the full extent of variation in the output.
H1212ch06.qxd 3/31/05 10:35 AM Page 81
Conclusion
• State the problem and goal in terms of the estimated performance measure(s).
• Determine the minimum time required to see the full extent of variation.
• Consider possible study and sample error.
Comments
We can use existing data to establish a baseline. With the QPDAC framework we can exam-
ine the data and how they were collected to ensure that we have confidence that the calcu-
lated baseline reflects the true long-term process performance. For example, suppose the
output is monitored using a control chart (Montgomery, 1996; Ryan, 1989). We can select
an appropriate time period (that is, a study population) from the recent past and use the con-
trol chart data over that period. We do not require the process to be stable, as defined by the
control chart, to estimate the problem baseline.2
Some projects will generate several problems, defined in terms of different outputs. We
may be able to plan and carry out a single investigation that simultaneously establishes the
baseline and full extent of variation for each problem. We need to be careful to avoid study
and sample error for each of the outputs.
6.4 EXAMPLES
We give two further examples of focusing the problem and estimating the baseline.
Truck Pull
We described a project in Chapter 1 where management identified truck alignment as a key
issue based on a Pareto analysis of warranty costs and customer quality surveys. They set a
project goal to match the performance of a competitor based on these surveys. The same
truck was built at several plants. To concentrate resources, the management established a
team at one plant to work on the project. They assumed that any remedy found at that plant
could be applied to the others.
From mathematical modeling of the truck geometry, the team knew that customers could
detect pull, a torque on the steering wheel. Pull is a function of the front wheel alignment as well
as many other characteristics such as tire design and pressure, the road camber, and condition.
Within the assembly plant, pull was measured as a function of caster and camber only. We have
pull = 0.23*(left caster – right caster) + 0.13*(left camber – right camber) (6.1)
The specification limits for pull are 0.23 ± 0.35 Newton-meters. The plant measures
and records caster and camber on every truck. Any truck with pull outside the specification
limits is repaired before release.
To focus the problem, the team started by relating pull as measured in the plant to war-
ranty costs due to alignment problems. They divided the pull specification range into seven
classes, each of width 0.10. Using historical data on the alignment characteristics and a
H1212ch06.qxd 3/31/05 10:35 AM Page 82
warranty database, the team grouped trucks into the seven classes, and for each class cal-
culated the average warranty cost due to alignment issues. Figure 6.9 shows that warranty
costs are higher for the extreme classes, corresponding to pull values near the specification
limits. This suggests that warranty costs can be reduced by reducing variation in pull. The
team assumed reducing pull variation would improve customer satisfaction as measured by
the quality surveys.
To establish a baseline, the team selected data from the previous two months. The data
are given in truck pull baseline. They felt that this time period was long enough so that they
would see the full extent of the variation in the process. The histogram for these data is
shown in Figure 6.10, with the specification limits given by the dashed lines.
The team also constructed box plots of pull by day as shown in Figure 6.11. We see
clear evidence of some day-to-day variation, with drifts in the pull center over time.
Mean of cost
1 2 3 4 5 6 7
Class
4
Percent
–0.2 –0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Pull
0.6
0.5
0.4
Pull 0.3
0.2
0.1
0.0
–0.1
–0.2
1 44
Day
The team set the problem goal to reduce the standard deviation of pull from 0.082 to
0.050. If they could achieve this goal, almost all trucks would leave the plant with pull in
classes 3, 4, and 5, and warranty costs would be reduced.
To further focus the problem, the team used Equation (6.1). Pull is a simple function of
cross caster (difference between left and right caster) and cross camber. Looking at sum-
maries of the cross caster and camber in the baseline data, we have
Using Equation (6.1) and ignoring the small correlations (the correlations between the
right and left sides of both camber and caster are around –0.2, and the correlation between
cross camber and cross caster is around 0.2), we have
stdev ( pull ) = (0.23) stdev ( cross caster ) + ( 0.13) stdev ( cross camber )
2 2 2 2
(6.2)
H1212ch06.qxd 3/31/05 10:35 AM Page 84
The team could accomplish the goal if they reduced the variation in both right and left
caster by half. They decided to address right caster first, hoping that any process improve-
ments could also be made to the left side.
We give a histogram of right caster in Figure 6.12. There is a large number of trucks in
the data set, so we need to be careful when defining the full extent of variation. The team
decided not to use the observed range since there were several outliers, as shown in the box
plot in Figure 6.12. Rather, looking at the histogram and the numerical output, they defined
the full extent of variation as 3.80° to 5.25° (4.52 ± 3*0.24).
In summary, the team focused the project as defined by the management to reducing
variation in right caster from the baseline standard deviation of 0.24° to 0.12°. They connected
the management and problem goal through an investigation of the warranty database. They
could not connect the problem directly to the customer satisfaction surveys. They estab-
lished the full extent of right caster variation. They were confident that there was little
7 6
5
5
Percent
R-caster
3
4
2
0 3
3 4 5 6
R-caster
Figure 6.12 Histogram and box plot of right caster from baseline.
H1212ch06.qxd 3/31/05 10:35 AM Page 85
study error in their conclusions since they believed two months of data described the per-
formance of the alignment process in the long term.
Pump Noise
A finished vehicle audit at an assembly plant repeatedly detected unacceptably noisy wind-
shield washer pumps. The plant charged for the repairs and put great pressure on the pump
manufacturer to solve the problem. The pump manufacturer established a project to eliminate
complaints due to noisy pumps.
A team began by developing a noise measurement system that could be used in the
manufacturing facility before the pumps were assembled into vehicles. Each member of the
three-person team subjectively assessed 24 pumps in vehicles for noise using a five-point
scale. The vehicles selected were both acceptable and unacceptable to the vehicle assembly
plant. The average score for each pump was recorded. The 24 pumps were then removed
and the noise was measured using the new in-plant system. After some adjustments, the
team was able to achieve a strong correlation between the subjective human and the in-
plant measurement systems (see Figure 6.13). The data from the final measurement inves-
tigation are given in pump noise measurement. Using the subjective measure, the team
judged that a score of 4 or greater was unacceptable to the customer. Accordingly, they set
a limit of 8 as the upper noise specification as determined by the in-plant system.
The team was confident that they could detect noisy motors in their facility using the
new measurement system. Over a two-day period, they selected 100 motors haphazardly
from the current production and measured the noise level. The data are given in pump noise
baseline. The baseline histogram is given in Figure 6.14.
In the sample, 18% of the motors had a measured value exceeding the new specifica-
tion limit. The problem goal was to reduce this percentage to 0. The full extent of variation
in the baseline using the in-plant measurement system is 0 to 15.
10
In-plant score
0
1 2 3 4 5
Figure 6.13 Correlation between in-plant and subjective pump noise measurement
systems.
H1212ch06.qxd 3/31/05 10:35 AM Page 86
15
10
Percent
5
0 5 10 15
Pump noise
The team did not consider possible study or sample error. They assumed that the sample
of 100 motors from a two-day period would accurately describe the long-term performance
of the process. They were driven to proceed by the urgent nature of the problem.
data. In either case, we must be careful that the time frame for the study population is long
enough to give an accurate picture of the current process performance.
We use the full extent of variation to help plan and analyze future process investigations.
In cases where there are outliers (unusual values) in the baseline investigation, we ignore
them in determining the full extent of variation, unless the outliers define the problem.
The full extent of variation for a binary output is given by the two possible values of the
output. The notation of the full extent of variation is not that helpful in planning investigations
in this case, other than to suggest we need to examine both defective and nondefective units.
Key
Points
• A project should be translated into one or more specific problems, each of which:
–Has a single dominant cause
–Can be quantified in terms of a measured output characteristic
• The goal of each problem should be directly linked to the project goal.
• We estimate a problem baseline to quantify the goal, to assess a proposed solu-
tion, and to help in the search for a dominant cause and solution.
• For problems with a continuous output, we determine the full extent of variation
to help plan and analyze subsequent investigations.
7
Checking the Measurement System
When you can measure what you are speaking about, and express it in
numbers, you know something about it; but when you cannot measure it,
when you cannot express it in numbers, your knowledge is of a meager and
unsatisfactory kind. It may be the beginning of knowledge but you have
scarcely, in your thoughts, advanced to the state of science, whatever the
matter may be.
—William Thompson (Lord Kelvin), 1824–1907
C
heck the Measurement System is the second stage of the Statistical Engineering
algorithm. The purpose is to ensure that we have an effective measurement system
for the output characteristic that defines the problem. We assess the measurement
system for two reasons:
• The measurement system may be home to the dominant cause of variation.
• We use the measurement system to produce data in subsequent investigations in
the variation reduction algorithm.
The measurement system provides a window to view the process. If the window is
foggy, we are not able to see clearly what is going on. We must improve the measurement
system if we judge it to be inadequate. Sometimes, we can meet the problem goal just by
improving the measurement system.
There are many types of output characteristics and measurement systems. For exam-
ple, the output of the system may be as complex as a force versus time curve, or as simple
as a score on the scale from 1 to 5. In this chapter, we describe how to assess a nondestruc-
tive measurement system for a continuous characteristic. In the supplement, we discuss
variants on the plans to assess measurement systems for binary characteristics1 and for
destructive measurement systems.2
89
H1212ch07.qxd 3/31/05 10:40 AM Page 90
the output characteristic is the measurement error. We define two important attributes of the
measurement system for this target population:
• Measurement variation (also called precision, or better imprecision): the
standard deviation of the measurement errors
• Measurement bias (also called accuracy, or better inaccuracy): the average
measurement error
We can picture measurement variation and bias as in Figure 7.2. The idealized histogram
shows the measurement errors for all measurements in the target population. This is the set of
measurement errors if we repeatedly measured many parts over a long time period.
There are many other attributes of the measurement system that may be of interest.
For example, do measurement bias and variation depend on the true value of the part
being measured? Or do measurement bias and variation change over time? We can define
and estimate other attributes to look at these issues related to linearity and stability of the
system (Automotive Industry Action Group [AIAG], Measurement System Manual,
1995).3
For many gages, the supplier provides a standard deviation to indicate the capability of
the gage. However, these performance measures are determined under narrow conditions
and almost always underestimate the measurement variation as we have defined it over a
broad target population with varying parts, operators, environment, and so on.
For many problems, we do not assess measurement bias, especially when the problem
baseline is quantified in terms of a standard deviation. If we remove bias from the measure-
ment system and change nothing else, we shift the center of the output characteristic but do
not change the variation. If there are two measurement systems within the same process,
then we may look at the relative measurement bias of the two systems. See Section 7.3.
We start by discussing how to assess measurement variation.
Frequency
St. dev
Bias
0
Measurement error
Since we are repeatedly measuring the same part, the variation in the measured values is
due solely to the variation in the measurement errors. We are assuming that the measuring
process does not change the true diameter, that is, that the measurement process is not destruc-
tive. If the five measurements were taken over conditions matching the target population, we
can use the standard deviation of the five values to estimate the measurement variation.
We now describe an investigation to estimate the variation of the measurement system.
We use the camshaft journal measurement system as the example restricting our attention
to the final inspection gage and the front location on journal 1. In reality, the team looked
at all eight locations on both gages.
Question
The purposes of the investigation were to:
• Estimate the standard deviation of the measurement errors over a wide range
of parts, a variety of operators, changing environmental conditions, and a long
period of time
• Compare the measurement variation to the variation due to the rest of the process
Plan
The first step in the Plan is to define a study population. In other words, what measure-
ments can we possibly take in the investigation? We need to decide:
• Which parts to measure
• Which operators to include
• What time frame to use
We recommend using three parts with values spread across the full extent of output
variation seen in the baseline investigation. Here, the team used three camshafts with initially
measured diameter values –12.2, 0.9, and 12.8, as shown in Figure 7.3.
If several operators use the measurement system, then we recommend including at
least two operators in the study population. If the gage is automated so that it is known that
there is no operator impact, then we can use a single operator. In the camshaft example,
there were three operators, one from each shift.
H1212ch07.qxd 3/31/05 10:40 AM Page 93
20
Percent
10
–10 0 10
Diameter
Figure 7.3 Baseline performance for journal 1 front diameter. Arrows show diameter
values chosen for measurement investigation.
We can get some guidance from the baseline investigation to help select an appropriate
time frame for the study population. Ideally, we assess the measurement system over a
period long enough to see the full extent of variation given in the baseline. This ensures that
the dominant cause (if it is in the measurement family) has time to act over the course of
the investigation. In the example, the team selected a one-week period for the study popu-
lation. In our experience, most teams make the mistake of selecting a time frame that is far
too short because they want to finish the measurement assessment stage quickly.
We need to specify a sampling protocol to determine exactly what measurements will
be taken. We plan to measure each part a number of times. We recommend selecting at least
two time points within the chosen study period and having each operator measure each part
at least twice at each time point.
In the camshaft example, two days were chosen, one week apart. On both days, the
team planned to have each operator measure each camshaft three times. They would give
the parts to the operators in random order for each determination. A team member would
record the results. With this plan, each of the three parts is measured 18 times (2 days by 3
operators by 3 determinations). The sample has a total of 54 measurements.
Other than day, operator, part number, and time, the team decided to record no other
characteristics. If they had suspected that the attributes of the measurement system would
change due to some environmental factor such as temperature, they could record the tem-
perature at each determination. If the measurement system is the home of the dominant
cause of the overall variation, the team can use these data to look for clues about why the
measurement variation is so large.
The plan should be executed under normal conditions. For example, the operators taking
the measurements should use their usual method, and gages and parts should not be spe-
cially cleaned.
It is a good idea to randomize the order of measurement within each time period so that
the operators cannot remember an earlier result for a particular part characteristic. This
point is especially important if there is a subjective element to the measurement system.
There are many other possible plans.4
H1212ch07.qxd 3/31/05 10:40 AM Page 94
Data
In the camshaft journal diameter example, the team executed the plan. The 54 measured
values and the corresponding time, operator, and part number were stored in the file
camshaft journal diameter measurement. Figure 7.4 shows some of the data stored in the
suggested row/column format (see Appendix A) in a MINITAB spreadsheet.
Analysis
We estimate the measurement variation by calculating the standard deviation of the meas-
urements made on each part. The variation in these measurements is due solely to variation
in the measurement errors. For the camshaft example the average and standard deviation
for each part are:
to produce the estimate of the measurement variation. We can get this value directly from
a one-way analysis of variance (known far and wide as ANOVA) using MINITAB as
described in Appendix D.
We can also look at how the measurement variation changes over time, over parts, and
over operators. We find this analysis useful to look for outliers (see the supplement to
Chapter 5) and to generate clues about a dominant cause if the measurement variation is
too large.
In Figure 7.5, we plot the measured values versus the part number (left panel) and,
more revealingly, the deviation from the part average versus part number (right panel). We
see that the variation due to the measurement system is roughly the same for the three parts
and that the measurement system can distinguish among these parts. There are no outliers.
In Figure 7.6, we show the results for the two weeks used in the example investigation.
We see no obvious changes in the measurement variation over time. If there were clear dif-
ferences over time, we would be concerned about study error and would recommend
repeating the plan over a longer time frame.
0
Diameter
–1
–10
–2
1 2 3 1 2 3
Part Part
Figure 7.5 Diameter and diameter minus part average by part number.
Diameter minus part average
1
10
0
Diameter
–1
–10
–2
1 2 1 2
Week Week
Conclusion
We estimate the measurement variation to be 0.756 microns. We cannot interpret this stan-
dard deviation in terms of how close a measured value is likely to be to the true value because
we do not know if the measurement system has bias. Perhaps the easiest interpretation is to
say that if we measure the same part at two different times with different operators, then the
difference in the two measurements is likely to fall in the range5
In the camshaft example, this range is ±2.14 microns. Since we looked at only 54 meas-
urements, there is possible sample error in the estimate of the order ±20% (see
Table 6.2).
To quantify the effect of measurement variation, recall from the model described in
Chapter 2 that we can partition the overall variation into two pieces—one due to the varia-
tion in the true values of the characteristic and the other due to the variation from the meas-
urement system. That is, we have
From the baseline investigation in the Define Focused Problem stage (discussed in
Chapter 6), we have an estimate of sd(total) that we used as a baseline to define the prob-
lem. We labeled this estimate stdev(total). From the measurement investigation, we esti-
mate sd(measurement). Then we can use Equation (7.1) to judge if the measurement system
is a major contributor to the overall variation.
In the camshaft example, the estimated total standard deviation is 6.055 (determined
from a baseline investigation) and the estimated standard deviation from the measurement
system is 0.756. Using Equation (7.1) with the estimates, we can solve for the contribution
from the rest of the process, that is,
In the example, stdev(due to process) is 6.0552 - 0.7562 = 6.008 . We see that the measure-
ment system has very little impact on the overall variation.
The effectiveness of the measurement system depends on the relative sizes of the vari-
ation due to the process and measurement. We summarize the measurement effectiveness
using the discrimination ratio D given by Equation (7.3). Larger values of this ratio are bet-
ter since we are better able to distinguish among parts using the measurement system.6
H1212ch07.qxd 3/31/05 10:40 AM Page 97
If D is less than about 2, the measurement system is home of a dominant cause of vari-
ation. We should reformulate the problem in terms of the measurement system. In this case,
improving the measurement system may solve the original problem.
If D exceeds 3, then we know that the measurement system is not the home of a dom-
inant cause and we can proceed with the next stage of the algorithm and choose a working
variation reduction approach.
If D falls between 2 and 3, the measurement system is not a dominant cause, but the
measurement system should be improved (see Section 7.5). This recommendation may be
ignored depending on the nature of the problem and the difficulty and cost of improving the
measurement system. If the discrimination ratio is between 2 and 3, we may have difficulty
interpreting the data in future process investigations.
Plan
• Ensure that the measurements are made under normal operating conditions.
• Select:
–Three parts that cover the full extent of variation in the output
–Two or three time periods that cover a period over which you expect to see the full extent of
the variation in the output, if feasible
–Two or three operators, if multiple operators are normally used
–Two or three gages, if multiple gages are normally used
• Make three measurements under each combination of part, operator, time period, and gage.
Data
Record the measured output and corresponding operator, gage, time, and part number with
one measurement per row.
H1212ch07.qxd 3/31/05 10:40 AM Page 98
Analysis
• Calculate the average and standard deviation of the measurements by part number.
• Combine the within-part standard deviations to estimate the measurement variation.
• Plot the measurements and the deviation from part average by the part number, time period,
operator, and gage. Look for unusual patterns and outliers.
Conclusion
• Calculate the discrimination ratio,
In the camshaft example, there were eight diameters measured on each part. We may
not be able to find three parts with the full extent of variation on each characteristic. Finding
and using different parts for each diameter was too complicated, so the team selected parts
based on journal 1 only.
We may be able to use the results from recent gage repeatability and reproducibility
(R&R) investigations to estimate the measurement variation.7 However, we must assess the
risk of study error since typical R&R investigations are conducted over a short time frame. In
many cases, the estimate from R&R substantially underestimates the measurement variation.
variation and bias. For the camshaft journal diameter system and others that have special-
ized fixtures, we recommend the use of certified parts to avoid possible study error.
We follow the same plan as we used to estimate the measurement variation. In the
camshaft journal diameter example, the original problem was excessive scrap and rework.
The rework operators had noticed that some of the parts rejected by the final inspection
gage were acceptable when measured by the rework gage. The team decided to assess the
bias of both measurement systems. Here we consider only the final inspection system and,
as earlier, the front location on journal 1.
There were no certified parts available. Instead, the team used the three camshafts that
were described in Section 7.2. For each part, the journal diameter was measured five times
on the in-house coordinate measuring machine (CMM). The data are given in Table 7.1.
Standard
Part 1 2 3 4 5 Average deviation
The team took the average CMM values to be the true values. Note that the variation of
the CMM was about 0.20 microns. By averaging five readings, this variation was reduced
to 0.09 (0.2 5 ; see “Use Combined Measurements” in Section 7.5). The team assumed
that the bias of the CMM was negligible.
To estimate the bias in the final gage, we calculate the measurement error for each of
the 54 diameters measured in the initial investigation. Recall that
The data file camshaft journal diameter measurement2 gives the measurement errors for
this investigation. We use the average of the observed measurement errors as the estimate
of bias.
From the numerical summary that follows, we estimate the measurement bias to be
–2.44 microns. The final gage gives consistently smaller measured diameters than the
CMM.
We plot the measurement errors against the part number and week in Figure 7.7. The
plot shows that the measured values at the final gage were consistently smaller for all parts
and there was no change from one week to the next.
The team considered two possible limitations to this conclusion. First, was there some-
thing peculiar about the three parts used or the time during which the investigation was
conducted? Since the bias persisted across all three parts (see Figure 7.7) and did not
change much over the week, the team was confident that there was a systematic difference
between the final gage and the CMM. Second, was the bias due to the CMM? The team,
with all members from production, had more faith in their own system that was specially
designed to measure journal diameters than the CMM that was an all-purpose gage. Recall
that there were eight gage heads to measure the two positions on the four journals in the
inspection system. We have given the results for one position on journal 1. The estimated
bias for the other seven heads was very small, less than 0.5 microns. Since the same CMM
program was used to certify all four journals, the team concluded that the observed bias on
journal 1 was not due to the CMM.
Given that the diameter specifications were ±12.5 microns and the process perform-
ance used up more than the full tolerance (Figure 7.3), the team knew that the bias was
large enough to cause inspection errors. They arranged maintenance on the gage, and a sub-
sequent bias investigation showed that the problem had been corrected. They implemented
a once-a-shift check of the system using a reference part (one of the parts certified by the
in-house CMM) to indicate if the bias problem recurred.
Note that the plan to investigate measurement bias is identical to that for estimating
measurement variation except that we use parts or standards with known values. We can
simultaneously estimate both bias and variation from such a plan.
When we have two or more measurement systems for the same characteristic, measure-
ment variation and bias can be the source of acrimony and confusion among the users. We have
seen battles between customers and suppliers, each insisting that the value produced by their
own measurement system is correct. In one instance, we saw a process in which a transmission
part was inspected on four different systems, the last being at the customer. The customer
0.0 0.0
–0.5 –0.5
–1.0 –1.0
Meas. error
Meas. error
–1.5 –1.5
–2.0 –2.0
–2.5 –2.5
–3.0 –3.0
–3.5 –3.5
–4.0 –4.0
–4.5 –4.5
1 2 3 1 2
Part Week
occasionally found parts out of specification even though they had been through three upstream
inspections. Needless to say there was a lot of finger pointing and high inspection costs.
The relative bias of two measurement systems is the difference in bias. We can esti-
mate the relative bias without using parts with known values. We need to measure a num-
ber of parts on both systems and compare the results. There are many possible plans. We
discussed an investigation to estimate the relative bias of two measurement systems for V6
piston diameter in Chapter 5.
We find it difficult to give a general rule to decide if bias is too large. We need to assess
the bias in terms of the original problem, the overall variation, and the consequences.
In the example, given the baseline variation, the team knew that the negative bias in the
final inspection gage was large enough that they were shipping oversized parts. The bias
also helped to explain why the repair gage accepted parts that were undersized at final
inspection.
the label. As the bottle was rotated, the system measured the maximum and minimum distance
and calculated the average height and crookedness (maximum height – minimum height). In
the measurement system assessment, the discrimination ratio for average height was 3.1.
Although the measurement system was not the dominant cause of the variation, the team
believed they would have difficulty in subsequent investigations because of the measurement
variation. They decided to use a feeler gage to measure the distance between the label and the
bottle bead. The discrimination ratio of the feeler gage system was 6.5. The team felt comfortable
proceeding to the next stage of the variation reduction algorithm using the feeler gage.
sufficiently small to support further process investigations. To complete this stage, the key
tasks are:
• Estimate the measurement variation and possibly bias.
• Calculate the discrimination ratio, D, as given by Equation (7.3).
• If the measurement variation or bias is too large, reformulate the problem in
terms of the measurement system.
• Fix obvious problems in the measurement system.
If we substantially increase the discrimination ratio, is it necessary to repeat the baseline
investigation using the altered measurement system?
If D < 2, then we reformulate the problem in terms of the measurement system and
use the results from the measurement system investigation as a baseline for the reformu-
lated problem. We will reassess the original baseline when we validate the solution in
the last stage of the Statistical Engineering algorithm. If D > 2, the measurement system
is not a dominant cause of variation. Thus, reassessing the baseline is unnecessary
because any change to the measurement system will have little impact on the full extent
of variation.
Key
Points
• We must assess the measurement system for the output characteristic that
defines the problem because it may be the home of the dominant cause of vari-
ation and because it will be used in future process investigations.
• The key attributes of the measurement system are bias and variation. Bias is the
average and variation is the standard deviation of all measurement errors made
in the future.
• To estimate the measurement variation, we define a study population over a wide
range of conditions, including a variety of parts, operators, gages, and times.
• To estimate bias, we apply the same plan proposed for estimating measurement
variation using parts with known values.
• To assess the adequacy of the measurement system, we look at the impact of the
bias and compare the measurement variation to the standard deviation of the
true values of the characteristic.
• If necessary, for the purpose of problem solving, we may use a different meas-
urement system than that used to define the baseline performance.
H1212ch07.qxd 3/31/05 10:40 AM Page 104
8
Choosing a Working Variation
Reduction Approach
I
n this stage of the Statistical Engineering algorithm, we select a working approach
from the seven possibilities described in Chapter 3. You may be tempted to skim over
this stage, since our directions are somewhat vague. However, without a working
approach, it is not clear what to do next. If you do take further action, you are implicitly
adopting a working approach. We believe that explicit consideration of the approach at
this point will lead to a better choice and produce better results sooner.
At this stage of the algorithm, we may not know enough to choose the approach that
will eventually be implemented. Based on what we do know, using both engineering
knowledge and previous process investigations, we select the working approach and then
gather the required information to assess its feasibility. So at this point, we are trying to
pick the most feasible approach with incomplete knowledge. We may have to return to this
stage of the algorithm a number of times as we obtain more process knowledge. We fix
obvious faults in the process as we uncover them.
When a team arrives at the Choose Working Variation Reduction Approach stage, they
will have the following:
• The problem expressed in terms of an output characteristic and a baseline
measure of process performance
• A goal expressed in terms of the baseline measure tied to the project goal
• Some knowledge of the process and its behavior (for example, a process map, a
control plan, results of a baseline investigation and other past investigations,
applicable science and engineering knowledge, and so forth)
• Some knowledge of the constraints (for example, economic reality, span of
control, political and cultural constraints)
• Confidence in the measurement system unless the problem is defined in terms
of this system
105
H1212ch08.qxd 3/31/05 10:40 AM Page 106
for a dominant cause and then decide which of the three to adopt as a working approach.
We have seen many problem-solving efforts flounder because the team tried to change the
process without first identifying the dominant cause of the variation. If we know a domi-
nant cause, we have a much better chance of finding a cost-effective process change that
will meet the problem goal.
Despite our strong preference for trying to find a dominant cause, in some problems the
team may know enough to answer no to this question.
Comments
One other reason for not looking for a dominant cause is historical. If many teams have
failed to find the cause in previous projects, it may be inefficient to keep searching.
Having recommended looking for the cause in almost all cases, we believe the team
should consider the four non-caused-based approaches before reaching a decision to search
for a dominant cause. Based on current knowledge and potential costs, the team may see
that one of these approaches is feasible.
H1212ch08.qxd 3/31/05 10:40 AM Page 108
30
15
20
10
Percent
Percent
10
5
0 0
0 10 20 10 15 20
Output Output
50
40
30
Percent
20
10
5 10 15 20
Current
The team knew they could increase average core strength by increasing the resin con-
centration in the core molding process. They could easily assess the cost of the change and
maintain it in the future. The team did not know how much extra resin was needed and how
well the stronger cores would behave in the casting process. They worried that it would not
be possible to shake out the core sand if the cores were too strong. Residual core sand stuck
in the casting is a serious defect.
Because of the available knowledge, the team adopted Move Process Center as their
working approach. They planned an experimental investigation to quantify the effects of
increasing the resin concentration on both core strength and casting quality.
The team decided to look for a design change that would increase the failure time on
the accelerated test to at least 200 hours for all seals. That is, they adopted Move Process
Center as their working approach. They rejected the approaches based on finding the
cause of the failure (what is different about the seals that explains the different failure
times seen in Figure 8.3) because it was too expensive to collect the necessary data. Based
on their knowledge of the failure, the team made a design change that they then tested by
repeating the five-piece investigation. They accepted the risk that the proposed design
change might prove ineffective in the field, where conditions were different from the
accelerated test.
10
Percent
5
0 10 20 30 40
BC runout
Figure 8.4 Baseline histogram of camshaft lobe BC runout (dashed line shows
specification limit).
the process that would shift the histogram to the left. That is, they initially decided not to inves-
tigate one or more process changes that might meet the goal. This had been tried in the past
without any substantial progress. Instead they decided to look for the dominant cause of lobe-
to-lobe variation. Once the cause was found, they could reconsider which approach to select.
They were willing to accept the investigation costs with no certainty that they could find the
cause or that they would adopt one of the cause-based approaches.
inputs included feed rate, speed, coolant concentration, and so forth. Each experimental run
consisted of grinding five shafts. That is, the team selected robustness as their working
approach.
For each run the team calculated the logarithm of the standard deviation of the five
diameters and then analyzed these 16 performance measures. They found the combination
of levels of the fixed inputs that produced the smallest value of the performance measure.
The team then decided to confirm the improved settings of the fixed inputs by running
these levels for one week under otherwise normal production conditions. They found no
improvement in the process variation and abandoned the new settings because there was an
increase in cost.
The team was disappointed in designed experiments as an improvement tool since they
had spent a great deal of effort for no reward. What went wrong? The major problem was
that the experiment set out to reduce short-term diameter variation (variation within five
consecutive shafts), and it turned out that this was a small component of the baseline varia-
tion. The team adopted the working approach without sufficient knowledge. In this case,
the project was doomed by this decision.
Comments
As shown by the speedometer cable shrinkage example, the robustness approach can be
successful, but we feel it is better to first look for the dominant cause. The more you know
about the dominant cause of variation, the greater the chance you will select fixed inputs to
change that will mitigate the variation in the dominant cause.
In the transmission shaft example, the team would have been much better off if they
had recognized that the dominant cause of variation acted over a longer time frame than
five consecutive parts. They would not have planned the experiment as they did with this
extra knowledge.
H1212ch08.qxd 3/31/05 10:40 AM Page 112
In the fascia film build example, introduced in Chapter 3, a team wanted to reduce varia-
tion in the paint flow rate. A baseline investigation showed the time pattern in the flow rate
given in Figure 8.5.
The baseline data provided evidence that feedback may be feasible. The flow rate vari-
ation was much smaller over the short term than over the long term. Since the team also
knew they could use a valve to adjust the flow rate, they adopted feedback control as the
working approach. They planned to assess the benefit by simulating the effect of the feed-
back control scheme on the baseline data. At the same time they could assess the costs of
the adjustments.
110
105
Flow rate
100
95
0 100 200
Time (minutes)
of the front cable. The cables were delivered in batches and the average length differed sub-
stantially from batch to batch.
The team did not further explore front cable length; the cable arrived as part of an
assembly, so it could not be measured in the plant. Because the dominant cause changed
from batch to batch, they decided to consider feedback control.
The team knew that changing the depth of an adjustment nut would change parking
brake tightness. To quantify the effect of this adjuster they planned an investigation where
they would try different adjustment depths for a number of vehicles. In addition, the team
still needed to decide on an adjustment rule, that is, when an adjustment should be made
and by how much.
Comments
In the fascia film build and parking brake tightness examples, we saw a time pattern in the
run chart of the output from the baseline investigation. We recommend a team use a sam-
pling protocol in the baseline investigation that lets them see any systematic pattern in the
output over time.
Patterns in the variation over time can have many forms. Some processes are setup
dependent; that is, once a setup is complete, there is little variation from part to part. Many
machining and stamping processes exhibit this behavior. If the team discovers such a pattern,
and the output of the process can be measured immediately after setup and there is a low
cost and quick way to adjust or redo the setup, then feedback control could be selected as
the working approach.
In cases where feedback control is feasible, the output varies systematically over
time. This implies that the dominant cause also exhibits the same time pattern. With this
clue, it may be more economical to try to find the dominant cause than to adopt feedback
control as the working approach. If the dominant cause is found, the team can address the
cause directly, for example by applying feedback control to the cause. In the parking
brake tightness example, feedback control was applied to the output despite partial
knowledge of the dominant cause. This knowledge was used to help design the feedback
control scheme.
8.5 SUMMARY
The thought process involved in choosing a working approach is summarized in Figure 8.6.
Questions given at the same vertical height are addressed in either order or simultaneously
depending on the problem context.
H1212ch08.qxd 3/31/05 10:40 AM Page 114
no
Is there an
Can we reduce
Make process exploitable pattern Feedback
yes variation by changing
robust no in the output yes control
fixed inputs?
variation over time?
no no
100% inspection
Figure 8.6 Flowchart to help choose a working approach before a dominant cause is known.
Key
Points
• We recommend the team select a working approach from the seven possible
variation reduction approaches to guide further efforts:
–Fixing the obvious using knowledge of a dominant cause
–Desensitizing the process to variation in a dominant cause
–Feedforward control based on a dominant cause
–Feedback control
–Making the process robust
–100% inspection
–Moving the process center
• The following questions can help to select the working approach:
–Can we find a dominant cause of the unit-to-unit variation?
–Can we meet the goal by shifting the process center without reducing variation?
–Can we reduce variation by changing one or more fixed inputs without knowl-
edge of a dominant cause?
–Does the process output exhibit a strong pattern in the variation over time?
• We strongly recommend the team search for a dominant cause of variation,
unless there is clear evidence that one of the non-cause-based approaches is
likely to be feasible.
• After choosing a working approach, we conduct further process investigations
to determine whether the selected approach is feasible.
• In making a choice of working approach, we try to assess potential implemen-
tation costs.
H1212CH09.qxd 3/31/05 10:41 AM Page 115
PART III
Finding a Dominant Cause of
Variation
How often have I said to you that when you have eliminated the impossible,
whatever remains, however improbable, must be the truth?
—Sir Arthur Conan Doyle (as Sherlock Holmes), 1859–1930
I
n many applications, identifying a dominant cause of variation leads to cost-effective
process improvement. We recommend finding the dominant cause using the method of
elimination and the idea of families of causes. With the use of observational plans and
leveraging (comparing extremes), the search for a dominant cause of variation can be inex-
pensive and nondisruptive to the production process. We discuss the tools and methods used
to support the search for a dominant cause. We introduce planned experiments to verify a
suspect cause is dominant.
9
Finding a Dominant Cause Using
the Method of Elimination
I
n most problems, the team will decide to try to find a dominant cause before assessing the
feasibility of a specific variation reduction approach. We recommend a search strategy
based on the method of elimination to find a dominant cause. That is, we concentrate on
ruling out possibilities rather than looking directly for the dominant cause. In this chapter, we
discuss the principles behind this strategy; in chapters 10 through 13, we provide investiga-
tion plans, analysis tools, and numerous examples to demonstrate how it can be implemented.
The method of elimination can be explained using the familiar game 20 Questions. In
one version of this game, we choose one of the 130,000 entries in the Canadian Oxford
Dictionary (2002). We then give you the dictionary and ask you to determine the selected
word using a series of yes/no questions. A poor strategy is to start asking about specific
words. Unless you are lucky, you are quite likely to be exhausted before you get to the cor-
rect word. A much better strategy is to divide the dictionary in half and ask if the unknown
word is in the first half. Whatever answer you get, you will have eliminated half the words
in the dictionary with a single question. If you divide the remaining words in half at each
iteration, you can find the unknown word with at most 17 questions (217 is just greater than
130,000).
We apply the same idea to search for a dominant cause. We divide the set of all causes
into families and then conduct an investigation to rule out all but one family. We repeat the
exercise on the remaining family until a single dominant cause remains.
117
H1212CH09.qxd 3/31/05 10:41 AM Page 118
In any process, there are many varying inputs that could have a large effect on the output.
We partition the set of all such inputs into two or more families with common features such
as the time frame or location in which they act. Then we use available data, investigations,
and knowledge of the process to rule out all but one family as home of the dominant cause.
V6 Piston Diameter
Operation 270A
Excessive diameter
variation
Measurement
Rest of process
system
Excessive variation
in average height
Measurement
Rest of process
system
Bottle-to-bottle Time-to-time
A team was assigned the task of reducing warranty costs due to oil consumption in a truck
engine. The engine was built at two different plants that had common equipment, common
suppliers, and similar manufacturing processes. From the warranty database, the team
noted that while the two plants had produced roughly the same number of engines, over
90% of the more than 1500 claims were associated with engines built in one of the two
plants.
Here we divide the causes into two families: the within-plant and plant-to-plant families
(see Figure 9.4). The within-plant family contains causes that vary in the same way for each
plant, such as characteristics of components from a common supplier. The plant-to-plant
family contains causes that have different values in the two plants, such as characteris-
tics of components from different suppliers or differences in operating procedures.
Using the warranty data, the team ruled out the within-plant family. The dominant
cause lived in the plant-to-plant family. To continue the search for a dominant cause, the
team focused on the few varying inputs that were different in the two plants.
In this example, the team did not check the measurement system. The discrepancy in
warranty claims was so great that they assumed that there was a real difference between the
plants. The team made huge progress using the data available from the warranty system
because the plant-to-plant family had relatively few causes.
Surprisingly, another team working on the same problem from an engineering per-
spective had decided (incorrectly) that changing the design of the PCV valve could solve
the problem. The second team had implicitly adopted a robustness approach. By changing
the PCV valve, they hoped to reduce the effect of the unknown dominant cause of oil con-
sumption. Since there was a common supplier, no characteristic of the PCV valve could
be a dominant cause because there was no systematic difference in the valves from plant
to plant.
There are many ways to partition causes into families and different investigations and
tools to eliminate all but one family. We look at numerous examples in the next three chapters.
Excessive oil
consumption
Within-plant Plant-to-plant
V6 Piston Diameter
Consider again the V6 piston diameter example, discussed earlier in this chapter, where the
problem was excess variation in piston diameter. The first investigation eliminated the
measurement system family. Next, the team looked at two location-based families as
shown on Figure 9.5.
Excessive diameter
variation
Measurement
Rest of process
system
The team planned an investigation in which they measured the diameter of 96 pistons
after Operation 270 and at the final gage. This investigation eliminated the operations
downstream from Operation 270 as the home of the dominant cause.
The team concluded that the diameter measured just after Operation 270 was the dom-
inant cause of variation in the final diameter. This finding did not surprise the team, who
understood the functions of Operation 280 through 310. However, by verifying that the domi-
nant cause lived in Operation 270 or upstream, they had made substantial progress. The
order in which the pistons are measured at the final gage was not the same as the order in
which they were machined, whereas the order of production was preserved much better in
the machining operations up to Operation 270. The preservation of order made it much
easier to track pistons through the process and thus made further investigation easier.
The team was unable to directly address the diameter variation at Operation 270. They
decided to reformulate the problem and to search for a dominant cause of diameter varia-
tion as measured after Operation 270. At the same time, they compared the measurement
system at Operation 270 to the final gage. While the Operation 270 measurement system
was not the home of a dominant cause of variation, the team discovered the relative bias
between the Operation 270 and final gages as described in Section 5.2. They fixed this
problem by changing an offset in the Operation 270 gage. They also knew that this obvious
fix would not reduce diameter variation at either gage.
Next, the team decided to look at three families of causes of Operation 270 diameter
variation as documented on Figure 9.6.
The team planned another investigation with 96 pistons. For each piston, the diameter
was measured before and after Operation 270. Each parallel stream at Operation 270
Excessive diameter
variation
Measurement
Rest of process
system
Figure 9.6 Diagnostic tree for V6 piston diameter problem after Operation 270 investigation.
H1212CH09.qxd 3/31/05 10:41 AM Page 123
processed half the pistons. Based on the observed data, the team concluded that the Oper-
ation 270 stream-to-stream family was the home of the dominant cause. They eliminated
the upstream family. They concentrated their efforts on identifying what was different
between the two streams that could explain the observed difference in process behavior.
At Operation 270, the grinders were of the same design, and there was a common source
of unfinished pistons and a common control plan. However, there were different operators.
The team interviewed the operators and found that each operator ran his machine differently;
each was convinced that the method they used was superior to that in the written control plan.
Now the team suspected the dominant cause was a difference in the method of process
control in the two streams. The final step was to verify this suspicion. We give the com-
pleted diagnostic tree in Figure 9.7.
We use the diagnostic tree to document the search for the dominant cause. We can add
additional information to the diagram to support the logic of eliminating families. Remem-
ber that the tree is built up as we iteratively apply the method of elimination. We do not start
by constructing the whole diagram.
Excessive diameter
variation
Measurement
Rest of process
system
Method of
operation/control
Figure 9.7 Final diagnostic tree for excessive variation in V6 piston diameter.
H1212CH09.qxd 3/31/05 10:41 AM Page 124
baseline investigation had shown that the dominant cause acted more slowly, for example,
week to week, they would have needed a longer time frame for the study population. To
obtain information about the bottle-to-bottle family, the sampling protocol must involve
selecting some bottles made consecutively. To help avoid sample error, the team selected
five consecutive bottles every hour throughout the study population.
In the Find and Verify a Dominant Cause stage of the algorithm, we assume the base-
line investigation captured the performance of the current and future process with little
study or sample error. With this assumption, if we see the full extent of variation in the output
in any subsequent investigation, we are confident the dominant cause has acted. See further
discussion of the full extent of variation later in this chapter.
1 12.8%
2 4.2%
3 2.8%
found” rejects. The test stand operators repeatedly measured a rejected engine without
teardown because they knew it would often pass with a second or third test.
A team was assigned the goal of reducing the reject rate. The baseline first-time reject
rate was about 6.6%. Before conducting an assessment of the measurement system, the team
used the available data to stratify the first-time reject rate by test stand over a one-week
period. The results are given in Table 9.1.
The dominant cause of the high reject rate lived in the measurement system family. The
team proceeded by looking at causes that could explain the stand-to-stand differences.
Excessive diameter
in OP270 variation
To eliminate one of the families, we track 30 consecutive pistons from each stream at
Operation 270 and measure the diameters at the final gage. What can we conclude if the
results are as given by Figure 9.9?
If we are careless, we conclude that there is a large difference stream to stream and rule
out the within-stream family. However, in the 60 pistons, the diameter varies only from 588
to 596 microns, about half the full extent of variation in the baseline investigation. The
dominant cause has not acted fully during this investigation, and we may be incorrect in rul-
ing out either family. We need to revisit the plan for the investigation. Perhaps we should
have collected 30 pistons from each stream spread out over a day rather than 30 consecutive
pistons to see the full extent of variation.
596
595
594
593
Diameter
592
591
590
589
588
A B
Stream
Figure 9.9 Box plot of diameter by Operation 270 stream from hypothetical investigation.
H1212CH09.qxd 3/31/05 10:41 AM Page 128
600
Diameter
590
580
A B
Stream
Figure 9.10 Box plot of diameter by Operation 270 stream from hypothetical investigation
showing full extent of baseline variation.
pumps somewhat noisier when the pump housing was rotated relative to the motor housing
in the reassembly. The dominant cause of noise was in the assembly family.
If we exploit leverage, we can use small sample sizes in many investigations. We need
to ensure that the extreme output values are due to the dominant cause. If the extreme values
are due to a different failure mode, leveraging may mislead us. By defining a focused prob-
lem and avoiding outliers, we hope to avoid this occurrence.
Be Patient
To use the method of elimination, we need to plan, execute, and analyze the results of a number
of simple investigations. To be successful, we need to avoid the temptation to jump to a spe-
cific cause too soon. As consultants, we have met strong resistance from process managers
when we suggested that we plan to conduct several investigations to find the cause of variation.
The usual reaction is to ask that we get it over with quickly using a single investigation. In man-
agement review meetings, we have seen problem-solving teams struggle with the question,
“Well, have you found the cause (or solution) yet?”
The answer, “No, but we have eliminated a lot of possibilities,” can be hard to defend
if the problem is urgent and substantial resources have been committed to its solution. On
the other hand, trying but failing to find the cause with a single complicated investigation
is even harder to defend.
described in chapters 12 and 13. Alternately, we can abandon the search and adopt a varia-
tion reduction approach that does not require the identification of a dominant cause. We
still use any knowledge gained about the family containing the dominant cause.
Key
Points
• To find the dominant cause of variation, partition the possible causes into fami-
lies and use the results of an investigation to eliminate all but one family as the
home of the dominant cause. Iterate to further subdivide the remaining families
until a dominant cause is found.
• “Let the process or product do the talking.” Observational data can provide
strong clues about a dominant cause of variation.
• Document the search for a dominant cause using a diagnostic tree.
• Implementation suggestions:
–Check the measurement system first
–Use the QPDAC framework to help plan the investigations
–Use available data
–Use knowledge of the process
–Observe the full extent of variation
–Use leverage where possible
–Take the simplest path
–Be patient
–Reconsider the variation reduction approach
10
Investigations to Compare
Two Families of Variation
The world is full of obvious things which nobody by any chance ever
observes.
—Sir Arthur Conan Doyle (as Sherlock Holmes), 1859–1930
I
n this chapter, we describe some plans and the corresponding analysis to compare two
families of causes. We concentrate on a single iteration of the method of elimination
where the goal is to eliminate one of the two families. For each example, we assume that
the team has successfully completed the Define Focused Problem and Check Measurement
System stages of the variation reduction algorithm and has decided to search for a dominant
cause of variation.
We use plans that meet the criteria discussed in Chapter 9:
• Keep it simple.
• Use available data.
• Observe the full extent of variation in the output.
• Exploit leverage if possible.
10.1 STRATIFICATION
With stratification (Kume, 1985), we divide parts into distinct groups based on their source.
For example, we may stratify by machine, mold, line, gage, plant, location within the part, sup-
plier, operator, and so on.
Using stratification, we can assess whether the dominant cause impacts each group in
the same way or each group in different ways. A dominant cause that impacts different
groups differently is said to act in the group-to-group family. If such a cause is present, we
will see large differences among the group averages and low variation within each group.
131
H1212ch10.qxd 3/31/05 10:42 AM Page 132
Casting Scrap
The baseline scrap rate for a casting defect was 2.4%. The castings were molded four at a
time with cavities labeled A, B, C, and D. The cavity label was molded into the casting.
The team decided to stratify the causes into the cavity-to-cavity family (also called the
among-cavity or between-cavity family in other references) and the within-cavity family.
We have:
• Cavity-to-cavity family: causes that affect different cavities differently
• Within-cavity family: causes that affect each cavity in the same way
The team classified 200 scrap castings by cavity as shown in Table 10.1. Because the
proportion of scrap castings from each cavity was about the same, the team ruled out the
cavity-to-cavity family. The dominant cause affected all cavities in the same way.
In Table 10.1, we compared the number of scrap castings from each cavity directly
because there was equal volume from all four cavities. If the cavity volumes were not the
same, we would have compared the percent scrapped from each cavity.
A 58
B 51
C 43
D 48
Total 200
Rod Thickness
In Chapter 6, we looked at an example where the goal was to reduce thickness variation in
a connecting rod at a grinding operation. The team established a baseline and found the full
extent of thickness variation was 2 to 59 thousandths of an inch (recorded as a deviation
from a particular value).
In the baseline investigation, the team measured thickness of each sampled rod at four
positions. The data are given in the file rod thickness baseline. With these data, we can
compare the family of causes that produce different thickness within each position to the
family of causes that act from position to position.
In Figure 10.1, we plot thickness by position. Since we are using the baseline data, we
are guaranteed to see the full extent of variation for the output, given by the dashed horizon-
tal lines. Position 3 has a smaller average thickness than the other positions. There is a
cause in the position-to-position family that produces this difference. However, we can
see that this cause is not dominant by imagining the plot with the average thickness
H1212ch10.qxd 3/31/05 10:42 AM Page 133
60
50
40
Thickness
30
20
10
0
1 2 3 4
Position
Figure 10.1 Rod thickness stratified by position (dashed horizontal lines give the full extent
of variation).
aligned for each position.1 The variation within each position is a substantial part of the over-
all variation. We cannot eliminate all the out-of-specification rods without reducing the
within-position variation. Of course, we will implement an obvious fix to better align
the thickness center at Position 3 if available.
35
30
25
BC runout
20
15
10
0
1 2 3 4 5 6 7 8 9 10 11 12
Lobe
Figure 10.2 Camshaft journal runout by lobe (dashed horizontal lines give the full extent
of variation).
Comments
We can often use stratification with existing data for any multistream process if we can
identify the stream associated with each part. Some examples are parallel processing steps,
multiple suppliers, gages, product sources, operators, and so on. We saw two examples in
Chapter 9: first, when oil usage warranty claims were stratified by the plant where the
engines were built, and second when rejected engines were stratified by the test stand.
For a continuous characteristic, a box plot (using either the box summary or individual
values depending on the data volume) is a convenient way to present the data to display the
variation within groups and group to group. Adding dashed horizontal lines shows if we
have seen the full extent of variation. The easiest way to assess the contributions of the two
families is to shift the boxes (or individual values for each group) in your mind so that they
all have the same centerline. The variation that remains is due to the within-group family.
If this variation is large relative to the full extent of variation, we eliminate the group-to-group
family as the home of the dominant cause.3
H1212ch10.qxd 3/31/05 10:42 AM Page 135
A B C ... A B C ... A B C
35
30
25
BC runout
20
15
10
0
1 2 3 4 5 6 7 8 9 10 11 12
Day
Figure 10.5 Box plots of BC runout by day (dashed horizontal lines give the full extent of
variation).
H1212ch10.qxd 3/31/05 10:42 AM Page 136
35 35
30 30
25 25
BC runout
BC runout
20 20
15 15
10 10
5 5
0 0
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
Day Day
Figure 10.6 Box plots of camshaft journal runout by day for Lobe 8 (left) and Lobe 12
(right) (dashed horizontal lines give the full extent of variation).
However, recall that the dominant cause must explain the differences between end and
middle lobes as discussed in Section 10.1. To see if the lobe-to-lobe differences depended on
the day, the team examined BC runout by day for each lobe separately. Figure 10.6 shows
the plots for lobes 8 and 12. In the baseline, Lobe 12 exhibited the most runout variation,
while Lobe 8 gave the least.
Figure 10.6 suggests there was no interaction between causes that act lobe-to-lobe and
day-to-day since for each lobe, the pattern in runout is similar across days. The team con-
cluded the dominant cause acted within each day. Using this conclusion, they could plan
subsequent investigations over shorter time periods (i.e., one or two days) and expect to see
the full extent of variation.
Bank face
Bank face
Excessive porosity
Measurement
Process
system
Block-to-block Time-to-time
400
300
Porosity
200
100
1 2 3 4 5 6 7 8
Hour
We created the plot in Figure 10.9 using the multivari routine in MINITAB (see Appen-
dix C). The dots on the plot correspond to the individual porosity values. The five block
porosity averages are joined from hour to hour. We see large differences in the average
porosity over time relative to the variation of the porosity within most hours. Based on this
observation, we eliminate all causes, such as mold dimensions, that act from block to block.
The average porosity (and the variation) is highest in hours 5 and 8. The team noted that
these were times when the process was neglected due to lunch and preparations for the end
of the shift. Combining this process knowledge with the patterns seen in Figure 10.9, the
team looked for causes that behaved differently during these special times.4
Heat
Grinding Straighten Lapper
treatment
was 2.6 to 33.9. Using the process map (Figure 10.10), they first split the remaining causes
into two families:
• Causes that act downstream of heat treatment
• Causes that act in heat treatment or upstream
The team selected 32 parts over the course of one day. There were four camshafts from
each of the eight lobe grinders. One camshaft from each grinder was processed on each
heat treatment spindle. BC runout was measured after the heat treatment and after the final
step of the process on each lobe. The data for Lobe 12 are in the file camshaft lobe runout
variation transmission.
Figure 10.11 shows a plot of the Lobe 12 final BC runout versus the BC runout after
heat treatment. Similar plots were produced for the other lobes. Across all the lobes, the
final runout variation matched the baseline variation. Using Figure 10.11, we can separate
the two families:
• Downstream family—causes that act downstream of heat treatment
• Upstream family—causes that act in the heat treatment step or upstream
35
30
25
Final BC runout
20
15
10
0
5 15 25 35
BC runout after heat treatment
Figure 10.11 Lobe 12 BC runout after heat treatment and final (dashed lines give full
extent of variation).
H1212ch10.qxd 3/31/05 10:42 AM Page 140
35
25
20
15
10
5 10 15 20
BC runout before heat treatment
Figure 10.12 Lobe 12 BC runout before and after heat treatment (dashed lines give full
extent of variation).
The plot shows a strong relationship between runout after heat treatment and at the end
of the process. If we hold fixed the runout after heat treatment, there is little variation in
final runout. We eliminate the downstream family that includes the straightening and lap-
ping operations as the home of the dominant cause. The variation in runout is coming from
the upstream family.
In the investigation, the team also measured runout for all lobes before the heat
treatment step. Figure 10.12 shows very little relationship between the BC runout before
and after heat treatment. If we hold fixed the runout before heat treatment, we see most
of the full extent of variation after heat treatment. The team concluded that the dominant
cause acts in the heat treatment. They eliminated all causes that act upstream of the heat
treatment.
Comments
We can use plans similar to that used in the camshaft lobe runout example to partition the
causes upstream and downstream from an intermediate operation whenever we can meas-
ure the output after the intermediate operation.
In the chapter supplement, we look at other plans and analysis tools for comparing
upstream and downstream process step families that have more limited application.5
High door closing effort was a relatively frequent complaint in a new vehicle owner survey.
A team worked on the rear doors where there were more complaints. They measured door
closing effort using a velocity meter that determined the minimum velocity (meters per
second) necessary to close and latch the door. A baseline investigation showed there was
considerable door-to-door variation in the velocity and that doors with high velocity were
difficult to close.
Using the baseline data, the team selected two cars with extreme velocity values, one
high and one low. The task of complete disassembly and reassembly of the door was difficult,
so instead the team decided to remove and replace only the striker. For each car, this was
repeated five times using production operators and tooling. The data are shown in Fig-
ure 10.14. The initial values are shown by the horizontal lines and indicate the full extent of
velocity variation.
There was a large cause acting in both the assembly and components families. Within
each car, there was considerable variation as the striker was removed and replaced but
not the full extent of variation. The door-to-door average velocity differed considerably
due to differences in the components of the two door systems. The team kept both families
and worked on each separately. Since the assembly family consisted of only the position
of the striker, they first took the striker assembly variation as one large cause and
designed a new fixture to control this variation. Using the new fixture, there was a signif-
icant reduction in both the velocity average and standard deviation, but not enough to
meet the problem goal. They also searched for the dominant cause of the remaining vari-
ation in the components family.
H1212ch10.qxd 3/31/05 10:42 AM Page 143
1.4
1.3
Closing effort
1.2
1.1
High Low
Velocity
Figure 10.14 Closing effort by velocity (initial values from the baseline given by horizontal
lines).
Comments
Figure 10.15 shows three possible plots of the results of a disassembly-reassembly
investigation.
In the left panel of Figure 10.15, the dominant cause acts in the assembly family and in
the right panel it acts in the component family. In the middle panel, there is an interaction
between causes in the assembly and component families. For example, the presence of a
burr on a component may make it difficult to assemble the unit so that it performs consistently.
If the burr is absent, it is easy to assemble the product to get consistent performance.
There are several cautions when using this plan to compare assembly and component
families. It is best to use the original assembly process, tools, and people to avoid study error.
Otherwise there is a risk that any conclusions will not be relevant for the actual process. Also,
we must ensure that parts are not damaged or changed in the disassembly-reassembly.
Output
Output
Output
Question
In the current process, does the dominant cause act in the assembly or component family?
Plan
• Select two assemblies with opposite and extreme output values relative to the full extent of
variation.
• Disassemble and reassemble each part three or more times. On each occasion measure the
output.
Data
Record the original output values and each new measurement together with the corresponding
part number, one measurement per row.
Analysis
• Plot the output by part. Use a special symbol to denote the original output values.
• Use the plot to compare the assembly variation within each part to the difference between
the two original output values.
Conclusion
If the assembly variation within each of the two parts is relatively:
• Large, the dominant cause acts in the assembly family
• Small, the dominant cause acts in the component family
Otherwise the dominant cause involves inputs in both the assembly and component families.
10.5 COMMENTS
Binary Output
With a binary output characteristic and a low rate of defectives, we cannot look at the
part-to-part family. In the block leaker problem described in chapters 1 and 6, the team
made a false start. They sampled five consecutive blocks for 20 hours. They tested each
block for leaks and found two leakers in separate hourly periods. From these data they
could not eliminate either the block-to-block or hour-to-hour family because of the low
failure rate. With a high-volume process, we can separate the hour-to-hour and day-to-day
families by plotting the hourly rate of defectives by day.
plan matches the actual production order. If we call parts consecutive, then they should be
processed consecutively as far as possible.
In the engine block porosity problem, the team had to specially mark the molds so
that they could find the five consecutively molded blocks after the bank face machining.
Once the blocks were poured, they were not necessarily finished and machined in pouring
order. In a process that produced brake lines, the problem was a crack in a formed flange
that led to leakage when the line was coupled to a valve. The crack was found at final
inspection. The output characteristic was the size of the crack. The team considered an
investigation to compare the part-to-part and hour-to-hour families. They abandoned this
plan because the sequence of parts changed so many times in the process. Consecutive
parts at final inspection were not processed consecutively through earlier process steps.
The team could not isolate causes that acted in the part-to-part family because of this
loss of order.
Formal Analysis
In this chapter we have described simple investigations supported by graphical displays
such as scatter plots, box plots, multivari charts, and defect concentration diagrams. Since
we are searching for a dominant cause, we can almost always draw the conclusion to elim-
inate a family without formal analysis procedures. Rarely do we need more formal methods
such as the analysis of variance (ANOVA)7 or regression analysis (see Chapter 12).
6 10
7
Output
Output
5
6
4 3
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Hour Hour
Figure 10.16 Two multivari charts of the same data with less than full extent of variation
(dashed horizontal lines give the full extent of variation from the baseline).
Figures 10.2 and 10.9 showed that output variation changed by lobe and by time, respectively.
Such patterns provide valuable clues about the dominant cause. For instance, in Figure 10.9,
the difference in porosity variation of the output at hours 5 and 8 may be due to:
• A difference in the average or range of a dominant cause at different time
intervals
• A different relationship between the cause and output in different time intervals
• A combination of these two reasons8
In both the examples, the change in variation was accompanied by a change in the
average output level. To examine a situation where only the output variation changes,
consider a hypothetical example. Suppose in the V6 piston diameter example discussed
in Chapter 9, we collect 30 pistons from each stream over one shift so that we see the full
extent of variation. We plot the results in Figure 10.17.
The average diameter is roughly the same for the two streams, so stream-to-stream
differences are not a dominant cause of variation. However, stream B shows substantially
greater diameter variation than does stream A. This pattern of unequal variation tells us a
lot about the dominant cause. The dominant cause must act within the stream B family. We
have eliminated all causes in the stream-to-stream and within-stream-A families. Note that
this conclusion is more specific than the usual interpretation from stratification where we
are able to rule out either the within-stream or stream-to-stream families.
What kinds of causes can explain the behavior seen in Figure 10.17?9 Two possibilities
are that there was a worn fixture in stream B or that the stream B operator made more fre-
quent adjustments.
H1212ch10.qxd 3/31/05 10:42 AM Page 147
600
Diameter
590
580
A B
Stream
Figure 10.17 Box plot of diameter by Operation 270 stream from hypothetical investigation
(dashed horizontal lines give the full extent of variation from the baseline).
Key
Points
• Simple investigations and graphical analysis tools can provide valuable clues
about the home of a dominant cause of variation.
• We often find it useful to divide the remaining possible causes into two families
of causes such as:
–Short-time and long-time
–Group-to-group and within-group
–Upstream and downstream
–Assembly and components
and then carry out a simple investigation to eliminate one family.
• To avoid study and sample errors, we must observe the full extent of variation in
the investigation.
1. If we have two families and stratified data, we can use analysis of variance (ANOVA) to
partition the overall standard deviation into two pieces, one associated with each family. We
look at the details in the supplement.
H1212ch10.qxd 3/31/05 10:42 AM Page 148
2. By understanding the possible reasons for an observed pattern in the results of an investigation,
we can eliminate many suspects. In the supplement, we look more closely at possible explanations
for the changing average and variation pattern seen in figures 10.2, 10.9, and 10.17.
3. We identify a dominant cause primarily using graphical displays. We do not use a formal
hypothesis test. In the chapter supplement we elaborate on why hypothesis tests are not
appropriate in the search for a dominant cause.
4. See note 2.
5. We used a simple plan to compare upstream and downstream families. We examine operations
swap and randomized sequencing, two more restrictive plans for separating these families.
6. By selecting units with extreme performance, we can compare the assembly and component
families with a very limited number of disassembly-reassembly cycles. We examine this
application of leverage in more detail in the supplement.
7. See note 1.
8. See note 2.
9. See note 2.
11
Investigations to Compare Three
or More Families of Variation
Things which matter most must never be at the mercy of things which
matter least.
—Johann Wolfgang von Goethe, 1749–1832
I
n Chapter 10, we compared two families of causes using simple plans and analysis
tools. In this chapter, we consider investigations for simultaneously comparing three or
more families to identify the home of a dominant cause of variation. We also consider
the repeated application of investigations based on two families.
Family Description
Mold-to-mold Causes that explain differences in weight from one mold to the other
Cavity-to-cavity Causes that explain differences in weight among the cavities
Part-to-part Causes that explain differences in weight among consecutive parts
Hour-to-hour Causes that explain differences in weight from one hour to the next
149
H1212ch11.qxd 3/31/05 10:43 AM Page 150
A C A C A C A C
Mold 1 Mold 1 Mold 1 Mold 1
B D B D B D B D
... ...
A C A C A C A C
Mold 2 Mold 2 Mold 2 Mold 2
B D B D B D B D
Hour I Hour II
To isolate the effects of each family, we sample from the process in a systematic manner.
We include parts in the sample that are produced consecutively and over hours from all
molds and cavities. For example, we might select and measure five consecutive parts from
each mold and cavity (40 parts in total) every hour for several hours.
With the proposed sampling plan, we can also detect interactions between the families.
For example, we can see if mold-to-mold differences change from hour to hour.
We call this a multivari investigation (Seder 1950a, 1950b, 1990). See also Zaciewski and
Nemeth (1995) and Snee (2001). We design the sampling protocol to see the variation due to the
causes within each family of interest. We must be able to trace the parts according to the order and
location in which they are produced. In the example, each part was labeled by cavity and mold. We
may have more difficulty finding parts that were molded consecutively at the end of the process.
We use multivari charts (modified run charts) to display the results of a multivari investi-
gation. Using MINITAB, we can choose to display the effects of up to four families on the
multivari chart at the same time. Since charts with many families are difficult to interpret, we
prefer to use a series of charts, each with fewer families. We can use analysis of variance
(ANOVA) to quantify the effects of each family. ANOVA is useful when we cannot easily
determine the dominant family from the multivari charts.1
We expect the causes in some families to act systematically. For example, if the domi-
nant cause acts in the cavity-to-cavity family, we will see large differences in the average
output across the cavities. In other families such as the part-to-part, we expect the causes to
act haphazardly. We would be surprised to see large differences between the output averages
(averaged over all cavities, times, and molds) of the first and second part sampled in different
hours.2 The effects of causes that change from part to part will likely produce haphazard
variation. We need to plot the data carefully to avoid masking this type of variation.
We provide three examples of multivari investigations to illustrate the sampling plan,
the presentation of the data, and the interpretation of the results.
H1212ch11.qxd 3/31/05 10:43 AM Page 151
0.04
0.02
0.00
Side shift
–0.02
–0.04
–0.06
1 2 3 4 5
Day
Excessive side
shift
In order to assess the variation contributed by each of these families, they used the
following sampling protocol that required tracking and measuring 96 parts:
1. Sample and measure four parts cast consecutively from each pattern.
2. Repeat step 1 for castings poured at hours one, three, and five of the day shift.
3. Repeat step 2 on two consecutive days.
With this sampling protocol, the team could isolate the contributions to the baseline
variation from each family in the diagnostic tree. For example, sampling four consecutive
parts showed whether the variation was dominated by a cause that changed quickly from
one part to the next, such as the relative positioning of the two halves of the mold onto the
guiding pins and bushings.
From the baseline investigation, the team expected the full extent of variation to occur
within the two days. The major difficulty in implementing the plan was tracing castings
molded consecutively. The team marked the sand molds so that they could find these castings
at the end of the shakeout and cleaning process.
Side and end shift for the 96 measured parts are stored in the file cylinder head scrap
multivari. For side shift, we see close to the full extent of variation in the histogram shown
in Figure 11.4, so we know the dominant cause acted during the investigation.
We look at a variety of plots. To simplify the data display, we start by creating a new input
called time that combines hour and day. See the section on multivari charts in Appendix C.
In Figure 11.5, the lines join the average side shift at the six times and the dots give the
16 side shift values at each time. We recommend that you always select the option (unfor-
tunately not the default) to plot the individual values on MINITAB multivari charts. From
Figure 11.5, we see:
• The time averages are almost constant; that is, there are no systematic differences
in the side shift between days or hour to hour within days.
• The variation of side shift within each time is close to the full extent of variation.
H1212ch11.qxd 3/31/05 10:43 AM Page 153
20
Percent
10
Figure 11.4 Histogram of side shift values from the multivari investigation (full extent of
variation in the baseline given by dashed lines).
The observed pattern cannot be explained by a cause that changes slowly from time to
time so we are tempted to eliminate the hour-to-hour and day-to-day families. We wait to
do so until we look for possible interactions with the other families.
In the left panel of Figure 11.6, we look at a multivari chart with side shift stratified by
pattern. From the chart, we see large systematic differences among the patterns. The variation
within each pattern is only about 60% of the full extent of variation. The differences in the
pattern averages contribute substantially to the baseline variation. We conclude that the
pattern-to-pattern family contains a dominant cause.
To deal with the part-to-part family, we need to recognize that variation from consecu-
tive molds is likely to be haphazard. We create a new input called group with a different
value for each combination of the four patterns and six times. There are four side shift values
within each of the 24 groups corresponding to the consecutive castings in that group.
0.02
0.01
0.00
Side shift
–0.01
–0.02
–0.03
–0.04
1 2 3 4 5 6
Time
0.02 0.02
0.01 0.01
0.00 0.00
Side shift
Side shift
–0.01 –0.01
–0.02 –0.02
–0.03 –0.03
–0.04 –0.04
Figure 11.6 Multivari charts for side shift versus pattern on left, side shift versus group
on right.
In the multivari chart in the right panel of Figure 11.6, the first six groups correspond
to the six time points for the first pattern. The next six correspond to the second pattern, and
so on. The chart shows that the variation within each group, as represented by the four dots,
is small relative to the baseline variation. Alternatively, we see that the side shift averages
vary substantially from group to group. The pattern-to-pattern family is included in the
group-to-group family, which explains the group-to-group differences in side shift average
seen in the right panel of Figure 11.6. We also see that the within-group variation is roughly
constant over all sampling periods, which means that there is no evidence of interaction
between the part-to-part family and the other families of causes. We eliminate the part-to-part
family from further consideration.3
We now look for possible interactions with the pattern-to-pattern family by constructing
multivari charts with pattern and time. In Figure 11.7, we show the side shift averages and
individual values for each time and pattern. The effects of the time family are small for each
Time
0.02 1
2
0.01 3
4
0.00 5
Side shift
6
–0.01
–0.02
–0.03
–0.04
1 2 3 4
Pattern
Figure 11.7 Multivari chart of side shift versus pattern and time.
H1212ch11.qxd 3/31/05 10:43 AM Page 155
In Chapter 7, we introduced a project to reduce scrap and rework due to excess variation in
the diameter of camshaft journals. At final 100% inspection, the four journal diameters
(the maximum diameter as the part was rotated) were measured at two locations, front and
rear. The diameter specifications were ± 12.5 microns, measured from a target value. There
were also specifications on out-of-round and taper for each journal. Camshafts that did not
meet the specifications at the final gage were sent to a rework station. There the parts were
remeasured and either scrapped or reworked. At the beginning of the project, the average
monthly reject rate at the final gage was 4.7% of which 1.5% was scrap. Management set
a goal to cut both these rates in half.
The team focused on journal diameter variation. There was no data storage at the final
inspection gage. To establish a baseline, the team had two or three camshafts per hour set
aside before being measured at the final gage to get a sample of 20 parts per day. At the end
of the day, an operator measured and recorded the diameters for the four journals and two
locations on each camshaft. This process was repeated for five days. The 800 measured
values (from 100 camshafts) are stored in the file camshaft journal diameter baseline. The
average diameter was 2.41 microns, the standard deviation was 5.00 microns, and the
corresponding capability (Ppk) was 0.67. The operators had centered the process above zero
to avoid scrap at the expense of rework. There were six parts needing rework and one
scrapped in the baseline sample. The run chart of diameter over time did not show any clear
patterns. The full extent of variation was –12.5 to 17.5 microns. The team set a goal to
improve the process capability (Ppk) to more than 1.20.
The investigation of the final gage, described in Chapter 7, revealed that there was a sig-
nificant bias in the head that measured the front diameter on the first journal. The team
removed this bias with the expectation that the process variation would be reduced. They
decided to repeat the baseline investigation before proceeding. The data are stored in the file
camshaft journal diameter baseline2 and the performance is summarized in Figure 11.8. The
average, standard deviation, and Ppk for the second baseline investigation were 2.55, 4.53,
and 0.73, respectively. Five out of the 100 parts required rework because of oversize
diameters. The full extent of variation in the new baseline was –11.8 to 15.9 microns.
Before choosing a working approach, the team decided to look for a dominant cause of
journal diameter variation. We give a high-level process map in Figure 11.9.
H1212ch11.qxd 3/31/05 10:43 AM Page 156
10
Percent
5
–10 0 10 20
Diameter
Grinder A
Lapper 1 Lapper 2 Final gage
Grinder B
Blank parts arrived in batches from a supplier and were haphazardly assigned to one of
the two grinders. The grinders used a feedback control system in which 1 part in 10 was
measured (journal 1 only) and the grinders were adjusted if the diameter was out of specifi-
cation. The within-process specification limits at the grinder were set based on the assumption
that the lappers would reduce the diameter by 24 microns. Each grinder had its own gage.
The team planned a multivari investigation to examine the following families:
Family Description
Position-to- Causes that explain differences in diameter from one position to
position the other within the same camshaft
(within-part)
Grinder-to- Causes that explain differences in diameter between grinders
grinder
Part-to-part Causes that explain differences in diameter among consecutive camshafts
Hour-to-hour Causes that explain differences in diameter from one hour to the next
Batch-to-batch Causes that explain differences in diameter from one batch of incoming
blanks to the next
They decided to use three batches (about one day’s production) of blank parts and to
sample parts once every two hours. They selected five camshafts from each grinder within
H1212ch11.qxd 3/31/05 10:43 AM Page 157
8
7
Percent
4
–10 0 10
Diameter
Figure 11.10 Diameter variation in the multivari (dashed lines give the full extent of
variation).
an adjustment cycle (part numbers 1, 3, 5, 7, and 9). These parts were specially marked so
that they could be found at the final gage. They planned to collect 120 camshafts and make
960 diameter measurements in total. The team carried out the plan without any difficulties.
The data are recorded in the file camshaft journal diameter multivari. We see the full
extent of variation in the histogram shown in Figure 11.10. We know the dominant cause
acted during the investigation.
We give the multivari charts for position, grinder, and the combined hour and batch
families in Figure 11.11. We expect these families to show systematic differences in the
averages if they contain a dominant cause. There is a large difference between the two
grinder averages indicating a possible dominant cause. There is no evidence of a dominant
cause acting in the other families.
Since a dominant cause appears to act in the grinder-to-grinder family, we look for an inter-
action with the position or batch/hour families using the multivari charts in Figure 11.12.
Since the effect of grinder does not appear to depend on position or batch/hour, we see no evi-
dence of any interactions.
We expect the part-to-part variation to be haphazard.4 To display the magnitude of
this variation, we create a new input called group that indexes the 192 sampling points
(8 positions by 2 grinders by 3 batches by 4 hours). Since there appears to be a large
cause in the grinder-to-grinder family, we define the group input so that the first 96 values
correspond to grinder A and the second 96 values correspond to grinder B.
We give the multivari chart of diameter by group in Figure 11.13.
This chart is difficult to interpret because there are so many groups. However, we can
see that the variation around the averages is large, which indicates substantial part-to-part
variation. The magnitude of this variation is roughly the same for each group of consecu-
tive diameter measurements at a particular position and from grinder to grinder, which indi-
cates a lack of interaction. We find ANOVA useful here to quantify the effects and augment
the multivari charts.5 We conclude there are dominant causes within the grinder-to-grinder
and part-to-part families.
The team concentrated their efforts on these two families in the search for the dominant
causes. The team was surprised by the grinder-to-grinder differences. They had not seen this
H1212ch11.qxd 3/31/05 10:43 AM Page 158
10 10
Diameter
Diameter
0 0
–10 –10
1f 1r 2f 2r 3f 3r 4f 4r A B
Position Grinder
Batch
1
10 2
3
Diameter
–10
1 3 5 7
Hour
1 2 3 1 2 3
Position Hour
10 1f 1
10
1r 3
2f 5
Diameter
Diameter
2r 7
0 3f 0
3r
4f
4r
–10 –10
A B A B
Grinder Grinder
Figure 11.12 Multivari charts of diameter by grinder versus position and batch/hour.
H1212ch11.qxd 3/31/05 10:43 AM Page 159
10
Diameter
–10
1 96 192
Group
difference in regular production data because normally there was no traceability to the
grinder.
Fascia Cratering
Management assigned a team to reduce scrap and rework on front and rear fascias because
of cratering. This was a new product, and during the startup period the reject rate was 8.9%.
The goal was to reduce this rate to less than 1%. The crater defects were not visible directly
after molding but they could be clearly observed after the fascia was primed. The crater
rejects occurred over all shifts and were haphazardly distributed over time. The team developed
and checked a measurement system to count the number of craters after priming.
In a baseline investigation, the team found that about 80% of the fascias had no craters,
10% had fewer than 25, and the remaining 10% had more than 25 craters. They decided to
search for a dominant cause of the craters. From the baseline sample, the team constructed
a defect concentration diagram that revealed that 75% of the craters were located in the
front half of the fascia. However, the team could not see how to use this knowledge to rule
out any family of causes.
The fascias were produced on two different molding machines. The team planned a mul-
tivari investigation to examine the fascia-to-fascia, machine-to-machine, and time-to-time
families. They sampled five consecutive fascias from each of the two machines every four
hours. They continued the sampling for three shifts (24 hours) resulting in a sample of 60
fascias. To keep track of the consecutive parts and machine numbers, they marked each fascia
on the inside immediately after molding.
H1212ch11.qxd 3/31/05 10:43 AM Page 160
70 70
60 60
50 50
Craters
Craters
40 40
30 30
20 20
10 10
0 0
1 2 1 2 3 4 5 6
Machine Time
70
60
50
Craters
40
30
20
10
1 2 3 4 5 6 7 8 9 10 11 12
Group
Figure 11.14 Multivari chart from first fascia cratering multivari investigation.
The data are given in fascia cratering multivari. The distribution of the number of
craters matches the baseline closely. To look at the fascia-to-fascia (part-to-part) family, we
create a new input group, with a different value for each combination of machine and time.
We give multivari charts for the time-to-time, machine-to-machine, and fascia-to-fascia
families in Figure 11.14.
There is no evidence that the dominant cause acts in the machine-to-machine or
time-to-time families. There is large fascia-to-fascia variation in some of the groups and
little in others.
An observant team member noticed that there were only two values plotted for each
group of five fascias. When the team examined the data more closely, they saw that there
was only one fascia in each group with a value different from 0. This was a very strong clue
about the dominant cause because the team could only think of one process step where
something happened only once in every five parts. The operator sprayed mold release every
10 pieces. After five pieces he applied a minor spray. Mold release was used to prevent tear-
ing when the fascia was removed from the mold.
In a follow-up investigation, the team sampled 10 consecutive fascias in the molding
sequence at three different times in the morning shift. The data are given in the file fascia
cratering multivari2. The team confirmed mold release as the dominant cause of cratering
as shown in the Figure 11.15. With knowledge of the dominant cause, the team needed to
choose a working approach. We continue the story in Chapter 14.
H1212ch11.qxd 3/31/05 10:43 AM Page 161
Molding sequence
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
80 Machine
1
70
2
60
Craters
50
40
30
20
10
Figure 11.15 Multivari chart from second fascia cratering multivari investigation.
Note that in this example the fascia-to-fascia family of causes results in systematic varia-
tion and not haphazard variation as in most applications where we have a part-to-part family.
Question
For the current process, which of the selected families is home to a dominant cause?
Plan
• Specify a study population long enough to capture the full extent of variation.
• Select a sample using a systematic sampling protocol designed to capture the variation due
to the chosen families.
• Spread out over the study population.
Data
Record the output value and the corresponding families, one output per row.
Analysis
• Plot a histogram of the data. Check that the range of output values covers most of the full
extent of variation.
• For families likely to have systematic effect, construct the single-family multivari charts, plotting
individual output values.
H1212ch11.qxd 3/31/05 10:43 AM Page 162
• For the part-to-part family (or other families expected to have haphazard effect), create a new
input group so that the effect of the family is seen in the variation within each value of group.
Construct the multivari chart with group.
• For each family identified as having a large effect, construct two family multivari charts to
look for evidence of interaction.
Conclusion
• Identify the dominant family (or families if the dominant cause acts in two or more families).
Comments
A multivari investigation is a powerful tool for assessing the contribution of families defined
based on time or location.
In planning a multivari investigation, keep the number of families to five or fewer so that
the sampling plan and analysis are not too complex. The camshaft journal diameter multivari
illustrates the difficulties.
In a multivari investigation, we do not use random sampling; instead we select parts
deliberately to estimate the effects of the specified families. We may need to sample addi-
tional parts beyond those collected for the normal control or monitoring of the process.
For any family in a multivari investigation, we need traceability. In the cylinder head
example, we were interested in the mold-to-mold family, that is, variation in consecutive
pieces from the same pattern. To assess this family, we must be able to identify the heads pro-
duced consecutively from each pattern when we measure the output. If we cannot trace parts
through the process easily, then for time-based families, we need first-in, first-out discipline at
each process operation so that the time sequence is not lost when we select parts at the end of
the process.
Multivari investigations are not effective for binary output characteristics unless the
rate of defectives is high. In the block leaker project, introduced in Chapter 1, the baseline
rate of leaking blocks was 2.2%. The team initially conducted a multivari investigation in
which five consecutively poured blocks were tracked through the finishing process every
two hours for several days. After 160 blocks, they found only one leaker. There was no useful
information in the multivari investigation for this rare defect.
Sometimes the recommended analysis process fails because we do not find a dominant
family when we look at one family at a time. In this case, we recommend constructing all
two-family multivari charts and looking for interactions.
If the multivari charts fail to reveal the dominant family, we suggest a formal analysis of
variance (ANOVA) to quantify the relative contributions of each family.6 We may then decide
to search for causes within one or more of the families with the largest contributions, or we
may abandon the partition of causes in these families and start over with a new set of families.
H1212ch11.qxd 3/31/05 10:43 AM Page 163
Most manufacturing processes consist of several steps. We can form families of causes
based on inputs that change within each of these steps. In Chapter 10 we looked at some
examples where the process was split into two families, causes acting upstream and down-
stream of a particular point in the process. Here we consider applying this idea repeatedly.
We consider two types of plans:
• Measure the output characteristic on the same parts after each processing step.
• Allow some parts to skip processing steps and measure the final output
characteristic.
The first type of plan can be applied more commonly than the second. Consider the following
examples.
V6 Piston Diameter
In Chapter 5, we described a problem to reduce variation in the diameters of machined alu-
minum pistons. In the baseline investigation, the final diameter after Operation 310 varied
between 581 and 601 microns, measured from a fixed value. In searching for a dominant
cause, the project team decided to define families based on a number of processing steps as
shown in Figure 11.16. The processing steps in Figure 11.16 include all those that affect
diameter.
To eliminate as many processing steps as possible, the team tracked 96 pistons through
the process, measuring the diameter after Operation 200 and after each of the steps shown
in the process map. In total they measured each piston diameter six times. The team carried
out the investigation over three days, 32 pistons at a time. During this period, the process
operators collected the marked pistons for measurement after each operation.
From the baseline investigation, the team expected that they would see close to the full
extent of variation over the three days. They decided to use only 32 pistons each day to
limit the disruption to the normal operations. The team used the in-process gages for the
measurements after Operation 200 to Operation 260 and the Operation 310 gage for the final
three measurements.
Operation 210 Operation 260 Operation 270 Operation 290 Operation 310
600 600
Diameter after OP310
590
580
590 595 600 590 595 600 605
Diameter after OP290 Diameter after OP270
605 630
Diameter after OP270
620
595
590 610
760
Diameter after OP210
750
740
Figure 11.17 Scatter plots of outgoing versus ingoing diameter (by operation; dashed
horizontal lines give the full extent of variation).
The team converted all measurements, given in the file V6 piston diameter variation
transmission, to the same units as at the final gage. The team produced the scatter plots of
the incoming and outgoing diameter for each operation, as shown in Figure 11.17. They
added horizontal lines to show the full extent of variation to the plot that shows the final
diameter.
We interpret the scatter plots in Figure 11.17 starting from the end of the process. From
the top left diagram, we see the full extent of variation in the final output. We also see that
most of the variation in the final diameter is transmitted through Operation 310. If we could
H1212ch11.qxd 3/31/05 10:43 AM Page 165
eliminate the variation in the diameter after Operation 290 so that the output is, say, 595,
then from the graph we see there would be very little variation in the diameter after Operation
310. That is, little diameter variation is added in Operation 310; most of the variation is trans-
mitted from upstream. We can eliminate Operation 310 as the home of a dominant cause. From
the top right plot, we see similarly that most of the variation in the diameter after Operation
290 is due to variation at Operation 270 and, hence, we can also eliminate Operation 290.
The left plot in the second row of Figure 11.17 has a different pattern. At Operation 270
there is substantial variation added to the process. If the diameters after Operation 260 were
fixed at, say, 620, there would still be substantial variation after Operation 270. Hence we
eliminate all operations upstream of Operation 270 and conclude the dominant cause of the
variation in the final diameter is in the Operation 270 family. Next, the team focused their
efforts on Operation 270 to isolate the dominant cause of variation.
In this example, it was possible to measure the diameter of each piston at various
points throughout the process. The pistons were traceable through the process by imple-
menting special measures during the investigation. Also, in this example, the team measured
the diameter after each operation in a single investigation. They could have adopted
another strategy and, for example, measured the diameter after Operation 270 and after
Operation 310 only. Then, they would repeat the investigation, splitting the remaining
family. This way they would compare only two families at a time, as in Chapter 10. In the
example described here, it was easier to track pistons through in a single investigation
process.
We can use regression analysis to quantify the amount of variation added and transmitted
at each stage if the pictures do not tell a clear story.7
The team decided to search for a dominant cause by first isolating where in the process the
defect occurred. They divided the causes into five families corresponding to the major pro-
cessing steps.
In the first investigation, the team masked three panels until after the e-coat step. Three
other panels were processed normally. The defect occurred on all six panels. The team
eliminated the incoming metal, the phosphate step, and the e-coat step from consideration.
In a second investigation, they processed another six panels normally up to the end of the
e-coat step. Three of the panels were processed through the primer stage but not the color
coat. The other three were not primed but color coated as normal. Clear coat was applied to
all the panels. Again the defect was present in all six panels. The team concluded that the
primer and color coat steps could be eliminated and hence the clear coat process was the
home of the dominant cause. This was not surprising given the first solution to the problem.
The two major substeps within the clear coat process were paint application and oven
cure. To split these two steps, three panels were removed from the test car after painting and
were cured in a laboratory oven. None showed the defect. The remaining three panels were
processed normally and as expected, all showed the defect. The dominant cause lived in the
clear coat cure oven. The team continued to split the process using similar trials. Eventually,
they isolated the airflow from exhaust ducts within a particular zone of the oven as the dom-
inant cause of the line defect.
The team then carefully tested a minor modification to the ducts that would change the
airflow pattern over the roof of painted vehicles. Twenty vehicles were painted in the modi-
fied process with high clear coat film build. There were no line defects and no other notice-
able negative side effects. The change was made permanent. The clear coat film build was
increased with a marked improvement in appearance. There were no occurrences of the line
defect in the new process.
In this problem, the team repeatedly split the process and homed in on the dominant
cause. Because of the nature of the process and the defect, they could skip or alter steps of
the process and still produce painted panels with or without the defect. This would not be
possible in many processes.
Question
For the current process, which process step is the home of a dominant cause?
Plan
• Specify a time frame over which we expect to see the full extent of variation in the output.
• After the first process step, select a sample of 30 or more parts spread over the time frame.
H1212ch11.qxd 3/31/05 10:43 AM Page 167
• Measure the characteristic corresponding to the output on each part in the sample after each
process step.
Data
Record the measured output values, one row for each part.
Analysis
• For each process step (after the first), plot the output after the step versus the output before
that step.
• For the plot showing the output from the final process step, add horizontal lines showing the
full extent of variation.
Conclusion
• If the full extent of variation is not observed, the process step containing the dominant cause
cannot be identified.
• Starting at the last process step, eliminate any step where most of the variation is transmitted.
The first process step where the variation is not transmitted is the home of the dominant cause.
Headrest Failure
Management initiated a project to reduce the frequency of customer complaints about seat
headrests that would not stay in a set position. In focusing the problem, the team demon-
strated that the problem could be solved by eliminating headrests that required a force of
less than 35 Newtons (N) to move through their full range of motion.
H1212ch11.qxd 3/31/05 10:43 AM Page 168
The team carried out a baseline investigation. To meet their goal, they wanted to reduce
variation in force rather than increase the average force to avoid headrests that would be
very hard to move; that is, they wanted to change only those headrests with low output.
They assessed the force measurement system and found that there was little contribution
from this system to the baseline variation. They decided to search for a dominant cause of
force variation.
The team selected two seat assemblies, one with high force (70.3 N) labeled H, and one
with low force (25.8 N) labeled L, relative to the baseline. The seat assembly consists of a
headrest and a seat. To assess the assembly process (as in Chapter 10), the team removed
and reinstalled each headrest three times in its original seat. The force was measured after
each reinstallation. The data are:
Because there were only small changes after reinstallation, the team ruled out the
assembly process and concentrated on the components. Next, they swapped the two headrests
between the seat assemblies and measured the force. For this and subsequent component
swaps we display the data in a table where:
• Diagonal cells contain the force measurements from the assembly-reassembly
phase.
• Off-diagonal cells contain the force measurements after the components are
swapped.
For the example the data are:
Headrest
Seat H L
H 70.3, 72.5, 74.2, 71.9 74.6
L 26.7 25.8, 25.9, 24.3, 25.4
Changing the headrest does not change the force. We say the performance follows the
seat, since the seat from the original high (low) force assembly results in high (low) force
even with a different headrest. The team eliminated the headrest family and looked at the
seat in more detail.
The seat was assembled from three components: guides, springs, and the actual seat.
The team disassembled and reassembled the two seats into the three components three
times and measured the force. The data are shown in the diagonal cells of the following
H1212ch11.qxd 3/31/05 10:43 AM Page 169
table. The team eliminated the assembly process for the three components as the home of the
dominant cause. During the seat disassembly-reassembly investigation, the team noticed
differences in the shape of the springs in the two assemblies. They next swapped the
springs in the two seat assemblies and measured the force. The data are given in the table.
Spring
Other components H L
The dominant cause acts in the spring family of causes. The diagnostic tree in Figure 11.19
summarizes the search for a dominant cause.
Excess headrest
force variation
Measurement
Rest of process
system
Headrest Seat
Assembly Components
Headrest
Springs Guides
hardware
The team worked with the supplier to understand the differences in the springs. They
discovered that the supplier had reworked the springs in the low-force seats. Together, the
supplier and customer developed a change of the rework process so that the force associated
with the reworked springs exceeded 35 N. Customer complaints about the headrest moving
were virtually eliminated.
There were frequent complaints about a noise fault called buzz in power window regulators
(the motor and linkage that opens and closes the window). The customer, an assembly
plant, demanded that the supplier implement 100% inspection to prevent noisy regulators
from reaching the assembly plant. Management assigned a team the goal of reducing or
eliminating the defect so that the costly inspection could be removed.
Trained listeners measured buzz subjectively on a scale of 1 to 7. They convinced their
customer that if the buzz score was less than 4 on all regulators, then they could eliminate
the 100% inspection.
The team did not carry out formal baseline or measurement system investigations. They
had informal baseline information from past experience since they knew the buzz noise
level varied between 1 and 7. They could ensure the dominant cause was acting in any
investigation by selecting regulators with noise levels at the extremes. The decision not to
check the measurement system was more risky. Large differences in measured noise for
equally noisy regulators would make finding a dominant cause difficult. To address this
concern, the team used a single listener to assign the noise levels for all regulators in the
subsequent investigations.
They decided to search for the cause of the buzz. They divided the causes into two
families, assembly and components. They selected a noisy regulator (score 7) and a quiet
regulator (score 1). They could take the regulator apart without damaging or changing
any of the major components. There was no change in buzz score when they disassembled
and reassembled the two regulators three times each using the normal assembly
sequence. The team eliminated the assembly process and concentrated on the components
family.
The regulator has six components: arm one, arm two, spring, motor, back-plate, and
sector. The team felt the motor was the likely home of the dominant cause of noise.
They swapped the motors in the two regulators and measured the noise. The motor with
the components from the originally quiet regulator had a score 7 and the motor with the
other noisy components had a score 1. Indeed, the dominant cause of buzz lived in the
motor.
The motor was a purchased part and comprised 18 components that could be disas-
sembled without damage. The team formed two groups of nine components each,
labeled G1 and G2, based on their limited knowledge of how the motor worked and the
ease of disassembly. They did not know the details of the assembly process used by the
motor supplier, so they ignored the motor assembly family for the moment. They next
swapped all of the G1 components simultaneously and remeasured the buzz. The
results were:
H1212ch11.qxd 3/31/05 10:43 AM Page 171
G1
G2 Noisy Quiet
Noisy 7 7
Quiet 2 1
The entries 7 and 1 on the diagonal are the initial buzz measurements from the two original
regulators. The off-diagonal entries are the buzz measurements for the regulators with all
components in G1 swapped. Note that to measure buzz, the motors were assembled
together with the other five components from the original quiet regulator. Since all compo-
nents other than the motor had been eliminated as possible homes of the dominant cause,
the choice of housing for the motor was not important.
The buzz followed G2 so all components in G1 were eliminated from consideration.
Now G2 was split into two groups, G21 and G22, and the team found that the dominant
cause lived in G22, a group of four components, three gears in the drive train, and the
armature shaft.
When the first and second gears were swapped together, the buzz only occurred for the
combination of all the G22 components from the originally noisy motor. The data were:
First and
second gears
Gear 3 and
armature Noisy Quiet
Noisy 7 3
Quiet 2 1
Now the picture is less clear. This pattern suggests a dominant cause involving one (or both) of
the first and second gears, and one (or both) of the third gear and the armature. In other words,
the dominant cause involves one or more components in each of the two remaining families.
The team returned the components to their original motor. They next simultaneously
swapped the second and third gears with the results:
Second and
third gears
First gear and
armature Noisy Quiet
Noisy 7 1
Quiet 7 1
The buzz followed the second and third gears together. The dominant cause of buzz is an
interaction between these two gears. Swapping only one of these two gears does not pro-
duce a complete switch in the buzz. We show the diagnostic tree for the search to this point
in Figure 11.20.
Regulator buzz
172
H1212ch11.qxd
3/31/05
Regulator Regulator
assembly components
Page 172
The team focused their search for the dominant cause on the second and third gears.
They approached the motor supplier with their findings. The supplier provided a new qui-
eter type of motor to solve the problem. Given that this solution does not require an under-
standing of the cause of motor noise, the team would have been better off talking with the
supplier much earlier.
G1
G2 Noisy Quiet
Noisy 7, 6, 7, 6
Quiet 1, 2, 2, 1
The first numbers in the diagonal cells correspond to the original values of the extreme
assemblies that we selected for the investigation. When we repeatedly disassemble and
reassemble, we see the variation due to the assembly process. In this case, we eliminate the
assembly process as the home of the dominant cause.
We obtain the off-diagonal elements by swapping the components in G1 between the
two assemblies. If the performance follows one of the two groups of components, the inter-
pretation of the results is straightforward. For example, we might have the data:
G1 G1
G2 Noisy Quiet G2 Noisy Quiet
Noisy 7, 6, 7, 6 1 or Noisy 7, 6, 7, 6 7
Quiet 7 1, 2, 2, 1 Quiet 2 1, 2, 2, 1
In the left table, G1 is the home of the dominant cause; in the right table, the dominant
cause acts in G2. In other cases the performance does not follow either group. For instance,
consider the following four possible results:
H1212ch11.qxd 3/31/05 10:43 AM Page 174
G1 G1
G2 Noisy Quiet G2 Noisy Quiet
Noisy 7, 6, 7, 6 1 Noisy 7, 6, 7, 6 4
Quiet 1 1, 2, 2, 1 Quiet 4 1, 2, 2, 1
G1 G1
G2 Noisy Quiet G2 Noisy Quiet
Noisy 7, 6, 7, 6 7 Noisy 7, 6, 7, 6 4
Quiet 4 1, 2, 2, 1 Quiet 1 1, 2, 2, 1
In all of these cases the dominant cause involves (at least) one component from both G1 and
G2. We need to re-form the groups to determine where the dominant cause lies. We discuss
the situation where a dominant cause involves two or more components in more detail later
in this section. We also explore the general issue of dealing with a dominant cause involving
two (or more) inputs in Chapter 14.
Summary
Suppose we have:
• Two assemblies with opposite and extreme output relative to the full extent of
variation
• Assemblies that can be disassembled and reassembled without damage
• Elimination of the assembly family as the possible home of a dominant cause
(see Chapter 10)
We search for the component that is the home of the dominant cause with a series of
small experiments where we divide the components into two groups and swap one group.
We analyze the data by looking at two-way tables. We assume that a dominant cause lives
with one or at most a pair of components. If we swap this component or pair, the output will
move across most of its full extent of variation.
The component swap procedure has two parts. In the first we assume the dominant
cause is in a single component family.
G1 low G1 high
G2 low G2 high
G1 high G1 low
G2 high G2 high
The procedure is more complicated if the dominant cause involves two components.
This procedure will fail if there is no dominant cause or if the dominant cause involves
three or more components.
Comments
A key requirement for component swap plans is that the product can be disassembled and
reassembled without damaging the parts. To avoid study error, the reassembly process
should match the assembly process in normal production as closely as possible. Once we
H1212ch11.qxd 3/31/05 10:43 AM Page 176
have ruled out the assembly process, we carry out the component swap investigation offline
to avoid interference with regular production.
By exploiting leverage, we use only two assemblies chosen to reflect the full extent
of variation. There is a risk we may select assemblies that are extreme due to a different
failure mode, and hence with a different dominant cause, than that of the problem we are
trying to address. To alleviate this risk we recommend confirming the conclusion with an
experiment that uses several extreme assemblies. For instance, to confirm the conclusion
in the headrest failure example, we can put good springs in five seat assemblies that orig-
inally required less than 35 Newtons force to move through their full range of motion.
If the new springs increase the force above 35 Newtons, we confirm the spring as the
dominant cause.
Since all components that make up the housing have been eliminated as a possible home
of the dominant cause, we can use a single housing for further disassembly-reassembly and
component swap investigations. We employed this idea in the power window buzz noise
example. Note that at the start of an investigation there is often no housing because all
components are suspect.
In the recommended component swap procedure, we assume the assembly family has
been eliminated using an investigation where we disassembled and reassembled down to the
individual components. One advantage of this plan is that we can use the production assembly
process. Alternatively, suppose we divide the assemblies into a number of subassemblies
(groups of components). In that case, we start by disassembling and reassembling down to
the subassembly level. Then, if this assembly process is eliminated, we apply the component
swap procedure to the subassemblies. A complication arises if the proposed component swap
procedure indicates the dominant cause is an interaction between two groups of sub-
assemblies. This interaction could be due to either components or the assembly process for
the subassemblies.
To illustrate, consider the window buzz noise example. The team found the dominant
cause acted in the motor, which had 18 individual components that could be grouped into
two subassemblies. Suppose the team started by disassembling and reassembling each
extreme motor into the two subassemblies and that this part of the assembly family was
eliminated. Next, the team would swap one of subassemblies between the two motors.
Suppose the dominant cause was found to act within one of the subassemblies. Then the
next step would be to check the assembly family for that subassembly before proceeding to
swap components from the identified subassembly.
There are many other component swapping plans.8 For example, we can use more than
two groups of component groups at each stage.9 We strongly recommend dividing the
remaining components into only two groups at each stage to keep the procedure simple and
to maximize the number of components eliminated with each swap.
H1212ch11.qxd 3/31/05 10:43 AM Page 177
Key
Points
• We use a multivari investigation to compare time and location families. We
design a systematic sampling protocol to isolate the contribution of each family
to the baseline variation.
• We use a variation transmission investigation to eliminate process step families.
We trace parts through the process and measure the output characteristic after
each process step.
• We use component swap to eliminate families related to an assembly process
and the corresponding components. We repeatedly disassemble, reassemble, and
swap components in an organized way.
12
Investigations Based
on Single Causes
I
n this chapter, we look at investigations to search for a dominant cause based on single
causes, that is, individual varying inputs. These plans are not particularly useful early
in the search because they do not result in the elimination of families with a large num-
ber of causes. However, we can use these investigations for available data or when the fam-
ily of remaining suspects is small. With these plans, we measure the output and selected
suspects for a number of parts.
Large variation in a measurement system for an input can mask a dominant cause. Ideally,
we would assess the measurement systems used to determine these inputs—see Chapter 7 for
details. This is often not done, either because of prior experience with the measurement system
or due to lack of resources.
In this chapter, we distinguish between plans for continuous and binary outputs because
they lead to different analysis tools.
179
H1212ch12.qxd 3/31/05 10:43 AM Page 180
0.20
Location 3 thickness
Location 4 thickness
0.25
0.20 0.15
0.15
0.10
Figure 12.1 Box plots of wall thickness by block type at locations 3 and 4.
Window Leaks
Based on the results of customer surveys, the management of a truck assembly plant iden-
tified rear window water leaks as a substantial customer concern and assigned a team to
address the problem. In the plant, 10 trucks per day were leak-tested in a special chamber
that simulated extreme conditions. Using one month’s data to establish a baseline, the team
found that the rear window leak rate was 8% in the aggravated test. The team assumed that
if they reduced this rate there would be substantial reduction in customer complaints. Using
Pareto analysis by location, the team identified the upper ditch as the source of about 50%
of the leaks. They set the problem goal to eliminate upper ditch rearindow leaks.
The team showed that the measurement system was acceptable for classifying trucks as
either leaking or not and for determining the location of the leak. However, the system
could not consistently measure the severity of the leak. For this reason, the team used the
H1212ch12.qxd 3/31/05 10:43 AM Page 181
measurement system only to classify a truck as a leaker or a nonleaker of the upper ditch
rear window.
The team decided on a group comparison with eight leakers and eight nonleakers. They
selected eight trucks that had failed the leak test with an upper ditch leak. Obtaining trucks that
did not leak was more difficult since there was great pressure to immediately ship any good
truck. The team found eight nonleakers from trucks that had been set aside for other problems.
They measured nine input characteristics thought to be related to water leaks. The data
are given in the file window leaks comparison. The team plotted the data by group for each
of the nine suspects. We show the result for primary seal fit in the left panel of Figure 12.2.
There was no clear separation between leakers and nonleakers for any input. They also cre-
ated scatter plots for all pairs of suspects with a different plotting symbol for leakers and
nonleakers.1 The plot for primary seal gap and quality of plastisol application, in the right
panel of Figure 12.2, shows that all leaking trucks had both a large primary seal gap and
poor plastisol application, while that combination never occurred for nonleaking trucks.
The team concluded that the dominant cause of upper ditch leaks was the combination
of a large primary seal gap and poor plastisol application. In addition, they eliminated the
other seven inputs from consideration. This conclusion was based on a small number of
trucks and required verification.
1.5 1.5
0.5 0.5
Leaker Nonleaker 3 4 5 6
Class Plastisol amount
Figure 12.2 Plot of primary seal fit by group (left) and by plastisol amount (right).
Question
In the current process, which, if any, of the suspects is a dominant cause?
Plan
• Select a group of at least six parts for each of the two output values.
• Measure all suspects for each part.
H1212ch12.qxd 3/31/05 10:43 AM Page 182
Data
Record the output (group) and input values, one row for each part.
Analysis
• Plot the values of each suspect stratified by group.
• If no clear dominant cause is evident, create scatter plots of all possible pairs of inputs using
different plotting symbols for each group.
Conclusion
• If there is clear separation of the suspect values between the groups, the suspect is the
dominant cause.
• If there is a separating line on a scatter plot that divides the groups, a combination of the two
suspects is the dominant cause.
Comments
Group comparison is especially useful if the problem is defined in terms of a rare defect.
Low-frequency problems are difficult to solve because it is hard to get information about
the dominant cause of the defect. We can collect the two groups of parts over time until we
have sufficient numbers to make the comparison.
We sometimes try a group comparison early in the search for a dominant cause where
we measure many continuous characteristics on each part in the two groups. If we can find
one characteristic that separates the defectives and the good parts, we can reformulate the
problem in terms of the continuous characteristic. The reformulated problem may be much
easier to solve because we can now see variation in the output on each part.
We can use group comparison if the output is binary or if the output is continuous and
we select parts from the extremes to get the full extent of variation. In this instance, we create
two groups and ignore the measured values of the output. We need to be careful if we use
leverage in this way. One danger is that the extreme parts used in the investigation may be
due to different failure modes and thus have different causes. This highlights the impor-
tance of focusing the problem so that there is only a single dominant cause. Suppose, in the
engine block leaks example, the team had not focused the problems based on the location
of the leak. If they had then carried out the group comparison described previously, they
would not have seen large differences in wall thickness between the two groups, because
the leakers group would likely contain blocks that leak at other locations.
In a group comparison, the groups are formed using the output. As a result, we can only
compare inputs that can be measured or determined after the part is produced. For instance,
in the window leaks example, it was not possible to compare process inputs such as win-
dow installer or machine settings because these cannot be determined after a truck is leak
tested.
We require at least six to eight parts per group to avoid falsely identifying a dominant
cause. We examine some alternate plans and analyses in the supplement to this chapter.2
H1212ch12.qxd 3/31/05 10:43 AM Page 183
Crossbar Dimension
There was excess variation in a key crossbar dimension of an injection-molded part. In the
baseline investigation, the team estimated the standard deviation and the full extent of vari-
ation of the dimension to be 0.46 and –0.3 to 2.0 thousandths of an inch. The problem goal
was to reduce the standard deviation to less than 0.25.
The team showed that the measurement system was highly capable. They decided to
search for a dominant cause. They conducted a multivari investigation where five consec-
utive parts from the single mold were sampled every 30 minutes for four hours. See the
exercises for Chapter 10. The team found that the time-to-time family contained the dom-
inant cause.
Next they planned an input/output relationship investigation. Forty shots of the process
were selected haphazardly over a two-day period. For each shot, the team measured the
crossbar dimension of the part and recorded five inputs: die temperature, nozzle tempera-
ture, barrel temperature, hydraulic pressure, and cavity pressure. All these suspects were
thought to exhibit time-to-time variation.
The data are given in the file crossbar dimension input-output. The first step in the
analysis is to check that the output varied over the full extent of variation as in the baseline.
We see from the summary that this is the case.
The second step is to fit a regression model3 (Montgomery et al., 2001) that includes
all of the inputs simultaneously. See Appendix E for the MINITAB directions. The resid-
ual standard deviation is 0.2515, which is substantially less than the baseline standard
deviation. The residual standard deviation is an estimate of the variation in crossbar dimen-
sion if we could hold all of the inputs in the regression model fixed. The small value of
residual standard deviation indicates that one or more of the inputs is a dominant cause of
the variation.
In the third step, we look at the scatter plots of crossbar dimension versus each of the
inputs.4 Two of the plots are shown in Figure 12.3.
There is a strong relationship between barrel temperature and crossbar dimension. If we
could hold barrel temperature fixed at 77°, for example, we see from the plot that crossbar
dimension would vary only by about 0.5. There is strong evidence that barrel temperature
H1212ch12.qxd 3/31/05 10:43 AM Page 184
2 2
Dimension
Dimension
1 1
0 0
Figure 12.3 Plot of crossbar dimension versus barrel temperature and hydraulic pressure.
Analysis of Variance
Source DF SS MS F P
Regression 1 9.1979 9.1979 142.13 0.000
Residual Error 38 2.4591 0.0647
Total 39 11.6570
The residual standard deviation is 0.25, much smaller than the baseline value 0.46. If
the team could eliminate the effect of barrel temperature they would meet their goal.
H1212ch12.qxd 3/31/05 10:43 AM Page 185
Truck Pull
Consider again the truck pull problem introduced in Chapter 1. The team decided to focus
on right caster since caster variation had a much larger effect on pull than camber variation.
The baseline standard deviation for right caster was 0.24˚. Previous investigations ruled out
the measurement system. The assembly process was difficult to investigate so the team set
that family aside for the moment and concentrated on the component families. A feed-
forward controller compensated for variation in the frame geometry. This left the other
components of the alignment system, namely the knuckle and upper and lower control arms,
as possible homes of the dominant cause. The search for the cause is illustrated in the diag-
nostic tree in Figure 12.4.
At this point the team had several choices. They could have tried assessing the assembly
and component families, as discussed in Chapter 11, but they could not use the production
assembly process. Moreover, at the time of the investigation, there was a proposal to bar-code
the control arms with dimensional data that could be fed into the feedforward controller
already in use to compensate for the frame geometry (see Chapter 16). With this proposed
process change, all control arms would be measured and bar-coded at the supplier’s plant.
Before implementing such an expensive proposal, the team wanted assurance that it would
be worthwhile. If the control arm inputs were not dominant causes, there would be little
reduction in pull variation from this costly change.
The team decided to explore specific characteristics of the control arms and knuckle. Based
on the proposal for the feedforward control scheme, they selected dimensional characteristics of
the components, one for each control arm and two for the knuckle, thought to affect caster.
Excessive
caster angle
variation
Measurement Process
Components Assembly
Upper Lower
Knuckle Frame
control arm control arm
Figure 12.4 Diagnostic tree for search for the cause of excessive caster variation.
H1212ch12.qxd 3/31/05 10:43 AM Page 186
For the investigation, the team selected 30 sets of components (two control arms and a
knuckle) from regular production over six days. From the baseline investigation, the team
expected to see the full extent of variation in caster angle with this sampling scheme. They
measured the four inputs on each set of components and the right caster angle on the
assembled truck. The data are given in the file truck pull input-output. The standard devia-
tion of the caster angle in the investigation is 0.22, somewhat smaller than the baseline
standard deviation.
We fit a regression model to describe the relationship between caster angle and the four
inputs. The results from MINITAB are:
The residual sum of squares is s = 0.162, a moderate reduction from the baseline vari-
ation 0.24. Next we look at the scatter plots of the inputs by caster angle. We cannot see any
strong relationships between caster angle and the individual component dimensions in
Figure 12.5.
To quantify the contribution of each component dimension, we fit regression models with
one input at a time. We rank the inputs based on the residual standard deviation in Table 12.1.
5.2 5.2
5.1 5.1
5.0 5.0
4.9 4.9
Right caster
Right caster
4.8 4.8
4.7 4.7
4.6 4.6
4.5 4.5
4.4 4.4
4.3 4.3
4.2 4.2
5.2 5.2
5.1 5.1
5.0 5.0
4.9 4.9
Right caster
Right caster
4.8 4.8
4.7 4.7
4.6 4.6
4.5 4.5
4.4 4.4
4.3 4.3
4.2 4.2
–0.4 –0.3 –0.2 –0.1 0.0 0.1 0.2 0.3 0.4 –0.4 –0.3 –0.2 –0.1 0.0 0.1 0.2 0.3 0.4 0.5
Upper ball Lower ball
Figure 12.5 Scatter plots of right caster versus the component characteristics.
None of the component dimensions is a dominant cause when considered singly. When
we fit regression models with pairs of inputs, we see in Table 12.2 that the two control arm
dimensions together produce the smallest residual standard deviation.
We conclude that if we could completely remove the effects of the two control arms by
using the feedforward controller, we would reduce the caster variation by about 20%. This
reduction could be achieved only if the feedforward controller worked perfectly. Also, the
sample size in this investigation is small and there is considerable uncertainty in the esti-
mates of the standard deviations.
The team used these results to argue that the proposed bar-coding of the control arms
would not be cost-effective. They decided instead to investigate the feasibility of feedback
control on caster angle (see Chapter 17).
H1212ch12.qxd 3/31/05 10:43 AM Page 188
In the production of cast-iron exhaust manifolds, a foundry attributed many defects to the sand
system used to create the molds. Historically, the sand-related defect rate was 2% with
substantial variation shift to shift. Management assigned a project team to reduce sand-
related scrap.
The team used a run chart of the scrap rate by shift as a baseline. This chart was pro-
duced daily as part of the management information system. They did not formally investigate
the measurement system used to determine whether a manifold had a sand-related defect.
They believed from past experience that this system was reliable. They decided to search
for a dominant cause of the defects.
Many characteristics of the molding sand such as temperature, compactness, permeability,
moisture level, green strength, and percent friability were routinely measured during produc-
tion. The team decided to use the available data to determine if any of these individual sand
characteristics was a dominant cause. They chose not to use broad families of causes as is usu-
ally done early in the search because the data were already available and the cost was low.
It was not easy to get the data into a usable form. The sand characteristics were not
measured for each casting. There were substantial and varying time lags between the meas-
urement of a sand characteristic and the use of the sand to make a mold. For each casting,
the time of casting was known only up to the nearest hour. Because of these traceability
difficulties, the team used the hourly scrap rate as the output and the average of the sand
characteristics over the hour as the inputs.
The team selected 91 hours of production and made the linkages between hourly scrap
rate and average sand characteristics. In each hour, the plant produced between 30 and 180
castings. The data are given in the file manifold sand scrap input-output. The average and
standard deviation of the hourly scrap rate were 0.017 and 0.023. The average rate 1.7%
matches the baseline well.
To analyze the data, the team ignored the changes in volume and focused solely on the
proportion of sand scrap in each hour. Fitting a regression model to the hourly scrap rates
with all of the sand characteristics, the residual standard deviation was 0.022, a very small
reduction from the baseline value 0.023. None of the sand characteristics was a dominant
cause. This is confirmed by looking at scatter plots of the sand scrap proportion versus the
inputs. Some of the plots are given in Figure 12.6.
H1212ch12.qxd 3/31/05 10:43 AM Page 189
0.05 0.05
0.00 0.00
Figure 12.6 Scatter plot of sand scrap proportion versus moisture and compactness.
The team noticed in the plot of hourly scrap rate versus temperature, shown in Figure 12.7,
that there was a nonlinear relationship. Using MINITAB, they fit a quadratic regression model
that included the square of the temperature. Part of the results are shown as follows:
0.15
Sand scrap proportion
0.10
0.05
0.00
90 100 110
Temperature
Figure 12.7 Sand scrap proportion versus sand temperature with quadratic fit.
H1212ch12.qxd 3/31/05 10:43 AM Page 190
While the observed pattern makes physical sense, temperature is not a dominant cause.
Holding sand temperature fixed (which would be very expensive) would not reduce the sand
scrap proportion substantially.
In the end the team was forced to conclude the investigation was a failure. They were
not able to find a dominant cause, and they could not eliminate any of the suspects because
the linkage between sand characteristics and the defect rate was poorly established.
This example demonstrates the risk of using an investigation based on individual
causes early in the search for a dominant cause. Even if the team could have eliminated the
measured sand characteristics as dominant causes, they would have made little progress, as
there were a large number of remaining suspects.
Question
For the current process, which, if any, of the suspects is a dominant cause?
Plan
• Select a time frame in which we expect to see the full extent of variation in the output.
• Select 30 or more parts spread across the time frame.
• For each part, measure the output and all suspects.
Data
Record the output and suspect values, one row for each part.
Analysis
• Use regression to model the output as a function of all suspects simultaneously. For categor-
ical suspects, use indicator variables.5
• Plot the output versus each one of the suspects. For continuous suspects, fit the correspon-
ding simple regression model to quantify any strong linear relationships. For categorical sus-
pects, use one-way ANOVA.
Conclusion
• If the residual standard deviation in the first regression model is much less than the baseline
standard deviation, one of more of the suspects is a dominant cause.
• If there is a dominant cause, identify it from the scatter plots.
H1212ch12.qxd 3/31/05 10:43 AM Page 191
Comments
In all cases, the goal is to get the full extent of variation so that we know the dominant cause
has acted.
With the first option, we need to have a large enough sample spread across appropri-
ate times and locations to meet the goal. We can use the results of the baseline investiga-
tion to help to define the sampling scheme. This option is the most common and was used
in the examples discussed in this section.
The second option requires more effort. For example, in the truck pull example, the
team had information about the historical variation in the four inputs. They could have
selected components with values for these dimensions at both ends of their historical
range. This would have required extra measurement to find the appropriate compo-
nents. Using extreme values for the inputs, we employ leverage and can use a smaller
sample size. If we do not see the full extent of output variation with this plan, none of
the selected inputs is a dominant cause.
The final option corresponds closely to the group comparison plan. For example, in the
window leaks problem, had the team been able to measure the severity of the leak, they
could have fit a regression model to the data. By choosing parts with extreme output values,
we are sure to see the full extent of variation. However, we may not be able to determine
the values of many suspects due to lack of traceability.
We need to be careful interpreting the MINITAB regression results. In the truck pull
example, each of the four inputs is statistically significant because the p-value is small
(less than 5%). This does not mean that any of the input is a dominant, or even large,
cause, as we saw in the example. The same difficulty occurs in group comparisons when
the input averages for the two groups are statistically significantly different, and yet there
is no evidence that the input is a dominant cause. Hypothesis tests are not a useful tool
for identifying a dominant cause. See further discussion of this issue in the Chapter 10
supplement.
Regression models can accommodate binary outputs, discrete inputs, quadratic terms,
interactions between inputs, and so on.6 We can transform or combine the input values. We
need to be careful using these models because we can be misled by nonlinear relationships,
outliers, and influential observations.
H1212ch12.qxd 3/31/05 10:43 AM Page 192
Key
Points
• Group comparison and input/output investigations assess single causes and are
most useful in the latter stages of the search for a dominant cause.
• With a binary output, we can compare inputs measured on parts using a group
comparison. We use plots of the individual values or box plots to look for large
differences in the inputs for the two groups of parts defined by the output values.
• With continuous output, we use regression models and scatter plots to show the
input/output relationships and isolate dominant causes. We calculate the residual
standard deviation from a simple regression fit to assess whether a particular
suspect is a dominant cause.
Chapter 13
Verifying a Dominant Cause
Approach each new problem not with a view to finding what you hope will be
there, but to get the truth, the realities that must be grappled with. You may
not like what you find. In that case you are entitled to try to change it. But do
not deceive yourself as to what you do find to be the facts of the situation.
—Bernard M. Baruch, 1870–1965
W
e recommended the method of elimination and a series of simple observational
investigations to isolate a dominant cause. Before proceeding with the next stages
of the Statistical Engineering algorithm, we want to be sure that the suspected
cause, here called a suspect, is dominant. We call this verification.
In many applications of the algorithm, we have sufficient evidence from the search to be
sure we have found a dominant cause and we require no further verification. For example, in
the fascia cratering problem discussed in Chapter 11, the team found that craters occurred only
on every fifth and tenth fascia taken from the mold. They concluded that the dominant cause
was the application of mold release. They did not verify this conclusion because no other cause
matched the observed pattern of craters. Similarly, in theV6 piston diameter example (Chapter 11),
the team concluded that a dominant cause of diameter variation at the final gage was the diam-
eter after Operation 270. The variation transmission investigation showed that pistons with
large (small) diameters after Operation 270 were large (small) at the final gage. The team
could explain the observed pattern in only one way and decided not to verify their conclusion.
Why do we need to verify? In the search, we might have inadvertently ruled out a family
that contains the dominant cause. More commonly, we may have selected the suspect from
the remaining family of causes using our best judgment. If we are wrong, there may be other
causes in the family that are dominant. Consider the problem of excess crossbar dimension
variation discussed in Chapter 12. There, the team concluded that barrel temperature was the
dominant cause based on the results of an input/output relationship investigation. They
decided that verification was necessary because it was possible that the actual dominant cause
was another (unidentified) cause in the same family (time-to-time) as barrel temperature.
193
H1212ch13.qxd 3/31/05 10:44 AM Page 194
To verify that a suspect is a dominant cause, we use an experimental plan, often called
a designed experiment, where the value of the suspect is deliberately manipulated. Good
references on experimental plans include Box et al. (1978), Ryan (1989), Wheeler (1990),
and Montgomery (2001).
Experimental plans are also important tools, as we will see in later chapters, for helping
to assess the feasibility and determine how to implement several of the variation reduction
approaches. In this chapter, we introduce the language and principles of experiments. We
start with plans to verify a single suspect dominant cause. We then introduce more complex
plans used to isolate a dominant cause from a short list of suspects.
x x x x x
Replicates
x x x x x
at low level
x x x x x
Runs
x x x x x
Replicates
at high level x x x x x
x x x x x
Repeats within
a run
Figure 13.1 Runs, replicates, and repeats for an experiment with a single suspect at two
levels.
To protect against some other input changing systematically as we change the suspect,
we randomize the order of the runs, if feasible.2 This use of randomization is one of the
main differences between observational and experimental investigations. We can randomize
the order of replicates but not repeats. For example, we give the random order and the oil
consumption for each run in Table 13.1.
Low 5.0 4 23
Low 7.0 3 24
Low 8.5 6 29
High 21.0 1 76
To analyze the results, we rely on tables of averages and graphical summaries such as
box plots or scatter plots. If the suspect is a dominant cause of variation, the output charac-
teristic should vary over most of its full extent of variation when the value of the suspect
changes from its high to low level. In Figure 13.2, we see that changing the valve lifter
clearance has a large consistent effect on the oil consumption. The team concluded that
valve lifter clearance was a dominant cause of the variation in oil consumption and that low
clearance values led to less oil consumption.
H1212ch13.qxd 3/31/05 10:44 AM Page 196
120
110
100
Oil consumption
90
80
70
60
50
40
30
20
High Low
Lifter clearance
There were several limitations to this conclusion. First, the team did not have a good
idea of the baseline in terms of the accelerated dynamometer test. They had measured only
one engine from the field failures. Repeated testing of this engine in the dynamometer gave
an average oil consumption of 117 grams per hour. The team was reassured because the oil
consumption at the high clearance level was close to this average. Second, the verification
experiment used only a single engine. The clear results for this engine may not carry over
to others. Finally, the team did not have a direct connection between oil consumption in the
field and in the result of the accelerated test on the dynamometer. In spite of these limita-
tions, they proceeded, assuming that lifter clearance was the dominant cause.
Crossbar Dimension
In the crossbar dimension example, discussed in Chapter 12, the team identified barrel tem-
perature as the dominant cause. For the verification experiment, the team chose the low
level for barrel temperature as 75° and the high level as 80°. This covered close to the full
range of barrel temperatures seen in the earlier investigation.
Barrel temperature could be controlled for the experiment and changed in a few min-
utes. As a result, the verification experiment used only two runs. The barrel temperature
was set, 25 parts were made to ensure the temperature had stabilized, and the next 10 parts
were selected and measured. There are two runs with 10 repeats per run and no replication.
The data from the experiment are given in the file crossbar dimension verification and
are presented in Figure 13.3. Barrel temperature has a large effect on crossbar dimension
relative to the full extent of variation (given by the dashed lines). The team concluded that
barrel temperature was the dominant cause of crossbar dimension variation.
The lack of randomization was not an important limitation here since previous investiga-
tions had shown that the dominant cause acted in the time-to-time family. Over the short
time frame of the verification experiment, it is unlikely that we will see the full range of vari-
ation in the output unless barrel temperature is a dominant cause. There is insufficient time
for other causes in the time-to-time family to change substantially.
H1212ch13.qxd 3/31/05 10:44 AM Page 197
Dimension
1
High Low
Barrel temperature
Figure 13.3 Crossbar dimension verification experiment results (dashed lines show the full
extent of diameter variation).
The lack of replication may have serious consequences. If the dominant cause is an
interaction between barrel temperature and some other cause in the time-to-time family, we
may see little effect on crossbar dimension when we change the barrel temperature, depend-
ing on the unknown level of the second cause. To remove this uncertainty, we need to take
the expensive step of replicating the whole experiment over several time periods, allowing
the second cause time to change.
was added to the ladles to protect the refractory (surface). Since the wash was water-based,
the team suspected it was a source of porosity when the process was restarted. The team could
not cheaply manipulate the pouring temperature, but they could change and control the level
of ladle wash. They added the full amount of wash to three ladles selected at random and half
the amount to the other three for two lunch breaks. In each case, they measured the porosity
of the first 30 blocks poured (five from each ladle). The data are given in the file engine block
porosity verification, and the results are discussed in the Chapter 13 exercises. Based on the
results and process knowledge, the team eliminated ladle wash as a suspect. They declared
pouring temperature a dominant cause, even though they did not verify it.
Most verification experiments are conducted over a short time using only a few parts. In
the oil consumption example, the team used the same engine for all experimental runs.
They assumed that the conclusions about valve lifter clearance for this engine would
apply more broadly. The team could have strengthened the conclusion by doing the same
six-run experiment with three different engines. This change to the plan would have increased
the cost and complexity of the experiment.
The more that we know about the nature of the output variation, the easier it is to select
a study population. For example, if we know the time-to-time family contains the dominant
cause, we can conduct the experiment over a short time period in which the process does not
vary materially. There is a small chance of another cause in the time-to-time family changing
substantially during the experiment. However, since the suspect is also in the time-to-time
family, we may have difficulty manipulating its levels within this short time period.
We select the two levels of the suspect for the verification experiment at the low and high
end of its range of values in the regular process. If the suspect is a dominant cause, then
changing from the low to high level will produce the full extent of output variation. To
determine the levels, we have to know the range of variation of the suspect cause. To
acquire this knowledge, we may have to carry out a small investigation on the suspect.
There is some risk in making the levels of the suspect too extreme. First we may induce a
different failure mode into the process. Second, we may fool ourselves in concluding that the
suspect is a dominant cause because the very extreme levels, which rarely occur in the regular
process, may induce the full extent of variation in the output.
For a verification experiment, we strongly recommend using only two levels per suspect.
Extra levels give little additional information about whether or not the suspect is dominant
and increase the complexity of the experiment.
In some cases we conduct a verification experiment over a short time frame without the pro-
tection provided by randomization and replication.3 In planning the experiment—that is,
defining a run, choosing the number of runs, and so on—we need to assess the risk that a
H1212ch13.qxd 3/31/05 10:44 AM Page 199
dominant cause, other than the suspect, acts within the time of the experiment. The nature
of the variation in the output over time is a key piece of information to help assess this risk.
If the dominant cause acts over the long term, as in the crossbar dimension example, we can
plan a verification experiment over a relatively short time with two runs, one at each level
of the suspect. If the dominant cause acts in the part-to-part family, we can use two runs
with a moderate number of repeats. That way, if the suspect is not a dominant cause, the true
dominant cause (that acts in the part-to-part family) will have time to generate close to the
full extent of variation in the runs at both the high and low level of the suspect.
When we do not know the nature of the variation over time, as in the oil consumption
example, we need to be careful. We should use replication—that is, several runs for each
level of the suspect—and randomize the order in which the runs are conducted. The key
question is, “Is there some other unknown cause that might change from run to run in a way
that matches the pattern of change of the suspect?” If the answer is yes, then we randomize
the order of the runs with at least three replicates per level to reduce the risk.
If the output is binary, we suggest many repeats within each run of the experiment. We can
then see if changing the suspect produces a large change in the proportion of defectives
within each run. We can assess the importance of the change by comparing the proportions
to the baseline. We need runs with many repeats to estimate the proportion of defectives at
each level of the suspect. We may have difficulty holding the normally varying suspect con-
stant for a long run.
In the analysis of a verification experiment, we always check that the observed variation in
the output characteristic is a substantial proportion of the full extent of variation seen in the
baseline. Otherwise the identified suspect is not a dominant cause. We do not recommend
a formal hypothesis test to see if changing the suspect produces a statistically significant
change in the output. Such a change may be too small to be helpful.
What should we do if the results of the experiment show the suspect is not dominant? First,
review the plan and conduct of the experiment in light of the information gathered during
the search for the suspect. Some possible questions are:
• What is the family of remaining suspects? Are there other suspects in this
family? Are these suspects eliminated by the experiment?
• Could we have missed an interaction with another cause that did not change
during the verification experiment?
If the answers to these questions are not helpful, then we have few options. First, review the
diagnostic tree. Did we rule out any families or causes during the search without sufficient
H1212ch13.qxd 3/31/05 10:44 AM Page 200
evidence? If the answer is yes, we may have to reinitiate the search. Second, review the
approaches (Chapter 8) and choose a working approach that is not cause-based. If none
seem feasible, then we may decide that there is no dominant cause for this problem as for-
mulated and go back to the beginning and reformulate the problem. Finally, we may aban-
don the problem and accept that Statistical Engineering has failed.
of the suspects at their low and high levels. For example, we use the low level of suspect C
on four runs and the high level of suspect C on four runs.
To analyze the results of factorial experiments, we rely on comparing averages and
graphical displays.
Table 13.3 Suspects and levels for brake rotor verification experiment.
Suspect Low level High level
To obtain the cores for the experiment, the team measured thickness variation and sorted
cores until they had 16 with high thickness variation and 16 with nominal thickness variation
from each set of tooling. The experiment required careful planning since the balance weight
of each casting can only be determined after shipping and machining. The castings were
tagged for identification after production and tracked through the subsequent process.
The team randomized the casting order of each treatment as given in Table 13.4. The
64 rotors were cast and machined as planned. The experimental plan and the measured bal-
ance weights are given in the file brake rotor balance verification. The average weight of
the eight repeats for each run is given in Table 13.4.
To analyze the data, we plot the weights by treatment in Figure 13.4. We see there are
large differences in the balance weights produced by different treatments and relatively little
difference within each treatment. We also notice that we have seen roughly the full extent of
variation in balance weight given by the problem baseline (the dashed line on Figure 13.4). If
this were not the case, we would conclude that we have not found a dominant cause.
2
Weight
1 2 3 4 5 6 7 8
Treatment
Figure 13.4 Weight by treatment for the brake rotor verification experiment (dashed line
gives the full extent of weight variation).
H1212ch13.qxd 3/31/05 10:44 AM Page 203
We analyze the experimental results using the within-run average weight for each treat-
ment. We estimate and rank the effects of the suspects. A main effect due to a particular sus-
pect is the difference in average output (high-low) for that input. For example, using the
data in Table 13.4, the main effect for tooling is:
An interaction effect measures the change in the main effect for one input as a second input
changes. For example, the interaction effect for tooling and thickness is:
The first term on the left side of the equation (except for an extra factor of 1⁄2) compares
the effect of changing the tooling when the thickness variation is 30 thousandths. The
second term measures the effect of changing the tooling when the thickness variation is
nominal. Half the difference in the two effects is the interaction.
We can similarly define interactions for three inputs and more. Interactions are important
if the effect on the output of changing levels in one suspect depends on the level of another
suspect. In terms of finding a dominant cause, we may get extreme values of the output
only if both inputs are at their high level. In this instance, we say that the dominant cause
is an interaction between the two suspects.
In this experiment with eight treatment combinations, we can estimate seven effects.
The number of effects estimable is always one less than the number of treatments. We fit a
full model with all possible effects and construct a Pareto chart (see Appendix F for
MINITAB instructions) of the unsigned effects to distinguish between those that are large
and small. In Figure 13.5, we see that thickness variation, tooling, and the interaction
between thickness variation and tooling are relatively large effects. The vertical dashed line
in Figure 13.5 is added by MINITAB based on a test of significance. We ignore this dashed
line in our interpretation.
A: Tooling
C B: Position
C: Thickness
A
AC
AB
ABC
BC
Figure 13.5 Pareto chart of the effects for brake rotor verification experiment.
H1212ch13.qxd 3/31/05 10:44 AM Page 204
2
Thickness variation
Nominal
30 thousandths
Weight
4-gang 6-gang
Tooling
To verify a dominant cause from a list of suspects, we do not recommend the use of one-at-
a-time experiments, where we change only a single suspect at a time. These traditional
H1212ch13.qxd 3/31/05 10:44 AM Page 206
experiments are a poor choice because they cannot find important interactions between the
suspects. For more information on the dangers of one-at-a-time experiments, see Mont-
gomery (2001).
Sometimes, we can test all the suspects simultaneously. In the window leaks example
discussed in Chapter 12, the team isolated the primary seal gap and the plastisol application
as suspects. To verify, the team selected eight trucks that had passed the aggravated leak
test and reinstalled the rear window using different windows and seals, ensuring that the
primary seal gap exceeded 1.5 millimeters and that the plastisol score was between 3 and 4.
Seven of the eight trucks leaked when they were retested. The conclusions from the verifi-
cation experiment and group comparison are summarized in Figure 13.7. All the experi-
mental runs came from the upper left quadrant and the results verified that the combination
of poor plastisol application (low score) and high primary seal gap resulted in leakers. The
team proceeded to improve the plastisol application process by ensuring that all trucks
received a plastisol score greater than 4.5.
Leakers Nonleakers
Primary seal fit
Nonleakers
Nonleakers
(group
(group comparison)
comparison)
Plastisol amount
Question
In the current process, are any of the suspects a dominant cause?
Plan
• For each suspect, choose two levels at the extremes of their normal range.
• Define the runs using available information about the time-based family that is home of the
dominant cause.
H1212ch13.qxd 3/31/05 10:44 AM Page 207
Data
Carry out the experiment. Record the output, suspect levels, treatment number, and order for
each run. Use a separate row for each repeat.
Analysis
• Plot the output against the treatment number. Add horizontal lines showing the full extent of
variation in the output.
• For several suspects, fit a full model and construct a Pareto plot of the main and interaction
effects. For large interactions, create a multivari chart with the corresponding inputs.
Conclusion
• If the output does not show the full extent of variation, then none of the suspects is a domi-
nant cause.
• A suspect with a large effect relative to the full extent of variation is a dominant cause.
• If there is a large interaction, the dominant cause involves two (or more) suspects.
(Full) factorial An experimental plan where all possible combinations of the input levels are
experiment used to define the treatments.
Interaction effect The change in the main effect for one input as a second input changes.
Continued
H1212ch13.qxd 3/31/05 10:44 AM Page 208
Main effect The change in the average output produced by a change in one input.
Repeat More than one part (or measurement) made within a run.
Replication Carrying out more than one run for each treatment.
Run Assigning the treatment, running the process, and measuring the output.
Key
Points
• We use an experimental plan to verify that one or more suspects obtained using
the method of elimination is a dominant cause.
• For each suspect we choose two levels at the extremes of its normal range.
• In the verification experiment, we recommend:
–For a single suspect: three or more replicates for each level
–For two suspects: two or more replicates at each of the four treatments
–For three or four suspects: a full factorial experiment
• We identify a dominant cause by examining main effects and two input interactions.
• In the analysis, we check that the full extent of variation in the output is seen
over the runs of the experiment. Otherwise, none of the suspects varied in the
experiment is a dominant cause.
supplement, we explain how randomization and replication can help to reduce the risk of falsely
identifying a suspect as the dominant cause. We also explain how we implicitly use blocking,
the third fundamental principle of experimental design, to help to avoid this risk.
3. See note 2.
4. In the supplement, we briefly discuss an alternative verification experiment based on a
technique called variables search.
PART IV
Assessing Feasibility and
Implementing a Variation Reduction
Approach
Opportunity is missed by most people because it is dressed in overalls and
looks like work.
—Thomas Edison, 1847–1931
I
n this final part of the book, we address revisiting the choice of working variation
reduction approach, the issues around the assessment, implementation of each approach,
and the validation of a solution. The choice of working approach may need to be recon-
sidered in light of the process knowledge obtained in a search for a dominant cause or in assess-
ing feasibility of a particular approach. We provide detailed how-to directions for assessing
feasibility, including consideration of costs, and for implementing each of the seven
approaches. We also discuss validating a solution.
14
Revisiting the Choice of Variation
Reduction Approach
A
fter conducting a search for a dominant cause, the team should reconsider their
choice of working variation reduction approach. In this chapter, we consider the
options and how to choose an approach.
If a specific dominant cause has not been found, the team must select one of the three
non-cause-based approaches, resume the search for a more specific cause, or abandon the
project. If a dominant cause has been found, the team can now consider the feasibility of any
one of the variation reduction approaches. They will have accumulated substantial information
about the behaviors of the output and the causes that can be used in selecting the approach.
There are four options that directly use the knowledge of the dominant cause:
• Fix the obvious.
• Compensate for the variation in the dominant cause (desensitization or
feedforward control).
• Reformulate the problem in terms of the dominant cause and reapply the
algorithm (that is, reduce variation in the cause).
• Continue the search for a more specific dominant cause. This is an informal
version of reformulation.
The variation reduction approaches that do not require knowledge of a dominant cause
should also be considered:
• Use feedback control on the output.
• Make the process robust to variation in the unknown dominant cause.
• Implement or tighten 100% inspection on the output.
213
H1212CH14.qxd 3/31/05 10:45 AM Page 214
Given the knowledge accrued during the search for the cause, the non-caused-based
approach may be more or less feasible. The final option is to abandon the project and devote
the resources to another problem with a greater likelihood of success.
We explore how to decide among these choices. Since there are few general rules, we
present a large number of examples. We consider costs and likelihood of success. There is
considerable risk and uncertainty in each choice, but we cannot proceed without making
this decision.
Truck Pull
In Chapter 10 and elsewhere, we described a project to reduce variation in a truck wheel
alignment process. The team had access to alignment data that was automatically collected
for every truck produced. Each truck was measured on one of four alignment machines. At
one point during the project, the team looked at the right caster data, stratified by the align-
ment machine. The daily averages are plotted in Figure 14.1.
The team was surprised to see the persistent differences among the four gages. The trucks
enter the gages haphazardly, so the observed differences must be due to differences in the
alignment machines. The alignment machine was a dominant cause of variation in right
caster. This cause was not acting during the baseline investigation or the initial investigation
of the measurement system. The team took immediate action to recalibrate the four gages to
remove the systematic differences. To prevent recurrence, they established a daily monitoring
program to correct such differences. There was a small cost associated with the daily check.
4.8
Avg. right caster
4.3
3.8
15 20 25 30
Day
Fascia Cratering
In the fascia cratering example discussed in Chapter 11, the team found that major cratering
occurred on every tenth fascia. They concluded that the dominant cause was the mold
H1212CH14.qxd 3/31/05 10:45 AM Page 216
release spray. With this knowledge, an obvious (to the team) and immediate short-term
solution was to wipe the mold after the spray. This solution had the potential to introduce
dirt in the mold and met with resistance from the mold operators. The obvious fix was not
feasible. The team decided instead to investigate different mold sprays. That is, they adopted
desensitization as the working approach.
Window Leaks
As reported in Chapter 12, a team found that the dominant cause of truck rear window
upper ditch leaks was an interaction between the primary seal gap and the plastisol appli-
cation to the seams. We reproduce the results in Figure 14.2.
The pattern on the plot suggested an obvious fix. The instructions were changed so that
the operators where the plastisol was applied, brushed the plastisol to ensure coverage of
the critical seals. Because of this change, the rear window upper ditch leaks were totally
eliminated. Note that the variation in primary seal gap was not changed. The process was
made less sensitive to this variation. The team also decided to pursue another solution
based on changing the primary seal to reduce the variation in gap. The operating costs of
this proposed solution were less than those of the obvious fix. However, management post-
poned implementation of this alternative solution because it required robotic application
and substantial capital expenditure.
2.5 Leaker
Nonleaker
Primary seal fit
1.5
0.5
3 4 5 6
Plastisol amount
Hubcap Damage
Because of customer complaints, a team was assigned a goal of reducing the incidence of
wheel trim and hubcap damage. The team discovered that the dominant cause of broken
retaining legs and other damage was a combination of cold weather and contact with curbs.
Through comparison with competitors, they found that an obvious fix was to change the
hubcap material and design. The team replaced the brittle existing ABS hubcap with a new
H1212CH14.qxd 3/31/05 10:45 AM Page 217
Comments
The Fix the Obvious approach is a catchall category that makes use of the other variation
reduction approaches. For example:
• 100% Inspection (for example, loose assembly nuts)
• Moving the output center (for example, rear intake wall leakers, cylinder head
side shift, truck pull)
• Desensitizing the process (for example, window leaks)
When applying the Fix the Obvious approach, the team should ensure the problem
does not recur. In the truck pull, camshaft journal diameter, and cylinder head shift examples,
the fix was obvious but all three problems were likely to recur unless a new control or reg-
ular maintenance scheme was put in place.
Fix the Obvious is the preferred choice because it applies when there is a clear solution
to the problem. There is little uncertainty that the fix will be effective. The main consider-
ations are the cost of the fix relative to the gain and potential negative side effects.
Crossbar Dimension
In Chapter 12, we discussed a problem to reduce variation in a crossbar dimension of a
molded part. From the baseline investigation, the full extent of variation was –0.3 to 2.0
thousandths of an inch. The team found that barrel temperature was a dominant cause as
shown in Figure 14.3.
At first, the team considered reformulating the problem. Reducing variation in barrel
temperature would result in reduced variation in crossbar dimension. Looking closely at
Figure 14.3, the team realized there was evidence of a nonlinear relationship between barrel
temperature and crossbar dimension. We have added a quadratic fit to the scatter plot to
make this conclusion clearer. The variation in crossbar dimension is greater as the barrel
temperature varies from 74° to 77° than it is if the barrel temperature ranges from 77° to
80°. The team decided to try to desensitize the process to barrel temperature variation by
increasing the average barrel temperature.
The team could not predict the benefits of increasing the average barrel temperature
from the data shown in Figure 14.3 since they needed to extrapolate beyond the normal
range of barrel temperature. The team could assess the direct cost of the change but they
were unsure if the process would gracefully tolerate barrel temperatures much above 80°.
That is, there might be negative side effects. Further investigation was required.
2
Dimension
74 75 76 77 78 79 80
Barrel temperature
Comments
There is uncertainty about finding an effective solution if the team decides to compensate
for the effects of a dominant cause. In the refrigerator frost buildup example, the team did
not know if they could find affordable changes to the design that would desensitize the
refrigerator to changes in the usage and environmental inputs. They could not quantify the
benefits until they had investigated the changes. They needed to carry out further process
investigations with no certainty of an efficient and effective resolution. Fortunately, they
were able to find good design changes. Otherwise, they would have had to absorb the high
costs of investigation and reconsider the other variation reduction approaches.
Sunroof Flushness
Customers were dissatisfied if the sunroof was not flush with the roof of the car. In the sun-
roof installation process, there was a 90% rework rate due to flushness variation. A team set
out to reduce the rework costs.
Flushness was the difference in height between the sunroof seal and the metal roof. It
was measured using digital calipers at six points, three at the front and three at the back of
the sunroof. A baseline investigation showed that flushness variation was largest at the two
front corners. Front corner flushness ranged between –3.5 and 4 millimeters, with an esti-
mated standard deviation of 1.25 millimeters. The team established a goal of reducing the
front corner flushness standard deviation to less than 0.5.
Based on engineering knowledge, the team suspected the dominant cause of flushness
variation was either crown height or attachment pad height. When the roof panel was
adapted to allow installation of a sunroof, six attachment pads were added. The team carried
out an investigation using two sets of six vehicles with extreme flushness. For each vehicle,
they removed the sunroof module and measured the attachment pad heights and roof crown
height at the front and back. The data are given in the file sunroof flushness input-output.
Here we report the results for left front flushness only. The team used regression analysis
H1212CH14.qxd 3/31/05 10:45 AM Page 220
–1
–2
–3
–2.5 –2.0 –1.5 –1.0 –0.5 0.0 0.5 1.0 1.5 2.0 2.5
Left front pad height
Figure 14.4 Scatter plot of flushness versus left front pad height.
and the scatter plot shown in Figure 14.4 to demonstrate a clear connection between the
corner flushness and pad height closest to the corner. Roof crown height was eliminated as
a suspect.
The team assumed that pad height was the dominant cause of the variation in flushness
without verifying this conclusion. They decided to reformulate the problem in terms of pad
height. They carried out a baseline investigation and found the standard deviation in pad
height was 1.18 millimeters.
To determine a goal for the reformulated problem, they used the fitted equation from
the regression analysis:
Analysis of Variance
Source DF SS MS F P
Regression 1 45.348 45.348 436.87 0.000
Residual Error 10 1.038 0.104
Total 11 46.386
where a + b* pad height represents the effect of the dominant cause and noise the effects
of all other causes. The model constants a (intercept) and b (slope) describe the approximate
linear relationship between the output flushness and the dominant cause pad height. Based
on the model (see Chapter 2), and assuming that the dominant cause varies independently
of the other causes, we have
From the regression analysis, we estimate b as 1.05. From the two baseline investigations,
we have estimates of the standard deviations of flushness and pad height. Substituting the
three estimates into the given equation, we get an estimate for the standard deviation of the
noise to be 0.118 millimeters.
Now, we can use the equation to translate the goal of reducing the flushness standard
deviation variation to 0.5 into a goal in terms of pad height variation. We use the estimate
for the slope b from the regression analysis and the estimated standard deviation of noise
from the given calculation. Substituting the estimated values for b, the standard deviation
of noise, and the goal for flushness standard deviation, we get
Solving gives 0.45 as the goal for pad height standard deviation. To meet the original goal for
flushness, in the reformulated problem we need to reduce pad height variation by over 60%.
Leak
Yes
No
300 310 320 330 340 350 360 370 380 390
Seal strength
Figure 14.5 Dot plot of seal strength for leakers and nonleakers.
H1212CH14.qxd 3/31/05 10:45 AM Page 222
In this example, reformulation is especially useful since the original output is binary
and the dominant cause is continuous. It should be easier to find a solution to the reformu-
lated problem than the original since, in any investigation, more information is available
from each battery when we measure a continuous as compared to a binary output. Since
seal strength is a destructive measurement, this gain is offset by increased costs.
Comments
We choose to reformulate only after we have considered the other options. Reformulation
of a problem is not a solution. That is, we cannot continue to reformulate indefinitely.
Eventually one of the variation reduction approaches must be applied. If we can repeatedly
find a dominant cause, we can reformulate the problem a number of times. However, the
potential benefit arising from solving the reformulated problem is reduced in each iteration
by the variation not explained by the identified dominant cause(s). Also, the final cause
may be outside local control.
With reformulation, the dominant cause (an input) becomes the output. We then establish
a baseline for the new output. We set the goal for the reformulated problem by exploiting
what we have learned about the relationship between the cause and the original output. We
can set this goal formally, as in the sunroof flushness example, or informally, as in the battery
seal failure example.
There is little value in reformulating a problem in terms of a cause that is not dominant.
Suppose, as in the sunroof flushness example, we found a cause that is linearly related to
the output so that we have the equation
If the cause is not dominant, then the standard deviation of the noise, the variation due
to all other causes, is large fraction of the standard deviation of the output. Even if we com-
pletely eliminate the effect of the cause, we do not reduce the output standard deviation
substantially.
30
Final BC runout
20
10
1 2 3 4
Heat treatment
Figure 14.6 Plot of final base circle BC runout by heat treatment spindle.
an adjuster seemed difficult. The team decided to look for a more specific cause. They
proceeded by comparing the heat treatment spindles, looking for reasons why they were
different. In the end they were unable to find a specific cause. Out of desperation they
arranged for overdue maintenance on the heat treatment operation. This somehow elimi-
nated the spindle-to-spindle difference. Next, the team had to worry about how to prevent
recurrence of the problem. They decided to monitor the runout after heat treatment on two
parts per day from each spindle. Any large spindle-to-spindle differences triggered main-
tenance of the heat treatment operation.
Comments
We distinguish between reformulation and continuing the search for a more specific cause
based on whether or not we determine a new baseline. We are more likely to reformulate in
cases where the cause is a different continuous characteristic than the output. If the dominant
cause is a discrete characteristic, like machine number or process stream, we typically
continue the search for a more specific dominant cause without reformulating.
H1212CH14.qxd 3/31/05 10:45 AM Page 225
Window Leaks
In the truck window leaks problem, the dominant cause of upper ditch leaks was the
combination of a large primary seal gap and poor plastisol application to the seams. The
proposed and implemented solution was a process change that required operators to brush
the plastisol to ensure coverage of the critical seals. With this solution, the team addressed
only one of the inputs involved in the dominant cause.
14.6 SUMMARY
The thought process in choosing how to proceed after finding a dominant cause is summarized
in Figure 14.7.
H1212CH14.qxd 3/31/05 10:45 AM Page 226
no no
no
no
Figure 14.7 Flowchart to help decide how to proceed after finding a dominant cause.
Key
Points
• There are three options once a dominant cause has been found:
–Fix the obvious.
–Compensate for the cause using desensitization or feedforward control.
–Reformulate the problem in terms of the dominant cause and reapply the algo-
rithm (that is, reduce variation in the cause).
–Continue searching for a more specific dominant cause.
• Subject to cost constraints and concerns over side effects, implementing an obvi-
ous fix is the most desirable option.
• There is a risk of choosing an approach that will be neither effective nor efficient
because of the lack of knowledge. However, we cannot proceed with making a
choice.
15
Moving the Process Center
This time, like all times, is a very good one, if we but know what to do
with it.
—Ralph Waldo Emerson, 1803–1882
M
oving the process center is the only approach that directly addresses an off-target
process center. The goal of the approach is to find a way to move the process out-
put center either closer to the target or in the desirable direction if higher or lower
is better. An example for a higher is better output is shown in Figure 15.1. We do not need
to identify a dominant cause to apply this approach.
The only requirement for moving the process center is an adjuster; that is, a fixed
process input that can be changed to move the process output center.
10 10
0 0
227
H1212ch15.qxd 3/31/05 10:47 AM Page 228
In many processes, the team will have an available adjuster and thus an obvious fix to
the problem. To find an adjuster, we recommend an experimental plan. We use engineering
knowledge and experience with the process to choose fixed inputs, called candidates, to
vary in the experiment. There is a risk that an adjuster will not be found.
The potential costs of moving the process center include the costs of:
• An experiment to find an adjuster, if necessary
• A one-time change to the adjuster
• The ongoing operation of the process at the new setting for the adjuster
To assess the potential benefit of moving the process center, we imagine shifting the
baseline process center to a new value.
500
Seal strength
400
300
Nonleaking Leaking
reformulated the problem in terms of tensile strength. The tensile strength was measured on a sam-
ple of 100 batteries selected from one week’s production. The data are stored in the file battery
seal strength baseline and the results are shown in the histogram in the left panel of Figure 15.1.
From the group comparison, the baseline investigation, and engineering knowledge, the team set
the minimum acceptable seal strength at 320 pounds, which corresponds to increasing the cen-
ter of the process output by about 60 pounds. The desired process histogram is shown in the
right panel of Figure 15.1. At this point, the team did not attempt to determine if this shift in the
process center would meet the initial project goal of reducing the leak rate to less than 0.5%.
The team set out to find an adjuster. Based on engineering and process knowledge, they
chose three candidates (fixed inputs) at two levels each. The further apart the levels, the
more likely the experiment will detect an effect and the greater the risk of negative side
effects. They planned a factorial experiment with eight runs. The candidates and levels are
given in Table 15.1.
In each of the eight experimental runs, five batteries were produced. The runs were
conducted in random order. Treatment 7 corresponds to the existing process. The data are
given in the file battery seal strength move center and in Table 15.2. There were no leaking
seals in the 40 batteries produced in the experiment.
Table 15.2 Treatments and seal strength for battery seal experiment.
Melt Melt Elevator
Treatment Order temperature time speed Seal strength
500
Seal strength
400
300
1 2 3 4 5 6 7 8
Treatment
In the analysis, we first plot the data by treatment as in Figure 15.3. We see, encouragingly,
that many treatments have average strength greater than treatment 7, the current operating
condition, and many meet the desired minimum of 320 pounds.
To isolate possible adjusters, we use MINITAB to fit a full model with all possible main
and interaction effects. Then we look at the Pareto plot of the effects in Figure 15.4. Only
the effect for melt temperature is large. Also, because none of the interaction effects is
large, we conclude that melt temperature is an adjuster.
We can use the main effect plot for melt temperature, given in Figure 15.5, to assess
quantitatively the effect of changing melt temperature. Decreasing the melt temperature by
50°F increases the average seal strength by about 90 pounds.
A: Time
B B: Temperature
C: Speed
C
AB
AC
BC
ABC
0 1 2 3 4 5 6 7 8
450
440
430
Seal strength
420
410
400
390
380
370
360
750° 800°
Temperature
We do not know from this two-level experiment if the effect of the adjuster is linear.
The team decided to use a melt temperature of 770ºF, a little over halfway between the low
and high levels in the experiment. They did not want to decrease the melt temperature too
much for fear of creating other problems related to material flow. To validate the solution,
they changed the melt temperature to 770˚ for one shift and carefully monitored the process
for leaks and negative side effects. The results were very promising. Since the cost of the
change was negligible, the team implemented the new melt temperature setting and moni-
tored the process for two months. The leak rate was reduced to 0.6%.
A B C D E
Wash type
Figure 15.6 Box plots of dip bump score by core wash solution.
In the experiment, for each core wash, 50 cores were processed on each of two days.
There were 10 runs, two replicates of each treatment, and 50 repeats for each run. Within
each day, the order of treatments was randomized. The dip bump score for each of 500
cores was recorded. The data are given in the file engine block leaks move center. The cores
from the experiment were recycled. Core wash A is the current wash.
The experimental results are summarized in the Figure 15.6. There were virtually no
dip bumps with core washes B and D. The team recommended changing the process to use
core wash B since it was cheaper than D.
During the validation of core wash B, the foundry scrapped a half shift worth of engine
blocks and shut down the engine assembly line. This disastrous event occurred because, over
the longer time frame used in the validation investigation, the new water-based core wash
separated and failed to effectively coat the cores. The experiment to assess the different core
washes was conducted over such a short time that the separation problem had not occurred.
The team had not anticipated that continuous mixing was required with the new wash.
Rather than changing the process to incorporate the necessary mixing, the team decided to
look again at core wash D (very carefully this time), since it was similar in formulation to
the original core wash. In the end, the foundry switched to core wash D and cylinder bore
leaks were virtually eliminated.
The problem goal was to reduce the average defect score. To establish a baseline and
to simultaneously assess the measurement system, the team selected a sample of 200 cast-
ings from one day’s production. The operator scored each casting twice. The data are in the
file differential carrier shrink defect baseline and are summarized in Table 15.3.
0 1 2 3 4
0 105 5 0 0 0
First 1 3 40 1 0 0
measurement 2 0 3 30 1 0
3 0 0 0 10 0
4 0 0 0 0 2
The operator was very consistent in scoring the castings, and the team judged the meas-
urement system to be acceptable. The average score (of the first measurements) in this
investigation was 0.75, and 12 out of the 200 castings were defective. The problem goal
was to reduce the average score to close to 0, since then there would be almost no defective
castings.
The team took the risky decision not to search for a dominant cause. Instead, they
adopted the Move the Process Center approach and planned an experiment. The team
selected nine candidates as potential adjusters. These candidates included three iron
chemistry levels (denoted A, B, and C), iron temperature (D), pouring time (E), concen-
tration of in-mold alloy (F), two molding inputs from changing the dies that make the
sand molds (G and H), and squeeze pressure (J). The team selected the low and high levels
of each candidate using engineering judgment. We code the two levels of each candidate
–1 and +1.
To define a run of the experiment, the team planned to set the candidate levels and oper-
ate the casting process for 40 minutes. From past experience with the process, they expected
to see castings with the shrink defect within that time period. Within each run, they planned
to select 20 castings and score them for shrink. They also planned to randomize the order of
the runs.
There are 29 = 512 possible candidate combinations or treatments. It was not feasible
to carry out an experiment with this many treatments so the team decided to use a fractional
factorial design with only 16 runs (see the chapter suppliment1). The 16 treatment combi-
nations must be carefully selected from the 512 possibilities. We use MINITAB to make the
best selection (see Appendix F). We give the experimental design and results in Table 15.4.
The team conducted the experiment without incident and stored the data in the file
differential carrier shrink defect move center.
234
H1212ch15.qxd
Table 15.4 Experimental design and results for differential carrier experiment.
Candidates Shrink defect score freuency
3/31/05
Average
Treatment Order A B C D E F G H J 0 1 2 3 4 Score
1 3 +1 –1 –1 +1 +1 +1 –1 –1 +1 7 5 6 2 0 1.15
10:47 AM
2 4 –1 +1 +1 +1 –1 +1 –1 –1 –1 5 7 6 2 0 1.25
3 2 –1 +1 –1 –1 +1 +1 –1 +1 –1 1 7 8 3 1 1.80
4 5 –1 –1 +1 –1 +1 +1 +1 –1 –1 6 10 4 0 0 0.90
Page 234
5 6 –1 –1 –1 –1 –1 –1 –1 –1 +1 4 7 7 2 0 1.35
6 10 –1 –1 +1 +1 +1 –1 –1 +1 +1 8 11 1 0 0 0.65
7 15 +1 +1 +1 –1 +1 –1 –1 –1 –1 0 5 6 9 0 2.20
8 16 +1 +1 +1 +1 +1 +1 +1 +1 +1 4 8 7 1 0 1.25
9 12 +1 +1 –1 +1 –1 –1 –1 +1 –1 1 9 7 3 0 1.60
10 7 –1 +1 –1 +1 +1 –1 +1 –1 +1 7 10 3 0 0 0.80
11 8 –1 –1 –1 +1 –1 +1 +1 +1 –1 12 8 0 0 0 0.40
12 9 –1 +1 +1 –1 –1 –1 +1 +1 +1 4 7 7 2 0 1.35
Part Four: Assessing Feasibility and Implementing an Approach
13 14 +1 –1 –1 –1 +1 –1 +1 +1 –1 4 7 5 3 1 1.50
14 13 +1 +1 –1 –1 –1 +1 +1 –1 +1 1 8 6 5 0 1.75
15 1 +1 –1 +1 –1 –1 +1 –1 +1 +1 1 8 6 4 1 1.80
16 11 +1 –1 +1 +1 –1 –1 +1 –1 –1 6 9 5 0 0 0.95
H1212ch15.qxd 3/31/05 10:47 AM Page 235
In Table 15.4, we give the frequency of scores and the average score for the 20 castings
in each run. We are looking for a treatment that gives a low average score. Here, due to the
discreteness of the scores, a plot of individual casting scores versus treatment is not very
informative. Instead, we look at the performance measure defined as the average score
over each run. In general, a performance measure is a statistic calculated over all the repeats
within each run to assess the performance of the process for that treatment. There are promising
treatments, such as number 11, where the average score is small.
From Table 15.4, we can see one important property of the selected design. For each
candidate, 8 of the 16 runs have the level –1 and the other 8 have the level +1. Due to this
balance, we can assess the main effect of any candidate by comparing the averages of these
two sets of runs, as we did in Chapter 13.
The price we pay for using a fractional design (rather than a full factorial design) is that
we lose information about interactions. We say that certain effects are confounded, because we
cannot estimate them separately. MINITAB produces a list of effects, called the alias struc-
ture, that are confounded when we use MINITAB to generate the design. For the differential
carrier experiment, the main effects and two input interactions that are confounded are:
I
A + F*J
B + G*J
C + H*J
D + E*J
E + D*J
F + A*J
G + B*J
J + A*F + B*G + C*H + D*E
A*B + C*E + D*H + F*G
A*C + B*E + D*G + F*H
A*D + B*H + C*G + E*F
A*E + B*C + D*F + G*H
A*G + B*F + C*D + E*H
A*H + B*D + C*F + E*G
In the list, the main effects are labeled by a single letter and the interactions by pairs
such as A*B. To determine the confounding, we see from the second line, for example, that
the main effect of input A is confounded with the two-input interaction between F and J. As
a consequence, if the average difference between the eight runs with A at level –1 and A at
level +1 is large, the difference may be due to simultaneously changing the levels of inputs
F and J or to changing the levels of input A. We cannot separate these two possibilities with
the data from the experiment. Because of this confounding, we need to confirm any promising
adjuster found in the experiment. Third- and higher-order interactions are also confounded
with the given effects. MINITAB will provide the complete confounding (aliasing) struc-
ture if desired. We call this experimental plan a resolution III design because some main
effects are confounded with two-input interactions but not with other main effects. In a
H1212ch15.qxd 3/31/05 10:47 AM Page 236
resolution IV design, some main effects are confounded with three-input interactions but
no two-input interactions; in a resolution V design, some main effects are confounded with
four-input interactions but no three-input interactions. Since we assume that the higher the
order of an interaction, the more likely it is to be negligible, we want a design with as high
a resolution as possible. For a given number of candidates, MINITAB selects, by default,
one of the fractional factorial designs with the highest possible resolution.
The Pareto chart of the effects for the full model is given in Figure 15.7. We use the first
effect in the list of confounded effects as the label for the string of confounded effects so
that, for example, D corresponds to the combined effects of input D, the two-input interaction
E*J, and other three-input and higher interactions not shown in the confounding list. We
see in Figure 15.7 that effects for inputs A, B, D, and G are large relative to the others. If we
assume that the two-input and higher-order interactions are negligible, then inputs A, B, D,
and G are adjusters.
We give the main effects plots for inputs A, B, D, and G in Figure 15.8. Since a lower
average score is better, the desirable levels of the four inputs are –1 for inputs A and B and
+1 for inputs D and G. The team decided to investigate further the consequences of chang-
ing the levels of inputs A and B, the two iron chemistries, to the low level used in the
experiment.
To confirm the findings, the team set inputs A and B to their new levels, kept all other
inputs at their current levels, and operated the process for four hours. Of the 3108 castings
produced, only 12 were scrapped for shrink defect at machining. This scrap rate of 0.38%
was a substantial improvement over the original rate of 5%. There were no side effects, so
the team adopted the proposed process change.
D A: A
B: B
A C: C
B D: D
G E: E
J F: F
G: G
AB
H: H
AC J: J
AG
E
AH
AE
F
H
C
AD
Figure 15.7 Pareto chart of the effects for piston shrink defect experiment.
H1212ch15.qxd 3/31/05 10:47 AM Page 237
A B D G
1.60
1.45
Avg. score
1.30
1.15
1.00
–1 1 –1 1 –1 1 –1 1
Figure 15.8 Main effects of significant inputs for shrink defect experiment.
Question
Does changing any of the candidates move the process center substantially?
Plan
• Select a study population where we expect to see the full extent of variation.
• Choose two levels for each candidate as extreme as is feasible.
• Define a run for the experiment. If the output is binary, a run must contain many repeats.
• Choose a design with at least eight runs to specify the treatments. For:
–Three or fewer candidates, use a full factorial design
–Four or more candidates, select a fractional factorial design with at least resolution III
• Spread the runs across the study population. Randomize the order as much as is feasible.
• Make everyone potentially impacted aware of the plan.
Data
• Carry out the experiment. For a:
–Continuous output, record the output value, the levels of the candidates, treatment num-
ber, and run order, one row for each repeat
–Binary output, record the proportion of defectives, the levels of the candidates, treatment
number, and run order for each run
H1212ch15.qxd 3/31/05 10:47 AM Page 238
Analysis
• Plot the output values for each treatment. Look for promising treatments.
• If there are two or more candidates, use a full model and create a Pareto plot of the effects.
• Check for possible confounding in the important effects if the design is a fractional factorial.
Conclusion
• A candidate with a large effect is an adjuster.
• For fractional factorial designs, verify an adjuster, if necessary, to break confusion due to
confounded effects.
When changing the process center we need to watch carefully for negative side effects.
In the core strength example introduced in Chapter 1, the team found they could eliminate
core breakage by increasing core strength. They discovered an adjuster, the amount of resin
in the sand mix, which they used to increase the average strength. However, the stronger
cores led to more casting defects and the approach was abandoned. We may be able to
avoid undesirable side effects by using two or more adjusters simultaneously rather than
one adjuster as in the battery seal failure and engine block leaks examples. This is a good
reason for searching for several adjusters simultaneously.
For off-target process center problems, we can see the expected benefit of changing the
process center using the histogram from the baseline investigation. Based on the problem
goal and baseline, we know how far we want to move the process center and the expected
benefits of such a move. We also need to assess the costs of changing the adjuster and the
operating costs at the new level.
In some problems the two approaches, Move the Process Center and Make the Process
Robust, are identical. The differential carrier shrink defect problem is a good example.
Whatever the cause of the shrink defect, the change in the fixed chemistry inputs made the
process more robust to the cause.
In multistream processes, if we find that the process centers vary from stream to
stream, we can reduce the overall variation by moving the centers of each stream to a com-
mon target, as shown in Figure 15.9. In the left panel, we show box plots of the output by
stream and overall. In the right panel, we show the same plots after we (roughly) align the
substream centers. Aligning the substream centers results in a significant reduction in the
overall output variation.
In some circumstances, an output center may drift over time. In that case, a one-time
adjustment of the process center will not solve the problem over the long term. We may
need to use another variation reduction approach, such as feedback control.
3 3
2 2
1 1
0 0
Adjusted output
Output
–1 –1
–2 –2
–3 –3
–4 –4
–5 –5
Key
Points
• To move the process center, we need to find one or more adjusters, normally
fixed inputs. We change the settings of the adjusters to move the output center.
Process
16
Desensitizing a Process to Variation
in a Dominant Cause
T
he goal of desensitization is to find and change fixed inputs that flatten the relationship
between the output characteristic and the dominant cause (see Figure 16.1). That is, we
find an interaction that we exploit to desensitize the process output to changes in the
dominant cause.1 To explore desensitization, we choose a number of fixed inputs to investi-
gate, based on knowledge of the dominant cause and the process. We use an experimental
plan to determine if these candidates and their new levels will make the process less sensitive
to variation in the dominant cause.
With this approach we do not address the dominant cause directly. We continue to live
with its variation. By changing the relationship between the cause and the output, we
reduce the effect of the cause.
Output
Output
Figure 16.1 Original (left) and new (right) relationship between the dominant cause and
output characteristic.
241
H1212ch16.qxd 3/31/05 10:47 AM Page 242
The only requirement for desensitization is to find new settings of the candidates that
make the output less sensitive to variation in the dominant cause.
The costs of desensitization include:
• An experiment to find the new process settings that make the process less
sensitive to variation in the dominant cause
• A one-time change to the process settings
• The ongoing operation of the process with the new settings
There is no information about whether this approach will be feasible until the experimental
investigation is complete. This is a drawback, since the cost of the experimentation may be
high and the returns uncertain. The benefit can be assessed using the relationship between the
cause and the output. If we could totally eliminate the effect of the dominant cause, the max-
imum benefit is given by the residual variation in the output due to all other causes.
1 Regular Regular
2 Regular Alternative
3 Alternative Regular
4 Alternative Alternative
temperature) and another three blocks just after lunch (low pouring temperature). The
experiment was replicated over three days so that a total of 72 blocks were measured for
porosity. The 12 blocks before and after lunch were divided into three groups of four.
Within each group, the team randomized the order of the four treatments. Since temperature
increased steadily after lunch, each treatment saw roughly the same range of temperatures.
The team managed the randomization easily because the treatments only affected the cores. The
data are given in the file engine block porosity desensitization.
We start the analysis with a plot of porosity versus treatment number with different plot-
ting symbols for the high and low temperatures, as shown in Figure 16.2. For any treatment,
we can see if there is substantially less porosity than in the current process, here given by
Treatment 1. All three new treatments are a large improvement over the existing process
with smaller average porosity and less variation as the pouring temperature changes.
The team decided to investigate the new core wash further to ensure that there were no
side effects such as an increase in other casting defects. There was no cost to this change.
The alternative core wash was eventually adopted and the scrap rate due to porosity
decreased from about 4% to less than 1%.
In this example, the team designed a full factorial experiment and drew their conclu-
sions from a simple plot. We give three more examples to illustrate some complications that
can arise in a desensitization experiment.
400 Low
High
300
Porosity
200
100
1 2 3 4
Treatment
Figure 16.2 Porosity by treatment for high and low pouring temperatures.
H1212ch16.qxd 3/31/05 10:47 AM Page 244
The baseline oil pan scrap rate was 8%. Applying the method of elimination, the team dis-
covered that a dominant cause of scrap was the amount of lubricant applied to the blanks
prior to the draw press. A roll coater with two steel cylinders applied the lubricant. There
were no controls on the roll coater and no ongoing measurement of the lubricant amount.
During a verification experiment, the team used a visual five-point score for the amount
of oil applied. They specially prepared 80 blanks with lubricant amounts at each of the five
levels. The team could control the dominant cause during the experiment but not in normal
production. To verify the cause, the team stamped the 80 blanks for each oil level. We see
from Figure 16.3 that the scrap rate increased markedly as the lubricant score increased.
There was no obvious cost-effective way to control the amount of lubricant applied to
the blanks, so the team decided to try to desensitize the process. There was a lot of uncer-
tainty about whether this approach would be effective, but they decided that it was worth the
cost of the investigation in the hopes of avoiding a solution with high capital cost.
Based on knowledge of the cause, the team chose three candidates for the desensitiza-
tion experiment: lubricant supplier, die temperature, and binder force. After consultation,
three lubricant suppliers (the current supplier is labeled A and two other suppliers B and C)
provided the lubricant they judged the best suited for the application. Supplier A suggested
the lubricant in current use. The team selected the two levels for the other candidates using
engineering judgment. They decided to use only the two extreme levels 1 and 5 for the
amount of oil on the blanks, the dominant cause.
The team defined a run to be the consecutive stamping of 80 blanks as in the verification
experiment. Since the output was binary, each run required sufficient repeats so that some
defectives were produced. The team used the scrap rate for each run as a performance meas-
ure. The experiment had 24 runs. There were 12 treatments defined by the levels of the three
candidates and two runs per treatment, one for each level of the lubricant score. With this
plan, each treatment would see the full range of variation in the dominant cause.
15
Scrap rate (%)
10
0
1 2 3 4 5
Lubricant score
Table 16.2 Candidate levels and scrap rates for low and high level of lubricant amount.
Scrap rate for
Since die temperature was difficult to change, all runs with the low temperature were
conducted first. Within the low-temperature runs, the six treatments were applied in random
order. Once they set the levels of the candidates, the team stamped the two sets of 80 blanks,
one set with oil level 5 and one with oil level 1.
We give the experimental plan, including the run order, and the results in Table 16.2
and in the file oil pan scrap desensitization.
0.25 1
5
0.20
Scrap rate
0.15
0.10
0.05
0.00
0 5 10
Treatment
Figure 16.4 Scrap rate for low and high level of lubricant amount versus treatment
(dashed horizontal line gives the baseline scrap rate).
H1212ch16.qxd 3/31/05 10:47 AM Page 246
0.09 0.10
Mean
0.09
0.08 0.08
0.07
0.06
0.07
0.05
0.04
1 5 1 5
Lube amount Lube amount
Supplier
A
0.15
B
C
Mean
0.10
0.05
1 5
Lube amount
Figure 16.5 Interaction plots for oil pan scrap desensitization experiment.
We start the analysis by plotting the scrap rate by treatment with different plotting symbols
for the two lubricant scores. In Figure 16.4, we see that there are promising treatments with
low scrap rates for both levels of the dominant cause.
We can look at the effects in more detail using the interaction plots between the candi-
dates and the lubricant score. In Figure 16.5, we see that the scrap rate is:
• Less sensitive to changes in the lubricant amount if the die temperature is low
• Substantially lower with binder force at the low level for both lubricant scores
• Much lower and less sensitive to changes in lubricant amount for lubricant C
Since lubricant C looks so promising, we next consider the interactions between binder force
and die temperature by lubricant amount for supplier C only. We subset the data set in
MINITAB (see Appendix A). From the interaction plots in Figure 16.6, we conclude that low
binder force is the preferred setting for lubricant C to reduce the scrap rate. Die temperature
has little effect.
During the investigation, the team measured other important outputs to make sure that
changing the process settings would not result in other problems. They found that lubricant
C was also better than the original lubricant in terms of steel flow at the deep end of the
pan. The team decided to change to the lubricant from supplier C. After the change, the
long run scrap rate was reduced to about 3%. There was a small increase in cost that was
far outweighed by the savings in scrap costs.
H1212ch16.qxd 3/31/05 10:47 AM Page 247
Mean
0.06
0.045 0.05
0.04
0.035
0.03
1 5 1 5
Lube amount Lube amount
Figure 16.6 Interaction plots of die temperature and binder force by lubricant amount
(supplier C only).
In this resolution III design, the main effect of input D4 is confounded with the three-input
interaction of D1, D2, and D3. Thus pairs of two input interactions such as D1*D2 and D3*D4
are also confounded.
The team set up the testing laboratory with two sets of environmental conditions:
Normal Conditions to mimic use in temperate climate
Extreme Conditions to match the worst anticipated usage conditions with high
temperature, high humidity, and frequent door openings
For each prototype listed in Table 16.3, the team planned two runs in the laboratory,
one at normal and one at extreme environmental conditions. We call this a crossed design
since we use both levels of the dominant cause for each of the treatments. There are 16
combinations of the 5 inputs (D1 to D4 and environment) out of the 32 possible treatments.
We use MINITAB to determine the confounding structure for this fractional design.2 We
assume third- and higher-order interactions are negligible, so we attribute any large differ-
ences to main effects and two-input interactions. We reproduce the confounding structure
with any effects involving three or more of the fixed inputs (candidates) erased.
I
D1
D2
D3
D4
environment
D1*D2 + D3*D4
D1*D3 + D2*D4
D1*D4 + D2*D3
H1212ch16.qxd 3/31/05 10:47 AM Page 249
D1*environment
D2*environment
D3*environment
D4*environment
D1*D2*environment + D3*D4*environment
D1*D3*environment + D2*D4*environment
D1*D4*environment + D2*D3*environment
Table 16.4 Plan and data for refrigerator frost buildup desensitization experiment.
Environmental conditions
10 Low
High
1 2 3 4 5 6 7 8
Treatment
In Figure 16.8, we look at the interaction plots for each candidate versus the environ-
mental cause. We hope to find an interaction that gives consistently low cooling plate
temperatures for both levels of the environment, as seen in treatment 6.
Using Figure 16.8, the team concluded that they should change D2 and D4 to their
new levels. However, changing to the new level of D4 added significant cost. As a result, the
D1 D2
6.5 Original 6.5 Original
New New
5.5 5.5
Mean
4.5
Mean
4.5
3.5 3.5
2.5 2.5
1.5 1.5
0.5 0.5
Low High Low High
Environment Environment
D3 D4
6.5 Original 6.5 Original
New New
5.5 5.5
Mean
Mean
4.5 4.5
3.5 3.5
2.5 2.5
1.5 1.5
0.5 0.5
Low High Low High
Environment Environment
Figure 16.8 Interaction plots of candidates and the environment cause output in cooling
plate temperature.
H1212ch16.qxd 3/31/05 10:47 AM Page 251
team decided to investigate how much the cooling plate temperature variation could be
reduced if only D2 was changed to its new level. This treatment was not included in the
original fractional design. When they produced a prototype with only D2 changed and sub-
jected it to both levels of the environmental cause, they found that changing this input alone
did not sufficiently reduce the cooling plate temperature and its variation as the environ-
mental input was changed.
Despite the added cost, the team recommended adopting a refrigerator design with
both D2 and D4 changed to their new levels. After the change, the frequency of complaints
about frost buildup from the tropical market was substantially reduced.
Eddy Current
In a process that produced castings later machined into brake rotors, there was 100%
inspection of hardness using a measurement system based on eddy currents. This system
was fast and nondestructive. Despite the 100% inspection, there were frequent complaints
from the customer, a machining operation, about castings out-of-specification with respect
to hardness. The customer used the Brinell method to measure hardness, a standard proce-
dure that is partially destructive and time consuming, not suitable for 100% inspection.
There was a high reject rate at the 100% eddy current inspection station. The operators
measured all rejected castings a second time. The plant shipped castings that passed at least
one test and scrapped those that failed both tests.
Management assigned a team to reduce scrap costs and customer complaints by reducing
casting hardness variation. To establish a baseline, each day for one week, the team col-
lected a haphazard sample of 100 first-time eddy current measurements for a total of 500
hardness measurements. The data are available in the file eddy current baseline and are
plotted by day in Figure 16.9. The hardness specification is 4.3 to 4.7. The baseline standard
4.8
4.7
Hardness
4.6
4.5
4.4
4.3
4.2
1 2 3 4 5
Day
Figure 16.9 Box plots of eddy current hardness measurements by day (dashed horizontal
lines give the specification limits).
H1212ch16.qxd 3/31/05 10:47 AM Page 252
deviation is 0.1365 and the full extent of hardness variation is 4.52 ± 3*0.14 (average ±3
standard deviation), or about 4.1 to 4.9.
The team assessed the measurement system using a standard gage R&R procedure as
described in the supplement to Chapter 7. The measurement system standard deviation was
0.021, so it appeared the measurement system was adequate to proceed to the next stage of
the algorithm.
Because of the past complaints from their customer, the team also carried out an inves-
tigation to compare the eddy current and Brinell hardness measurement systems. They
selected 30 castings with widely varying eddy current measurements, then tagged and
shipped these castings to the customer. The customer measured the Brinell hardness for
each casting. The data are plotted in Figure 16.10, and given in the file eddy current Brinell
measurement.
The real problem is now apparent. There is poor correlation between the two measure-
ment systems. Although the earlier gage R&R investigation showed that the eddy current
system was repeatable, for any set of castings with the same Brinell hardness, the eddy
current system gave widely varying values. The team reformulated the problem to improve
the correlation between the two measurement systems. Said in another way, the team decided
to reduce the variation in eddy current measurements among castings with the same Brinell
hardness.
The team discovered that day-to-day fluctuations in iron chemistry and the level of dirt
on the castings were dominant causes of the variation of the eddy current measurement of
castings with the same Brinell hardness. They did not look for a more specific cause in the
chemistry family. Castings were cleaned by shot blasting. The level of cleaning varied
since the shot blast machine did not run using a first-in, first-out protocol. The cleaning
times ranged from 5 to 19 minutes.
In regular production, iron chemistry was expensive to control, and it was difficult to
remove all the dirt from the casting before measurement. As a result, the team decided to
look for a way to run the eddy current measurement system that was less sensitive to vary-
ing levels of dirt and iron chemistry.
4.85
4.75
4.65
Eddy
4.55
4.45
4.35
4.25
4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9
Brinell
Figure 16.10 Scatter plot of eddy current versus Brinell hardness measurements.
H1212ch16.qxd 3/31/05 10:47 AM Page 253
For the desensitization experiment, the team chose three candidates corresponding to
settings on the eddy current system. For each candidate they selected high and low levels
on either side of the current settings. The team planned a full factorial with eight treatments
using the levels given in Table 16.5.
To ensure that each treatment was assessed under the extreme levels of the two dominant
causes, the team selected 16 castings from each of two days’ production. Each casting was
cleaned for 5 minutes, measured with the eddy current system, cleaned for another 14 minutes,
and then measured again. The team measured every casting using all eight treatments at both
points in the cleaning cycle. Finally they measured the Brinell hardness of each casting at two
different positions on the casting. The team knew that the Brinell hardness measurements had
little measurement variation relative to the variation in the eddy current machine. In the analy-
sis, they used the average of the two Brinell values as the true hardness of the casting.
The team hoped that the two days would represent the day-to-day variation in iron
chemistry. This was a risky decision. With only two days, the full day-to-day variation in
chemistry might not be captured. If the team had found the specific cause within the chemistry
family, they could have more easily ensured that each treatment was subject to the full
range of the dominant cause. We give the data from the experiment in the file eddy current
desensitization. There are 16 eddy current measurements and the average Brinell reading
for each of 32 castings.
The analysis of this experiment is complicated. The goal in the analysis is to find a com-
bination of levels for the candidates that yields a strong relationship between the Brinell
hardness values and the values given by the eddy current system, consistently over the four
combinations of the levels of the dominant causes. We do not worry about measurement
bias, since if the bias is consistent it can be easily removed.
In the analysis, for each of the eight treatments we plot the hardness as measured by the
eddy current system versus the (average) Brinell hardness for all four combinations of the
dominant causes. For example, in Figure 16.11, we show the plot for the treatment fre-
quency = 200, temperature = 35, and gain = 30. This plot is typical of all the eight treat-
ments. We see from the Figure 16.11 that there is a weak relationship between the two
methods of measurement. For a given Brinell hardness, there is large variation in the eddy
current measurements.
The experiment failed to find settings of the candidates that make the eddy current
measurement system less sensitive to changes in chemistry and dirt. In all cases, there was
no relationship that could be used to predict the Brinell hardness from the eddy current
measurements. Had one or more of the treatments shown promise, the team could have
conducted a more formal analysis as discussed in the chapter supplement.3
H1212ch16.qxd 3/31/05 10:47 AM Page 254
4.8
4.7
4.5
4.4
4.3
4.2
Figure 16.11 Plot of eddy current versus Brinell hardness for freq. = 200, temp. = 35, and
gain = 30.
In the end, the team concluded that the experiment was not a failure. They had shown
that the eddy current system did not work well and could not be easily improved. There was
little value in the 100% inspection. By removing the eddy current system, the foundry was
forced to concentrate on reducing variation in the process instead of relying on imperfect
100% inspection.
Question
Which combination, if any, of the candidate levels results in a process less sensitive to vari-
ation in the dominant cause?
Plan
• Select a design for the candidates. For:
–Three or fewer candidates, use a full factorial design
–Four or more candidates, use a fractional factorial design with resolution III or higher
• Define a treatment number for each combination of the candidates.
• Select two levels for the dominant cause at the extremes of its normal range.
• Use a crossed design where, for each treatment, there are runs for both levels of the dominant
cause.
• Define a run of the experiment including treatment assignment and the number of repeats.
H1212ch16.qxd 3/31/05 10:47 AM Page 255
Data
Carry out the experiment. Record the output value, the levels for candidates, the level of the
dominant cause, treatment number, and run order, one row for each repeat.
Analysis
• Plot the output for each treatment number. Use a different plotting symbol for the two levels
of the dominant cause. Look for promising treatments.
• Construct all cause by candidate interaction plots.
Conclusion
Examine the plots to determine the levels of the candidates (process settings) that make the
output less sensitive to variation in the dominant cause.
The dominant cause varies in normal operation of the process. During a desensitization
experiment, however, we need to hold the cause fixed at its low or high level for a run of
the experiment. If this is not possible, we may need to resort to a robustness experiment as
described in Chapter 19.
We strongly recommend the crossed design to ensure that all interactions between the
individual candidates and the dominant cause are separately estimable, even if a fractional
design is used for the candidates.
In many instances, we choose not to completely randomize the order of the runs. In the
refrigerator example, eight prototypes were built and then simultaneously tested under the
two levels of the (composite) dominant cause. Because of high cost, it was not feasible to
randomize the order in which each refrigerator was exposed to each level of the cause.
There are some special considerations when planning a desensitization experiment for
binary output. There must be enough repeats in each run so that some defectives occur on
at least half the runs in the experiment. Otherwise the experiment will provide little useful
information. See tables in Bisgaard and Fuller (1995a, 1995b, and 1996) for some guidance
concerning the sample sizes necessary. The analysis of an experiment with a binary output
can be based on the proportion defect within each run, as in the oil pan scrap example.
We start the analysis of a desensitization experiment with a plot of the individual output
values by treatment number, as in Figure 16.2. We use a different plotting symbol for each
level of the dominant cause. For complex problems such as the eddy current measurement
system, we use scatter plots or other graphical summaries to visually characterize the per-
formance of the process for each treatment. From these summaries, we can assess if desen-
sitization is feasible. In a full factorial experiment, if none of the treatments are promising,
we cannot desensitize the process using the selected levels of the candidates. If we have a
fractional design, or if one or more treatments appear promising, we look at interaction plots
between the individual candidates and the cause as in Figure 16.5.
We may find process settings that are less sensitive to variation in the dominant cause, but
that result in an undesirable shift of the process center or some other side effect. In this case, we
may look for an adjuster to move the center (see Chapter 15) or we may formulate a new prob-
lem to deal with the side effect. In the crossbar dimension example discussed in Chapter 12, bar-
rel temperature was identified as a dominant cause. The team found that increasing the set point
for barrel temperature (but not controlling the variation) increased the average size but substan-
tially reduced the variation in the crossbar dimension. They then needed to move the process
center and also solve a new problem called burns, a visual defect that occurred on some parts
molded with the higher barrel temperature. See the exercises for chapters 16 and 18.
We recommend experiments with only two levels for each candidate. If we are successful
in finding a candidate that desensitizes the process, we may optimize using a follow-up
experiment with the identified candidate at several levels and the dominant cause at two
levels. If a candidate is categorical, such as the supplier in the oil pan scrap example, we
may use more than two levels in the desensitization experiment.
We have seen many teams proceed directly to the desensitization approach without first
identifying a dominant cause. They then conduct an experiment where they change both
H1212ch16.qxd 3/31/05 10:47 AM Page 257
candidates and suspects. In this way, they hope to identify a dominant cause and desensitize
the process at the same time. This is a poor strategy. The experiment will be large and com-
plex since there is little knowledge about which varying input, if any, is a dominant cause.
There is also little information to help select the fixed inputs as candidates. It is more effective,
both in terms of cost and the likelihood of finding a solution, to search first for a dominant
cause (or at least clues about the dominant cause family) using the method of elimination
with observational investigations, and then to consider a desensitization experiment.
We sometimes analyze the results of desensitization experiments using statistical mod-
els. Model building is an advanced topic that we do not cover here. See experimental
design references such as Montgomery (2001) and Wu and Hamada (2000). Additional
analysis using models may be necessary if we wish to predict the performance for candi-
date levels not used in the experiment.
The idea of desensitizing a process to a dominant cause was popularized by Taguchi
(1985, 1986). He calls a suspect a noise factor. Taguchi (1985) cites several examples,
including the famous Ina tile case. In other examples, Taguchi proceeds without knowledge
of a dominant cause. This corresponds to the robustness approach described in Chapter 19.
We provide a more detailed discussion of the designs and analysis suggested by Taguchi in
the supplement to Chapter 19.
Key
Points
• A desensitized process is one that is less sensitive (in terms of output variation)
to changes in an identified dominant cause.
Dominant
Process Output
cause
17
Feedforward Control Based
on a Dominant Cause
W
e use a feedforward controller to reduce the effect of an identified dominant cause
of variation. The basic idea is to measure the cause and then predict the output
value. If the predicted output is not close to the target value, we make an adjust-
ment to the process to compensate for the predicted deviation. Figure 17.1 is a schematic of
the implementation.
For feedforward control to be effective, the requirements are:
• A known relationship between the output and the dominant cause
• A reliable system to measure the dominant cause
• A timely way to adjust the process center
No adjustment
Measure required
dominant Rest of process
cause
Adjustment
required
259
H1212ch17.qxd 3/31/05 10:48 AM Page 260
When making potato chips, dark spots on the chips were undesirable to the customers. In
the existing process, the dark spot problem was an uncommon but significant concern. A
team was assigned to reduce the occurrence of dark spots. Dark spots were measured on a
scale from 1 to 10 for each lot of chips. The rating was subjective, but by using the same
assessor, the measurement system added little variation. The team used existing production
records to quantify the baseline. The lot average score was 1.83 and the full extent of vari-
ation was a range of scores from 1 to 8. They also discovered that there was strong time-to-
time variation in the occurrence of dark spots.
Stratifying by potato batch and investigating further, they suspected that the sugar con-
centration in the incoming potatoes was the dominant cause. To verify the suspicion, the
team produced chips using five different batches of potatoes chosen to have a wide range
of sugar concentrations. From each batch of potatoes, three lots of chips were produced.
The resulting data are given in Table 17.1 and plotted in Figure 17.2.
1 0.3% 1, 1, 2
2 0.4% 2, 1, 3
3 0.5% 4, 5, 5
4 0.6% 5, 4, 4
5 0.7% 5, 6, 7
H1212ch17.qxd 3/31/05 10:48 AM Page 261
Figure 17.2 Scatter plot of dark spot score versus sugar concentration (%).
The team decided to implement a feedforward control scheme. They noticed that a sig-
nificant number of dark spots occurred when the sugar concentration was 0.5% or greater.
They knew they could reduce the average sugar concentration in a batch of potatoes by
storing them for several weeks at temperatures above 13°C. Any batch of potatoes with
initial sugar concentration above 0.4% was stored until the sugar concentration decreased.
Otherwise the potatoes were processed immediately. The dark spot problem was virtually
eliminated (Nikolaou, 1996). There were added storage and logistic costs but no other
reported negative side effects due to the storage.
14
12
10
Percent
8
0 1 2 3 4 5
Imbalance
The team had several choices of approach. They could reformulate the problem again and
try to reduce the imbalance in each of the two components. They could implement 100%
inspection and rework transmissions with imbalance greater than 2.0, a very expensive
option. Instead, they decided to assess the feasibility of feedforward control. In the assembly
process, they planned to measure the center of gravity of each component and then assemble
the two parts so that there was a maximum offset (that is, the two centers of gravity were set
180° apart). Assembling in this way is called vectoring. See Figure 17.4.
To estimate the possible improvement from vectoring, the team measured the center of
gravity for 100 pairs of the two components selected over two days to match the baseline
investigation. We give the data in the file steering wheel vibration feedforward.
The team calculated the imbalance for each sampled pair if vectoring had been used in
the assembly. The histogram for these simulated data is shown in Figure 17.5. For these
data, 3.5% of the transmissions had imbalance greater than 2.0, a marked improvement
over the baseline.
Component 1 Component 2 Vectored assembly
X
X
X X
Figure 17.4 Vectoring to reduce imbalance (the X shows the location of the center of
gravity for each component).
H1212ch17.qxd 3/31/05 10:48 AM Page 263
15
10
Percent
5
0 1 2 3 4
Imbalance
Since vectoring alone cannot meet the goal, the team considered selective fitting fol-
lowed by vectoring, a more complicated and expensive option. The idea was to select a pair
of components with similar imbalances and then vector them. There are two steps:
• Sort the second component into bins based on its measured imbalance.
• Measure the first component and select a matching second component from the
appropriate bin.
The team needed to determine how best to sort the components (that is, determine the
number of bins and their boundaries) and whether sorting was worth the trouble.
To quantify the possible benefits of selective fitting with different bin structures, the
team used the data from the feedforward investigation to simulate several scenarios. They
explored the effect of sorting by creating two or three bins for the second component.
With two bins, the components were divided based on whether their imbalance was less
than or greater than 0.75. This is roughly the middle of the range for both components in
the sample. With three bins, the boundaries were 0.33 and 0.90. The simulation results are
presented in Table 17.2. The benefits of selective fitting are clear. The goal can be reached
by using a combination of sorting components and then vectoring. We give a more
detailed explanation of how this simulation was done in the supplement.1
The team decided to implement feedforward control using selective fitting with two
bins and vectoring. They arranged for the off-line measurement of the center of gravity
(imbalance and angle) of the second component. The angle was marked on each compo-
nent. The components were then sorted into two bins depending on whether the imbalance
exceeded 0.75. In the assembly process, the center of gravity for the first component was
measured. This component was then matched with a mating second component from the
appropriate bin and the two selected components were vectored.
H1212ch17.qxd 3/31/05 10:48 AM Page 264
Truck Pull
Analysis of Variance
Source DF SS MS F P
Regression 4 77.827 19.457 627.82 0.000
Residual Error 95 2.944 0.031
Total 99 80.771
Truck frame geometry is a good predictor of the left caster value. We make the prediction
by substituting the known frame geometry summaries into the equation:
left caster = –18.6 + 1.24 left front + 0.677 right front + 0.140 left rear + 0.156 right rear
The standard deviation of the left caster for the 100 vehicles in the investigation is 0.903. The
residual standard deviation is 0.176. If we could eliminate the effect of the variation in
the frame geometry, we could reduce the variation in left caster by a factor of five. There
was a similar pattern for the other alignment characteristics.
In reality, a mathematical prediction model was built using knowledge of the geometry
of the suspension. The regression model mimicked the mathematical model well.
Since the potential benefit was large, the team recommended implementing feedforward
control. For each frame, the supplier measured and bar-coded the frame geometry sum-
maries. Then at the truck assembly plant they:
• Read the geometry summaries from the bar code and used the values to predict
caster and camber.
• Built positioning components based on the prediction to adjust the caster and
camber.
• Used the custom components to position the suspension components on the
frame during the assembly.
After implementing feedforward control, the standard deviation in left caster values, given
in the file truck pull validation, was 0.25. The full reduction in left caster variation predicted by
H1212ch17.qxd 3/31/05 10:48 AM Page 266
the regression model was not achieved because there was some error in building and using
the positioning components and there were also small errors in the mathematical model.
There was a large benefit for the extra cost in terms of reduced warranty claims. The
truck pull variation was reduced by roughly 70% through the use of the feedforward
controller.
Predictions should be accurate for the full range of values normally seen for the dominant
cause. Informal methods and simple models may give poor predictions and more compli-
cated models are sometimes warranted.2
Feedforward control is effective only if based on a dominant cause. By definition, we
cannot predict the output well with a cause of variation that is not dominant. If we base an
adjustment on a poor prediction, we may increase rather than decrease the output variation.
If we measure the dominant cause for every unit, we must be able to apply the adjustment
to each unit. This can be a difficult task in a complex process, especially if the adjust-
ment takes place far downstream from the measurement of the cause. In the truck pull
example, the team solved this difficulty by bar-coding the frames.
The frequency of adjustment depends on how the dominant cause varies. If the domi-
nant cause acts in the part-to-part family, we may need to make an adjustment for every
part. If the dominant cause acts in the time-to-time family, we can make adjustments less
often, as in the potato chip spots example. Since frequent adjustment adds to the cost and
complexity of the process, the feasibility of feedforward control depends on the nature of
the variation of the dominant cause. If adjustment costs are large, we may decide to only
adjust when the difference between the predicted output and the target is large.
In the truck pull example, we use the residual standard deviation after fitting a regres-
sion model relating the output and the dominant cause to estimate the benefits of feedforward
control. This standard deviation underestimates the remaining variation because there are
likely measurement, prediction and adjustment errors. In cases where the dominant cause
acts batch-to-batch, such as in the potato chip spots example, we use the within-batch vari-
ation as an optimistic estimate of the benefit. We can sometimes use available process data
to simulate the performance of a proposed feedforward controller to assess its benefit, as in
the steering wheel vibration example.
Feedforward control is related to feedback control, described in Chapter 18. In both
approaches we reduce variation by adjusting the process based on a prediction of the output.
The fundamental difference is that, with feedforward control, we predict the output using
the dominant cause, while with feedback control, we predict the future output using previ-
ously observed output values.
Feedforward control can also be thought of as a way of desensitizing the process to
variation in a dominant cause, as discussed in Chapter 16. In the desensitization approach,
we searched for process settings that permanently made the process less sensitive to vari-
ation in a dominant cause. With a feedforward controller, the desensitization requires
repeated adjustments to compensate for the different values of the dominant cause.
There is little published work on feedforward controllers available in the statistical
literature. The most comprehensive work is Box and Jenkins (1976). See also Jenkins (1983).
H1212ch17.qxd 3/31/05 10:48 AM Page 268
Key
Points
• A feedforward controller compensates for the value of a dominant cause by pre-
dicting the output from the measured value of the cause and then adjusting the
process based on the deviation between the prediction and the target.
Feedforward control
Dominant
Process Output
cause
Control
18
Feedback Control
F
eedback control is used to compensate for a predictable pattern in the output char-
acteristic due to known or unknown causes. We reduce variation by predicting the
next output value using previously observed output values. Then we adjust the
process center to compensate for the deviation between the predicted output and the target.
We do not need to identify a dominant cause of variation with this approach. We illustrate
how to implement a feedback control scheme in Figure 18.1.
For feedback control to be effective, the requirements are:
• The dominant cause must act in the time-to-time family.
• There must be a timely way to adjust the process center.
The potential costs of feedback control include:
• An investigation to explore the time pattern in the output, if not known
• An experiment to find an adjuster, if one is not known
• The ongoing costs of measuring the output and making the adjustment
No adjustment
Measure required
Process output
characteristic
Adjust
Adjustment
required
269
H1212ch18.qxd 3/31/05 10:49 AM Page 270
We can assess the potential benefits of feedback control once we know the time pat-
tern of variation in the output. With feedback control we can at best hope to eliminate the
time-to-time component of the output variation.
9 9
8 8
Tightness
Tightness
7 7
6 6
5 5
4 4
1 2 3 4 5 6 1 2 3 4 5 6 7
Front cable batch Axle cable batch
Figure 18.2 Box plot of tightness by front and axle cable batch.
H1212ch18.qxd 3/31/05 10:49 AM Page 271
To implement the approach, they needed to find an adjuster. The team knew that changing
the depth of an adjustment nut would change parking brake tightness. To calibrate the adjuster,
they selected five vehicles of varying tightness to cover the full extent of parking brake tightness
variation. The parking brake system on each vehicle had been originally installed with the
adjustment nut at a depth of 24 millimeters. The team reinstalled the parking brake system on
each vehicle with the nut at the four depths 22, 23, 25, and 26 millimeters and measured the
corresponding tightness. The data are given in Table 18.1 and in the file parking brake tight-
ness adjuster calibration.
We use MINITAB to fit a regression model to relate the average tightness to the depth
with results:
Analysis of Variance
Source DF SS MS F P
Regression 1 7.7440 7.7440 161.33 0.001
Residual Error 3 0.1440 0.0480
Total 4 7.8880
The estimated slope is 0.88, so changing the adjustment nut depth by one millimeter will
change the tightness, on average, by about 0.88 clicks.
There was little cost to measuring tightness and to changing the depth of the adjust-
ment nut. Since the dominant cause was in the batch-to-batch family, the team concluded
that feedback control was feasible.
To implement the approach, the team decided to measure the tightness of the first five
vehicles assembled for each new batch of front cable assemblies. These measurements
were made as soon as possible once the parking brake was installed. They made an adjustment
if any of these measurements was outside the specification limits (5 to 8). The amount of
the adjustment was based on the difference between the target tightness (6.5) and the average
tightness for the five vehicles. To increase the average tightness by one click, they increased
the depth of the adjustment nut by roughly 1/0.88 = 1.14 millimeters. For ongoing protec-
tion, and to ensure that the feedback control was effective, they changed the control plan so
that within a batch of front cables, the parking brake tightness was measured on every tenth
vehicle. If an out-of-specification vehicle was found, then all vehicles for that batch were
inspected and reworked if necessary. After implementation, there was a marked reduction
in the amount of rework required because of parking brake tightness variation.
V6 Piston Diameter
We discussed the problem of excess diameter variation in the production of V6 aluminum
pistons in chapters 2, 5, 9, and 11. From the problem baseline investigation, the team learned
that the process was centered on target with standard deviation 3.32 microns. The problem
goal was to reduce the standard deviation to less than 2.0 microns. The team determined that
the measurement system was adequate and, using a variation transmission investigation (see
Chapter 11), that the home of the dominant cause was the intermediate Operation 270. They
reformulated the problem to reduce the variation in the diameter as measured at Operation 270.
At Operation 270, there were two parallel grinders. To better understand the perform-
ance of the process, the team measured one piston a minute from each grinder for 200
minutes. They ensured that no adjustments were made to the grinders while the data were
collected. The data for stream 1 are given in the file V6 piston diameter 270. Diameter is
recorded in microns as the deviation from 87 millimeters. Thus, a measured diameter of
87.595 millimeters is recorded as 595. We show the run chart for stream 1 in Figure 18.3.
600
Diameter at stream 1
595
590
The plot for the second stream is similar. Once the data were stratified by stream, the team
saw that the short-term variation in the diameter was small relative to the time-to-time
variation.
Based on the run chart, the team decided to look at the feasibility of feedback control
separately for each stream. They found this approach attractive because they knew that
the operator could make an adjustment at Operation 270 within a short time and with
immediate effect.
The team selected an informal feedback scheme similar to precontrol.1 The rules for
each stream were:
• Every 15 minutes, measure the diameter of two consecutive pistons after
machining.
• Compare the average diameter to the adjustment limits 592.7 and 600.7 microns.
• If the average falls outside the limits, adjust the process to the target 596.7 microns.
The size of the adjustment, if any, is the difference between the observed average diameter
and the target.
The team derived the adjustment limits by working backwards using information about
the relationship between the diameter at Operation 270 and the final diameter. To meet the
overall goal, they needed the final diameter to be well within the specification limits of 591
±10 microns.
To assess the potential gain from this approach, the team used the data in the file V6 pis-
ton diameter 270 to simulate the effect of feedback control.2 They assumed the adjustment
was perfect in the simulation, which would not be true in practice. We see the results of the
simulation in Figure 18.4.
600
Diameter at stream 1
595
590
Figure 18.4 Diameter at stream 1 of Operation 270 by minute (solid line gives the original
series, dashed line gives the adjusted series).
H1212ch18.qxd 3/31/05 10:49 AM Page 274
We estimate the benefit of the proposed feedback scheme by comparing the adjusted
and unadjusted series. Using MINITAB, we get:
Based on the simulation, we expect the feedback controller to reduce the standard
deviation of the diameter at Operation 270 from 3.3 to about 1.7. We combine this information
with the results from the variation transmission investigation (see the supplement to Chap-
ter 11) to predict that after implementing the feedback controller, the standard deviation in
the final diameter would be reduced to
The team changed the control plan at Operation 270 and trained the operators to use the
new control scheme. They also arranged for a periodic process audit to ensure the new
scheme was being used.
After implementation, the team repeated the baseline investigation to validate the
improvement. The standard deviation of the finished diameter was reduced from 3.32 to 2.20
microns.
The estimate of measurement variation was 0.42 microns. If the diameter was within
specification at the original measurement, we cannot explain a measured value 13.2 by the
short-term measurement system variation.
The team next decided to assess the stability of the measurement system by measuring the
same two pistons every 15 minutes for 12 hours, starting immediately after calibration. The
data are given in the file V8 piston diameter measurement stability and plotted in Figure 18.5.
Over the 12-hour period, the variation was much larger than expected based on the
results from the short-term measurement investigation. The range of diameter values for
each piston was around 8 microns. This drift could easily be responsible for the out-of-
specification piston found in the audit.
The team immediately increased the frequency of calibration to every two hours. They
decided not to look for the cause of the drift but instead implemented a feedback control
scheme. Immediately after calibration and then every 15 minutes, the operator measured
the same piston and recorded its diameter. If the change from the initial measurement was
more than 1.8 microns (in either direction), the process was stopped and the gage was
recalibrated. Note that the short-term standard deviation of the difference in two measure-
ments on the same piston is 0.42 2 = 0.59 , so a difference of 1.8 microns indicates that the
measurement system has drifted.
The changes added to the operating cost of the inspection gage. However, no further
out-of-specification pistons were found in audits. There was a positive side effect. The
scrap rate due to out-of-specification diameters decreased by about 50%, which more than
made up for the loss of cycle time of the gage. Before the change, the gage rejected many
10
Piston 1
Piston 2
5
Diameter
0 10 20 30 40 50
Time
good pistons. As well, there were fewer adjustments to the upstream machining process
that had been driven by the drift in the inspection gage.
where α (the Greek letter alpha) is a constant and 0 < α ≤ 1. In the prediction, the most
recent value yt gets the highest weight α, the next most recent point, yt–1, gets the second
highest weight a (1 – a), and so on. You can use a bit of probably forgotten high school
algebra to show that the weights add to one.
110
105
Flow rate
100
95
0 100 200
Time (minutes)
Actual
110 Predicted
Actual
Predicted
105
Flow rate
MAPE: 1.53881
95 MAD: 1.56194
MSD: 3.82725
We use MINITAB (see Appendix C) to estimate the parameter α from the data collected
over a period of time when there were no adjustments. For the flow rate data, we get the
results shown in Figure 18.7. The estimated value for α is 0.169. We also show the series of
predicted values in Figure 18.7 to demonstrate how the EWMA prediction captures the
long-term pattern in the series and smoothes the short-term variation.
Once the controller is operational, we start to make adjustments and see the series that
includes the effects of earlier adjustments. Let yt* be the observed output with the controller
in operation. After some algebra,4 we can show that the appropriate adjustment at time t is
α(yt* – Target) in the direction of the target. In other words, if yt* is greater than the target
value, we move the process center down by α(yt* – Target). Here we assume adjustments
are made after every measurement.
If we make perfect adjustments, we get an estimate of the process standard deviation
when the controller is in use from the MINITAB results. For this example, the estimated
standard deviation is 1.96 ( MSD in Figure 18.7), a substantial reduction from 3.23, the
standard deviation of the unadjusted series, which serves as a baseline here.
To implement this feedback control scheme, the team calibrated the valve and auto-
mated the adjustment so that it could be carried out each minute. The standard deviation of
the film build was markedly reduced.
Truck Pull
In Chapter 1 and elsewhere, we discussed a problem with excess pull variation in trucks. Pull,
an alignment characteristic, is a linear function of left and right camber and caster. The base-
line data truck pull baseline include the values for left and right caster and camber on more
than 28,000 trucks produced over a two-month period. As a result of the problem focusing
effort discussed in Chapter 6, the team determined that reducing caster variation was the
highest priority. They searched unsuccessfully for a dominant cause of this variation.
H1212ch18.qxd 3/31/05 10:49 AM Page 278
The team then decided to investigate feedback control as an approach to reduce the
caster variation. We illustrate the assessement of feasibility using right caster. There were similar
results for left caster. The target for right caster is 4.5°. A summary of the baseline data is:
The process is well centered with standard deviation 0.243°. We see in the plot of right
caster angle over time (see Figure 18.8) that there is some drift in the process near the middle
of the series.
The team knew they could adjust caster using the same custom components that were
built for the feedforward control scheme discussed in Chapter 17. Since these components
were manufactured and assembled approximately two hours before caster was measured,
feedback control based on adjustments after each truck was not feasible. The team decided
to investigate the effects of making an adjustment once per shift. The process is run with
three eight-hour shifts per day.
As a first step, they looked at a one-way analysis of variance as described in the supple-
ment to Chapter 11. The idea was to separate the baseline variation into two components,
variation within shifts and variation from shift to shift. The partial results from MINITAB are:
The estimate of the within-shift standard deviation, labeled MS(Error) in the MINITAB
results, is 0.043 = 0.207 . If we could use feedback control to make all the shift averages
equal, we would expect a reduction in the baseline standard deviation from 0.243 to 0.207,
a modest improvement.
Summarizing the caster shift averages gives:
Right caster
5
3
10,000 20,000
Index
To obtain the shift average summary from the available data, we first store the caster aver-
ages by shift (see Appendix B). Across the 132 shifts, the standard deviation of the shift
averages is 0.126. We plot the shift averages and the predicted averages from the EWMA
in Figure 18.9. The smoothing constant is α = 0.944, so the most recent shift is given very
high weight in the prediction of the subsequent shift average. If we use feedback based on
the EWMA to adjust the process at the start of the shift, then we estimate that the standard
deviation of the shift averages will be reduced to 0.077 0.00598 , less than half the base- ( )
line shift-to-shift variation, 0.126. However, since the within-shift variation is so large, this
reduction has little impact on the overall variation.
At this point, the team decided that the potential gain of feedback control would not be
worth the cost. Given the two-hour time lag and the imperfect adjuster, they knew they would
not be able to achieve the small gain predicted using the baseline data.
The team abandoned the problem because no other approach was feasible.
5.0 Actual
Predicted
4.9
Actual
4.8 Predicted
Shift average
4.7
4.6
4.5
Smoothing Constant
4.4 Alpha: 0.944
0 50 100
Time
1
Dimension
–1
1 2 3 4 5 6 7 8 9 10
Tool
We must be able to measure the output quickly compared to the time period that defines
the family containing the dominant cause of variation. In an iron foundry, there was a one-hour
delay between the time the iron was sampled and the time the chemistry measurements
became available. Since the within-hour variation in the process chemistry was relatively
large, feedback control was not feasible. In the truck pull example, there was a lag of two
hours between the point of adjustment and the measurement of caster. Hence the team con-
sidered feedback control schemes that would adjust caster at most every two hours .
We can make similar comments about the action of the adjuster. If the time for the
adjustment to take effect exceeds the time period that defines the family containing the
dominant cause, we cannot use feedback control effectively.
We recommend determining the pattern of variation before looking for an adjuster
since the cost of finding an adjuster may be substantial. If needed, we can search for an
adjuster using an experimental plan as described in Chapter 15. Sometimes we need to
calibrate a known adjuster, as in the parking brake example, so that we can make adjust-
ments of different sizes to the process center as required.
To specify a feedback control scheme, we need:
• A sampling plan to determine when and how to measure the output
• A rule to predict future output values and to decide what adjustment, if any, is
required
• A procedure to make the adjustment
The design of the sampling plan involves a trade-off between the cost of measurement
and the ability to predict the future output values. In the V6 piston diameter example, the
team decided to measure two pistons every 15 minutes. In the fascia film build example, the
team measured flow rate every minute.
To predict future output values, we can use an exponentially weighted moving average
or a simple method such as the value of the most recent observation. In the supplement, we
discuss the choice of prediction method more fully.5
Adjusting after every observation may be undesirable due to the added process complexity
and adjustment costs. A simple alternative is to use a dead band or bounded feedback
adjustment (Box and Luceno, 1997). With dead bands, we do not adjust the process if the
predicted deviation from the target is small. Compared to adjusting after every observation,
a dead band scheme results in a smaller decrease in variation and a large reduction in the
number of adjustments.
The size of an adjustment was the difference between predicted output and the target in
the examples discussed here. However, in cases where a process is subject to regular drift
either upward or downward, say due to tool wear, the adjustment may be to the opposite
side of the specification limit rather than to the target. This idea is illustrated in the discus-
sion of feedback control in Chapter 3.
Once we have designed the feedback scheme, we can assess the potential benefits in
several ways if we have a series of historical data that matches the sampling protocol of
the scheme. We can simulate the effect by applying the adjustments to the historical series.
If we base the control scheme on an EWMA, we can use MINITAB to estimate the stan-
dard deviation of the adjusted process. If we have multivari data with short-term and time-
to-time families, we can use a one-way ANOVA to estimate the variation within the
H1212ch18.qxd 3/31/05 10:49 AM Page 282
short-term family. This is the best we could hope to achieve with a feedback control scheme
designed to eliminate the time-to-time variation. In all cases, we assume adjustments are
made without error and so, if this is not the case, we overestimate the benefit of a feedback
control scheme.
A comprehensive treatment of feedback control from a statistical perspective is given
in Box and Luceno (1997) and Del Castillo (2002). There are many variations of feedback
control. See Tucker et al. (1993), Box and Jenkins (1994), and Box and Luceno (1997) for
further details. Specific examples include acceptance control charts (Duncan, 1986) and
precontrol (Shainin and Shainin, 1989; Juran, Gryna, and Bingham, 1979).6 There is also
an extensive engineering literature on feedback control.7
If a dominant cause acts part to part, using feedback control will result in increased
variation. This is called tampering by Deming (1992, pp. 327–328). The effect of tampering
is shown by Deming’s famous funnel experiment.
Key
Points
• Feedback control is based on predicting future output values using the current
and past output values and adjusting if appropriate.
• Feedback control may be effective if the (unknown) dominant cause acts in the
time-to-time family.
Process Output
Control
• Feedback control requires an adjuster that moves the process center quickly, has
low cost to change, and has small effect on other outputs.
• We may need to calibrate the adjuster with an experiment because we need to
make repeated adjustments of different sizes.
• To implement feedback control, we require a sampling plan and a rule to decide
when and how large an adjustment to make.
3. In the supplement, we compare EWMA to some other formal prediction methods. We also
show that using partial compensation of the current deviation from target to determine the
adjustment is appropriate under some assumptions.
4. See note 3.
5. See note 3.
6. See note 1.
7. We describe the connection between the simple feedback control schemes discussed here and
engineering proportional-integral-differential (PID) controllers.
19
Making a Process Robust
T
o make a process robust, we must find changes to fixed inputs that make the process
output less sensitive to variation in the unknown dominant cause. The robustness
approach is similar to desensitization, described in Chapter 16. However, now we do
not have knowledge of a specific dominant cause.
The requirements for robustness to be effective are:
• The unknown dominant cause acts in the short-term family of variation.
• New settings of fixed inputs that reduce the effect of the unknown dominant
cause resulting in less output variation.
To assess the feasibility of robustness, we choose a number of fixed inputs (candidates) and
an experimental plan to determine if changing the levels of these candidates will make the
process robust. The first requirement is important because we must define a run to be long
enough to see the full extent of variation of the output under the current candidate settings. As a
consequence, the dominant cause will vary over its full range within each run. Hence, for every
run, we will be able to see if the candidate settings make the process robust to the variation in
the unknown dominant cause. If the dominant cause acts in the time-to-time family, it will likely
not be feasible to conduct such an experiment since the runs would need to be too long.
The costs of the robustness approach include:
• An experiment to search for the new settings of the candidates
• A one-time change to the process settings
• The ongoing operation with the new process settings
There is no information about whether this approach will be feasible until the experimental
investigation is complete. There is a risk of running a high-cost experiment with no return. We
cannot assess the benefits of this approach until we find the new settings for the candidates.
285
H1212ch19.qxd 3/31/05 10:50 AM Page 286
Crossbar Dimension
We discussed the problem of reducing variation in a crossbar dimension in Chapter 12 and
the exercises for Chapter 16. The team raised the barrel temperature set point to make the
process less sensitive to variation in barrel temperature, the dominant cause. When validat-
ing the solution, they showed the variation in the crossbar dimension was substantially
reduced but, with the new setting, there was an increase in the frequency of a mold defect
called burn. They decided to address the burn defect as a new problem. Using a multivari
investigation, they showed that the dominant cause of burn acted in the part-to-part family.
but the specific dominant cause was not found. They suspected that the defect occurred
when the mold cavity filled too fast. In any case, since the suspect dominant cause could
not easily be controlled, the team decided to try the robustness approach.
The team planned an experiment with four fixed inputs (candidates): injection speed,
injection pressure, back pressure, and screw speed. These candidates were selected because
of their influence on fill time and other potential dominant causes in the part-to-part family.
They selected two levels for each candidate as given in Table 19.1.
The team decided to define a run as five consecutive parts. Since they knew the domi-
nant cause acted in the part-to-part family, they expected it to act within each run. Each run
was carried out once the process stabilized after changing the values of the candidates.
The team selected a fractional factorial experiment with the eight runs given in Table 19.2.
Since there was no proper baseline investigation, the team assigned the letters to the candidates
so that one of the treatments (treatment 5) corresponded to the current process settings (see the
Chapter 15 supplement for details). The confounding pattern of the chosen design is given as
follows. In the resolution IV design, pairs of two input interactions are confounded.
Alias Structure
A + BCD
B + ACD
C + ABD
H1212ch19.qxd 3/31/05 10:50 AM Page 287
D + ABC
AB + CD
AC + BD
AD + BC
The burn on each part was classified into one of four categories of increasing severity.
Levels 1 and 2 were acceptable, while levels 3 and 4 resulted in scrap. The order of the runs
was randomized. The experimental results are given in Table 19.2 and the file crossbar
dimension robustness.
Treatments 2 and 3 are promising relative to the current process as given by treatment 5.
We plot the burn scores against treatment in Figure 19.1. Because the data are discrete, we add
jitter in the vertical direction (see Appendix C). We use MINITAB (see Appendix F) to stack
the treatment and burn score columns to create one row for each repeat in the experiment.
Table 19.2 Experimental plan and data for burn robustness experiment.
Run Injection Injection Back Screw Burn Average
Treatment order speed pressure pressure speed scores burn
3
Burn
1 2 3 4 5 6 7 8
Treatment
A: Injection speed
C B: Injection pressure
C: Back pressure
D: Screw speed
A
AC
AD
AB
We use the average burn as the performance measure (that is, the response in MINITAB)
for the analysis. We look for candidate settings that make the performance measure as
small as possible.
Fitting a full model with all possible effects, we get the Pareto plot of the effects for the
average burn score in Figure 19.2. We see that only factor C (back pressure) has a large effect.
The team assumed the three-input interaction (ABD) confounded with C was negligible.
We show the main effect plot for factor C in Figure 19.3. The low level of back pres-
sure gives lower burn scores on average.
The team decided to reduce the back pressure to 75 and leave the other fixed inputs at
their original values. We finish this story in Chapter 21.
0
75
10
2.6
2.3
Avg. burn
2.0
1.7
1.4
Back pressure
ratio was 1.5 and the estimated stdev(measurement) was 0.33. The cause of the measure-
ment variation was unknown but acted over the short term.
The measurement process has three major steps:
• Sample the molten iron and pour coins.
• Machine and polish the coins.
• Use a spectrometer to determine the concentration of silicon in the coins.
In the measurement system investigation, the team saw most of the measurement vari-
ation in the silicon concentration of coins poured consecutively from the same stream of
iron, then prepared and measured together. Rather than search for the dominant cause of
the measurement variation, the team decided to try to make the measurement process
robust to the unknown cause. They knew that the spectrometer was highly repeatable and
hence chose candidates from the other steps of the measurement process. Table 19.3 gives
the five candidates and their levels.
The definition of a run is a key stage in the planning of a robustness experiment. Since
the coin-to-coin variation was dominant, the team decided to sample iron and prepare five
coins as quickly as possible for each run. They expected the dominant cause of measure-
ment variation would act within each run.
The team used the resolution III fractional factorial experiment with eight runs and
treatments given in Table 19.4. The confounding pattern is:
A + BD + CE
B + AD + CDE
C + AE + BDE
D + AB + BCE
E + AC + BCD
BC + DE + ABE + ACD
BE + CD + ABC + ADE
290
H1212ch19.qxd
3/31/05
Table 19.4 Treatments and results for iron silicon concentration robustness experiment.
Run Mold Mold Cut Surface Sample Silicon
Treatment order temperature design depth finish temperature measurements Log(s)
10:50 AM
2.6, 2.2
A resolution III design can estimate main effects, assuming that two input interactions are
negligible.
To conduct the experiment, the team manufactured two new molds and selected two
old molds. They set the mold temperature to 300° for one of the new molds and to 400° for
the second, and similarly for the two old molds. They then quickly sampled iron and poured
40 coins, 10 for each combination of mold design and temperature. Since all coins were
produced from essentially the same iron, they assumed that the true concentration of sili-
con in each coin was the same. Next, the team prepared the 40 coins according to the exper-
imental plan. The data are given in the file iron silicon concentration robustness and
Table 19.4.
We are looking for a treatment combination that has little variation. As a first step in the
analysis, we plot the measured silicon concentration by treatment as shown in Figure 19.4.
We see that treatments 4 and 8 look promising since they have relatively little variation.
We summarize the performance of the measurement system for each run using log(s),
where s is the standard deviation of the five measurements within each run. We use the log
transformation to better meet the assumptions underlying the model for the data. The
smaller the within-run variation, the smaller the performance measure. We show the values
of the performance measure for each treatment in Table 19.4. As expected, treatments 4 and
8 have the smallest values. From the original investigation of the measurement system, the
baseline performance is log (.33) = –1.11.
Figure 19.5 shows the Pareto chart of the effects when fitting a full model. The largest
effects are the main effects of candidates B (mold design) and D (surface finish).
To draw conclusions, we summarize the results using the main effects plots given in
Figure 19.6. With the performance measure log(s), smaller is better, so we see that switching
to the new mold design and polishing the samples is beneficial.
2.8
Silicon concentration
2.3
1.8
1 2 3 4 5 6 7 8
Treatment
A: Mold temp.
B B: Mold design
C: Depth of cut
D: Surface finish
D E: Sample temp.
BC
BE
Figure 19.5 Pareto chart of the effects on log(s) for iron silicon experiment.
To validate that the performance of the measurement system would be improved with
the new mold design and polished surface finish, the team carried out a simple investigation.
They sampled 30 coins from the same stream of iron and measured the silicon concentra-
tion using the new levels for the mold design and surface finish and the current levels for the
other candidates. Notice that they had not used this treatment in the experiment. The standard
deviation of the 30 measurements was 0.15, substantially less then the baseline 0.33. With the
new setting we expect the measurement discrimination ratio to increase to 3.3.
The team made a risky decision to use a single batch of iron in the robustness experiment.
By using only one silicon concentration, there was a danger that the conclusions from the
experiment would not generalize to the range of silicon concentrations seen in the process.
h
e
lis
th
ew
in
16
0
0
ld
La
Po
70
Sk
30
40
12
1/
O
–1.30
–1.42
Log(s)
–1.54
–1.66
–1.78
Mold temp. Mold design Depth of cut Surface finish Sample temp.
Figure 19.6 Main effects for iron silicon concentration robustness experiment.
H1212ch19.qxd 3/31/05 10:50 AM Page 293
C Tank 1 concentration
D Tank 2 concentration
E Tank 2 temperature
F Tank 3 temperature
There was some confusion about the appropriate levels for the candidates, since they all
varied somewhat in the normal process. For each candidate, the team decided to use the set
point as prescribed in the control plan as one level (–1) and a new set point outside the nor-
mal range of variation as the second level (+1). They used their judgment to choose the
direction of the change.
For convenience, they defined a run as a rack of 48 grills, all processed simultaneously.
The performance measure for each run was the number of defective grills in the rack.
In the past, without much success, the organization had made various attempts to
reduce the pinskip defect rate by changing the levels of one fixed input at a time. Now,
the team hoped to find a helpful interaction among the candidates. Accordingly, they
planned a resolution V experiment with 32 treatments that would allow for the separate
estimation of all main effects and two input interactions. This was not possible with six
candidates in a 16-run experiment. They planned to conduct the experiment over four
days, giving the process time to settle down after changing the candidate levels. The
candidates C, D, E, and F (the tank concentrations and temperatures) take the longest
time to change, so they ordered the runs to minimize the number of changes of these
candidates.
The experimental plan and the number of defective grills for each run are shown in
Table 19.6 and are given in the file electroplating pinskip defect robustness.
H1212ch19.qxd 3/31/05 10:50 AM Page 294
The experiment was more time-consuming than expected. The team could not com-
plete the planned eight runs in each day. They found the tank temperatures and concentra-
tions hard to change quickly. In the end, the experiment was conducted over five days.
From Table 19.6, there are many promising treatments with no pinskip rejects in the
rack. Using MINITAB and a full model, we get the Pareto chart of effects shown in
Figure 19.7. There are three related interactions (CD, CE, and DE) that are large. We present
a cube plot in Figure 19.8 for the candidates C, D, and E. The plot gives the average number
of grills rejected per rack for the eight combinations of the three candidates.
CD A: Tank 1 dip
CE B: Tank 2 dip
DE C: Tank 1 conc.
B
AB D: Tank 2 conc.
EF E: Tank 2 temp.
D F: Tank 3 temp.
CF
AEF
F
DF
C
ADF
ACD
ABF
ABE
ABC
A
E
BF
BE
ACF
ACE
AC
ABD
BC
AD
BD
0 1 2
6.00 0.75
1.25
1 1.00
Tank 2 temp.
0.50 3.00
–1 –1
–1 1
Tank 1 conc.
Figure 19.8 Cube plot of average number of pinskip rejects per run for tank 1
concentration, tank 2 concentration, and tank 2 temperature.
H1212ch19.qxd 3/31/05 10:50 AM Page 296
The lower the number of rejected grills the better. From the cube plot, the team was
surprised to see that the process at the current levels (shown as the lower left-hand point in
the plot) had an average of 0.5 rejects per rack, the best observed performance (tied with
two other combinations). This was surprising since the current process averaged about 1.6
defective grills per rack.
The team concluded that there was no reason to change the process settings based on
the results of the experiment. They decided to abandon the project.
We cannot say for certain what went wrong. One possibility is that none of the
selected candidates interacted with the unknown dominant cause. Alternately, if we examine
Figure 19.9, a plot of the number of defective grills in the rack versus the order of the runs,
we see that there was a large burst of defectives during the middle portion of the experi-
ment. If the dominant cause acts in the time-to-time family, then since each run consisted
of a single rack, there was little chance of the cause acting within each run and hence no
chance to see if any of the candidates could make the process robust. Defining a run as a
single rack of grills is not appropriate if the dominant cause acts in the time-to-time family.
The lesson here is that the team made a number of poor decisions. They should not
have jumped directly to the robustness approach. They had little process knowledge to
help choose the candidates. They had no assurance that the dominant cause would act
within each run or even in the week used for the experiment. They would have been
better off first investigating the nature of the process variation and generating more
clues about the dominant cause before selecting any particular variation reduction
approach.
10
Pinskips
0 10 20 30
Run Order
Question
Which, if any, combination of the candidate levels reduces the sensitivity of the process to
variation in unknown dominant causes?
Plan
• Define a run long enough to see the full extent of variation in the output in the existing
process.
• Determine the number of parts (repeats) to be measured within each run and the perform-
ance measure(s).
• Determine the number of runs.
• Select a fractional factorial design of resolution III or higher.
• Randomize the order of the runs as much as is feasible.
• Make everyone potentially impacted aware of the plan.
Data
Carry out the experiment. Record the output values, the levels for candidates, treatment num-
ber, and run order, one row for each run. Use a separate column for each repeat.
Analysis
• Plot the output by treatment to look for promising treatments. To make this plot, temporar-
ily arrange the data with a separate row for each repeat.
• Calculate the performance measure(s) across the repeats for each run.
• Use a full model and a Pareto chart to analyze the performance measure(s) looking for
large main and interaction effects.
• For large effects, construct main and interaction effects plots.
Conclusion
Identify the levels of the candidates that lead to the best performance.
• Check that the change of settings does not produce substantive negative side
effects.
• Estimate the costs of changing the settings and the new ongoing operating costs.
• Estimate the benefit of the change of settings.
If we can accomplish all these tasks, and the benefits outweigh the costs, we proceed to the
validation stage of the algorithm.
Since we have little or no knowledge of the dominant cause, we are given little guid-
ance on what candidates to select and how to pick their levels. We recommend using as
many candidates as feasible and a fractional factorial design.
The definition of a run is a key step in planning the robustness experiment. To identify set-
tings that reduce the effect of the unknown dominant cause, we need the cause to act within
each run. To define a run, we can use knowledge of the time pattern of output variation from a
multivari, baseline, or other investigations. In the iron silicon concentration example, the team
knew that measuring the same coin several times in a short period of time would show most of
the variation in the measurement system. They defined a run to be five consecutive measure-
ments on coins poured from the same iron. In the electroplating pinskip defect example, the
team had no knowledge of the time family containing the dominant cause. They specified a run
as a single rack of grills. This would have been an appropriate choice if the dominant cause
acted within a rack. However, if the dominant cause acted in the rack-to-rack family or slowly
over time, then with this definition of a run, the experiment was doomed to fail.
If the time-based family in which the dominant cause acts is unknown, we recommend a
multivari investigation before proceeding with a robustness experiment. When the dominant
cause acts slowly, we need to have long runs so that the cause acts within each run. Long runs
add to the cost and complexity of the experiment, so in this case, determining robust process
settings may be infeasible. On the other hand, feedback control may be feasible.
For each run in the experiment, we calculate a performance measure to assess the behav-
ior of the process output within the run. We define the performance measure based on the goal
of the problem. There are many possible performance measures.1 In the iron silicon concen-
tration example, the performance measure was the standard deviation (actually the logarithm
of the standard deviation) of the output values measured during the run. The problem goal
was to reduce the measurement system variation. In the crossbar dimension example, the
team chose the average burn score over the run to measure process performance. By lower-
ing the average score, the team hoped to address the problem goal of reducing the frequency
of burn defects. The team could have selected an alternate performance measure such as the
proportion of parts scrapped due to burn (as in the electroplating pinskip defect example) in
each run. With this choice they would have required longer runs since the output is binary.
We can define and analyze several performance measures (for example, average and
standard deviation) within the same experiment. For instance, to make a measurement sys-
tem more robust, we may simultaneously analyze bias and measurement variation as two
performance measures. In the camshaft lobe runout example, introduced in Chapter 1, the
goal was to reduce the average and variation in runout. If the team had adopted the robust-
ness approach, in the experiment, they would have calculated both these performance
measures within each run. With two or more performance measures we may be forced to
make a compromise in the choice of settings.
H1212ch19.qxd 3/31/05 10:50 AM Page 299
We need to have enough repeats within each run to get a good estimate of the process
performance. It is hard to give a firm rule, but more repeats are better. If measuring the out-
put is expensive we can use a relatively long run but measure a sample of parts within the run.
After the experiment, we can assess the costs and benefits of the approach. We can esti-
mate the performance measure for the new process settings. There are costs associated with
changing the candidate levels and the ongoing operating costs at the new levels. We also
need to check for negative side effects. Changing candidate levels may shift the center of
the process in an undesirable direction. In that case there may be additional costs related to
finding an adjuster and operating the process at a different level of the adjuster.
The robustness approach is often selected to reduce the rate of defectives. In the robustness
experiment for a binary output, we need a run to be long enough so that each run will likely
contain one or more defective parts. If defectives are rare, we may be able to aggravate the
process to increase the defect rate for the purposes of the experiment. Then we hope that results
obtained under the aggravated conditions are relevant for the standard process. For example,
there were field failures of exterior electrical boxes after several years due to corrosion. The
team conducted a robustness experiment on the painting process using scored panels in a salt
spray chamber (a highly aggravated condition) to see if changing fixed inputs would increase
the durability.
There is a strong connection between the robustness approach and the desensitization
approach discussed in Chapter 16. In the latter case where we control the known dominant
cause in the experiment, we can determine the interactions between the candidates and the
cause directly in the experiment. With the robustness approach, we can only observe the
interaction indirectly through the performance measure.
There is also a connection between the robustness and moving the process center
approaches. In the electroplating pinskip defect example, the output was binary, and the
goal of the problem was to reduce the proportion of defective grills. We can view the
approach taken as either moving the process center or robustness. In either case, we search
for changes to fixed inputs to achieve the goal without knowledge of a dominant cause.
We recommend desensitization over robustness if at all possible. In other words, we
recommend first finding the dominant cause of the variation. We have had little success
applying the robustness approach. In most circumstances one of the other variation reduc-
tion approaches is preferred. Selecting the robustness approach is a last hope.
The idea of analyzing a performance measure such as the within-run standard deviation
was first suggested by Bartlett and Kendall (1946). Nair and Pregibon (1988) give a motivation
for using log(s). Taguchi (1986) popularized process robustness and called it parameter
design.
H1212ch19.qxd 3/31/05 10:50 AM Page 300
Key
Points
20
100% Inspection
If you put off everything till you’re sure of it, you’ll get nothing done.
—Norman Vincent Peale, 1898–1993
T
he simplest yet most controversial variation reduction approach is 100% inspection.
We compare the value of the output characteristic of each part to inspection limits. We
then scrap, downgrade, or rework any part with output value outside the inspection
limits. Figure 20.1 shows how adding inspection limits reduces the output variation of accepted
parts. The inspection limits are tighter than the customer-driven specification limits. To use
100% inspection, we do not need to know the dominant cause or understand the nature of
the process variation.
10
9 Lower Upper
8 inspection inspection
limit limit
7
6
Percent
5
4
3
2
1
0
Output
301
H1212ch20.qxd 3/31/05 10:50 AM Page 302
The requirements for 100% inspection to reduce variation are minimal. We need a meas-
urement system for the output characteristic with little measurement bias and variation.
The costs associated with implementing 100% inspection are:
• The cost of measuring every part
• The cost of dealing with rejected parts
We can use the baseline investigation to help assess the second cost. For a continuous
output, we use the baseline histogram to estimate the increase in the number of rejects due
to adding or tightening inspection limits. 100% inspection is more effective for eliminating
outliers than for reducing the standard deviation.
Treatment A B C n1 n2
1 Low Low Low 1 7
2 Low Low High 2 9
3 Low High Low 3 3
4 Low High High 1 6
5 High Low Low 0 2
6 High Low High 0 12
7 High High Low 2 13
8 High High High 1 7
Since passing a manifold with a blocked port was a serious error, the team considered
only treatments 5 and 6 where all manifolds with blocked ports were detected. They
selected treatment 5 since it had a lower rate of rejecting manifolds without a blocked port.
They changed the operation and control plan so that:
• The ultrasound system inspected all manifolds and automatically rejected any
classified as having a blocked port.
• All rejected manifolds were inspected by hand to determine if a port was
blocked or not.
• At the start of each shift, one of the manifolds with naturally blocked ports was
inspected by the ultrasound system to ensure it was functioning properly.
In this example, defects were very rare, so the costs were limited to the cost of inspection.
The inspection cost was less than the loss of goodwill when the customer found manifolds
with blocked ports.
With the new system, there was no need for the audit of inspected parts or for containment.
There was an increase in scrap and rework costs. However, there were no further complaints
from the customer about cracks and splits.
Broken Tooling
In a machining process, the process engineer discovered that the dominant cause of broken
drills in an automated drilling operation was improperly machined parts from upstream in
the process. He found that these parts occurred when the production line restarted after a
shutdown and decided to use 100% inspection before the drilling operation. He installed a
limit switch that detected the poorly machined parts and stopped the line when such a part
was found before the drills were damaged. The cost of the inspection was low, and there
were substantial savings in tooling and downtime.
This example is classic error proofing, 100% inspection applied to the dominant cause.
• Estimate the costs of measurement, loss of volume, and dealing with rejects.
• Estimate the benefits.
If we can accomplish all these tasks, and the benefits outweigh the costs, we proceed to the
validation stage of the algorithm.
Many processes already have 100% inspection because the shipment of a defective or
out-of-specification part has critical consequences for the customer. For a part such as the
exhaust manifold, we consider improving the existing inspection system. In the rocker cover
example, the team added a 100% inspection system to the process to check for proper torque.
We can apply 100% inspection if the output can be compared to the inspection limits
or standards (sometimes called boundary samples) with small measurement error. The key
prerequisite is a stable, nondestructive measurement system with low variation and bias.
100% inspection does not eliminate defects or out-of-specification parts. We need to
change the control plan to deal with the rejects. In one disastrous example, rejected parts
were set aside in a box to be reworked. However, because of poor organization and labeling,
the box was shipped to a customer.
Applying 100% inspection to the output is rarely the most cost-effective approach.
There is less variation in the parts shipped to the customer, but there are increased costs due
to the larger number of rejects. We think 100% inspection is best suited for situations where
other variation reduction approaches have not proven feasible or effective. Two potential
applications are processes with rare defects or when inspection and rework costs are low.
In the manifold blocked port example, the defect was so rare that the team could not dis-
cover the dominant cause of the defect using empirical methods. The only feasible
approach was to design an effective low-cost 100% inspection system to detect the rare defect.
Applying 100% inspection to a dominant cause is called source inspection (Shingo 1986).
See Shimbun (1988) for numerous examples. In this case, the costs of dealing with the parts
that fall outside the inspection limits on the cause are likely to be small, since we are applying
the inspection upstream from the final output. In the rocker cover oil leak example, it was
much cheaper to detect and repair the poorly tightened bolts than to deal with the leaks after
the engine had been assembled.
We often go directly to source inspection on a dominant cause without reformulating
the problem in terms of the cause. This is an example of the Fix the Obvious approach
described in Chapter 14.
Most successful applications of 100% inspection use automated measurement. Human
inspectors make mistakes at the best of times, and with rare defects, they are unable to
remain focused. Using multiple inspectors is not a solution since each inspector may
become complacent with the belief that any problem will be found by one of the others.
100% inspection is poorly regarded as a variation reduction approach in the quality
improvement literature. For example, one of Deming’s 14 points exhorts industry not to rely
on mass inspection to “control” quality (Deming, 1992). Despite its unfavorable image, we see
100% inspection applied frequently because immediate process improvement is required and
no other improvement approach is feasible without more process knowledge and investigation.
A common modification of 100% inspection is inspection sampling where not every
part is measured. One alternative is to define lots that are accepted or rejected based on the
quality of a sample taken from the lot. Accepted lots are shipped, and rejected lots are 100%
H1212ch20.qxd 3/31/05 10:50 AM Page 306
Key
Points
• We use 100% inspection when defects are critical and rare, so that it is difficult and
costly to determine a dominant cause, or when the inspection costs are very low.
• 100% inspection applied to a reformulated problem is one form of error proofing.
Process
21
Validating a Solution
and Holding the Gains
W
e have now reached the final stage of the Statistical Engineering algorithm. We have a
proposed solution to the problem, a change to one or more fixed inputs of the process.
We may propose changes to process settings, the addition or deletion of a process step,
the use of a new supplier, changes to the process control plan, and so on. Although this sounds
silly, we remind you that the process cannot be improved without making changes to fixed inputs.
There are two remaining tasks:
• Validate the proposed solution to see that the goal is met and that there are no
substantial negative side effects.
• If the solution is validated, ensure that the process change is made permanent
so that the gains are preserved.
307
H1212ch21.qxd 3/31/05 10:51 AM Page 308
solution. We can use knowledge we have gained in the process investigations to suggest how
long we need to observe the original process until we would see the full extent of variation.
We plan the validation investigation to last at least that long so that we expect to see the new
full extent of variation with the proposed process changes.
We may find a proposed solution inadequate for a number of reasons related to taking
shortcuts in the algorithm with the hope of saving time and money. First, the solution may
be based on a cause that is not dominant. In that case, the process improvement will be
small (see Chapter 2). If the algorithm was followed closely, this cannot occur, since the
algorithm requires verification of the dominant cause. Second, we may not meet the goal
because of optimistic assumptions. For example, we assume a perfect adjustment method
in assessing feedback control that is not realized in practice.
If the solution leads to an improvement that is not sufficient to meet the goal, we need
to decide whether or not to continue with the implementation. We can make the decision by
reevaluating costs and benefits. To meet the goal, we need to go back to reconsider the pos-
sible approaches and look for further changes to fixed process inputs.
If we have reformulated the problem, we must assess the process change against the base-
line for the original output. We will have made assumptions about the links between the out-
put and the dominant cause based on uncertain knowledge. When we validate using the
original output, we can check that these assumptions were correct.
We give two examples of validation.
Crossbar Dimension
In the production of an injection-molded contactor crossbar, the problem was excessive vari-
ation in a crossbar dimension. In the baseline investigation, the team established the full
extent of variation of the dimension (measured as the deviation from nominal) as –0.3 to 2.3
thousandths of an inch, with standard deviation 0.46. After some investigation (see Chapter 12),
they discovered that barrel temperature was the dominant cause of variation. The team pro-
ceeded using the desensitization approach (see the Chapter 16 exercises). They found that
increasing the barrel temperature set point reduced the effect of the variation in barrel
temperature. Since this change increased the crossbar dimension center, they used a known
adjuster to reduce the average dimension to the target value zero.
In a preliminary validation investigation, the team found that increasing the average
barrel temperature resulted in burn defects. They did not search for a dominant cause of the
burn defect. Instead, they used the robustness approach to find process settings that eliminated
the defect while at the same time allowing the increased barrel temperature. See Chapter 19
for further discussion.
The team next proceeded to a full validation with the proposed process settings. In the
validation investigation, 300 parts were selected over two shifts. This plan matched the base-
line investigation. The team measured the crossbar dimension and inspected each part for the
burn defect. The data are given in the file crossbar dimension validation. The histogram of the
crossbar dimension from the validation investigation, given in Figure 21.1, shows the reduced
variation. The standard deviation in the crossbar dimension was reduced to 0.23, and the burn
defect occurred on only 2 of the 300 parts. With the new settings, the process performance
met the project goal.
H1212ch21.qxd 3/31/05 10:51 AM Page 309
10
9
8
7
Percent
6
5
4
3
2
1
0
3.5
Porosity scrap
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0 5 10 15 20 25
Month
After validating a solution and implementing a process change, we need to ensure that the
improvement is preserved over the long term. Most aspects of holding the gains are manage-
ment issues outside the scope of this book.
In our experience, there are two imminent dangers:
• The recommended process change is not fully implemented or later reversed.
• Other fixed inputs are (later) changed in a way that reduces the effectiveness of
the solution.
The difficulty in ensuring the implemented change is not reversed depends on the nature
of the change. To change the design of a product or process, we should adhere to formal
design change procedures that can be expensive and take considerable time. Such changes are
not easily reversed. Changes to the process control plan, on the other hand, are easy to make
but also easy to undo or ignore. In these cases, we need to document the changes and ensure
that appropriate training is conducted. We also recommend process audits to verify that con-
trol plan changes continue to be used.
In many processes, fixed inputs change slowly over time due to wear or aging of equip-
ment. We have discussed several problems where the obvious fix requires maintenance
activity. Because at the time the problem has a high profile, the maintenance is carried out
and the variation is reduced. Over time, as the original problem is forgotten, maintenance
effort decreases and the projected gains slip away. In such cases, the team can increase the
chance of holding the gains by implementing a formal monitoring procedure. This is a sim-
ple form of feedback control added to the proposed solution. We can use a summary of
process performance over time such as a run chart, a regular calculation of process capabil-
ity, or a control chart (Ryan, 1989; Montgomery, 1996). In all cases, we need a plan to act
if evidence from the monitoring suggests the gains are disappearing.
At the end of a project, we need to preserve any new knowledge of process behavior
that can be of use to a wider audience. We need to document any process or design changes
with the supporting reasons for the change. We do not want improvements undone during
future cost-reduction exercises or in the solution to another problem.
We can retain information by:
• Logging design changes in a design guide
• Documenting of all variation reduction projects on a searchable company
intranet
Documenting projects and problems is useful but expensive and time-consuming. In
our experience most process engineers do not enjoy this activity. See knowledge manage-
ment books such as Davenport et al. (1997) and O’Dell et al. (1998) for suggestions.
We give three examples that illustrate the issues.
H1212ch21.qxd 3/31/05 10:51 AM Page 311
Truck Pull
As described in Chapter 10, in the early phases of the problem to improve the truck align-
ment process, the team examined right caster data stratified by the four alignment meas-
urement machines that operated in parallel. The team was surprised to see the persistent
material differences among averages of the four gages. Because trucks were assigned an
alignment gage haphazardly, the gage was a cause of variation in right caster.
The team took immediate action to recalibrate the four gages to remove the systematic
differences. To prevent a recurrence, they established a monitoring program to compare the
daily averages of each measured characteristic for each gage. If they found significant dif-
ferences, they recalibrated the four gages. Without such monitoring, there is every reason
to believe that the problem would recur.
Fascia Dimension
At a fascia supplier, management assigned a team to address complaints from their customer,
a car assembly plant, about difficulties installing the fascias. After consulting with the cus-
tomer, the team determined that the average of a key dimension was off target and that there
were too many large fascias. We discussed this example in the exercises to chapters 6 and 15.
From the baseline investigation, the team estimated the dimension average, measured
from nominal, to be 7.3, with a full extent of variation of 2 to 12. After some further inves-
tigation, they proposed changing the cure and cycle times in the molding process to move
the process center. They made changes to the control plan and carried out a short validation
investigation. The average dimension was reduced to 3.1, with a range of 1 to 6. Within a
week in actual production, the new control plan was being ignored. The line operators realized
that the proposed solution was not as effective as initially thought.
Mistakes were made in this problem. The main difficulty was that the baseline investi-
gation was conducted over too short a time period and did not capture the full extent of output
variation. The team also failed to notice an upward drift in the dimension over the course
of the baseline investigation. Subsequently, the validation investigation also covered too
short a time period. Because of the poor plan and analysis for the baseline investigation, the
team was led to an inappropriate approach and solution.
program. No one had examined the potential costly side effects of removing the dust shield,
despite a design change protocol that required the assessment of such side effects before
the change could be implemented.
Key
Points
• We need to validate process changes to ensure the long-term reduction of
process variation.
• In the solution validation, watch for unexpected negative side effects on other
output characteristics.
• To have a lasting impact, the implemented change needs to be made permanent.
References
Abraham, B., and J. Ledholter. 1983. Statistical Methods for Forecasting. New York: John Wiley
& Sons.
Agrawal, R., 1997. Variation Transmission in Multistage Industrial Processes. Statistics Ph.D.
thesis, University of Waterloo, Ontario, Canada.
Agrawal, R., J. F. Lawless, and R. J. MacKay. 1999. “Analysis of Variation Transmission in
Manufacturing Processes, Part II.” Journal of Quality Technology 31:143–154.
Amster, S., and K. L. Tsui. 1993. “Counterexamples for the Component Search Procedure.”
Quality Engineering 5:545–552.
Asher, R. K., Sr. 1987. “Statistical Process Control.” Products Finishing 51:20–22.
Aström, K. J. 1970. Introduction to Stochastic Control. Mathematics in Science and Engineering
Series. Vol. 70. New York: Academic Press.
Aström, K. J., and B. Wittenmark. 1989. Adaptive Control. Reading, MA: Addison-Wesley.
Automotive Industry Action Group (AIAG). 1995a. Measurement Systems Analysis. Second
edition. Southfield, MI: AIAG.
Automotive Industry Action Group (AIAG). 1995b. Statistical Process Control. Second edition.
Southfield, MI: AIAG.
Automotive Industry Action Group (AIAG). 1998. Quality Systems Requirements QS-9000. Third
edition. Southfield, MI: AIAG.
Bartlett, M. S., and D. G. Kendall. 1946. “The Statistical Analysis of Variance: Heterogeneity and
the Logarithmic Transformation.” Journal of the Royal Statistical Society. Series B, 8:128–138.
Bhote, K. R., and A. K. Bhote. 2000. World Class Quality. Second edition. Saranac Lake, NY:
American Management Association (AMACOM).
Bisgaard, S., and H. Fuller. 1995a. “Sample Size Estimates for Designs with Binary Responses.”
Journal of Quality Technology 27:344–354.
———. 1995b. “Analysis of Factorial Experiments with Ordered Categories as the Response.”
Quality Engineering 8: 199–207.
———. 1996. “Correction to ‘Sample Size estimates for Designs with Binary Responses’
(1995V27 p344-354).” Journal of Quality Technology 28: 496.
Box, G.E.P. 1988. “Signal to Noise Ratio, Performance Criteria, and Transformations (with
discussion).” Technometrics 30:1–40.
———. 1999. “Statistics as a Catalyst to Learning by Scientific Method Part II: A Discussion.”
Journal of Quality Technology 31: 16–29.
Box, G.E.P., W. G Hunter, and J. S. Hunter. 1978. Statistics for Experimenters: An Introduction to
Design, Data Analysis and Model Building. New York: John Wiley & Sons.
313
H1212chRef.qxd 3/31/05 10:52 AM Page 314
314 References
Box, G.E.P., and G. M. Jenkins. 1976. Time Series Analysis: Forecasting and Control. Revised
edition. Englewood Cliffs, NJ: Prentice-Hall.
Box, G.E.P., G. M. Jenkins, and G. C. Reinsel. 1994. Time Series Analysis: Forecasting and Control.
Third edition. Englewood Cliffs, NJ: Prentice-Hall.
Box, G.E.P., and T. Kramer. 1992. “Statistical Process Monitoring and Feedback Adjustment:
A Discussion.” Technometrics 34:251–285.
Box, G.E.P., and A. Luceno. 1997. Statistical Control by Monitoring and Feedback Adjustment.
New York: John Wiley & Sons.
Boyles, R. A. 2001. “Gage Capability for Pass-Fail Inspection.” Technometrics 43: 223–229.
Brassard, M. 1988. The Memory Jogger: A Pocket Guide of Tools for Continuous Improvement.
Second edition. Metheun, MA: Goal/QPC.
Breiman, L., J. Friedman, C. J. Stone, and R. A. Olshen. 1984. Classification and Regression Trees.
Boca Raton, FL: Chapman and Hall/CRC.
Breyfogle, F. W., III. 1999. Implementing Six Sigma: Smarter Solutions Using Statistical Methods.
New York: John Wiley & Sons.
Burdick, R. K., and G. A. Larsen. 1997. “Confidence Intervals on Measures of Variability in R&R
Studies.” Journal of Quality Technology 29:261–273.
Cameron, A. C., and P. K. Trivedi. 1998. Regression Analysis of Count Data. Cambridge, England:
Cambridge University Press.
Canadian Oxford Dictionary: The Foremost Authority on Current Canadian English. 2002. Edited
by Katherine Barber. Oxford, England: Oxford University Press.
Cleveland, W. S. 1979. “Robust Locally Weighted Regression and Smoothing Scatterplots.”
Journal of the American Statistical Association 74:829–836.
Coleman, D. E., and D C. Montgomery. 1993. “A Systematic Approach to Planning for a Designed
Industrial Experiment.” Technometrics 35:1–26.
Cotter, C. S. 1979. “A Screening Design for Factorial Experiments with Interactions.” Biometrika
66:317–320.
Davenport, T. H., and L. Prusak. 1997. Working Knowledge: How Organizations Manage What
They Know. Harvard, MA: Harvard Business School Publishing.
Deming, W. E. 1992. Out of the Crisis. Seventh printing. Cambridge, MA: MIT Center for Advanced
Engineering Study.
De Mast, J., 2003. “Quality Improvement from the Viewpoint of Statistical Method.” Quality and
Reliability Engineering International 19:255–264.
———. 2004. “A Methodological Comparison of Three Strategies for Quality Improvement.”
International Journal of Quality and Reliability Management 21(2): 198–213.
De Mast, J., K.C.B. Roes, and R.J.M.M. Does. 2001. “The Multi-Vari Chart: A Systematic
Approach.” Quality Engineering 13:437–448.
De Mast, J., W.A.J. Schippers, R.J.M.M. Does, and E. Van den Heuvel. 2000. “Steps and Strategies
in Process Improvement.” Quality and Reliability Engineering International 16:301–311.
Del Castillo, E. 2002. Statistical Process Adjustment for Quality Control. New York: John Wiley &
Sons.
Duncan, A. J. 1986. Quality Control and Industrial Statistics. Fifth edition. Homewood, IL: Richard
D. Irwin.
Evans, J. R., and W. M. Lindsay. 1993. The Management and Control of Quality. Second edition.
St. Paul, MN: West.
Farnum, N. R. 1994. Modern Statistical Quality Control and Improvement. Belmont, CA: Duxbury.
Feder, P. I. 1974. “Some Differences Between Fixed, Mixed, and Random Effects Analysis of
Variance Models. Technometrics 6:98–106.
Goldratt, E. M. 1992. The Goal. Second revised edition. Great Barrington, MA: North River.
H1212chRef.qxd 3/31/05 10:52 AM Page 315
References 315
Grubbs, F. E. 1954. “An Optimum Procedure for Setting Machines or Adjusting Processes.”
Industrial Quality Control, July. Reprinted in the Journal of Quality Technology 15(4):186–189.
Hahn, G. J. 1984. “Experimental Design in the Complex World.” Technometrics 26:19–31.
Hamada, M., and J. A. Nelder. 1997. “Generalized Linear Models for Quality Improvement
Experiments.” Journal of Quality Technology 29:292–304.
Harrington, H. J. 1987. The Improvement Process: How America's Leading Companies Improve
Quality. New York: McGraw-Hill.
Harry, M. J. 1997. The Vision of Six Sigma: A Roadmap for Breakthrough. Fifth edition. Phoenix,
AZ: TriStar.
Harry, M., and R. Schroeder. 2000. Six Sigma: The Breakthrough Strategy Revolutionizing the
World’s Top Corporations. New York: Doubleday.
Hastie, J. H., and R. J. Tibshirani. 1990. Generalized Additive Models. Boca Raton, FL: Chapman
and Hall/CRC.
Hosmer, D. W., and S. Lemeshow. 2000. Applied Logistic Regression. Second edition. New York:
John Wiley & Sons.
Hoyle, D. 2001. ISO 9000 Quality Systems Handbook. Fourth edition. Woburn, MA: Butterworth
and Heinemann.
Ingram, D. J. 1993. “A Statistical Approach to Component Swapping.” American Society for Quality
Control (ASQC) Quality Congress Transactions, Milwaukee, WI, 85–91.
Ishikawa, K. 1982. Guide to Quality Control. Second revised edition. Tokyo: Asian Productivity
Organization.
Jenkins, G. M. 1983. “Feedforward-Feedback Control Schemes.” In Encyclopedia of Statistical
Sciences, edited by S. Kotz and N. L. Johnson. New York: John Wiley & Sons.
Juran, J. M. 1988. Juran on Planning for Quality. New York: Free Press.
Juran, J. M., and F. M. Gryna, Jr. 1980. Quality Planning and Analysis. Second edition. New York:
McGraw-Hill.
Juran, J. M., F. M. Gryna, and R. S. Bingham, eds. 1979. Quality Control Handbook. Third edition.
New York: McGraw Hill.
Kalbfleisch, J. G. 1985. Probability and Statistical Inference. Vol. 2, Probability. Second
edition. New York: Springer-Verlag.
Kotz, S., and N. Johnson. 2002. “Process Capability Indices: A Review 1992–2000.” Journal of
Quality Technology 34:2–53.
Kume, H. 1985. Statistical Methods for Quality Improvement. Tokyo: Association for Overseas
Technical Scholarships (AOTS).
Lawless, J. F., R. J. MacKay, and J. A. Robinson. 1999. “Analysis of Variation Transmission in
Manufacturing Processes, Part I.” Journal of Quality Technology 31:131–142.
Ledolter, J., and A. Swersey. 1997a. “Dorian Shainin’s Variables Search Procedure: A Critical
Assessment.” Journal of Quality Technology 29:237–247.
Ledolter, J., and A. Swersey. 1997b. “An Evaluation of Pre-control.” Journal of Quality Technology
29:163–171.
Lewis, J. P. 2002. Fundamentals of Project Management: Developing Core Competencies to Help
Outperform the Competition. Second edition. Saranac Lake, NY: American Management
Association (AMACOM).
Liberatore, R. L. 2001. “Teaching the Role of SPC in Industrial Statistics.” Quality Progress,
July, 89–94.
Logothetis, N. 1990. “A Perspective on Shainin’s Approach to Experimental Design for Quality
Improvement.” Quality and Reliability Engineering International 6:195–202.
Mackertich, N. A. 1990. “Pre-control vs. Control Charting: A Critical Comparison.” Quality
Engineering 2:253–260.
H1212chRef.qxd 3/31/05 10:52 AM Page 316
316 References
McCallagh, P., and J. L. Nelder. 1989. Generalized Linear Models. Second edition. Boca Raton,
FL: Chapman Hall/CRC.
Mease, D., V. N. Nair, and A. Sudjianto. 2004. “Selective Assembly in Manufacturing: Statistical
Issues and Optimal Binning Strategies.” Technometrics 46:165–175.
Meyer, R. D., D. M. Steinberg, and G.E.P. Box. 1996. “Follow-up Designs to Resolve Confounding
in Multifactor Experiments (with discussion).” Technometrics 38:303–332.
MINITAB. 2000a. MINITAB User’s Guide 1: Data, Graphics, and Macros. Release 13 for
Windows. State College, PA: MINITAB.
———. 2000b. MINITAB User’s Guide 2: Data Analysis and Quality Tools. Release 13 for
Windows. State College, PA: MINITAB.
Montgomery, D. C. 1996. Introduction to Statistical Quality Control. Third edition. New York: John
Wiley & Sons.
———. 2001. Design and Analysis of Experiments. Fifth edition. New York: John Wiley & Sons.
Montgomery, D. C., E. A. Peck, and G. G. Vining. 2001. Introduction to Linear Regression Analysis.
Third edition. New York: John Wiley & Sons.
Nair, V. N. 1992. “Taguchi’s Parameter Design: A Panel Discussion.” Technometrics 34:127–162.
Nair, V. N., and D. Pregibon. 1988. “Analyzing Dispersion Effects from Replicated Factorial
Experiments.” Technometrics 30: 247–257.
Nelson, L. S. 1985. “Sample Size Tables for Analysis of Variance.” Journal of Quality Technology
17:167–169.
Neter, J., M. H. Kutner, C. J. Nachtsheim, and W. Wasserman W. 1996. Applied Linear Statistical
Models. Fourth edition. Chicago: Irwin.
Nikolaou, M. 1996. “Computer-aided Process Engineering in the Snack Food Industry.” Proceedings
Chemical Process Control Conference (CPC-V), Tahoe City.
Odeh, R. E., and M. Fox. 1975. Sample Size Choice: Charts for Experimenters with Linear Models.
Second edition. Homewood, IL: Irwin.
O’Dell, C. S., N. Essaides, and N. Ostro. 1998. If Only We Knew What We Know: The Transfer of
Internal Knowledge and Best Practice. New York: Free Press.
Oldford, R. W., and R. J. MacKay. 2001. Stat 231 course notes, Fall 2001, University of Waterloo,
Ontario.
Parmet, Y., and D. M. Steinberg. 2001. “Quality Improvement from Disassembly-Reassembly
Experiments.” Communications in Statistics: Theory and Methods 30:969–985.
Phadke, M. S. 1989. Quality Engineering Using Robust Design. Englewood Cliffs, NJ: Prentice-Hall.
Prett, D. M., and C. E. Gracia. 1988. Fundamental Process Control. Boston: Butterworths.
Quinlan, J. 1985. “Product Improvement by Application of Taguchi Methods.” Third Supplier
Symposium on Taguchi Methods, American Supplier Institute, Dearborn, MI, 367–384.
Robinson, G. K. 2000. Practical Strategies for Experimenting. New York: John Wiley & Sons.
Ross, P. J. 1988. Taguchi Techniques for Quality Engineering: Loss Function, Orthogonal
Experiments, Parameter and Tolerance Design. New York: McGraw-Hill.
Ryan, B. F., B. L. Joiner, T. Ryan, Jr. 2000. Minitab Handbook. Fourth edition. Belmont, CA: Duxbury.
Ryan, T. P. 1989. Statistical Methods for Quality Improvement. New York: John Wiley & Sons.
Satterthwaite, F. E. 1954. “A Simple, Effective Process Control Method.” Rath & Strong Inc.
Report 54-1, Boston, MA.
Scholtes P. R. 1998. The Team Handbook: How to Use Teams to Improve Quality. Madison, WI:
Oriel.
Seder, L. A. 1950a. “Diagnosis with Diagrams: Part I.” Industrial Quality Control, January, 11–19.
———. 1950b. “Diagnosis with Diagrams: Part II.” Industrial Quality Control, February, 7–11.
———. 1990. “Diagnosis with Diagrams.” Quality Engineering 2:505–530 (reprinted from original
in Industrial Quality Control, 1950).
H1212chRef.qxd 3/31/05 10:52 AM Page 317
References 317
Senge, P. M. 1990. The Fifth Discipline: The Art and Practice of the Learning Organization. New
York: Currency/Doubleday.
Shainin, D., and P. Shainin. 1988. “Better Than Taguchi Orthogonal Tables.” Quality and
Reliability Engineering International 4:143–149.
–
———. 1989. “Pre-control Versus X & R Charting: Continuous or Immediate Quality
Improvement?” Quality Engineering 1:419–429.
Shainin, R. D. 1992. “Technical Problem Solving Strategies: A Case Study.” 46th Annual Quality
Congress Proceedings, American Society for Quality Control (ASQC), Milwaukee, WI,
876–882.
———. 1993. “Strategies for Technical Problem Solving.” Quality Engineering 5(3):433–448.
Shimbun, N. K., ed. 1988. Poka-yoke: Improving Product Quality by Preventing Defects.
Cambridge, MA: Productivity Press.
Shingo, S. 1986. Zero Quality Control: Source Inspection and the Poka-yoke System. Stanford,
CT: Productivity Press.
Snee, R. D. 2001. “My Process Is Too Variable: Now What Do I Do?: How to Produce and Use a
Successful Multi-vari Study.” Quality Progress, December, 65–68.
Spiers, B. 1989. “Analysis of Destructive Measuring Systems,” Forty-third Annual Quality
Congress Proceedings, American Society for Quality Control (ASQC), Milwaukee, WI,
22–27.
Steiner, S. H., and R. J. MacKay. 1997–1998. “Strategies for Variability Reduction.” Quality
Engineering 10: 125–136.
Taguchi, G. 1986. Introduction to Quality Engineering, Designing Quality into Products and
Processes. Tokyo: Asian Productivity Association.
———. 1987. System of Experimental Design: Engineering Methods to Optimize Quality and
Minimize Costs. White Plains, NY: UNIPUB/Kraus International.
Taguchi, G., and Y. Wu. 1985. Introduction to Off-line Quality Control. Nagaya, Japan: Central
Japan Quality Control Association.
Taylor, W. A. 1991. Optimization and Variation Reduction in Quality. New York: McGraw-Hill.
Tippett, L.H.C. 1934. Applications of Statistical Methods to the Control of Quality in Industrial
Production. Brighton, UK: Manchester Statistical Society.
Todd, R. H., D. K. Allen, and L. Alting. 1994. Manufacturing Processes Reference Guide. New
York: Industrial Press.
–
Traver, R. W. 1985. “Pre-control: A Good Alternative to X-R Charts.” Quality Progress,
September, 11–14.
Traver, R. W. 1995. Manufacturing Solutions for Consistent Quality and Reliability. New York:
American Management Association.
Tukey, J. W. 1959., “A Quick, Compact, Two-sample Test to Duckworth’s Specifications.” Journal
of Quality Technology 1:31–48.
VanderWiel, S. A., and S. B. Vardeman. 1994. “A Discussion of All-or-None Inspection Policies.”
Technometrics 36: 102–109.
Wheeler, D. J., and R. W. Lyday. 1989. Evaluating the Measurement Process. Second edition.
Knoxville, TN: SPC.
———. 1990. Understanding Industrial Experimentation. Second edition. Knoxville, TN: SPC.
———. 1992. “Problems with Gage R&R Studies.” American Society for Quality Control (ASQC)
Quality Congress Transactions, Milwaukee, WI, 179–185.
Wu, C.F.J., and M. Hamada. 2000. Experiments: Planning, Analysis, and Parameter Design
Optimization. New York: John Wiley & Sons.
Zaciewski, R. D., and L. Nemeth. 1995. “The Multi-vari Chart: An Underutilized Quality Tool.”
Quality Progress 28:81–83.
H1212chRef.qxd 3/31/05 10:52 AM Page 318
H1212Index.qxd 4/1/05 10:45 AM Page 319
Index
Note: All page references following CD refer to text on the compact disk.
319
H1212Index.qxd 4/1/05 10:45 AM Page 320
320 Index
bell-shaped histograms, 19, 75, CD-187–89 capability ratios, 19, 73, CD-183,
Gaussian models and, 23 CD-194
best subsets regression, CD-342–44 casting thickness. See examples
Bhote, A. K., 46, CD-234, CD-253–54, categorical characteristic, 15, 22
CD-259, CD-265 cause-and-effect diagrams, CD-227
Bhote, K. R., 46, CD-234, CD-253–54, cause of variation, 16
CD-259, CD-265 classification, CD-182
binary characteristic, 15 dominant, 16
binary output center points, use of, CD-275–76
difficulties with, 4 characteristics
dominant causes for, 17 calculating derived, CD-304
full extent of variation, 87 classification, 15
group comparisons and, 179–82 definition, 14
logistic regression for, CD-261 numerical summaries, CD-311–15
measurement system assessment, Cleveland, W. S., CD-288
CD-205–9 Coleman, D. E., 200
multivari investigations, 162 common cause, CD-182
sample size and, 5, 77 component family, 141, 143,167–76
summaries for, 22 use of leverage, CD-237
use of aggravation, 71 component-swap investigation, 167–76
verification experiments, 199 alternative plans, CD-257
Bingham, S., 282 summary, 174–75
Bisgaard, S., 256 using three groups, CD-254
block bore diameter, See examples concentration diagrams, 140
blocking, CD-264 conclusion step in QPDAC, 58
bottle label height. See examples checklist, 65
boundary samples, 305 confidence intervals, CD-201–3
Box, G.E.P., 51, 194, 267, 281–82, CD-198, sample size, 77
CD-260, CD-267, CD-288, CD-290, confounding, 235
CD-295, CD-298, CD-333 aliased effects and, CD-272
box plots, CD-319 elimination of, CD-264
Boyles, R. A., CD-209 continuous characteristic, 15
brainstorming, CD-225–26 numerical summary, CD-311
brake rotor balance. See examples continuous output
Brassard, M., CD-225 baseline sample size, 77
Breakthrough Cookbook, CD-191 dominant cause and, 17
Breyfogle, F. W., CD-191 input/output investigation, 183–192
broken tooling, 304 contrast matrix, CD-268
control chart. See run chart
acceptance control chart, 282
C as a baseline, 81
Cameron, A. C., CD-261 dominant cause and, CD-182
camshaft journal diameter. See examples feedback control and, CD-290
camshaft lobe angle error, 21 process monitoring, 310
camshaft lobe runout. See examples control factors, CD-182, 278
candidate controlled processes, adjustments to,
adjuster, 228, 238 CD-294
for desensitization, 241, 254 convenience sampling, 54
for robustness, 285, 297 cost/benefit analysis, 47, 49, 217
H1212Index.qxd 4/1/05 10:45 AM Page 321
Index 321
322 Index
Index 323
324 Index
I L
indicator variables, 190, CD-261 Lawless, J. F., CD-235
infrastructure, supportive, 47 least squares, CD-195
Ingram, D., CD-235, 253 Ledolter, J., CD-265, CD-288, CD-290
input characteristic, 15 Lemeshow, S., CD-261
input/output relationship investigation, levels for inputs in experiments, 194
183–92, CD-181 dominant cause in desensitization, 254
summary, 191 move process center, 237
inspection limits, 301, 304 verification of suspects, 198
inspection, 100%, 38–39, 301–6 leverage, 128
interaction effect, 203, CD-181 comparing assembly and component
confounding in fractional factorials, families, 141, CD-237
CD-272 component swap process, 167, 176
interaction plots, 246, CD-249 outliers and, CD-200
internal cause, CD-182 Lewis, J. P., 48
investigation summary Liberatore, R. L., CD-198
assembly versus components, 144 Lindsay, M., CD-225
baseline, 80–81 linearity, definition, CD-212
component swap, 174–75 local optimization, 48
desensitization, 254–55 locally weighted scatterplot smoother
group comparison, 181–82 (LOWESS), CD-288
input/output relationship, 190–91 logistic regression, CD-261
measurement, 97–98 logistics in planning investigations, 56, 63
moving process center, 237–38 Logothetis, N., CD-192
multivari, 161–62 LOWESS (locally weighted scatterplot
robustness, 297 smoother), CD-288
H1212Index.qxd 4/1/05 10:45 AM Page 325
Index 325
326 Index
Index 327
328 Index
Steiner, S. H., 29 U
stoplight control, CD-289–90
stratification, 131–34 unit, definition, 14, 64
study error upstream family. See variation transmission
avoidance of, 71, 124 investigation
definition, 54, 64
description, CD-198
study population, 54, 64, 198
V
subprocess, 13 V6 piston diameter. See examples
subsetting data, CD-304–5 V8 piston diameter. See examples
sunroof flushness. See examples validation of a solution, 307–9
supplier, definition, 15 valve train test stands. See examples
suspect, 121, 124, 208 VanderWiel, S. A., 306
choosing levels for, 198 Vardeman, B., 306
interactions between, 203 variables search, CD-265
verification experiments for, 193–209 variation
Swersey, A., CD-265, 290 approaches to reducing, 29–40
systematic sampling, 54, CD-196–98 causes of, 15–23, CD-182
definition, 1
displaying, 18–23
T explaining changes in, CD-232–33
Taguchi, G., 257, 299, CD-182, 192, 253, quantifying, 18–23
278, 297–98 reduction algorithm, 41–50
target population, 52 unequal levels of, 146
target process, 52 variation transmission investigation, 163–67
Taylor, W. A., CD-192 regression analysis for, CD-250–53
team. See process improvement team summary, 166–67
time-based families. See multivari varying inputs, 15–16
investigation vectoring, 262
comparison of, 135–38 verification of a dominant cause, 193–209
order preservation, 144–45 investigation summary, 206–7
traceability, 144–45 vital few, CD-179
time series models, CD-288
tin plate strength, CD-209–210
Tippett, L.H.C., CD-253
W
Todd, R. H., 46 weatherstrip torsional rigidity, CD-72–74
traceability, 144–45 wheel bearing failure time. See examples
transmission shaft diameters. See examples Wheeler, D. J., 194
Traver, R. W., CD-290 window leaks. See examples
treatments, 200, 208 Wittenmark, B., CD-295
Trivedi, K., CD-261 working approach flowchart, 114, 226
trivial many, CD-179 Wu, C. F. J., 257, CD-267, 281
truck pull. See examples
Tsui, L., CD-253
t-tests, CD-234
Z
Tukey, J. W., CD-234 Zaciewski, R. D., 150
H1212CS01.qxd 4/1/05 10:13 AM Page CD–1
Case Studies
W
e provide three case studies to illustrate the use of the Statistical Engineering
variation reduction algorithm. We hope that the case studies help you to:
• Better understand the stages of the algorithm and how to move among them.
• Understand the considerations in the choice of a working approach.
• Select appropriate investigations and analysis tools in the search for a
dominant cause.
• Assess the feasibility of an approach and implement it.
As W. Edwards Deming somewhat paradoxically said, “You cannot learn by example.”
We recognize that there are no other processes or problems for which the case studies are
perfect models. We recommend, when reading the case studies, that you think about where
your own processes and problems are similar to and different from those in the cases.
Case Study I
Brake Rotor Balance
A
n iron foundry produced veined brake rotors (Figure I.1) that were machined at a
separate location. The machining plant 100% inspected the rotors for balance and
welded a weight into the veins if the imbalance was too severe. We call a rotor need-
ing added weight a balance reject.
The historic rate of balance rejects was approximately 25%. The foundry initiated the
project because the reject rate jumped to 50%. This increase in rework coincided with a
move from a four-cavity to a six-cavity (called a four-gang and six-gang) core mold to
increase productivity in the foundry. The cores were set in the mold to create the veins
when the rotor was cast.
CD–3
H1212CS01.qxd 4/1/05 10:13 AM Page CD–4
The foundry was convinced that the change to the six-cavity mold was not the cause of the
increase in balance rejects. Their confidence was based on their previous experience and because
a full analysis of the six-cavity mold had shown all dimensions well within specification.
The increased reject rate could not be explained by any other changes made at either the
foundry or the machining operation. As it stood, each party blamed the other. To address
the increased rework, the machining operation planned to add another rework station. The
foundry formed a team with the goal of reducing the reject rate to at most the historical level.
To determine imbalance, the machining plant measured the center of gravity (a distance
and direction from the rotor center), which was then translated into a weight (in ounces) and
orientation needed to balance the rotor. If needed, the weights were welded to the veins on
the inside of the rotor, as seen in Figure I.2. A balance reject was any rotor needing weight
greater than 0.5 ounces. To focus the problem, the team selected the balance weight as the
output. They knew that if they could reduce the weight, they could eliminate the rework,
regardless of the orientation.
To establish the baseline in terms of the balance weight, the team selected 300
machined rotors spread out over the previous week’s production. The data are given in
brake rotor balance baseline. The baseline histogram and run chart are given in Figure I.3.
In the baseline, 46% of the rotors required rework. The run chart of the balance weights
suggests no obvious pattern over time.
25
3
20
Weight
2
Percent
15
10
1
5
0 0
Figure I.3 Histogram and run chart of balance weights in the baseline investigation.
H1212CS01.qxd 4/1/05 10:13 AM Page CD–5
There was one outlier in the baseline sample. If we exclude this rotor, the standard
deviation is reduced slightly to 0.49. Ignoring this casting, the full extent of variation of the
imbalance weight is about 0 to 2.25. The team set the goal to reduce variation in the bal-
ance weight so that at least 75% of the rotors had weight less than 0.5.
The team moved on to the Check the Measurement System stage. They wanted to
ensure they had a reliable way to measure balance. There were three gages in parallel used
to measure balance weight. See the process map in Figure I.4.
For the investigation, the team selected three rotors with initial measured weights of 0.10,
0.54, and 1.12. They measured the three rotors twice using each of the three gages on three sep-
arate days. There was little operator effect since the gages were mostly automated. There were
54 measurements in total. The data are given in the file brake rotor balance measurement.
We see in Figure I.5 that the measurement system can easily distinguish the three rotors.
We calculate the summary measures by rotor (part).
Balance gage 1
Add balance
weight
1.0
Weight
0.5
0.0
1 2 3
Rotor
There are no obvious outliers in these data. We estimate the standard deviation of the
measurement system as
0.06852 + 0.06972 + 0.0584 2
= 0.06
3
The measurement system was judged to be acceptable since this is much less than the over-
all standard deviation, estimated in the baseline investigation to be 0.51. The discrimination
ratio D is
0.512 - 0.062
= 8.4
0.06
Next, the team had to select a working approach. They first considered the non-cause-
based approaches. They ruled out 100% inspection, since that was the current approach
and was too costly. They also eliminated feedback control, since there is no strong pattern
in the variation over time in the baseline data and they had no idea of how to adjust the
weight. Robustness or Move the Process Center (equivalent approaches in this case) were
possibilities but, without more process knowledge, were not likely to succeed. The team
decided to search for the dominant cause of variation in the balance weight.
The team first looked at easily available data to see what causes could be eliminated.
They recorded, on a defect concentration diagram, the location (in increments of 30°) of the
welded rework weight for the 140 balance rejects from the baseline investigation. The dots
on the part schematic in Figure I.6 show the nonsymmetrical pattern of balance weights
observed. Since the machining process is rotationally symmetric and the casting process is
oriented, the team eliminated all causes in the machining operation. With this simple inves-
tigation the team made tremendous progress with little cost and in a short time.
The team next planned a group comparison. They selected 30 balance rejects (average
weight 1.02) and 30 balanced brake rotors (average weight 0.15) for the comparison. They
then measured 26 foundry-determined characteristics on each machined rotor. Note that
H1212CS01.qxd 4/1/05 10:13 AM Page CD–7
Nomenclature
machining destroyed many foundry characteristics. The input characteristics included vein
thickness at eight different locations (four orientations at inner and outer edges of the
rotor), three derived thickness variation measures (variation at inner and outer locations,
and the overall variation), core offset at two orientations, four diameters related to the core
size, and five hole diameters (on the inner part of the rotor, see Figure I.1). The data are
given in the file brake rotor balance comparison.
The team identified two input characteristics, thickness variation and core position (off-
set), that were substantially different for balanced and unbalanced rotors. See Figure I.7. The
results for thickness variation were more compelling than for core position. However, based on
engineering knowledge, both inputs were plausible dominant causes of imbalance variation.
The team decided to verify these suspects hoping that they could then reformulate the
problem, since the suspects could be measured in the foundry. This would save time and
effort in future investigations since they would no longer need to trace rotors between the
30
0.15
Overall thickness variation
Offset1
0.10
20
0.05
10
0.00
Balanced Unbalanced Balanced Unbalanced
Rotor Rotor
Figure I.7 Plot of thickness variation and offset 1 in balanced and unbalanced rotors.
H1212CS01.qxd 4/1/05 10:13 AM Page CD–8
foundry and the machining operation. They planned and conducted a verification experi-
ment to confirm that core thickness variation and core position were substantive causes of
the balance weight variation and that the six-cavity mold was not. They used two levels for
each input and a full factorial design. The team selected the nonnominal levels for core
position and thickness variation at the high end of their normal range of variation. We give
the eight treatments and input levels in Table I.1. For each run, they planned to produce eight
castings. The order of the treatments was randomized.
The 64 experimental castings were tagged and shipped to the machining plant to be
processed and measured. We give the data in the file brake rotor balance verification and
the average weight over the eight castings for each run in Table I.1.
We plot the weights by treatment in Figure I.8. We see that some treatments have very
little variation and result in a low average weight. We also see roughly the full extent of the
variation in weight given by the dashed line on Figure I.8.
2
Weight
1 2 3 4 5 6 7 8
Treatment
A: Tooling
C B: Position
C: Thickness
A
AC
AB
ABC
BC
Figure I.9 Pareto chart of the effects for brake rotor verification experiment.
From the Pareto plot of the effects, given in Figure I.9, we see the effect of core posi-
tion is small, so we eliminate it as a suspect. Since the tooling and thickness variation have
a relatively large interaction, we look at the effect of these two suspects simultaneously.
We give the main and interaction effects plots in figures I.10 and I.11, respectively.
From the interaction plot, the team concluded that low thickness variation using the
four-cavity mold produced the optimal results (the weights required were so small that the
balance specification was met without rework). Thus, the dominant cause of the imbalance
problem was in the core molding process.
This search for the dominant cause is summarized in the diagnostic tree shown in
Figure I.12.
The team made the obvious fix and recommended that the foundry go back to the orig-
inal four-cavity core mold. When this change was implemented, the rate of balance rejects
immediately dropped to its historical levels. The team had met the original project goal.
The major lesson learned in the project was the effect of the thickness variation on the
balance weight. The verification experiment showed that thickness variation in the cores was
a dominant cause of balance weight variation in the original process that used the four-cavity
s
th
nd
sa
al
al
ou
ng
ng
in
in
et
th
om
om
ga
ga
ffs
30
O
N
4
1.0
0.8
Average weight
0.6
0.4
0.2
Figure I.10 Main effects plot for brake rotor verification experiment.
H1212CS01.qxd 4/1/05 10:13 AM Page CD–10
hs
dt
an
us
al
al
in
in
o
et
th
om
om
ffs
30
O
N
Tooling
1.0
6 gang
0.5
4 gang
0.0
Position
1.0
Normal
0.5
Offset
0.0
Thickness
mold. The team was puzzled about the interaction between the thickness variation and the
number of cavities in the mold. Knowledge of a dominant cause provided the opportunity to
improve the process further. There was no immediately known method for adjusting the core-
making process to reduce thickness variation. The team considered looking for an adjuster in
the core-making process but rejected this approach since they did not expect to be successful.
Excessive
balance rejects
Core thickness
Core position
variation
Figure I.12 Summary of the method of elimination for brake rotor example.
H1212CS01.qxd 4/1/05 10:13 AM Page CD–11
The team had the opportunity to implement a new core-making process for the veined
rotor. The equipment was already available in the plant but not in use. The team knew that
the cold box process was dimensionally stable, and they expected much less thickness vari-
ation with this process. This is the Fix the Obvious approach. With the implementation of
the cold box method, the process was greatly improved. Over the next four months the rate
of balance rejects dropped to 0.2%, a large reduction from the 50% at the start of the project.
The machining plant eliminated the expensive rework stations and scrapped the few bal-
ance rejects in the new process.
Highlights
Some strengths and weaknesses of this case are:
• The use of the available data together with the knowledge of the symmetry of
the machining operation to eliminate all causes in the machining plant.
• In the group comparison, the team could have selected two smaller sets of
rotors that were more extreme with respect to the balance weights.
• The carefully planned and conducted verification experiment.
• The application of the knowledge gained about thickness variation (the dominant
cause) to select the cold box process.
H1212CS01.qxd 4/1/05 10:13 AM Page CD–12
H1212CS02.qxd 4/1/05 10:19 AM Page CD–13
Case Study II
Rod Thickness
A
plant manufactured approximately 12,000 connecting rods per day for use in an
engine assembled in the plant. The rod, illustrated in Figure II.1, connects the piston
(at the small or pin end of the rod) to the crankshaft (at the large or crank end of the rod).
The plant received forged blanks and machined the rods in a large number of process steps.
Management identified the rod line for a variation reduction project because the overall
scrap cost was greater than budget. The yearly scrap cost was in excess of $400,000, and the scrap
rate was 3.2% over the previous four months. Management set a goal to reduce the scrap rate
to 1.6% in its annual business plan. The rod line production manager assigned a team to the
project.
Looking at scrap records, the team found that scrap occurred at several stages in the process
and for several reasons. To focus the problem, they used Pareto analysis on the records
for one month. The results, in Figure II.2, showed that 65% of the scrap occurred at a
grinding operation. At this operation, the team discovered that about 90% of the scrap was
due to rods with their crank end thickness less than specification. The team focused their
attention on reducing variation in rod thickness.
CD–13
H1212CS02.qxd 4/1/05 10:19 AM Page CD–14
100
80
100
Percent
60
Count
40
50
20
0 0
Grind Bore Broach Assembly Others
Defect
Count 85 24 14 6 2
Percent 64.9 18.3 10.7 4.6 1.5
Cum% 64.9 83.2 93.9 98.5 100.0
Figure II.2 Pareto chart for scrap by operation for the rod line.
At the grinder, the final thickness of the rod was set in a two-pass operation. An oper-
ator loaded the parts into the machine. In the first pass, one side of the rod was ground in
three steps. The rod was then turned over by the machine operator, and the second side was
ground. The grinder had a rotary table with 20 pallets that passed under four different
grinding wheels. An internal control system automatically adjusted the grinding wheels
based on thickness measurements taken by a series of gages internal to the grinder.
After grinding, an in-line gage measured the thickness of every rod at four positions
(given by the white circles on the crank end of the rod, also faintly numbered 1 through 4,
as shown on Figure II.1). The specifications were 0.91 to 0.96 inches at each location.
The gage automatically rejected to a rework station a rod with any of the four thickness
measurements not meeting the specification. At the rework station, an operator remeasured
the rejects using a different gage and scrapped undersized rods.
The rod line ran on three shifts, five days per week. To establish the baseline, the team
sampled 200 rods chosen in two batches of 20 rods for each of five days on the day shift
only. They recorded the thickness measurements for the four positions using the in-line
gage that did not normally store the data. We give the data in the file rod thickness baseline.
Thickness is given as a deviation from 0.9 in thousandths of an inch.
Figure II.3 gives a histogram of thickness, where the dashed vertical lines are the
specification limits. Of the 200 rods, 10 rods had thickness values less than the lower
specification limit for at least one of the positions. This was somewhat higher than expected,
given the historical scrap rate. There are no obvious outliers. A numerical summary of the
baseline data (across all positions) is:
7
6
Percent
4
3
2
1
0 10 20 30 40 50 60
Thickness
Figure II.3 Baseline histogram of rod thickness (dashed vertical lines give the specification
limits).
The team set the objective to produce all rods within the thickness specification. If this
ambitious goal could be achieved, they would eliminate 90% of the scrap at the grinder or
58% of the total rod line scrap, hence meeting the project goal.
The process was well centered, so to meet the goal the team needed to reduce the standard
deviation to 8.5 from around 11 thousandths of an inch, while keeping the process centered
on target. The full extent of the thickness variation was 2 to 59 thousandths. In Figure II.4
we show the baseline data over time (batch), which suggests there was no systematic drift
in the process.
The next step was to assess the measurement system. The in-line gage used four sets of
transducers to measure rod thickness at the four positions. In effect there were four gages.
The team selected three rods to roughly cover the full extent of the thickness variation.
The team was concerned about a relative bias among the four gages. Since it was
impossible to measure the same position on the different gages, they sent the three rods to
60
50
40
Thickness
30
20
10
0
1 2 3 4 5 6 7 8 9 10
Batch
Figure II.4 Box plot of rod thickness by batch for baseline investigation (dashed horizontal
lines give the specification limits).
H1212CS02.qxd 4/1/05 10:19 AM Page CD–16
a precision laboratory to have the thicknesses determined with little measurement bias and
variation. The precision lab values are given in Table II.1.
The team chose two operators, one from the day shift and one on the afternoon shift.
For three days, each operator measured the three rods three times at each of the four posi-
tions. In total, there were 216 measurements, 56 per gage. Since we know the true values,
we give the measurement errors in the file rod thickness measurement.
A summary of the measurement errors for each position (gage) are:
There is no evidence of substantial bias in any of the four gages, since the average errors
are close to zero.
Because there is no relative bias among the gages and we have the measurement errors,
we estimate the measurement system variation by the standard deviation of the 216 meas-
urement errors. As given, we obtain the estimate 1.512.
The baseline variation was 11.023. We can estimate the process variation, using Equa-
tion (7.2), by 11.0232 - 1.512 2 = 10.92 and the discrimination ratio as D = 10.92 ÷ 1.512 = 7.2.
The team decided that the in-line gage was not a dominant cause of the variation and was
adequate to move to the next stage of the Statistical Engineering algorithm.
H1212CS02.qxd 4/1/05 10:19 AM Page CD–17
60
50
40
Thickness
30
20
10
1 2 3 4
Position
Figure II.5 Box plot of thickness by position from the baseline data.
When considering the choice of a working variation reduction approach, the team
quickly rejected the approaches that did not require knowledge of a dominant cause. The
process was already well targeted, there was no evidence of an exploitable time pattern in
thickness, 100% inspection was already present, and making the process robust seemed
difficult. The team decided to search for a dominant cause.
The team first looked at the available data from the baseline investigation to see what
family of causes could be eliminated. We show a box plot of thickness by position in
Figure II.5. There is a difference between position 3 and the other positions. However, the
variation within positions was close to the full extent of variation, so position is not the dom-
inant cause. All but one of the scrapped rods in the baseline sample was undersized at posi-
tion 3. The team did not know of an obvious fix to move the thickness center for position 3.
The team decided to carry out a multivari investigation to compare pallet-to-pallet,
position-to-position, day-to-day, and part-to-part families. They planned to select three
consecutive rods from six different pallets (pallets occur in pairs, so three pairs of pallets
were chosen) on four different days. They measured thickness at all four positions to con-
firm the findings from the baseline investigation. In total, they sampled 72 rods and made
288 measurements. The data are given in the file rod thickness multivari.
We show the results of the multivari analysis for the pallet-to-pallet, position-to-position,
and day-to-day families in Figure II.6. We see that none of these families is home to a domi-
nant cause, although again the thickness was lowest at position 3. To examine the part-to-part
family, we form a new input group corresponding to all 96 combinations of day, position, and
pallet. We use ANOVA to isolate the variation within the groups. The results are:
Source DF SS MS F P
group 95 14444.32 152.05 3.93 0.000
Error 192 7433.33 38.72
Total 287 21877.65
H1212CS02.qxd 4/1/05 10:19 AM Page CD–18
We estimate the within group variation due to causes in the part-to-part family as
38.72 = 6.22, a substantial component of the baseline variation 11.023. The team concluded:
• There was substantial part-to-part variation within the same pallet, position,
and day.
• Position 3 was systematically thinner than the other positions.
However, later, when the team examined the histogram of the multivari data in Figure II.6,
they noticed that the range of variation in thickness from the multivari did not cover the full
extent of variation. There were no undersized rods with thickness values below the lower
specification limit of 10. As a result, they decided that their conclusions from the multivari
were premature.
This was an unexpected result. The plan for the multivari investigation matched the
baseline investigation quite closely. By sampling rods over a number of days, the team had
hoped that they would observe undersized rods. The lack of undersized rods suggested that
either the dominant cause acts in pallets not chosen for the multivari or that it acts only
occasionally in time. Based on this thinking, the team decided to conduct another multivari
investigation.
60 60
50 50
Thickness
Thickness
40 40
30 30
20 20
1 2 3 4 1 2 3 4
Position Day
60
10
50
Thickness
40
Percent
5
30
20
0
1 2 3 4 5 6 0 10 20 30 40 50 60
Pallet Thickness
Figure II.6 Summary of the data from the first multivari investigation (vertical dashed lines
give the full extent of variation).
H1212CS02.qxd 4/1/05 10:19 AM Page CD–19
In the second multivari, they planned to sample three consecutive rods from a different
set of six pallets than those used in the first multivari investigation. This time they repeated
the sampling scheme five times within a single shift. This shorter time frame seemed suffi-
cient from the baseline results, as shown in Figure II.4. In total, the team sampled 90 rods
and made 360 measurements. The data are given in the file rod thickness multivari2.
The team first checked that their new sampling plan generated the full extent of thick-
ness variation. A histogram of the new multivari data, shown in Figure II.7, shows that the
full extent of variation was captured. The dominant cause of thickness variation must have
acted during the course of the investigation.
We can see the effects of the families of variation for time, pallet, and position in the
one-input-at-a-time multivari charts in Figure II.8. We see that a dominant cause of thick-
ness variation is acting time to time. Within each time period, the variation is substantially
less than the full extent of variation. All of the undersized rods occur at time 3. The differ-
ences among the positions is smaller than previously.
We expect causes in the rod-to-rod (that is part-to-part) family to act haphazardly. To
examine the rod-to-rod family, we define a new input group that uniquely numbers the 120
combinations of position, time, and pallet. The within-group standard deviation (Pooled
StDev in the MINITAB one-way ANOVA) is 4.29. The rod-to-rod family is not dominant.
During the multivari investigation, 40 undersized rods were produced in a matter of
two minutes at time 3. The tool setter suspected that there was a problem with the feedback
control scheme that automatically adjusted the grinding wheels. For the finishing wheel,
the controller used the measurements from an internal gage and the following rules:
• Measure every part at all four positions.
• If the thickness at any position for two consecutive parts exceeds 50-thousandths
of an inch, lower the finishing grinding wheel for three seconds at a fixed rate.
• Ignore the measurements from the next three parts until the adjustment has
taken effect.
10
9
8
7
Percent
6
5
4
3
2
1
0
0 10 20 30 40 50 60
Thickness
60 60
50 50
40 40
Thickness
Thickness
30 30
20 20
10 10
0 0
1 2 3 4 5 1 2 3 4 5 6
Time Pallet
60 60
Position
50 50
1
40 2
40
Thickness
3
Thickness
30 4
30
20 20
10 10
0 0
1 2 3 4 1 2 3 4 5
Position Time
Figure II.8 Multivari charts using data from the second multivari investigation.
The last rule was necessary because of the time delay between the grinding by the fin-
ishing wheel and the measurement of thickness by the internal gage. The rule was based on
the feedback timer and did not count parts. Upon closer inspection, the team discovered
that, occasionally, the last rod ground before compensation was treated as the first rod after
the wheel was lowered, thereby fooling the equipment into thinking a second compensation
was needed. Combined with the systematic difference among the positions, the double
compensation was a dominant cause of undersized rods.
The team adopted the Fix the Obvious approach. They adjusted the logic of the controller to
count the parts processed after compensation instead of using the timer. This change prevented
the double compensation. The team also looked for ways to adjust the process to better center the
four positions. They initiated maintenance on the alignment of the grinding wheels to reduce the
position-to-position differences. Since they expected the equipment to deteriorate over time,
they set up a monitoring procedure to detect when the position-to-position differences became
large. Each day, the operator recorded the thicknesses for five consecutive parts and plotted the
position averages on a run chart. The process engineer checked the chart on a regular basis to see
if there were systematic differences among the four positions.
In the validation stage, the team assessed the effect of the process changes by compar-
ing performance before and after the changes. We give a histogram of thickness values
observed when the plan for the original baseline investigation was repeated in Figure II.9.
H1212CS02.qxd 4/1/05 10:19 AM Page CD–21
10
Percent
5
10 20 30 40 50 60
Thickness
The data are given in the file rod thickness validation. No undersized rods were observed
in the validation investigation.
The scrap rate substantially decreased. In the first month after eliminating the double
compensation, the scrap rate was 1.7%. The project team fell short of the initial goal, but
the project was judged to be successful by the rod line management.
Highlights
Some strengths and weaknesses of this case are:
• The team did a good job of focusing the problem, linking the problem goal
“eliminate undersized rods” to the project goal “reduce rod line scrap by at
least half.”
• The team adopted a good solution to the assessment of the four gages in the
measurement system investigation. In this case, they could not measure the same
part/position on each gage, so there was no way to tell if there were relative biases
among the gages without measuring rods with known thicknesses. Note that a
relative bias could have explained why the average thickness for position 3 was
smaller than the other positions.
• The team did not make the best use of the information from the baseline
investigation. Given that the full extent of thickness variation occurred within a
single day (actually a shift), they should have planned the first multivari within
a single shift. The time-to-time variation is captured by repeatedly sampling
within the shift, as in the second multivari investigation.
• The team made the mistake of not initially checking that the first multivari
investigation captured the full extent of thickness variation.
• The team was wise to change the control plan to monitor the process for
position-to-position differences, since these were expected to recur, given the
nature of the grinding process.
H1212CS02.qxd 4/1/05 10:19 AM Page CD–22
H1212CS03.qxd 4/1/05 10:20 AM Page CD–23
A
n engine plant machined approximately 1500 V8 crankshafts per day on three shifts.
There was 100% inspection at an automated final gage that measured more than 60
characteristics to ensure that the customer, the engine assembly operation, received
a high-quality product. We show a schematic of the crankshaft in Figure III.1.
At the beginning of this project, the monthly scrap rate ranged between 6 and 8%,
averaging 7.2% over the previous four months. The first-time-through rate, the ratio of the
number of parts that were accepted by the final gage to the total number of parts
processed, was highly variable, and sometimes as low as 20%. Parts rejected at the final
gage were scrapped or reworked and remeasured using an off-line gage.
The goals of this project were to reduce the overall scrap rate to 4.5% or less and
increase the first-time-through rate at the final gage to at least 75%. Management hoped to
achieve these goals without any substantial capital expenditure.
1 2 3 4 5
Mains
CD–23
H1212CS03.qxd 4/1/05 10:20 AM Page CD–24
The team began by determining a more specific goal that could support the overall
project goal. The reason was recorded for every crankshaft rejected by the final gage. The
team used Pareto analysis to determine that about 85% of rejects and 73% of the scrap were
related to the main diameters. The five main diameters are numbered in Figure III.1.
The final gage measured the diameters at three positions—front, center, and rear—on
each of the five mains. The specification limits for diameter were ±4.0 thousandths of an
inch measured from nominal. The rejects related to main diameter were due to undersized,
oversized, and excess taper from the front to the rear of the main. A crankshaft with any
main diameter less than –4.0 was scrapped. Taper was the difference between front and rear
diameters on each main. The taper specifications were ±2.0 thousandths of an inch. Most
parts rejected for taper could be reworked.
To achieve the project goal, the team decided to concentrate efforts on reducing varia-
tion in main diameter to eliminate scrap due to undersized diameters and to reduce rework
due to taper and oversized diameters.
The team extracted baseline data on main journal diameters for 2000 crankshafts over
four days (500 per day selected haphazardly throughout the day) from the final gage. The
data are given in the file crankshaft main diameter baseline. Because of the large number
of measurements, the data are stored in three different formats on the same worksheet:
• In the first 21 columns by crankshaft
• In the next three columns by individual measurements for diameter
• In the last two columns by individual measurements for taper
In the baseline investigation, the scrap rate was 6.9%, with 53% rework, of which 40%
was due to excess taper. We show the histograms of all measured diameters and calculated
taper values for the 2000 crankshafts in Figure III.2.
4
4
3 3
Percent
Percent
2 2
1 1
0 0
–5 0 5 10 –4 –3 –2 –1 0 1 2 3 4
Diameter Taper
Figure III.2 Baseline histograms of main diameters and taper over all mains and positions
(dashed vertical lines show the specification limits).
H1212CS03.qxd 4/1/05 10:20 AM Page CD–25
There are many diameters and taper values outside the specification limits. Numerical
summaries of the baseline data are:
The team saw that if the process was centered on target, they needed to reduce the standard
deviation of diameter to less than 1.30 to meet their goal of producing no diameters out of
specification. The full extent of diameter variation was about –5.1 to 6.9 (0.88 ±3*1.99)
thousandths of an inch.
There are so many diameter observations that it is difficult to look for patterns over
time. We give the box plot of diameter by day in Figure III.3. The team also looked at plots
of diameter over time for each of the individual positions. There are no day-to-day patterns.
The next step was to assess the final measurement system. There were 15 different gages
and the team could not measure the same part on each of them; for example, they could not
measure the number 1 main front diameter on the number 3 rear gage. These no-contact
gages used a common air pressure system. The team selected six different crankshafts to be
measured three times each on three days. The team chose six crankshafts to roughly cover
the full extent of variation because here they were simultaneously investigating 15 different
measurement systems. It was not worth the effort to find three crankshafts that gave the full
extent of diameter variation for all 15 positions simultaneously. There was no operator effect
because the measurement process was automated. There were a total of 54 measurements on
the front, center, and rear positions for each of the five mains. The data are given in the file
crankshaft main diameter measurement.
10
5
Diameter
–5
1 2 3 4
Days
The estimates of the measurement variation (standard deviation) of the 15 gages ranged
from 0.28 to 1.06. The MINITAB ANOVA results for the two extreme cases are:
Because there were 15 gages, there may have been relative biases among the gages.
These biases would contribute to the variation in diameter across all mains as determined
in the baseline. We cannot assess these biases with the data from the measurement sys-
tem investigation. The team was concerned about how to determine if the measurement
system was a dominant cause of the overall variation. They looked at the box plot of
diameter by position from the baseline investigation, given in Figure III.4, and noted that
there were systematic position-to-position differences.
These differences could be due to the process or the gages. The team decided to assess each
gage against the baseline variation within its own position. We give the results in Table III.1.
H1212CS03.qxd 4/1/05 10:20 AM Page CD–27
10
Diameter
0
–5
1center
2center
3center
4center
5center
1front
2front
3front
4front
5front
1rear
2rear
3rear
4rear
5rear
Position
For some positions, the measurement system was a dominant cause of the variation.
This was a shocking discovery, since the gage was thought to be reliable based on the
required gage R&R investigations. The team needed to improve the measurement system
immediately. The small discrimination ratios helped to explain why some operators remea-
sured scrapped crankshafts a second time. Given the observed measurement variation, it
was not surprising that these crankshafts were sometimes found to be acceptable. Surprisingly,
no one found the need to remeasure crankshafts that were first time passes through the
measurement system.
To improve the final gage, the team decided to look for a dominant cause of measure-
ment variation. They started with the existing data from the measurement investigation.
They examined the data from positions 4 front (one of the best gages) and 1 rear (one of the
worst) in more detail. In Figure III.5, we give plots of the estimated measurement errors
(measurements minus the position average) by day. We see for 1 rear that the measurement
error is substantially different from one day to the next. The dominant cause of the meas-
urement variation for position 1 rear acted in the day-to-day family.
To explore this behavior further, the team measured the same crankshaft once per day
for 19 days. The data are given in the file crankshaft main diameter measurement stability.
The results were striking and surprising. We present results only for the front position for
all five mains here. The results were similar for the other positions. In Figure III.6, and in
the subsequent numerical results, we see that the final gage was unstable. The day-to-day
variation in the gage was much greater than expected based on the existing short-term
R&R results.
2 2
1 rear minus crank average
1 1
0
0
–1
–1
1 2 3 1 2 3
Day Day
The team decided to address the measurement system instability with feedback control.
The controller was feasible because:
• Quick adjustment of the measurement system was available by changing an offset.
• The shifts in the measurement process were persistent, as seen in Figure III.6.
The team developed a procedure to monitor the stability of the gages and a reaction plan.
Each day, an operator measured a reference part and plotted the results on a control chart,
one for each of the 15 positions. The centerlines and adjustment limits were based on the
within-day measurement variation (that is, short-term variation) from the initial measure-
ment investigation. If a plotted point fell outside the control (adjustment) limits, the gage
was cleaned and remastered. The control charts provide an ongoing record of the perfor-
mance of the gages. If the gages performed consistently over time on the reference part, the
team had confidence in using data from the gages to make process decisions. After the feed-
back system was implemented, the team validated the improvement by repeating the initial
measurement investigation. They found that discrimination ratios, calculated as in
Table III.1, all exceeded 3.0.
The team was ready to proceed to the next step of the Statistical Engineering algorithm,
confident in the measurement system and knowing they had already improved the process
substantially. They were tracking the first-time-through rate at the final gage and saw a
marked increase. The team considered the possible variation reduction approaches and
decided to look for a dominant cause of the variation in main diameter.
6
1
5 2
4 3
4
Front main diameter
3
5
2 1
1 2
3
0
4
–1 5
–2
–3
0 1 2
Days
Figure III.6 Front position diameter for the five mains when measuring the same
crankshaft.
H1212CS03.qxd 4/1/05 10:20 AM Page CD–30
Because of the faulty measurement system, they debated whether they should repeat
the baseline investigation. They decided to proceed without doing so. They continued to
use the original full extent of variation, –5.1 to 6.9, recognizing that this was likely too
wide given the improvements they had made.
We show a simplified flowchart of the crankshaft production system in Figure III.7,
highlighting the operations that affected the main diameters. The team believed initially
that most of the variation in main diameters at the final gage was caused by differences in
the four journal grinders. There were little data to support this belief.
The team decided to explore the families of variation defined by the process steps using
a variation transmission investigation. The main diameters were measured with an in-process
contact gage between the grinding and lapping operations. The team planned to measure a
sample of parts with the in-process gage and then track the parts through the process and
remeasure them at the final gage. They could use this variation transmission investigation to
eliminate the lapper or the upstream process as the home of the dominant cause.
Based on their experience with the final gage, the team decided to next investigate the
in-process gage. They carried out an investigation of this system with the same plan used
for the final gage. We do not give the data here. They found the measurement variation to
be relatively small (estimated standard deviation 0.31) and no evidence of instability.
The team knew that both the in-process and final measurement systems were used in
an informal way to control the grinders. For that reason, the team next checked the relative
bias of the two systems. They measured six crankshafts on the final gage. Then, they
returned the parts to the in-process gage and remeasured the diameter. Using the same six
parts, they repeated this process two more times. In the data, the intermediate (in-process)
diameters have been increased by 2.25 units to reflect the fact that the nominal diameter is
different at the two gages by 2.25 thousandths. The data are given in the file crankshaft
main diameter relative bias.
Grinder 1
5 1 center
1 front
4 1 rear
Final diameter
3
–1
–2
–2 –1 0 1 2 3 4
Intermediate diameter +2.25
Figure III.8 Scatter plot of final versus adjusted intermediate diameter for the first main.
From the scatter plot for the first main shown in Figure III.8, we see that there is a bias
between the two gages. There are similar patterns on the other mains. The team changed
some of the offsets on the in-process gage to remove the bias.
The team now had confidence in the two measurement systems and recognized that
they had made another improvement in the process. The first-time-through rate increased
again. They proceeded with the planned variation transmission investigation to separate the
effects of the lapper from those of upstream operations.
The team selected four crankshafts from each of the four grinders on four different
days. Using the two gages, they measured the diameters of the 64 parts before and after lapping.
The data are given in the file crankshaft main diameter variation transmission.
The final diameters varied between –4.3 and 5.5. This is somewhat less than the full
extent of variation but, given that the process had been improved, the team was confident
that the dominant cause had acted during the investigation.
In Figure III.9, we plot the final versus intermediate (in-process) diameters across all
mains and positions. We see that that the intermediate diameter is a dominant cause of vari-
ation in the final diameter. The lapper transmits the upstream variation.
The team was not surprised because the lapper was a so-called dumb machine. It had
no gauging or compensation and lapped for a fixed number of rotations on each main. The
team expected the lapper to remove about 2.25 thousandths from the diameter. A numerical
summary of the difference in diameters is:
Final diameter
0
–5
–5 0 5
Intermediate diameter
Figure III.9 Final diameter versus intermediate diameter over all positions.
Because of the offset between the two gages, the average change in diameter is –4.17
+ 2.25 = –1.92, close to what was expected. The standard deviation of the change in diameter,
0.51, is small.
In Figure III.10, we look at the change in diameter by main and, more critically, by
position within main. We see that the lapper removed systematically different amounts of
material within mains, especially for mains 2, 4, and 5. The lapper is home to a dominant
cause of the taper variation.
All parts met the taper specification ±2.0 thousandths before the lapping but not after.
A summary of the before and after lapper taper is:
Using the results from Figure III.10, the team arranged for maintenance on the lapper
to balance all shoes so that, on average, the change in diameter was consistent from main
to main and from position to position within each main. They also changed the control plan
for the lapper. Once per day, an operator measured a crankshaft before and after lapping,
using the final gage, and plotted the change in taper for each main on a control chart. Any
changes in lapper performance could be quickly identified and remedied as required.
The team next returned to the problem of excess diameter variation. They had identified
the intermediate diameter as a dominant cause of the variation. The dominant cause was not
verified since the conclusion made physical sense and seemed clear-cut. To select a working
H1212CS03.qxd 4/1/05 10:20 AM Page CD–33
–1
Diameter difference
–2
–3
–4
1 center
1 front
1 rear
2 center
2 front
2 rear
3 center
3 front
3 rear
4 center
4 front
4 rear
5 center
5 front
5 rear
Position
approach, the team briefly considered using the lapper to adjust the final diameter based on
the observed intermediate diameter. They rejected feedforward control because of the cost.
They would need a smart lapper that could remove a varying amount of material on each main
after receiving input from the intermediate gage. They also rejected desensitization, since
they did not believe any change to the lapper settings could mitigate the variation in the inter-
mediate diameter. They decided to reformulate the problem in terms of the dominant cause.
The team used the data from the variation transmission investigation to set the baseline
for the intermediate diameter. We give a numerical summary as follows and the histogram
in Figure III.11.
10
Percent
–5 0 5
Intermediate diameter
Figure III.11 Histogram of intermediate diameter (dashed lines are the specification
limits ±4.0).
H1212CS03.qxd 4/1/05 10:20 AM Page CD–34
Analysis of Variance
Source DF SS MS F P
Regression 1 2540.6 2540.6 9758.54 0.000
Residual Error 958 249.4 0.3
Total 959 2790.0
The residual standard deviation is 0.51. If we could hold the intermediate diameter
fixed, this is the estimate of the standard deviation in the final diameter. The slope of the
regression equation is 0.968. This gives the expected change in average final diameter for
each unit change in intermediate diameter.
H1212CS03.qxd 4/1/05 10:20 AM Page CD–35
The original goal was to reduce the standard deviation of the final diameter to about
1.30. Using the formula from Chapter 2, we have the equation for the required standard
deviation at the intermediate diameter
Solving, we find that the required standard deviation for the intermediate diameter is 1.24. The
team set a goal to reduce the standard deviation of the intermediate diameters to 1.25 or less.
The team had already completed the Check the Measurement System stage for the in-
process gage, so they proceeded to consider a working approach. The team expected that
they could find a dominant cause of the variation. However, they first considered tighten-
ing the within-process specifications, that is, using 100% inspection. The current practice
was to scrap parts if any diameter was less than –4.5 and to process all others with the hope
that the lapper might bring the part back into specification. The team considered reducing
the specifications to ±3.0 and insisting that any part outside of specification be scrapped or
reworked at the intermediate gage. The team believed that the operators at the intermediate
gage and the grinding process could meet these specifications on an ongoing basis. The
suggestion was met with hostile resistance from the operators. Without further variation
reduction effort, the team saw that by tightening the specifications, they would transfer the scrap
from the final gage to the intermediate gage with little savings.
The team next considered feedback control. In the current informal system, each of the four
grinder operators used their judgment plus the data from the intermediate gage. The final
gage operators also informed the grinder operators when there were a large number of
crankshafts out of specification. The team suspected that grinder adjustments were made
only when scrap parts were produced.
The operator could quickly make an adjustment that affected all mains simultaneously
but required the help of a tool setter and considerable downtime to adjust the diameter for
a single main. The team decided to adopt feedback control as a working approach and to
investigate further how the diameters varied over time and position.
Since taper at the intermediate gage was small and the adjuster applied simultane-
ously across all 15 positions, the team concentrated on the center diameter for each main.
They also decided to look at one grinder only since all grinders had the same adjustment
mechanism.
The team looked back at the data from the variation transmission investigation for the
five center positions. We see in Figure III.12 that there were small average differences
from day to day and main to main. Within each day, we see most of the full extent of vari-
ation of the diameter. The team decided to investigate the time-to-time variation within a
single shift.
The team sampled and measured three consecutive crankshafts from a single grinder
every 20 minutes across the eight-hour shift. The sample was about half the crankshafts
produced by that grinder. They recorded the diameters from the five center positions only.
During the investigation, the grinder operator was encouraged to use his usual ad hoc feed-
back control method. There were no adjustments during the shift. We give the data for the
H1212CS03.qxd 4/1/05 10:20 AM Page CD–36
7 7
6 6
Intermediate diameter
Intermediate diameter
5 5
4 4
3 3
2 2
1 1
0 0
–1 –1
–2 –2
1 2 3 4 1 2 3 4 5
Day Main
Figure III.12 Box plots of intermediate center diameter by day and main.
63 crankshafts (21 time points and three crankshafts per time) in the file crankshaft main
diameter feedback.
The overall standard deviation of the diameters was 1.45, greater than the target value
1.25 but less than 1.68, the standard deviation of the diameter before lapping in the variation
transmission study. The team used ANOVA to assess the effects of time and mains:
Source DF SS MS F P
main 4 91.187 22.797 15.96 0.000
time 20 155.538 7.777 5.44 0.000
Error 290 414.203 1.428
Total 314 660.927
The residual standard deviation is 1.428 = 1.19. If the time-to-time and main-to-main
differences could be eliminated, the within-grinder variation could be reduced to less than
the target 1.25. Since the easy adjustment applied simultaneously to all mains, the team
looked at the variation of the average diameter (averaged over the five mains) versus time
using a multivari chart. See Figure III.13.
There was a strong trend that dominated the part-to-part variation in average diameter
suggesting that feedback control would be effective. For each main, the team also plotted
the average diameter (over each time period) versus time as shown in Figure III.14. There
were persistent differences among the mains with main 3 being large and main 4 being
small. The average diameters over all parts by main are:
average
main diameter
1 0.5048
2 0.4762
3 1.1587
4 –0.4968
5 0.6635
H1212CS03.qxd 4/1/05 10:20 AM Page CD–37
Average diameter
1
–1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Time
4 1
2
3
3
4
Average diameter
2
5
–1
–2
0 10 20
Period
Figure III.14 Average diameter at each time period versus time period stratified by main.
The team decided to build a feedback controller using the average across all mains as
the output. Separately for each grinder, they plotted a run chart of three part averages every
hour. The operators were trained to look at the chart and make an adjustment if the average
fell outside the range ±2.5. For political reasons, the team did not change the in-process
specifications. Since the diameters drifted upwards, the target for the adjusted process was
–2.0. The team carried out a small experiment on each grinder to verify that changing the
adjuster produced the desired effect.
To deal with the problem of systematic differences among the mains, the team changed
the section of the control plan that dealt with the setup of the grinding wheels. The tool set-
ter had to ensure that the main averages of the first five parts after setup differed by no more
than one-thousandth of an inch.
H1212CS03.qxd 4/1/05 10:20 AM Page CD–38
Once the changes were in place, the team sampled 100 crankshafts, 25 from each grinder,
over one day from the intermediate gage. The standard deviation of the center diameters
was 1.28, close to the goal of 1.25.
In summary, the team added the following changes to the process control plan:
• Monitor the stability of the final gage using a reference part and a control chart.
Clean and master the gage as necessary.
• Monitor the performance of the lapper on a daily basis by measuring the change
in diameter (before minus after lapping) on one part for all mains and record the
results on control charts, one chart for each main. Watch for evidence of taper
change. Rebalance the lapper shoes if taper becomes an issue.
• Adjust each grinder based on a feedback control scheme that uses the average
across the five center positions and three consecutive parts at the in-process gage.
• Change the grinder setup procedure to ensure that all main averages are close to
equal after setup.
The changes provided the process management with ongoing, timely information to better
manage the process. The process engineer reacted to trends and anticipated problems
before they occurred.
There were many benefits to this project. The taper rework was virtually eliminated by
maintenance on the lapper. The scrap rate was reduced from 7.2% to 2.0% as measured
over a one-month period. Each percent reduction in scrap saved approximately $250,000
per annum. The first-time-through rate at the final gage was held at over 90%. The crank-
shaft line had a large gain in productivity and a large reduction in scrap costs.
Highlights
Some strengths and weaknesses of this case are:
• Addressing both taper and diameter scrap and rework problem at the same time
resulted in a complicated problem. The team needed to look simultaneously at
20 outputs (15 diameters and 5 tapers).
• The team wisely checked the stability of both the final and intermediate gages.
• To prevent recurrence of the problem, the team implemented feedback control
using reference parts for ongoing monitoring of the gages and lapper.
• Not establishing a new baseline after the team made substantial improvements
to the final gage was risky since it was harder to tell if a dominant cause acted
in later investigations.
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–39
Exercises
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–40
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–41
Exercises
CHAPTER 2
2.1 The word variation is used in other contexts to describe a difference between a real-
ized and target value such as in budget variation. How does this use compare to vari-
ation as discussed in Chapter 2?
2.2 We have heard the following comment many times from manufacturing engi-
neers: “The cause of the variation is the product design—what can you expect me
to do?” Discuss the comment in light of the definition of cause in Section 2.2.
2.3 Profile A is a measure of deviation of the actual from the ideal shape of a camshaft
lobe over one region (A) of the lobe. The target value is zero and the upper specifi-
cation limit is 250 microns. Use the data in the file camshaft lobe runout baseline to
summarize the variation in this output. Do all lobes exhibit the same variation? Is
there any time pattern in the variation?
2.4 Construct histograms and run charts for output 1 and output 2 given in the data file
chapter 2 exercise 4. Find the average and standard deviation for each output.
Assume the target value and upper specification limit for these lower-is-better out-
puts are 0 and 35.
a. Is the variation the same for each output?
b. Is the nature of the variation over time the same for each output?
2.5 You may convince yourself that the formulas for combining means and standard
deviations given in Section 2.4 are true with the following numerical demonstration
you can conduct in MINITAB. Generate two columns of 100 values sampled from
some model (in MINITAB: Calc Æ Random Data Æ your choice of model and
CD–41
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–42
CD–42 Exercises
parameters, for example, Normal with mean and standard deviation 0 and 1, respec-
tively). Then, calculate two new data columns. Let one column be the sum of the
original two columns and the other the difference.
a. Find the standard deviation and average for each of the four columns.
b. Calculate the sum and differences of the averages for the first two columns. How
do these compare to the average of the other two columns respectively?
c. Calculate the standard deviation for the sum and difference using the “square root
of sum of squares” formula given by 2.1. How do the results compare to the stan-
dard deviations for the last two columns?
2.6 At a project review, the team presented the following summary of their investigation
based on standard deviations.
Measurement system 30
Identified cause 50
Unidentified causes 81
a. The reviewing manager questioned the numbers in the second column of the
table because they did not add to 100. Is there an error? Explain.
b. By what percentage can the process standard deviation be reduced by eliminating
the contribution of the identified cause?
c. Is the identified cause a dominant cause?
2.7 In Chapter 1, we discussed a project to reduce variation in pull, an alignment char-
acteristic of light trucks. Recall that
Pull = 0.23*(right caster – left caster) + 0.13*(right camber – left camber)
and that the data for two months’ production are stored in the file truck pull baseline.
The data are summarized in the following table.
a. Use the formula for pull and the results for how averages and standard deviations
combine to predict the average and standard deviation for pull given by the last
row in the table indirectly from the component averages and standard deviations.
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–43
Exercises CD–43
b. Suppose you had the resources to reduce the variation in one of the alignment
angles by 50%. Which angle would you choose? By how much, approximately,
would the pull standard deviation be reduced?
CHAPTER 3
3.1 For a problem of interest to you speculate about the likely costs and feasibility of
implementing each of the possible variation reduction approaches.
3.2 Variation in the location of a drilled hole in a machined casting can cause poor fits
when the part is bolted to an engine. To reduce this variation, an engineer considers
a variety of possible approaches.
a. A vision system is available that can measure location on 100% of the parts and
reject those that it judges to be out of specification. What are the advantages and
disadvantages of such an approach?
b. Institute a feedback controller by measuring two parts every hour. If hole location
on either part is outside of specification, stop and adjust the process. When is
such a scheme likely to be effective?
c. A third choice is to find a dominant cause of the variation. What are the advan-
tages and disadvantages of this strategy?
d. If a dominant cause can be discovered, what options does the engineer have?
CHAPTER 5
5.1 Briefly discuss the advantages and disadvantages of the following—be sure to think
of potential errors as described within the QPDAC framework.
a. To estimate the baseline performance of a grinding process, 100 consecutive pis-
tons were sampled and the diameters were measured.
b. To investigate a proposed change to a chemical process, the investigators tried the
change in a pilot process rather than the production process.
5.2 In the camshaft lobe BC runout problem described in Chapter 1, the team selected
50 parts (10 per day over 5 days) and measured the BC runout for each of the 12
lobes on each camshaft to quantify the baseline. The 600 runout measurements are
stored in the file camshaft lobe runout baseline original. Conduct an analysis of
these data. Are your conclusions different from those in Chapter 1? Why?
5.3 To assess a measurement system used to check the diameter of an engine bore, an
investigator plans to repeatedly measure the same four (of the eight) bores on five
blocks sampled from a shift of production.
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–44
CD–44 Exercises
CHAPTER 6
6.1 In Chapter 1, we described a problem in terms of the lobe geometry of camshafts.
The data are given in the file camshaft lobe runout baseline. Quantify the problem
baseline for the following output.
a. BC runout
b. Angle error
6.2 Many programs such as Excel cannot easily handle missing observations.
MINITAB is an exception. Missing values are often stored using a special numeri-
cal code (–99 is common). These special codes can result in much confusion and
lead to incorrect conclusions. Consider the data rod thickness baseline with missing
observation. In the file, there are two outputs. The output thickness_–99 uses a
numerical code of –99 for missing observations, while thickness_missing uses the
MINITAB missing observation symbol (*). Quantify the baseline for these two out-
puts. Which data summaries show the missing observation and which do not?
6.3 The baseline investigation for the V6 piston diameter example was described in
Chapter 5. The data are given in the file V6 piston diameter baseline. Suppose the
data were collected so that all the pistons from a given hour were collected at the
start of the hour. Now the data come in subgroups as defined by hour. What sum-
maries used in the baseline analysis are affected by the subgrouping? When taking
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–45
Exercises CD–45
the subgrouping into account are the conclusions any different than those derived in
Chapter 5?
6.4 Based on customer complaints concerning installation difficulties, a team investi-
gated variation of a key fascia dimension. To establish a baseline, they measured the
dimension on 147 fascias sampled from one month’s production. The data are given
in the file fascia dimension baseline. Using appropriate summaries of the data,
quantify the baseline. Are there any concerns?
CHAPTER 7
7.1 In a process improvement problem to improve the quality of a roof panel, the meas-
urement system (specially designed for the project) counted the number of updings
on each panel. To assess the measurement system, the number of updings on 20 bad
panels and 20 good panels was counted twice. The data are given in roof panel
updings measurement.
a. Can this investigation be used to assess the measurement variation of the count-
ing process? Explain.
b. Can this investigation be used to assess the bias of the counting process? Explain.
c. The same operator counted all panels. Does the order in which he makes the
counts matter? It is most convenient to count the same panel twice in a row. Is this
a good idea?
d. A scatter plot of the first versus the second measurement is given as follows. Note
that some plotting symbols correspond to more than one pair of measurements?
What does the scatter plot tell you about the counting process?
Bad
Good
10
Second count
0 5 10 15
First count
e. This investigation was conducted over one hour. What are the advantages and
disadvantages of spreading the two measurements on each panel over two days?
f. Can the counting process discriminate between good and bad panels?
7.2 To monitor the process that produces engine blocks, piston bore diameters are meas-
ured on every block because they are key characteristics. Each engine block has
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–46
CD–46 Exercises
eight bores. The bore diameter is measured at three different heights in each bore
(bottom, middle, and top) and at two different orientations at each height. Because
the measurement process is automated, there are no operators. A measurement
investigation was conducted over a day where the diameter of every bore on four
blocks was measured four times each. The main concern was out-of-round, given by
10,000 times the difference of the two diameters at a particular height. The data are
given in the file block bore diameter measurement. From a baseline investigation the
out-of-round standard deviation was 22.8.
a. Determine the discrimination ratio. Is the measurement system adequate?
b. What would have been the advantage and disadvantage of conducting the meas-
urement investigation over a longer time period?
7.3 The following MINITAB results and graphs arise from a measurement system
investigation in which two different operators measured five parts three times each.
The five parts were selected with initial measured values spread out over the full
extent of variation, 0 to 8. The data are given in the file chapter 7 exercise 3. The two
operators worked different shifts so the parts were saved from one shift to the next.
The results include an edited ANOVA analysis as suggested in the supplement to
Chapter 7 and the default gage R&R analysis in MINITAB.
Analysis of Variance for measurement
Source DF SS MS F P
part 4 230.819 57.705 81.25 0.000
Error 25 17.754 0.710
Total 29 248.573
Gage R&R
%Contribution
Source VarComp (of VarComp)
Total Gage R&R 0.900 8.62
Repeatability 0.425 4.07
Reproducibility 0.475 4.55
operator 0.475 4.55
Part-To-Part 9.547 91.38
Total Variation 10.447 100.00
Exercises CD–47
Gage name:
Date of study:
Reported by:
Gage R&R (ANOVA) for measurement Tolerance:
Misc:
5
50
0
0
Gage R&R Repeat Reprod Part-to-part 1 2 3 4 5
Part
R chart by operator By operator
3 1 2 10
UCL = 2.937
Sample range
2
5
1 R = 1.141
0
0 LCL = 0
1 2
Operator
X-bar chart by operator Operator*part interaction
10 2 9 Operator
1
Sample mean
8 1
7
Average
6 2
UCL = 5.332 5
5
Mean = 4.165 4
LCL = 2.997 3
2
1
0 0
1 2 3 4 5
Part
a. What do the given results tell us about the bias and variation of the measurement
system?
–
b. In the gage R&R results, the X chart by operator is out of control. What does
this mean?
c. In the gage R&R results, why is the sum of the % study variation column not 100%?
d. What is the discrimination ratio (D) for this system? How does the part selection
procedure influence this ratio?
e. The gage R&R is about 29%, yet D is small. Why?
f. The results suggest a small operator-to-operator difference. This observed differ-
ence may be due to a difference in method or a drift of the system over the two
shifts. How can you separate these two possibilities?
7.4 To assess the variation in the system designed to measure camshaft lobe geometry
over time, the same camshaft was measured daily for a month. At each measure-
ment, a number of characteristics (for example, angle error, BC runout, taper, and so
on) on each lobe were determined. The data are given in the file camshaft lobe
runout measurement stability. Is there evidence of time-to-time variation in this
measurement system?
7.5 In a process that produced V8 pistons, problems occurred when pistons in inventory
were remeasured (for an audit) and found to be out of specification. Since the
process used 100% final inspection, this could only occur if there was a problem
with the measurement system. This was puzzling because a recent gage R&R inves-
tigation at the final gage had concluded that the measurement system was accept-
able. As a result, the team decided to conduct a long-term measurement investigation.
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–48
CD–48 Exercises
Two pistons were chosen to span the range of diameter normally seen. Each piston
was measured four times a day (spread out over the day) for 12 days. During that
time the regular gage calibration was performed every four hours. The data are given
in the file V8 piston diameter measurement stability.
a. Does the measurement system drift over time?
b. What effect does the regular gage calibration have?
7.6 Consider the brake rotor balance example described in the case studies. In the meas-
urement investigation, three rotors were specially selected: one well balanced,
another poorly balanced, and the final rotor requiring weight near the specification
limit of 0.5. The three rotors were measured twice by each of the three gages on
three separate days. There is no operator effect since the gages are automated. The
54 measurements are given in brake rotor balance measurement. The analysis given
in the case study focuses on the measurement of the weight needed to balance the
rotor. However, the location (or orientation) of the weight needed to move the rotor’s
center of gravity is also important. Can the measurement system consistently deter-
mine the orientation of the required balance weight? From the baseline investiga-
tion, the orientation of the weight was roughly uniform from 0° to 360°.
7.7 If necessary, measurement variation can be reduced by applying the Statistical Engi-
neering algorithm. Describe how each of the seven variation reduction approaches
might be used to improve a measurement system.
CHAPTER 9
9.1 Think of a process and problem you know well. Define various families of causes.
9.2 The following plot shows the results of a process investigation aimed at finding a
dominant cause. The dashed lines give the full extent of variation in the output as
defined by the problem baseline. Can the input be ruled out as a dominant cause of
variation?
Output
Input
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–49
Exercises CD–49
CHAPTER 10
10.1 The flow chart that follows shows the major steps in an assembly process to set the
wheel alignment of a truck.
Knuckle Aligner 4
Components Measurement
The characteristic of interest is right camber with specification 0.5 ± 0.5°. Cam-
ber is measured on every truck by one of the four gages (aligners). The process per-
formance for right camber is shown as follows based on about 6200 consecutive
trucks.
10
Percent
0
0 1 2
rcamb
CD–50 Exercises
1.0
rcamb
0.5
0.0
f. The following plot shows the camber variation for the first nine trucks in the
data set. What families can be ruled out as the home of a dominant cause using
these data?
0.8
0.7
rcamb
0.6
0.5
0.4
1 2 3 4 5 6 7 8 9
Index
g. In a special study, one key characteristic of the lower control arm was measured
for 30 trucks. The other components were specially selected to ensure that they
were well within specification. Based on the plot that follows, is the lower con-
trol arm characteristic a dominant cause of right camber variation? Explain.
0.6
right_camber
0.5
0.4
0.3
–2 –1 0 1 2
lower_arm
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–51
Exercises CD–51
10.2 Consider again the camshaft lobe runout problem introduced in Chapter 1. Each
camshaft has 12 lobes with a number of characteristics of interest. In a search for a
dominant cause, we may compare the lobe-to-lobe and camshaft-to-camshaft fami-
lies of variation. Using the problem baseline data given in the file camshaft lobe
runout baseline, explore the relative sizes of the two families for the following char-
acteristics and decide which family, if any, can be eliminated.
a. Profile A
b. Profile B
c. Profile C
10.3 In the manufacture of an injection molded part, a key crossbar dimension exhibited
excess variation. The problem baseline estimated the standard deviation of the
crossbar dimension as 0.46 with full extent of variation –0.3 to 2.0. The goal was to
reduce the standard deviation to less than 0.25. An investigation showed the meas-
urement system to be highly capable.
Next the team conducted a multivari investigation where five consecutive parts
were sampled every 30 minutes for four hours. Analyze the data given in crossbar
dimension multivari. Which family of variation can be eliminated as the home of the
dominant cause?
10.4 As described in Chapter 7, in a process that placed labels on bottles, the team
searched for an acceptable measurement system. The file label height measurement
contains the data from an investigation in which three operators using a hand feeler
gage measured three specially chosen bottles three times on two different days.
The bottles were chosen to roughly cover the range of label height values seen in the
process. From a baseline investigation an estimate of the overall standard deviation
was 0.022. The results of a one-way ANOVA are:
Analysis of Variance for height
Source DF SS MS F P
part 2 0.0413263 0.0206631 263.10 0.000
Error 51 0.0040054 0.0000785
Total 53 0.0453317
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ----+---------+---------+---------+--
1 18 0.06966 0.00965 (-*-)
2 18 0.10950 0.00930 (-*)
3 18 0.13705 0.00749 (-*)
----+---------+---------+---------+--
Pooled StDev = 0.00886 0.075 0.100 0.125 0.150
CD–52 Exercises
60
50
40
30
20
15 25 35 45 55 65 75 85
Start particle count
10.6 In the engine block porosity example discussed in Chapter 10, the team found the
occurrence of high porosity coincided with production directly after breaks. To
explore this clue further, they conducted another investigation in which the porosity
of 50 consecutive blocks was measured. The first six blocks were taken from
directly before the lunch break, and the next 44 blocks were the first ones produced
after the break. The data are given in the file engine block porosity run chart. What
does the observed pattern in the run chart tell us about the dominant cause?
400
300
Subsurface porosity
200
100
10 20 30 40 50
Index
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–53
Exercises CD–53
10.7 High silicon concentration in cast iron is undesirable as it was found be a dominant
cause of fluidity variation. However, measuring the silicon level can be difficult. The
measurement process consisted of sampling the molten iron by pouring sample
coins for testing. The coins are then machined and polished before being spectro-
chemically analyzed. The full extent of variation in percent silicon as measured in
the current process was 1 to 4%. The measurement system was investigated by
repeatedly measuring three different coins that roughly covered full extent of varia-
tion in the observed percent silicon. Two operators measured each of the three coins
three times on each of two days. The data are given in the file iron silicon concen-
tration measurement. Analysis of the measurement results estimated the measure-
ment standard deviation as 0.33. The corresponding estimate of the process standard
deviation was 0.5; thus the discrimination is too small at around 1.5. The team needs
to improve the measurement system. Using the existing measurement investigation
data, are there any clues about the dominant cause of the measurement variation?
CHAPTER 11
11.1 In a multivari investigation, two consecutive pieces are selected from each of three
pallets once per hour. Sketch the appearance of the multivari chart that shows all
three families at the same time if a dominant cause lies in the following family. Use
the following multivari chart template in which the dashed lines indicate the full
extent of variation.
a. Pallet-to-pallet family
b. Part-to-part family
c. Hour-to-hour family
d. An interaction between the part-to-part and pallet-to-pallet families
Pallet
1
30 2
3
20
10
1 2 3 4
Hour
11.2 In the engine block leakers example, introduced in Chapter 1, the baseline defect
rate was 2–3%. The team conducted a multivari investigation where three consecu-
tive blocks were taken at twelve different times throughout the day. The investiga-
tion continued for three production days giving a total of 108 castings. Each block
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–54
CD–54 Exercises
was tested for leaks. The data are given in the file engine block leaks multivari. What
can you conclude?
11.3 At an intermediate operation the team planned a multivari investigation in which three
consecutive parts were taken from each of two machines operating in parallel once
every hour for two days. Consider two different processes. In the first process, the order
of the parts coming from upstream is preserved, while in the second process the order
is jumbled. When interpreting the resulting multivari chart (think specifically about the
part-to-part family), what difference does it make which process we are observing?
11.4 In a multivari investigation, the diameter of a transmission shaft was measured at
four positions (left and right side at two different orientations) for three consecu-
tively sampled shafts each hour. The data are available in the file transmission shaft
diameter multivari.
a. What conclusion can you draw from the multivari charts that follow?
0.251 0.251
Diameter
Diameter
0.250 0.250
0.249 0.249
1 2 3 4 5 1 2 3 4
Hour Position
b. Using the data assess whether the dominant cause acts in the shaft-to-shaft family.
11.5 In the production of engine blocks, bore diameters are key characteristics. Bore
diameter is measured at three heights and two orientations in each of the eight bores
in each block. The team used Statistical Engineering to address a problem of excess
bore diameter variation. The baseline investigation found a standard deviation of
3.04 and the full extent of variation of –9 to 9 as measured from nominal in microns.
There were no strong differences between the different bores, heights, or positions.
Another investigation concluded that the measurement process was adequate. To
isolate the processing step where the dominant cause acts, the team selected 30
engine blocks haphazardly from a day’s production. In the investigation the bore
diameter (measured from nominal at that processing step) in the first bore at the top
position and first orientation was measured at each of five processing steps in the
machining part of the process. The data are given in the file block bore diameter
variation transmission. Which processing step is home to the dominant cause?
11.6 In the paint film build example described in Chapter 3, a baseline investigation
found the standard deviation was 0.315, with an average of 16.2 thousandths of an
inch. The full extent of variation was 15.2 to 18.5. To search for a dominant cause,
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–55
Exercises CD–55
the team conducted a multivari investigation where they measured the film build at
five positions on five cars in a row every hour for two shifts (16 hours). This resulted
in a total of 400 film build measurements. The data are given in the file paint film
build multivari. Based on the plots that follow, what conclusions can you draw? We
define group as (hour – 1) ¥ 5+ position.
19
10
18
Film build
Percent
17
5
16
0 15
15 16 17 18 19 1 2 3 4 5
Film build Position
18 18
Film build
Film build
17 17
16 16
15 15
1 40 80
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Hour Group
11.7 A team wanted to reduce the number of updings on a roof panel. Updings are small
outward dents in the metal surface caused by contamination. A baseline investiga-
tion found that the total number of updings in 20 consecutive panels ranged between
5 and 438. To search for a dominant cause the team conducted a multivari investiga-
tion where the number of updings was counted for 20 consecutive roof panels from
three sections of seven different pallets of steel sheets. Originally, the plan was to
repeat this data collection over two separate days. However, the team found the full
extent of variation from the baseline was observed on the first day so they stopped
collecting data. The data are given in the file roof panel updings multivari.
a. Analyze the data using multivari charts and draw conclusions.
b. When the number of updings was counted they were classified into one of the
nine locations as numbered in the schematic that follows. Analyze the multivari
data using a concentration diagram based on the given schematic.
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–56
CD–56 Exercises
Roof panel
7 8 9
4 5 6
1 2 3
Front
11.8 The baseline investigation for the sand core example discussed in Chapter 1
involved taking five samples over a single day of five consecutive shots of four cav-
ities each. The data are given in sand core strength baseline. What conclusions can
you draw?
CHAPTER 12
12.1 Vehicle plant and warranty complaints for loose spark plug wires at the spark plug
end prompted an improvement project. As a result of several investigations, the fam-
ily of causes related to push forces on the wires was the home of a dominant cause.
A further investigation then compared eight loose and eight good connections. For
each of the 16 connections, the team measured the terminal position of wires and
terminal runout of the spark plug in millimeters. The data are given in the file spark
plug connection comparison. What do the data tell us about the dominant cause?
12.2 A sunroof installation process suffered from a 90% rework rate due to a lack of
flushness. Flushness is defined as the difference in height between the sunroof seal
and the metal roof. It is measured using digital calipers at six points (three at the
front and three at the back). A baseline investigation showed that flushness problems
were most pronounced at the two front corners with full extent of variation between
–3.5 to 4 mm and standard deviation 1.25 millimeters. A goal of reducing the front
corner flushness variation to 0.5 and a range of –2 to 1 millimeters was established.
Based on engineering knowledge, the team felt that only two types of characteristics
could lead to flushness variation, namely roof crown height and attachment pad
height. When the roof is adapted to allow installation of a sunroof, six installation
pads are added. Based on this knowledge, the team selected six vehicles with large
positive flushness and six vehicles with large negative flushness on both front cor-
ners. The sunroof modules were removed and the six attachment pad heights and
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–57
Exercises CD–57
roof crown height were measured at the front and back. The data are given in the file
sunroof flushness input-output. What conclusions can you draw?
12.3 An example related to sand defects in manifolds was discussed in Chapter 12.
Before the problem mentioned in Chapter 12, the team carefully monitored the
process for a shift. Each of 970 manifolds was classified as either scrap (due to sand
issues) or not scrap. In addition many inputs relating to each manifold, including
some discrete inputs such as mold number and continuous inputs such as pour time
were recorded. In some cases the linkages were difficult to establish, and the team
did the best they could. The data are given in the file manifold sand scrap comparison.
What conclusions can you draw?
CHAPTER 13
13.1 In a verification experiment there were two suspects at two levels. The low and high
levels for each suspect were chosen based on the extremes from historical variation.
The results of the first three runs of the experiment are shown in the following table.
Low Low 2 13
Low High 3 16
High Low 1 17
High High 4 ?
Given that the full extent of output variation is 12 to 30, what conclusions can you
draw about the dominant cause?
13.2 In the engine block porosity example discussed in the text and exercises of Chapter
10, a dominant cause of porosity acted immediately following scheduled breaks in
production. Based on this evidence, the team identified two suspects: iron pouring
temperature and the addition of ladle wash. During work stoppages, iron that
remained in the six pouring ladles cooled off because there was no external heat
source. At the start of the break, ladle wash was added to the ladles to protect the
refractory (surface). The team could not easily manipulate the pouring temperature,
but they could change the amount of ladle wash. They conducted a verification
experiment in which they added the normal amount of wash to ladles 1, 3, and 5 and
half the normal amount to the other three ladles over two lunch breaks. At each
break, they measured the porosity of the first 30 blocks poured (five from each
ladle). The data are given in the file engine block porosity verification.
a. What have we learned about the identity of the dominant cause of porosity?
b. Explain how the effects of ladle number and the presence or absence of ladle
wash are confounded. Does this matter?
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–58
CD–58 Exercises
13.3 The manufacture of a tube assembly required a protective nylon sleeve to be posi-
tioned and bonded to a tube. The bond strength of this tube assembly was occasion-
ally tested using a destructive test where the sleeve was subject to increased tensile
shear load until failure. In the current process, the average pull-off force was around
15 pounds, but roughly 8% of assemblies tested had a pull-off force less than the
desired minimum of five pounds. The team decided to try to solve the problem by
reducing the variation in pull-off force rather than by increasing the average pull-off
force. A number of investigations were conducted to find the dominant cause. A
multivari investigation suggested that the dominant family of causes was tube-to-
tube. At this point, the team decided to conduct an experiment to search for a domi-
nant cause using the limited process information they had gathered. They planned a
factorial experiment with three suspects—clearance between the sleeve and tube,
amount of adhesive, and cure time—all consistent with the tube-to-tube family clue.
The team chose the low and high levels of each suspect to roughly match their range
in regular production. The levels of clearance were achieved by sorting sleeves and
tubes. There were two replicates of each treatment, and the run order was random-
ized. The data are given in the file nylon bond strength verification and summarized
in the following table:
Exercises CD–59
ulant added by the automated delivery system seemed to vary. The desired amount
of inoculant was obtained by slowly shaking the inoculant onto a plate. The plate
was designed to tip automatically when the required weight of inoculant was pres-
ent. The team saw that the location on the plate where the inoculant fell varied, and
they thought that this might influence when the plate tipped and thus how much
inoculant was delivered. The team decided to verify inoculant amount as the dominant
cause of nodularity variation. In the verification experiment, they produced a total of
20 castings at each of two levels of inoculant amount, 12.3 and 13.5 grams. For the
experiment the inoculant was carefully weighed and added by hand. The experiment
consisted of eight runs of five castings each. The order of the eight runs (four at each
level) was randomized. For each of the 40 castings the percent nodularity was deter-
mined. The data are given in the file steering knuckle strength verification and are
summarized in the table that follows:
a. What considerations should the team have used in choosing the two levels for
inoculant?
b. Why was randomizing the order of the runs important?
c. Has the team verified the dominant cause of nodularity variation?
CHAPTER 14
14.1 In the camshaft lobe runout example, the team searched for a dominant cause of
variation. As discussed in Chapter 10, they conducted a variation transmission
investigation where runout was measured directly before heat treatment and after
the final step of the process, on the same 32 parts selected over the course of one day.
In the investigation the grinder (one of eight) and heat treatment spindles (one of
four) used were also recorded. The data are given in the file camshaft lobe runout
variation transmission. They found that a dominant cause of variation was the
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–60
CD–60 Exercises
BC runout just after heat treatment and, more specifically, as shown in the plot that
follows, that heat treatment spindle was a dominant cause.
30
Final BC runout
20
10
1 2 3 4
Heat treatment
In this example, the team decided not to reformulate the problem but to look for
a more specific cause.
a. Discuss the advantages and disadvantages of the decision not to reformulate.
b. Suppose the team had reformulated the problem based on heat treatment spindle
and that the original goal was to reduce the final runout standard deviation to less
than 4.5. Using the results from a one-way ANOVA model based on heat treat-
ment spindles, derive a goal for the new problem based on differences among
spindle averages.
14.2 In Chapter 11, the team found that the piston diameter directly after operation 270
was a dominant cause of final V6 piston diameter variation. The relationship is illus-
trated in the scatter plot that follows. The data are given in the file V6 piston diame-
ter variation transmission.
599
Diameter after OP310
594
589
584
Exercises CD–61
The team decided to look further for a more specific dominant cause. Suppose,
however, they had wanted to reformulate the problem in terms of the operation 270
diameter. Determine an appropriate goal for the reformulated problem. Recall that
the goal for the original problem was to reduce the final diameter standard deviation
to less than 2.0.
CHAPTER 15
15.1 Based on customer complaints about assembly difficulty, a team investigated fascia
dimension variation. A baseline investigation found that some fascias were too
large. The team felt that reducing the average size of the fascias could solve the
problem (that is, they adopted the Move the Center approach). They planned a (full)
factorial experiment with two candidates, cycle time and cure time, each at two
levels to look for an adjuster. They chose the levels for each candidate based on
engineering judgment. The results of the experiment are given in the file fascia
dimension move center and in the following table. For each treatment, the team con-
ducted four runs producing 10 fascias for each run. The order of the 16 runs was ran-
domized over a day. In the data, we give only the average fascia dimension from
each run and not the individual values.
CD–62 Exercises
Run Safety
Treatment order Tonnage Lubrication Blank size Prebending margin
Press tonnage was very difficult to change so all eight runs with low press ton-
nage were carried out first. Within each group of eight runs, the order was random-
ized. The data are given in the file sheet metal move center.
a. Analyze the experimental data to see if any of the candidates is an adjuster.
b. Does the restriction on randomization required for this experiment make any dif-
ference to the conclusions we can draw?
15.3 In the sand core strength example introduced in Chapter 1, too many cores were
breaking during handling. A suggested solution was to increase the core strength
(and thereby reduce core breakage) by increasing the resin concentration. It was
known that increasing the resin would result in a stronger core. However, the precise
nature of the relationship—that is, how much the core strength increases for a given
change in resin concentration—was not known. An experimental investigation was
planned to quantify the relationship. Three levels of resin concentration (1.3, 1.6,
1.9% by weight) were chosen based on engineering judgment. In the experiment, 40
cores for each level of resin were produced; 15 were measured for strength (using a
destructive test) and the remaining 25 were processed to look for casting problems.
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–63
Exercises CD–63
The experiment consisted of three runs with 15 repeats. The order of the runs was
not randomized. The data are given in the file sand core strength move center.
a. What can you conclude about the relationship between resin concentration and
core strength?
b. The team used only three runs with 15 repeats for each run. Discuss the advan-
tages and disadvantages of this plan compared with using five replicates for each
treatment with three repeats each.
CHAPTER 16
16.1 In a sonic welding operation, problems arose due to poor weld strength, measured
as pull-off force. The goal was to reduce the variation and increase the average pull-
off force. The second goal is not addressed here. From the baseline, the full extent
of variation for pull-off force was 0.9 to 3.0. The team discovered that the dominant
cause acted in the time-to-time family. While they could not be more specific, the
team felt that the dominant cause was related to material hardness, which was out-
side their control. They decided to try to desensitize the process to variation in the
dominant cause.
The team planned a fractional factorial experiment with four candidates at two
levels each in eight treatments. Using the results of regular process monitoring, they
identified three time periods when weld strength was low, medium, and high relative
to the baseline. In each period, they randomized the order and then produced a part
with each of the eight treatments. The pull-off force data and plan are given in the
file sonic weld desensitization and the table that follows. The three values in the
columns Order and Pull-off force correspond to the three different time periods. The
original settings of the candidates correspond to treatment 2.
a. Explain why the team believed the dominant cause acted over the three runs for
each treatment.
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–64
CD–64 Exercises
b. What levels of the candidates do you recommend to reduce the variation in pull-
off force?
c. Another way to assess the results of this experiment is to summarize the output
across each treatment using log standard deviation. Using this performance
measure, do your conclusions differ from part b?
16.2 In the crossbar dimension example discussed in Chapter 12, the team found that the
dominant cause of dimension variation was barrel temperature. Because it was hard
to control in regular operation, the team decided to try to make the process less sen-
sitive to barrel temperature variation. In the current process, barrel temperatures
ranged over roughly 4°C. The team planned a half fraction factorial experiment with
three candidates—target barrel temperature, injection pressure, and material—at
two levels each, as shown in the following table. The current injection pressure and
target barrel temperature were 1000 and 75, respectively. Note that although the
variation in barrel temperature was the dominant cause, the target barrel temperature
is a fixed input. Five crossbars were produced and measured in each run. For each
treatment, there were two runs, one at the target barrel temperature plus 2°C and the
other at the target barrel temperature minus 2°C. The data are given in the file
crossbar dimension desensitization and in the table as follows.
Dimensions Dimensions
Target at barrel at barrel
barrel Injection temperature temperature
Treatment temperature pressure Material –2°C +2°C
Since the average dimension can be easily adjusted, we focus the analysis on
finding a way to make the process less sensitive to barrel temperature variation.
a. What levels of the candidates do you recommend?
b. Injection pressure and material were chosen as candidates based on engineering
judgment. Looking again at the results presented in Chapter 12, what motivates
the choice of target barrel temperature as a possible candidate?
16.3 In Chapter 16, we describe a desensitization experiment for the refrigerator frost
buildup example where each refrigerator is subjected to only two extreme levels of
environmental causes. Here we consider a hypothetical experiment in which each
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–65
Exercises CD–65
Cause combination
Varying input 1 2 3 4 5 6 7 8
Ambient 26 26 26 26 32 32 32 32
temperature (°C)
Relative humidity 70 70 90 90 70 70 90 90
(%)
Door openings 4 8 4 8 4 8 4 8
per hour
The experimental plan had 64 runs. To conduct the experiment, all eight refriger-
ators were simultaneously placed in a test chamber and exposed to each cause com-
bination in the given order. The cooling plate temperatures are given in the file
refrigerator frost buildup desensitization2 and in the following table:
In the table, we have coded the new and original settings for the candidates as N and O,
respectively. What conclusions can you draw? Remember, the goal is to desensitize
cooling plate temperature to changes in the environmental conditions.
16.4 There were excessive failures in the accelerated life testing of electric motors. Using
a group comparison investigation, the team found that unevenness in the commuta-
tor shaft surface was a dominant cause of these failures. The team next reformulated
the problem to one of reducing the unevenness in the commutator shaft. The surface
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–66
CD–66 Exercises
Smoothness
a. What is the confounding structure of the design? What limitations does this
introduce?
b. What conclusions can you draw?
c. What would be the advantages and disadvantages of measuring the time to fail-
ure using the accelerated life test for each run rather than judging the smoothness
of the commutator surface after machining?
CHAPTER 17
17.1 In an investigation, 100 trucks were selected from regular production over two weeks.
The frame geometry as given by four summaries (left and right front, left and right rear)
and the alignment outputs left and right camber and caster were determined for all 100
trucks. The data are given in the file truck pull feedforward. In Chapter 17 an analysis
determined that feedforward control based on frame geometry was feasible for left
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–67
Exercises CD–67
caster. Repeat the analysis for the other outputs: right caster, left camber, and right
camber.
17.2 Engine assembly problems occurred due to a poor fit between the pistons and the
engine bore. The dominant cause of poor fit was found to be variation in the clear-
ance, the difference between the (minimum) bore diameter and the (maximum) pis-
ton diameter. To solve this problem, the team thought about using the feedforward
(selective fitting) approach. The idea was to measure each piston diameter and place
them into bins of similar diameter. Then, after each bore diameter was measured, a
piston would be selected from the appropriate bin. To assess this proposal the diam-
eter measurements for 469 pistons and bores, as measured from nominal, are given
in the file block bore diameter feedforward. Quantify the expected reduction in
clearance variation when using one (that is, no selective fitting), two, three, or four
bins of pistons. A suggestion is to define the bins by dividing the range in piston and
bore diameters (roughly –10 to 10 microns) into equal widths.
17.3 In the V6 piston diameter example discussed in Chapter 11, the team found that pis-
ton diameter after Operation 270 was a dominant cause of the final diameter. The data
are given in the file V6 piston diameter variation transmission. This suggested that
feedforward control might be a feasible approach.
a. What are the requirements for feedforward to be feasible in this context?
b. If feedforward were feasible, assess the potential benefit using the results of the
variation transmission investigation.
c. Could the team also use the diameter after Operation 200, rather than the diame-
ter after Operation 270, as the input to a feedforward controller?
CHAPTER 18
18.1 The bias of the system used to measure camshaft journal diameters tended to increase
over time. The cause of this increase was not determined. Instead, the team introduced
a feedback controller. At the start of each shift, a master journal with known diameter
was measured. If the measured diameter deviated from the known value by a large
amount, the measurement system was recalibrated.
a. How could we sensibly define a large deviation in this context?
b. What would happen to the measurement variation if the measurement device
were recalibrated every time the master journal was measured, rather than only
when the deviation from the known dimension was large?
18.2 In a machining process, the dominant cause of dimension variation acted in the
setup family. That is, the dimension variation within a particular setup was small rel-
ative to the variation from one setup to the next. The existing control plan called for
a complete process adjustment back to the target based on the first observation after
each setup. There were more than 200 parts machined between setups. The baseline
dimension standard deviation was 0.31. The team decided to explore a new feedback
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–68
CD–68 Exercises
control scheme based on the average for the first five observations after each setup.
In an offline investigation, they carried out 10 setups and produced 20 parts after
each setup without any adjustment. The dimension data, scaled so that the target is
zero, are given in the file machining dimension feedback.
a. Use a one-way ANOVA to estimate the standard deviation if the process could be
adjusted so that the dimension averages across all setups were equal.
b. Use simulation to compare the performance of the existing feedback controller
with the proposed controller that makes a complete adjustment based on the aver-
age for the first five observations after each setup.
c. In general, we may design a feedback controller by averaging the output from
the first n observations after each setup. What considerations help you decide
how many observations should be used to estimate the process average after
each setup?
18.3 In a machining process, there was excess variation in the diameter of a precision
ground shaft. The shaft diameter was measured for all shafts using a complex auto-
mated gage (that also measured other outputs). Upon investigation, the team discov-
ered that the dominant cause acted in the measurement family. In particular, the
measurement bias changed from day to day, consistent with the pattern observed in
the baseline. To explore this bias change further the team planned an investigation
where the diameter of the same shaft was measured each hour for four days. A total
of 32 diameter measurements were made. The data are given in the file precision
shaft diameter feedback, with the output being the diameter measured from nomi-
nal. The results show a gradual drift. The team speculated that the drift was caused
by changes in some (unidentified) environmental conditions. They decided to
reduce the measurement variation using a feedback controller.
a. What type of feedback controller (that is, what prediction equation and what
adjustment rule) would you recommend in this application?
b. Suppose the team decided to use a feedback controller based on EWMA fore-
casts with the smoothing parameter alpha equal to 0.4. What kind of a reduction
in the measurement variation could they expect?
CHAPTER 19
19.1 In the paint film build example introduced in Chapter 3, the baseline standard devi-
ation in film build (paint thickness) was 0.67 thousandths of an inch. With this vari-
ation, to ensure a minimum film build of 15-thousandths of an inch, the process was
centered at 17. The goal was to reduce the standard deviation to 0.35, thereby allow-
ing for a reduction in the average film build.
The dominant cause of film build variation was found using a multivari investiga-
tion to act in the car-to-car family. Despite further effort, the dominant cause was not
found. The team decided to adopt the process robustness approach. Based on process
experience, candidates and their corresponding levels were chosen as follows:
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–69
Exercises CD–69
Temperature 30 50
The team selected a fractional factorial resolution V experiment with the 16 treat-
ments given as follows. To reduce the cost of the experiment, panels were used
rather than cars. With this choice there was a risk of study error.
Anode Conductivity
Treatment dimension of paint Temperature X voltage Z voltage
For each run, five panels were painted. The order of the treatments was random-
ized. Since the dominant cause acted car to car, the team believed the unknown dom-
inant cause would act within each run. Film build was measured at five locations on
each panel. The data for one location are given in the file paint film build robustness
and in the table that follows.
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–70
CD–70 Exercises
a. Analyze the data using the standard deviation of film build over the five con-
secutive panels to measure performance. Is it possible to make the process
robust to noise variation? What levels of the candidates do you suggest?
b. The team had a way to adjust the process center. However, we can also use the
robustness experiment to look for an adjuster. Analyze the data using the average
film build over the five consecutive panels to measure performance. Are any of
the candidates adjusters?
c. In the experiment, the film build at a particular location on five consecutive cars
(panels) was used to define a run. Suppose, instead, that the five observations
came from five fixed locations on a single door. What, if any, changes are needed
in the analysis presented in part a?
19.2 In a trim plant, customer complaints about seat appearance prompted management
to assign a team the task of reducing shirring variation. The team proceeded without
using Statistical Engineering and made a number of mistakes. Seat cover shirring
was scored on a scale of 1 to 6 using boundary samples by how much cloth was
gathered by the parallel stitching. Shirring scores of 1 to 4 were acceptable with 1
being the best. Scores of 5 or 6 resulted from either too much or too little shirring. A
review of historical data suggested that the observed shirring score over a week cov-
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–71
Exercises CD–71
ered all six possible values. Next, the team informally checked the measurement
system. They found the measurement system added little variation. The team
decided not to look for a dominant cause. Rather they moved directly to assessing
the feasibility of making the process robust. They used brainstorming to select six
candidates with two levels each as follows:
Seam width 9 mm 11 mm
Bun thickness +5 mm –5 mm
The team planned a resolution III fractional factorial experiment with 16 runs
(one for each treatment) as follows:
CD–72 Exercises
Each run consisted of three seats (repeats). The runs were conducted in the treat-
ment order given in the table. The data are given in the file seat cover shirring
robustness and reproduced as follows:
1 13 3 1 2 2.0
2 16 1 2 1 1.3
3 7 2 2 2 2.0
4 6 2 2 2 2.0
5 10 2 1 1 1.3
6 1 3 1 3 2.3
7 11 4 2 1 2.3
8 15 2 2 4 2.7
9 5 1 2 2 1.7
10 3 4 5 2 3.7
11 14 3 3 2 2.7
12 9 2 3 3 2.7
13 8 1 2 2 1.7
14 2 2 2 3 2.3
15 4 1 4 2 2.3
16 12 2 3 1 2.0
a. Explain why choosing the process output as a measure of variation (that is, high
scores come from either too much or too little shirring) was a poor one.
b. The goal is to find process settings that lower the average shirring score. Can we
use any of the candidates to achieve the goal?
c. Each run consisted of three seats. Discuss this choice in the context of a robust-
ness experiment.
For the last two parts of this question, suppose the first three candidates (leather
thickness, leather toughness, and seam width) used in the robustness experiment
were normally varying rather than fixed inputs.
d. How should the levels of the first three inputs have been chosen?
e. Discuss changes you would make to the analysis you conducted in part b.
19.3 Torsional rigidity of the weather stripping was the dominant cause of door assembly
problems. Management set a goal of reducing standard deviation in torsional rigid-
ity to 0.3. A baseline investigation found the variation in torsional rigidity was
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–73
Exercises CD–73
roughly 0.55 mm and that the dominant cause acted over the short term and certainly
within any half hour. The team looked briefly for a dominant cause of rigidity vari-
ation without success. Next, they planned a robustness experiment with four candi-
dates at two levels each, chosen based on engineering judgment. The candidates and
levels are:
Extruder RPM 22 26
Tension (pre) 1 5
Water flow 2 6
The team planned a full factorial experiment with 16 runs, one for each treatment.
The correspondence between treatments and candidate levels is given in the table
that follows.
1 –1 –1 –1 –1
2 –1 –1 –1 1
3 –1 –1 1 –1
4 –1 –1 1 1
5 –1 1 –1 –1
6 –1 1 –1 1
7 –1 1 1 –1
8 –1 1 1 1
9 1 –1 –1 –1
10 1 –1 –1 1
11 1 –1 1 –1
12 1 –1 1 1
13 1 1 –1 –1
14 1 1 –1 1
15 1 1 1 –1
16 1 1 1 1
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–74
CD–74 Exercises
Each run consisted of running the process for half an hour after the candidate lev-
els had been reached. Within each run, 10 weather-strip samples were selected
spread out over the half hour. The order of the runs was randomized. The torsion
rigidity of each of the 10 weather-strip samples for each treatment is given in
columns s1 to s10 of the table that follows and in the file weatherstrip torsional
rigidity robustness.
1 13 10.3 13.0 11.5 11.8 10.7 9.9 10.7 11.5 11.0 11.1
2 6 11.5 13.0 10.4 11.1 10.9 10.6 12.0 9.3 9.2 9.3
3 9 11.6 13.0 10.4 16.0 10.3 10.8 11.5 11.0 11.3 10.9
4 1 11.5 11.7 10.4 11.7 14.0 11.7 10.4 11.7 10.4 10.4
5 3 14.0 11.7 11.7 19.0 11.9 11.7 12.1 13.0 11.1 11.0
6 11 22.0 15.0 18.3 11.7 20.3 21.0 12.6 13.6 14.7 15.1
7 5 9.1 9.6 10.2 9.8 9.0 9.7 10.0 12.0 9.0 8.8
8 14 10.0 9.1 10.6 10.4 10.8 11.0 11.1 10.8 10.5 10.8
9 2 11.7 12.5 11.9 11.7 20.0 14.0 10.4 11.5 11.7 20.0
10 10 10.3 11.6 10.5 10.6 13.0 14.0 11.7 10.3 15.0 11.8
11 7 10.3 10.5 11.0 11.4 9.8 10.4 11.7 11.8 11.5 11.9
12 15 11.6 11.0 11.4 11.3 12.0 10.6 10.9 10.7 10.7 10.7
13 16 10.6 10.7 11.6 10.6 10.7 22.0 11.0 10.4 10.4 23.0
14 8 9.1 10.4 10.6 11.4 10.9 10.4 10.8 10.9 11.0 11.6
15 12 10.3 11.0 12.0 12.1 10.5 10.7 11.3 11.4 10.8 10.9
16 4 10.4 10.4 10.4 10.5 10.9 11.4 9.0 9.6 9.8 10.2
Exercises CD–75
CHAPTER 21
21.1 Discuss whether lessons learned are properly maintained in corporate memory in
your organization. What could be done to improve the situation?
21.2 In the paint film build example described in Chapter 19, the team found new process
settings that resulted in reduced car-to-car variation in film build. To validate the
proposed solution, 80 cars were painted over one day with the settings given in the
following table. These were the best settings found in the robustness investigation.
The film build values from five specific positions on one door for each of the cars
are available in the file paint film build validation.
Candidate Setting
Temperature 30
a. The baseline film build standard deviation was 0.68. The problem goal was to
reduce the standard deviation to 0.35, and the robustness experiment results sug-
gested that changing settings would reduce the standard deviation to about 0.37.
Has the solution been validated?
b. What, if anything, do the validation results tell us about the home of the dominant
cause in the remaining variation?
21.3 In the truck pull example described in Chapter 17 and Exercise 17.1, a feedfor-
ward controller was implemented to compensate for the effect of truck-frame
geometry on pull. After the feedforward system had been operating successfully for
some time, management decided to review its operation. The four frame geometry
measurements and left and right caster and camber were recorded for roughly a
month of production consisting of over 6600 trucks. The data are given in the file
truck pull validation.
a. The standard deviations for caster and camber before implementation of the feed-
forward controller can be estimated from the 100-truck investigation described in
Chapter 17. From the same investigation, the team predicted the possible reduc-
tion in standard deviation using a feedforward controller. A summary is given in
the following table.
H1212 Steiner Exercises.qxd 4/1/05 10:23 AM Page CD–76
CD–76 Exercises
Do the results of the investigation validate the reduction in left and right caster
variation due to the feedforward controller?
b. For each of the two caster characteristics, conduct a regression analysis to see if
the feedforward controller can be improved. Recall that the feedforward controller
should be compensating for variation in the frame geometry.
c. Repeat the analysis in parts a and b for left and right camber.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–77
Exercise Solutions
CHAPTER 2
2.1 The word variation is used in other contexts to describe a difference between a
realized and target value such as in budget variation. How does this use com-
pare to variation as discussed in Chapter 2?
We define two types of variation, an off-target component and a part-to-part compo-
nent. Budget variation only describes the off-target component of variation.
2.2 We have heard the following comment many times from manufacturing engi-
neers: “The cause of the variation is the product design—what can you
expect me to do?” Discuss the comment in light of the definition of cause in
Section 2.2.
Product design cannot be the cause of part-to-part variation because it does not
change from part to part. It is, however, possible that changes to the product design
may solve the problem, though in most cases, the process produces good parts. This
suggests improvement is possible with the existing design.
2.3 Profile A is a measure of deviation of the actual from the ideal shape of a
camshaft lobe over one region (A) of the lobe. The target value is zero and the
upper specification limit is 250 microns. Use the data in the file camshaft lobe
runout baseline to summarize the variation in this output. Do all lobes exhibit
the same variation? Is there any time pattern in the variation?
We use a combination of plots and numerical summaries. Some typical plots follow.
The histogram (and the subsequent numerical summary) suggests the full extent of
variation is 59 to 292.
CD–77
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–78
10
9
8
7
Percent
6
5
4
3
2
1
0
Stratifying by lobe we see the average and variation in profile A is roughly the
same across all 12 lobes.
300
Profile A
200
100
1 2 3 4 5 6 7 8 9 10 11 12
Lobe
We need to be careful looking at the data by time. As stored, the 1296 profile
A values come from 108 camshafts. The camshafts are given in production
sequence.
We can plot profile A for each of the 12 lobes by camshaft in order of produc-
tion. The plot shows the 12 values for each camshaft. There are no strong time
effects that influence all lobes in the same way.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–79
300
Profile A
200
100
0 50 100
Camshaft
250
Profile A
150
50
20 40 60 80 100
Index
200
100
The observations are not ordered by time because each group of 12 values
comes from the 12 lobes on a single camshaft. In this plot, large lobe-to-lobe
effects could be misinterpreted.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–80
2.4 Construct histograms and run charts for output 1 and output 2 given in the data
file chapter 2 exercise 4. Find the average and standard deviation for each out-
put. Assume the target value and upper specification limit for these lower-is-
better outputs are 0 and 35.
a. Is the variation the same for each output?
b. Is the nature of the variation over time the same for each output?
30 30
20 20
Frequency
Frequency
10 10
0 0
0 10 20 30 40 0 10 20 30 40
Output 1 Output 2
35 35
30 30
25 25
Output 1
Output 2
20 20
15 15
10 10
5 5
0 0
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
Index Index
a. The standard deviations and histograms are the same for both outputs.
b. Output 2 is ordered from smallest to largest over time.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–81
2.5 You may convince yourself that the formulas for combining means and stan-
dard deviations given in Section 2.4 are true with the following numerical
demonstration you can conduct in MINITAB. Generate two columns of 100 val-
ues sampled from some model (in MINITAB: Calc Æ Random Data Æ your
choice of model and parameters, for example, Normal with mean and standard
deviation 0 and 1, respectively). Then, calculate two new data columns. Let one
column be the sum of the original two columns and the other the difference.
a. Find the standard deviation and average for each of the four columns.
b. Calculate the sum and differences of the averages for the first two columns.
How do these compare to the average of the other two columns respectively?
c. Calculate the standard deviation for the sum and difference using the
“square root of sum of squares” formula given by 2.1. How do the results
compare to the standard deviations for the last two columns?
b. The sum and difference of the averages for the first two columns equal the aver-
age of the sum and difference columns, respectively.
c. For the sum (and difference) of C1 and C2 we have
In the example data 1.1262 + .91452 = 1.45, which closely matches the standard
deviation of the last two columns.
2.6 At a project review, the team presented the following summary of their investi-
gation based on standard deviations.
Measurement system 30
Identified cause 50
Unidentified causes 81
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–82
a. The reviewing manager questioned the numbers in the second column of the
table because they did not add to 100. Is there an error? Explain.
b. By what percentage can the process standard deviation be reduced by elimi-
nating the contribution of the identified cause?
c. Is the identified cause a dominant cause?
a. There is no error. The percentages are calculated on the standard deviation scale.
Recall that standard deviations combine using the square root formula as illus-
trated by Equation (2.1). The total is given by 0.30 2 + 0.50 2 + 0.812 = 1.
b. If we eliminate the contribution of the identified cause the remaining variation is
given by 0.30 2 + 0.812 = 0.85. So the process standard deviation could be reduced
by roughly 15%.
c. No.
a. Use the formula for pull and the results for how averages and standard devi-
ations combine to predict the average and standard deviation for pull given
by the last row in the table indirectly from the component averages and stan-
dard deviations.
b. Suppose you had the resources to reduce the variation in one of the align-
ment angles by 50%. Which angle would you choose? By how much, approx-
imately, would the pull standard deviation be reduced?
( ) ( )
0.232 * 0.224 2 + 0.2432 + 0.132 * 0.132 + 0.1292 = 0.08
The derived standard deviation for pull is not exactly 0.082, because the align-
ment angles do not vary independently; there is a small correlation. Note that the
averages of the components play no role in the standard deviation.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–83
There is some rounding error, so this value does not match the pull average in the
table.
b. The largest reduction in overall standard deviation would be achieved by reduc-
ing the variation in the right caster. Reducing the variation in right caster by 50%
would reduce the variation in pull to roughly
Ê Ê 0.243 ˆ ˆ
2
0.23 * Á 0.224 + Á
2 2
Ë
2
( 2 2
)
˜¯ ˜ + 0.13 * 0.13 + 0.129 = 0.063
Ë 2 ¯
CHAPTER 3
3.1 For a problem of interest to you speculate about the likely costs and feasibility
of implementing each of the possible variation reduction approaches.
The solution is dependent on the chosen problem.
3.2 Variation in the location of a drilled hole in a machined casting can cause poor
fits when the part is bolted to an engine. To reduce this variation, an engineer
considers a variety of possible approaches.
a. A vision system is available that can measure location on 100% of the parts
and reject those that it judges to be out of specification. What are the advan-
tages and disadvantages of such an approach?
b. Institute a feedback controller by measuring two parts every hour. If hole
location on either part is outside of specification, stop and adjust the process.
When is such a scheme likely to be effective?
c. A third choice is to find a dominant cause of the variation. What are the
advantages and disadvantages of this strategy?
d. If a dominant cause can be discovered, what options does the engineer have?
d. With knowledge of a dominant cause, the engineer can consider the variation
reduction approaches that require knowledge of a dominant cause, namely:
fixing the obvious, desensitizing the process, and feedforward control. Any of
the non-cause-based approaches are also still options.
CHAPTER 5
5.1 Briefly discuss the advantages and disadvantages of the following—be sure to
think of potential errors as described within the QPDAC framework.
a. To estimate the baseline performance of a grinding process, 100 consecutive
pistons were sampled and the diameters were measured.
b. To investigate a proposed change to a chemical process, the investigators
tried the change in a pilot process rather than the production process.
a. The given plan results in quick and easy data collection. However, there is a
danger that process variation will be underestimated since 100 consecutive
pistons may be more similar than 100 pistons chosen over a longer time
frame (study error). Also, 100 pistons is a relatively small sample. See
Table S6.1 in the supplement to Chapter 6 to get a better idea of the uncer-
tainty in the estimate of the process standard deviation from a small sample
size (sample error).
b. An investigation on the pilot process would be cheaper and easier than using the
production process. The main concern is whether results from the pilot process
can be scaled up to the regular process (study error.)
5.2 In the camshaft lobe BC runout problem described in Chapter 1, the team
selected 50 parts (10 per day over 5 days) and measured the BC runout for
each of the 12 lobes on each camshaft to quantify the baseline. The 600 runout
measurements are stored in the file camshaft lobe runout baseline original.
Conduct an analysis of these data. Are your conclusions different from those in
Chapter 1? Why?
From the MINITAB results that follow, we see that the variation in BC runout as
measured by the standard deviation is now much larger at 31.7. However, look-
ing at the numerical or graphical summary, we see that the data contain a large
outlier. In particular, observation number 74 is 1130. A transcription error put
the decimal point in the wrong place. The value should have been 11.3. This mis-
take was readily identified and corrected before proceeding with the rest of the
analysis.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–85
90
1000
80
70
BC runout
60
Percent
50
40 500
30
20
10
0 0
5.3 To assess a measurement system used to check the diameter of an engine bore,
an investigator plans to repeatedly measure the same four (of the eight) bores on
five blocks sampled from a shift of production.
a. Discuss the advantages and disadvantages of using 10 rather than 5 blocks.
b. In the investigation, all the blocks produced over one shift were available for
study. Give two considerations that the investigators should take into
account in making the choice of available blocks.
c. The plan was to make all measurements in a single day. Discuss the advan-
tages and disadvantages of making the measurements over a longer time
period.
d. When would the investigator be better off devoting the available resources to
measuring all eight bores on fewer engine blocks?
a. Using 5 blocks rather than 10 would be cheaper and quicker. However, with
fewer blocks there is a greater chance of sample error. The performance of the
measurement system on the chosen 5 blocks may be different (better or worse)
than its performance on other blocks.
b. The investigator needs to trade off cost and convenience with the risk of study error.
The key question is whether the performance of the measurement system is likely
to be different when examining blocks from other days. It may be, for example, that
engine bores from a single day have similar dimensions and that the measurement
system works well only for some dimensions.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–86
c. Again the investigator needs to trade off cost and convenience with the risk of study
error. Here the key question is whether the properties of the measurement system
change substantially over time. Measurement systems, like other processes, change
over time unless properly maintained. As we shall see in Chapter 7, we recom-
mend, if at all possible, that measurement investigations compare the measure-
ments over a longer time period.
d. Having fewer engines makes the logistics of the investigation easier. In the meas-
urement investigation to repeatedly measure the bores on the same block, the
team had to move the block back into the process before the gage. Measuring all
the bores on fewer blocks would be a good idea if we expected the measurement
system to be sensitive to the bore number (though if this were known, another
option would be to focus on the known problem bores). If, on the other hand, the
dimensions of all the bores on a particular block were similar, using fewer blocks
might prevent us from seeing the full range of bore dimensions in the measure-
ment investigations.
5.4 You are a manager with the responsibility to decide if you should change the
supplier for a tooling insert. You receive a report from your process engineer
who has conducted an investigation into a new insert. He gives you the follow-
ing verbal report and recommendation:
Our current insert has an average life of 1105 parts. To assess the perform-
ance of the new supplier, we asked them to supply 10 inserts. We checked
the inserts out on one of our machines last week and got an average of
1300 pieces. Since the cost is the same, I think we should switch to the new
inserts.
Using the QPDAC framework, think of five questions you would ask about
the conduct of the investigation before you might accept the recommendation.
CHAPTER 6
6.1 In Chapter 1, we described a problem in terms of the lobe geometry of
camshafts. The data are given in the file camshaft lobe runout baseline. Quan-
tify the problem baseline for the following output.
a. BC runout
b. Angle error
a. We summarize the data using a histogram and a run chart as follows. The run chart
plots the 12 lobe BC runout values for the 108 camshafts in production order.
35
9
30
8
7 25
BC runout
6
Percent
20
5
4 15
3
10
2
1 5
0 0
0 10 20 30 40 0 50 100
BC runout Time
The baseline standard deviation is 6.4, and the full extent of variation is 2.6 to
33.9 microns. We see no obvious patterns over time. We can also look at run charts
for each lobe separately.
b. Looking at angle error we use the same summaries as in part a.
200
8
150
7
100
6
50
Angle
Percent
5
0
4
–50
3
–100
2
–150
1 –200
0 –250
6.2 Many programs such as Excel cannot easily handle missing observations.
MINITAB is an exception. Missing values are often stored using a special
numerical code (–99 is common). These special codes can result in much confu-
sion and lead to incorrect conclusions. Consider the data rod thickness baseline
with missing observation. In the file, there are two outputs. The output thick-
ness_–99 uses a numerical code of –99 for missing observations, while thick-
ness_missing uses the MINITAB missing observation symbol (*). Quantify the
baseline for these two outputs. Which data summaries show the missing obser-
vation and which do not?
If we forget we are using a special code for a missing observation, it is not readily
apparent in the numerical summary and the –99 inflates the standard deviation by
about 9%. The code for the missing observation is easily identified in the graphical
display if the code is extreme relative to the usual values. Using a missing observa-
tion symbol is much preferred over a special numerical code.
Variable N N* Mean Median TrMean StDev
Thickness_–99 800 0 34.426 36.000 34.832 11.981
Thickness_miss 799 1 34.593 36.000 34.861 11.018
20
8
7
6
Percent
Percent
5
10
4
3
2
1
0 0
–100 –50 0 50 0 10 20 30 40 50 60
Thickness –99 Thickness_missing
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–89
6.3 The baseline investigation for the V6 piston diameter example was described in
Chapter 5. The data are given in the file V6 piston diameter baseline. Suppose
the data were collected so that all the pistons from a given hour were collected
at the start of the hour. Now the data come in subgroups as defined by hour.
What summaries used in the baseline analysis are affected by the subgrouping?
When taking the subgrouping into account are the conclusions any different
than those derived in Chapter 5?
The subgrouping in the data affects the interpretation of the run chart. The run chart
(as given in Chapter 5) assumes the observations are equally spaced in time. Since
the data are collected so that the five observations in each subgroup are much closer
together in time then the observations from subsequent hours a better display would
plot the diameters by hour (time).
600
Diameter
590
580
0 10 20 30 40
Time
There appears to be no special pattern over time. The conclusions for the baseline
would be unchanged.
We summarize the baseline data numerically and using a histogram and run chart.
Variable N Mean Median TrMean StDev SE Mean
dimension 147 7.303 7.000 7.256 2.261 0.186
12
20
Dimension
Percent
7
10
0 2
2 7 12 5 100
Dimension Index
There is some evidence of an increase in the fascia dimension over the month of
the baseline investigation. This suggests the investigation was not conducted over a
long enough period of time. We need to plan a new baseline investigation.
CHAPTER 7
7.1 In a process improvement problem to improve the quality of a roof panel, the
measurement system (specially designed for the project) counted the number of
updings on each panel. To assess the measurement system, the number of
updings on 20 bad panels and 20 good panels was counted twice. The data are
given in roof panel updings measurement.
a. Can this investigation be used to assess the measurement variation of the
counting process? Explain.
b. Can this investigation be used to assess the bias of the counting process?
Explain.
c. The same operator counted all panels. Does the order in which he makes the
counts matter? It is most convenient to count the same panel twice in a row.
Is this a good idea?
d. A scatter plot of the first versus the second measurement is given as follows.
Note that some plotting symbols correspond to more than one pair of meas-
urements? What does the scatter plot tell you about the counting process?
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–91
Bad
Good
10
Second count
5
0 5 10 15
First count
e. This investigation was conducted over one hour. What are the advantages
and disadvantages of spreading the two measurements on each panel over
two days?
f. Can the counting process discriminate between good and bad panels?
a. Yes. By measuring each panel twice we can assess measurement variation. How-
ever, we may underestimate the variation if important causes of measurement
variation do not vary sufficiently during the investigation.
b. No. We cannot assess bias since we do not know the true upding count for
each panel.
c. It is best to randomize the order of the panels. The danger with measuring the
same panel twice in row is that on the second count the operator will remember
and be influenced by the first count.
d. The points lie close to the 45° line and there is a clear separation between good
and bad panels. This suggests the measurement system is adequate.
e. Using two days for the investigation reduces the risk of study error but takes
longer.
f. Yes. See part d.
7.2 To monitor the process that produces engine blocks, piston bore diameters are
measured on every block because they are key characteristics. Each engine
block has eight bores. The bore diameter is measured at three different heights
in each bore (bottom, middle, and top) and at two different orientations at each
height. Because the measurement process is automated, there are no operators.
A measurement investigation was conducted over a day where the diameter of
every bore on four blocks was measured four times each. The main concern was
out-of-round, given by 10,000 times the difference of the two diameters at a par-
ticular height. The data are given in the file block bore diameter measurement.
From a baseline investigation the out-of-round standard deviation was 22.8.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–92
50
Out-of-round
–50
–100
0 10 20 30 40 50 60 70 80 90 100
block_bore_height
baseline investigation, the team knew that the dominant cause acted within a day.
Thus, a single day was long enough for the measurement investigation.
7.3 The following MINITAB results and graphs arise from a measurement system
investigation in which two different operators measured five parts three times
each. The five parts were selected with initial measured values spread out over
the full extent of variation, 0 to 8. The data are given in the file chapter 7 exer-
cise 3. The two operators worked different shifts so the parts were saved from
one shift to the next. The results include an edited ANOVA analysis as sug-
gested in the supplement to Chapter 7 and the default gage R&R analysis in
MINITAB.
Analysis of Variance for measurement
Source DF SS MS F P
part 4 230.819 57.705 81.25 0.000
Error 25 17.754 0.710
Total 29 248.573
Gage R&R
%Contribution
Source VarComp (of VarComp)
Total Gage R&R 0.900 8.62
Repeatability 0.425 4.07
Reproducibility 0.475 4.55
operator 0.475 4.55
Part-To-Part 9.547 91.38
Total Variation 10.447 100.00
Gage name:
Date of study:
Reported by:
Gage R&R (ANOVA) for measurement Tolerance:
Misc:
5
50
0
0
Gage R&R Repeat Reprod Part-to-part 1 2 3 4 5
Part
R chart by operator By operator
3 1 2 10
UCL = 2.937
Sample range
2
5
1 R = 1.141
0
0 LCL = 0
1 2
Operator
X-bar chart by operator Operator*part interaction
10 2 9 Operator
1
Sample mean
8 1
7
Average
6 2
UCL = 5.332 5
5
Mean = 4.165 4
LCL = 2.997 3
2
1
0 0
1 2 3 4 5
Part
a. What do the given results tell us about the bias and variation of the measure-
ment system?
–
b. In the gage R&R results, the X chart by operator is out of control. What does
this mean?
c. In the gage R&R results, why is the sum of the % study variation column
not 100%?
d. What is the discrimination ratio (D) for this system? How does the part selec-
tion procedure influence this ratio?
e. The gage R&R is about 29%, yet D is small. Why?
f. The results suggest a small operator-to-operator difference. This observed
difference may be due to a difference in method or a drift of the system over
the two shifts. How can you separate these two possibilities?
a. We estimate the standard deviation due to the measurement system as 0.84, the
pooled standard deviation from the ANOVA. This calculation corresponds to
the formula given in Section 7.2. We cannot assess measurement bias because
the true dimensions are not known. As an aside, it is always a good idea to also
plot the measured values versus part number as follows.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–95
10
Measurement
5
1 2 3 4 5
Part
–
b. The points plotted on the X chart are the average measurement for a particular
part and operator. The control limits are determined by the variation of the
measurements within the 10 part and operator combinations. If the measure-
ment system is able to distinguish among the parts, there should be many points
outside the control limits. Here, out-of-control signals indicate a good measure-
ment system.
c. The column partitions the percent variation due to the measurement system and
the parts with the overall standard deviation as the divisor. To combine standard
deviations we need to square and add. We see that 29.352 + 95.592 = 1002.
d. To calculate the discrimination ratio, we can estimate the overall standard devia-
tion from the full extent of variation as 8 / 6 = 1.33 (we are assuming a bell-
shaped histogram in this calculation). Thus we can estimate the standard
deviation due to the process as 1.03 (= 1.332 - 0.84 2 ). The corresponding esti-
mate for the discrimination ratio D is 1.22 (= 1.03 / 0.84). The selection of the
parts plays no role in the calculation of D. We assume that the variation within
each part is the same regardless of true size.
e. The denominator of the gage R&R calculation is the variation observed in the
investigation. In this case, the overall standard deviation is 2.93, much larger than
the estimated baseline variation, which decreases the R&R. The reason for the
large overall variation is the part selection procedure.
f. With the plan, there is no way to distinguish the operator and shift effects (we say
the effects are confounded). If we want to be able to separate the effects, we need
another investigation in which both operators measure parts at the same time over
several shifts.
7.4 To assess the variation in the system designed to measure camshaft lobe geom-
etry over time, the same camshaft was measured daily for a month. At each
measurement, a number of characteristics (for example, angle error, BC runout,
taper, and so on) on each lobe were determined. The data are given in the file
camshaft lobe runout measurement stability. Is there evidence of time-to-time
variation in this measurement system?
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–96
Checking for time-to-time variation is difficult due to the large number of output
characteristics and lobes. We plot each output versus day, stratifying by lobe number.
450 1 0 1
400 2 2
3 3
350
Radius error
4 4
BC runout
300 5 –100 5
250 6 6
200 7 7
8 8
150 9 –200
9
100 10 10
50 11 11
12 12
0 –300
0 10 20 30 0 10 20 30
Day Day
Examining a run chart of any characteristic versus the order in the MINITAB
worksheet may be misleading since much of the observed pattern may be due to dif-
ferences between lobes.
Most of the characteristics show a pattern similar to the one observed for BC
runout, where the gage appears stable except for the results on day 17. Radius error
has unusual results for day 8.
a. To assess drift we plot the diameter for each piston over time.
10
Diameter
0 10 20 30 40 50
Time
There is clear evidence of a drift over time. Note that since the pistons were cho-
sen to span the normal range of diameters, the amount of drift seen is substantial.
b. The gage was recalibrated a number of times during the investigation. However,
looking at the run chart, the regular calibration appears to have no effect.
7.6 Consider the brake rotor balance example described in the case studies. In the
measurement investigation, three rotors were specially selected: one well bal-
anced, another poorly balanced, and the final rotor requiring weight near the
specification limit of 0.5. The three rotors were measured twice by each of the
three gages on three separate days. There is no operator effect since the gages
are automated. The 54 measurements are given in brake rotor balance measure-
ment. The analysis given in the case study focuses on the measurement of the
weight needed to balance the rotor. However, the location (or orientation) of the
weight needed to move the rotor’s center of gravity is also important. Can the
measurement system consistently determine the orientation of the required bal-
ance weight? From the baseline investigation, the orientation of the weight was
roughly uniform from 0° to 360°.
With an output like orientation measured in degrees we must be careful when ana-
lyzing the data. The data are circular since 0° is the same as 360°. Plotting the data
for orientation by the three rotors we see that for this data we can ignore the circu-
lar nature of the output since none of the measured orientations are near 0°.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–98
220
210
200
190
Orientation
180
170
160
150
140
130
120
1 2 3
Rotor
of rotor hardness. The team explored changing a number of settings to see if it was
possible to make the measurement system less sensitive to variation in the amount of
dirt on the casting, the dominant cause of measurement variation. Ultimately, in this
example, the approach failed (see the discussion in Chapter 16).
Feedforward Control
The use of feedforward control to improve a measurement system is difficult to
imagine. One idea is to use different gages depending on the class of part we are to
measure. This would be effective in reducing measurement variation if each meas-
urement system performed better for one of the classes of parts. For instance, sup-
pose we wish to measure the thickness of foam. Say the current measurement
system (hand calipers) worked well for narrow foam, but not very well for thick
foam. Then, to implement feedforward, when foam of different grades required
measurement we would choose the best gage for the job.
Feedback Control
We can use a feedback control scheme if a measurement system drifts. To set up the
controller, we select a reference part and measure it repeatedly in the short term to
determine the average and the standard deviation. Then to implement, we measure
the same part on a periodic basis and adjust the measurement system if the charac-
teristic value is materially different from the average. We use the short-term varia-
tion to determine if the change is material.
Robustness
Maintenance on a measurement system can be thought of as the Robustness
approach. We hope to reduce the measurement variation without knowledge of the
dominant cause. Another idea to reduce measurement variation that uses the robust-
ness idea is to take two or more measurements on each part and average the obtained
results. This will reduce the measurement variation, at the expense of increased
measurement costs, if the repeated measurements are independent.
Inspection
Inspection is not an option to improve a measurement process. We have no way to
determine which measurements should be rejected.
Move the Process Center
Moving the process center is trivial for a measurement process. We just add some
value to all measured values. The Move Process Center approach is appropriate to
eliminate a known and persistent measurement bias.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–100
CHAPTER 9
9.1 Think of a process and problem you know well. Define various families of
causes.
The solution depends on the process and problem chosen.
9.2 The following plot shows the results of a process investigation aimed at finding
a dominant cause. The dashed lines give the full extent of variation in the out-
put as defined by the problem baseline. Can the input be ruled out as a domi-
nant cause of variation?
Output
Input
The simple answer is no, because we have not seen the full extent of variation.
However,
• If the full range of variation in the input is bigger than that observed, the input
may be a dominant cause.
• If the full range of variation in the input has been observed, the input is not a
dominant cause on its own but may be dominant in combination with another
input (one that has not varied over its full range of variation in the investigation).
For instance, the input/output relationship may look like the plot that follows,
where different plotting symbols are used to distinguish between two possible val-
ues for another input. In the original investigation, the level of the other input was
restricted to the values given by solid circles.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–101
Output
Input
CHAPTER 10
10.1 The flow chart that follows shows the major steps in an assembly process to set
the wheel alignment of a truck.
Knuckle Aligner 4
Components Measurement
10
Percent
0
0 1 2
rcamb
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–102
1.0
rcamb
0.5
0.0
f. The following plot shows the camber variation for the first nine trucks in the
data set. What families can be ruled out as the home of a dominant cause
using these data?
0.8
0.7
rcamb
0.6
0.5
0.4
1 2 3 4 5 6 7 8 9
Index
g. In a special study, one key characteristic of the lower control arm was meas-
ured for 30 trucks. The other components were specially selected to ensure
that they were well within specification. Based on the plot that follows, is the
lower control arm characteristic a dominant cause of right camber variation?
Explain.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–103
0.6
right_camber
0.5
0.4
0.3
–2 –1 0 1 2
lower_arm
a. No. The measurement variation contributes to the range of values in the his-
togram but there is no way to tell from this investigation how much.
b. We need to conduct an investigation of the measurement system where right cam-
ber is measured more than once on a number of trucks.
c. If we are able to disassemble and reassemble the alignment components repeat-
edly without damage, we can assess the assembly operation. If disassembly/
reassembly does not change the right camber values much, we eliminate the
assembly operation as the home of a dominant cause. Repeated disassembly and
reassembly using the production assembly process is difficult in this application.
d. If we were able to track which trucks contained the upper control arms from the dif-
ferent suppliers, we could stratify the right camber values by control arm supplier.
If the control arm supplier was a dominant cause, the average right camber values
in the two groups would be quite different, relative to the full extent of variation.
e. We are seeing roughly the full extent of variation in the output (right camber)
within a short time frame. With this evidence we eliminate the family of causes
that acts over the long term.
f. We have not observed the full extent of variation in right camber. It is premature
to draw conclusions about the home of the dominant cause.
g. While the relationship between the lower control arm characteristic and right
camber appears strong, we have not observed the full extent of variation in the
right camber. We may ask if the range of lower control arm characteristic values
observed in the investigation is typical of the process.
10.2 Consider again the camshaft lobe runout problem introduced in Chapter 1.
Each camshaft has 12 lobes with a number of characteristics of interest. In a
search for a dominant cause, we may compare the lobe-to-lobe and camshaft-to-
camshaft families of variation. Using the problem baseline data given in the file
camshaft lobe runout baseline, explore the relative sizes of the two families for
the following characteristics and decide which family, if any, can be eliminated.
a. Profile A
b. Profile B
c. Profile C
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–104
300
Profile A
200
100
1 2 3 4 5 6 7 8 9 10 11 12
Lobe
350
250
Profile B
150
50
1 2 3 4 5 6 7 8 9 10 11 12
Lobe
c. For profile C, the middle lobes exhibits less variation (and lower average) than
for lobes on the end. The dominant cause is an interaction between lobe number
and some, as yet, unidentified cause that acts in the lobe-to-lobe family. Note that
ANOVA results corresponding to this plot will not reflect the differences in vari-
ation by lobe. The ANOVA analysis focuses on differences between the profile C
averages across lobes.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–105
400
300
Profile C
200
100
0
1 2 3 4 5 6 7 8 9 10 11 12
Lobe
2
Dimension
1 2 3 4 5 6 7 8
Time
10.4 As described in Chapter 7, in a process that placed labels on bottles, the team
searched for an acceptable measurement system. The file label height measure-
ment contains the data from an investigation in which three operators using a
hand feeler gage measured three specially chosen bottles three times on two
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–106
different days. The bottles were chosen to roughly cover the range of label
height values seen in the process. From a baseline investigation an estimate of
the overall standard deviation was 0.022. The results of a one-way ANOVA are:
Analysis of Variance for height
Source DF SS MS F P
part 2 0.0413263 0.0206631 263.10 0.000
Error 51 0.0040054 0.0000785
Total 53 0.0453317
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ----+---------+---------+---------+--
1 18 0.06966 0.00965 (-*-)
2 18 0.10950 0.00930 (-*)
3 18 0.13705 0.00749 (-*)
----+---------+---------+---------+--
Pooled StDev = 0.00886 0.075 0.100 0.125 0.150
We start our analysis by calculating the heights minus the part average since we
want to look for sources of variation other than part. Plotting height minus part aver-
age versus operator and day we get the plots given as follows:
Height minus part average
0.01 0.01
0.00 0.00
–0.01 –0.01
1 2 3 1 2
Operator Day
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–107
10.5 A process improvement problem was initiated to reduce the number of updings
on a roof panel. Updings are small outward dents in the metal surface caused by
contamination. The team discovered that the dominant cause was contamina-
tion before the forming process step. In an investigation, the team measured the
particle count on coils directly after the arrival from steel supplier and again
after blanking and stamping (before the forming process). They measured at
the tail, middle, and head of four different coils. The data are given in the file
roof panel updings variation transmission. What does the following scatter plot
tell us about the dominant cause? The plotting symbols correspond to the four
different coils.
90 1
2
80
3
70 4
End particle count
60
50
40
30
20
15 25 35 45 55 65 75 85
Start particle count
Assuming the investigation has seen the full extent of variation in the output (which
we are not given in the exercise description), we conclude the dominant cause acts
in the raw blanks. The blanking, stamping, and forming process steps all transmit
but do not add variation.
10.6 In the engine block porosity example discussed in Chapter 10, the team found
the occurrence of high porosity coincided with production directly after breaks.
To explore this clue further, they conducted another investigation in which the
porosity of 50 consecutive blocks was measured. The first six blocks were taken
from directly before the lunch break, and the next 44 blocks were the first ones
produced after the break. The data are given in the file engine block porosity
run chart. What does the observed pattern in the run chart tell us about the
dominant cause?
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–108
400
300
Subsurface porosity
200
100
10 20 30 40 50
Index
All six blocks from before lunch have extremely low porosity. Immediately follow-
ing the break, the first six blocks have the highest porosity. The porosity gradually
trends downward. This pattern implies the dominant cause is some input that
changes abruptly during the breaks and then gradually returns to the original level.
The return time is greater than 44 blocks. The team concluded that pouring temper-
ature was a suspect dominant cause. The temperature of the iron decreased during
breaks because there was no external heat source in the pouring ladles. The pouring
temperature gradually increased after a break since the pouring ladles are frequently
replenished with hot iron.
10.7 High silicon concentration in cast iron is undesirable as it was found be a dom-
inant cause of fluidity variation. However, measuring the silicon level can be
difficult. The measurement process consisted of sampling the molten iron by
pouring sample coins for testing. The coins are then machined and polished
before being spectrochemically analyzed. The full extent of variation in percent
silicon as measured in the current process was 1 to 4%. The measurement sys-
tem was investigated by repeatedly measuring three different coins that roughly
covered full extent of variation in the observed percent silicon. Two operators
measured each of the three coins three times on each of two days. The data are
given in the file iron silicon concentration measurement. Analysis of the meas-
urement results estimated the measurement standard deviation as 0.33. The
corresponding estimate of the process standard deviation was 0.5; thus the dis-
crimination is too small at around 1.5. The team needs to improve the measure-
ment system. Using the existing measurement investigation data, are there any
clues about the dominant cause of the measurement variation?
We start our analysis by calculating the silicon concentration minus the average for
each coin since we want to look for sources of measurement variation other than dif-
ferences from coin to coin. Some plots of the data are given as follows.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–109
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0
–0.2 –0.2
–0.4 –0.4
–0.6 –0.6
–0.8 –0.8
1 2 1 2
Time Operator
We see a small difference between the two operators, but no dominant cause. The
dominant cause is acting over the short term. Using other investigations, the team
was unable to determine the dominant cause of measurement variation. In Chapter
19, we describe a robustness investigation used to look for changes in this measure-
ment process that would result in less variation.
CHAPTER 11
11.1 In a multivari investigation, two consecutive pieces are selected from each of
three pallets once per hour. Sketch the appearance of the multivari chart that
shows all three families at the same time if a dominant cause lies in the follow-
ing family. Use the following multivari chart template in which the dashed lines
indicate the full extent of variation.
a. Pallet-to-pallet family
b. Part-to-part family
c. Hour-to-hour family
d. An interaction between the part-to-part and pallet-to-pallet families
Pallet
1
30 2
3
20
10
1 2 3 4
Hour
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–110
We give examples of the possible multivari charts, where the output name matches
the question part. We suppose the full extent of variation is 8 to 35 units.
Pallet Pallet
1 1
2 2
3 30 3
30
Output B
Output A
10 10
1 2 3 4 1 2 3 4
Hour Hour
Pallet Pallet
1 1
2 2
3 3
30 30
Output D
Output C
10 10
1 2 3 4 1 2 3 4
Hour Hour
Note in the multivari chart for output D, the output has the most variation for
pallet 3.
11.2 In the engine block leakers example, introduced in Chapter 1, the baseline
defect rate was 2–3%. The team conducted a multivari investigation where
three consecutive blocks were taken at twelve different times throughout the
day. The investigation continued for three production days giving a total of 108
castings. Each block was tested for leaks. The data are given in the file engine
block leaks multivari. What can you conclude?
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–111
Time
1.0 1
2
3
4
5
6
Leaker?
0.5 7
8
9
10
11
12
0.0
1 2 3
Day
There is only a single leaker in the 108 blocks. This was not surprising consider-
ing the low baseline leak rate. This plot does not provide much information. In gen-
eral, multivari investigations are not effective for binary output characteristics
unless large sample sizes are used and the proportion defective is plotted.
a. What conclusion can you draw from the multivari charts that follow?
0.251 0.251
Diameter
Diameter
0.250 0.250
0.249 0.249
1 2 3 4 5 1 2 3 4
Hour Position
b. Using the data assess whether the dominant cause acts in the shaft-to-shaft
family.
a. From the two multivari charts, we see that a dominant cause acts in the hour-to-
hour family.
b. To assess the shaft-to-shaft family we define a new variate group to uniquely
identify the 20 groups of three shafts that are consecutive. The multivari chart
using group (=[hour – 1] ¥ 4 + position) is:
0.251
Diameter
0.250
0.249
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Group
The variation from shaft to shaft inside each group is small, so the shaft-to-shaft
family does not contain a dominant cause.
11.5 In the production of engine blocks, bore diameters are key characteristics. Bore
diameter is measured at three heights and two orientations in each of the eight
bores in each block. The team used Statistical Engineering to address a problem
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–113
5 5
yfinal
y250
0 0
–5 –5
–5 0 5 –5 0 5 10
y250 y200
10
2
1
5
0
–1
y200
y140
0 –2
–3
–5 –4
–5
–5 –4 –3 –2 –1 0 1 2 –5 –4 –3 –2 –1 0 1
y140 y110
Starting from the end of the process, we see that little variation is added between
operation 250 and the final diameter measurement, and also little variation is added
between operations 200 and 250. Based on the plot in the lower left hand side, the
dominant source of variation acts between operations 140 and 200 (including possi-
bly operation 200 itself).
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–114
11.6 In the paint film build example described in Chapter 3, a baseline investigation
found the standard deviation was 0.315, with an average of 16.2 thousandths of
an inch. The full extent of variation was 15.2 to 18.5. To search for a dominant
cause, the team conducted a multivari investigation where they measured the
film build at five positions on five cars in a row every hour for two shifts (16
hours). This resulted in a total of 400 film build measurements. The data are
given in the file paint film build multivari. Based on the plots that follow, what
conclusions can you draw? We define group as (hour – 1) ¥ 5 + position.
19
10
18
Film build
Percent
17
5
16
0 15
15 16 17 18 19 1 2 3 4 5
Film build Position
18 18
Film build
Film build
17 17
16 16
15 15
1 40 80
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Hour Group
The histogram of the multivari data covers more than the full extent of variation, so
we know the dominant cause has acted. From the multivari charts, we conclude that
the dominant cause acts in the hour-to-hour family. We can use a one-way ANOVA
to estimate the within-hour standard deviation as 0.47, much less than the baseline
standard deviation 0.68. In other words, if we could make the average thickness the
same at each hour, we could reduce the variation substantially.
11.7 A team wanted to reduce the number of updings on a roof panel. Updings are
small outward dents in the metal surface caused by contamination. A baseline
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–115
Roof panel
7 8 9
4 5 6
1 2 3
Front
a. The multivari data covers the full extent of updings variation. The multivari charts
suggest a dominant cause acts in the pallet-to-pallet family of causes. There is no
section-to-section effect and no evidence of an interaction between the pallet and
section families.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–116
400 400
300 300
Upding
Upding
200 200
100 100
0 0
1 2 3 4 5 6 7 1 2 3
Pallet Section
Section
1
400 2
60 3
50 300
Updings
40
Percent
200
30
20
100
10
0 0
b. Plotting the number of updings by location using the concentration diagram does
not show any clear patterns. We can eliminate the location-to-location family.
Roof panel
7 8 9
249 240 304
4 5 6
220 194 246
1 2 3
204 175 158
Front
11.8 The baseline investigation for the sand core example discussed in Chapter 1
involved taking five samples over a single day of five consecutive shots of four
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–117
cavities each. The data are given in sand core strength baseline. What conclu-
sions can you draw?
Multivari charts using one input at a time follow. Since we are analyzing the base-
line data, the core strength values by definition cover the full extent of variation.
We see no strong cavity, time, or shot effects. However, there is much greater vari-
ation at time 4 than the other times. The core-to-core family is a large source of
variation only at time 4.
14.6 14.6
13.6 13.6
12.6 12.6
Core strength
Core strength
11.6 11.6
10.6 10.6
9.6 9.6
8.6 8.6
7.6 7.6
6.6 6.6
5.6 5.6
1 2 3 4 5 1 2 3 4
Time Cavity
14.6
13.6
12.6
Core strength
11.6
10.6
9.6
8.6
7.6
6.6
5.6
1 2 3 4 5
Shot
Exploring the data further, we make multivari plots by time together with shot
and cavity in turn. As there is no clear pattern, the shot and cavity families can be
eliminated.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–118
Shot Cavity
14.6 1 14.6 1
13.6 2 13.6 2
3 3
12.6 4 12.6 4
Core strength
Core strength
11.6 5 11.6
10.6 10.6
9.6 9.6
8.6 8.6
7.6 7.6
6.6 6.6
5.6 5.6
1 2 3 4 5 1 2 3 4 5
Time Time
We need to look for a dominant cause consistent with the extra core-to-core vari-
ation at time 4.
CHAPTER 12
12.1 Vehicle plant and warranty complaints for loose spark plug wires at the spark
plug end prompted an improvement project. As a result of several investiga-
tions, the family of causes related to push forces on the wires was the home of a
dominant cause. A further investigation then compared eight loose and eight
good connections. For each of the 16 connections, the team measured the termi-
nal position of wires and terminal runout of the spark plug in millimeters. The
data are given in the file spark plug connection comparison. What do the data
tell us about the dominant cause?
Plotting the terminal runout and wire position by connection quality, as given, sug-
gests wire position is a dominant cause. Note that there is one value in each group
where wire position does not explain the connection quality. It is possible there is
also a secondary cause. Before proceeding the team should verify that wire position
is a dominant cause.
0.85 21
Terminal runout
Wire position
0.75 20
19
0.65
18
0.55
Connection Connection
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–119
12.2 A sunroof installation process suffered from a 90% rework rate due to a lack of
flushness. Flushness is defined as the difference in height between the sunroof
seal and the metal roof. It is measured using digital calipers at six points (three
at the front and three at the back). A baseline investigation showed that flush-
ness problems were most pronounced at the two front corners with full extent of
variation –3.5 to 4 mm and standard deviation 1.25 millimeters. A goal of
reducing the front corner flushness variation to 0.5 and a range of –2 to 1 mil-
limeters was established. Based on engineering knowledge, the team felt that
only two types of characteristics could lead to flushness variation, namely roof
crown height and attachment pad height. When the roof is adapted to allow
installation of a sunroof, six installation pads are added. Based on this knowl-
edge, the team selected six vehicles with large positive flushness and six vehicles
with large negative flushness on both front corners. The sunroof modules were
removed and the six attachment pad heights and roof crown height were meas-
ured at the front and back. The data are given in the file sunroof flushness
input-output. What conclusions can you draw?
To start the analysis, we fit a multiple regression for both left front flushness and
right front flushness with all the eight inputs. The residual variation for the two
regression models was 0.35 (left) and 0.32 (right). Since both the residual variations
are less than the target of reducing flushness variation to 0.5, the results appear
promising. Plotting the data shows that a dominant cause of the right front flushness
variation is the right front pad height. Similarly, a dominant cause of the left front
flushness variation is the left front pad height.
3 3
Right front pad flushness
2 2
Left front pad flushness
1 1
0 0
–1 –1
–2 –2
–3 –3
–3 –2 –1 0 1 2 3 –2.5 –2.0 –1.5 –1.0 –0.5 0.0 0.5 1.0 1.5 2.0 2.5
Right front pad height Left front pad height
The regression results given for left front flushness show that by eliminating the
dominant cause we could reduce the variation in flushness variation to 0.30 for the
left side and 0.35 for the right side.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–120
12.3 An example related to sand defects in manifolds was discussed in Chapter 12.
Before the problem mentioned in Chapter 12, the team carefully monitored the
process for a shift. Each of 970 manifolds was classified as either scrap (due to
sand issues) or not scrap. In addition many inputs relating to each manifold,
including some discrete inputs such as mold number and continuous inputs
such as pour time were recorded. In some cases the linkages were difficult to
establish, and the team did the best they could. The data are given in the file
manifold sand scrap comparison. What conclusions can you draw?
There are no clear differences in the distributions of any of the input values when we
stratify by scrap or not scrap. The box plots that follow are typical.
7.5 160
7.0
6.5
Permeability
6.0
Pour time
5.5
150
5.0
4.5
4.0
3.5
3.0 140
0 1 0 1
Scrap Scrap
4.2
4.1
4.0
3.9
Percent LOI
3.8
3.7
3.6
3.5
3.4
3.3
3.2
0 1
Scrap
The team concluded that linking input values to individual manifolds was too dif-
ficult. Many of the inputs are only measured a few times each shift and there is great
uncertainty about the time linkages between the condition of each manifold and the
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–121
measured input values. The team next conducted the investigation described in
Chapter 12.
CHAPTER 13
13.1 In a verification experiment there were two suspects at two levels. The low and
high levels for each suspect were chosen based on the extremes from historical
variation. The results of the first three runs of the experiment are shown in the
following table.
Low Low 2 13
Low High 3 16
High Low 1 17
High High 4 ?
Given that the full extent of output variation is 12 to 30, what conclusions
can you draw about the dominant cause?
Suspects A and B alone are not dominant causes. However, without making the last
run, we cannot tell if there is a dominant cause that involves both inputs. If the run
with both A and B at the high level gives an output value near 30, the dominant cause
involves both inputs. If the last run gives an output value near 17, then neither sus-
pect is a dominant cause.
13.2 In the engine block porosity example discussed in the text and exercises of
Chapter 10, a dominant cause of porosity acted immediately following sched-
uled breaks in production. Based on this evidence, the team identified two sus-
pects: iron pouring temperature and the addition of ladle wash. During work
stoppages, iron that remained in the six pouring ladles cooled off because there
was no external heat source. At the start of the break, ladle wash was added to
the ladles to protect the refractory (surface). The team could not easily manip-
ulate the pouring temperature, but they could change the amount of ladle wash.
They conducted a verification experiment in which they added the normal
amount of wash to ladles 1, 3, and 5 and half the normal amount to the other
three ladles over two lunch breaks. At each break, they measured the porosity
of the first 30 blocks poured (five from each ladle). The data are given in the file
engine block porosity verification.
a. What have we learned about the identity of the dominant cause of porosity?
b. Explain how the effects of ladle number and the presence or absence of ladle
wash are confounded. Does this matter?
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–122
a. We use a box plot to stratify the porosity by the amount of ladle wash. We also
plot porosity against block number for the two levels of ladle wash.
300 300
Porosity
Porosity
200 200
100 100
0 0
Full Half 0 5 10 15
Ladle wash Block number
Ladle wash is not a dominant cause of porosity. Because the porosity decreased
as the block number increased, the team concluded that pouring temperature was
a dominant cause. They accepted the risk that some other unknown cause, con-
founded with pouring temperature, was actually responsible for the change in
porosity.
b. In the experiment, the effects of ladle number and the presence or absence of
ladle wash are confounded, since ladles 1, 3, and 5 always had full ladle wash and
ladles 2, 4, and 6 always had half ladle wash. As shown in the following plot, the
porosity scores were consistent across the odd- and even-numbered ladles. Based
on engineering knowledge, the team believed there was no other cause that
matched the change in the amount of ladle wash. Thus, in this case the confound-
ing does not limit the conclusions in any meaningful way.
400
300
Porosity
200
100
0
1 2 3 4 5 6
Ladle number
c. Observing the process before breaks was not necessary because the team was try-
ing to determine if wash or temperature was the dominant cause of the porosity.
These suspects changed at, or after, the break.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–123
30 A: Clearence
A
B: Adhesive
B C: Cure time
Bond strength
AC
20
BC
10 AB
ABC
1 2 3 4 5 6 7 8 0 10 20
Treatment combination
We see that input A (clearance) has by far the largest effect. A plot of bond
strength by clearance (input A), given as follows, clearly shows that clearance is a
dominant cause of bond strength variation, and that low clearance gives higher aver-
age bond strength.
30
Bond strength
20
10
–1 1
Clearance
13.4 Steering knuckles are produced in a gray iron casting process. Around 2% of
castings were scrapped because the percent nodularity was too small. In this
example the team did not clearly establish a problem baseline. The team
thought the cause must be related to the inoculation of the molten iron using a
silicon-based alloy. The inoculant was added as the iron was poured to increase
nodularity (and thus casting strength). Based on observing the process, the
team noticed that the amount of inoculant added by the automated delivery
system seemed to vary. The desired amount of inoculant was obtained by slowly
shaking the inoculant onto a plate. The plate was designed to tip automatically
when the required weight of inoculant was present. The team saw that the loca-
tion on the plate where the inoculant fell varied, and they thought that this
might influence when the plate tipped and thus how much inoculant was deliv-
ered. The team decided to verify inoculant amount as the dominant cause of
nodularity variation. In the verification experiment, they produced a total of 20
castings at each of two levels of inoculant amount, 12.3 and 13.5 grams. For the
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–125
experiment the inoculant was carefully weighed and added by hand. The exper-
iment consisted of eight runs of five castings each. The order of the eight runs
(four at each level) was randomized. For each of the 40 castings the percent
nodularity was determined. The data are given in the file steering knuckle
strength verification and are summarized in the table that follows:
a. What considerations should the team have used in choosing the two levels
for inoculant?
b. Why was randomizing the order of the runs important?
c. Has the team verified the dominant cause of nodularity variation?
a. The team chose the levels to cover the range of inoculant values seen in regular
production. Over a single shift, the team determined the range 12.3 and 13.5
grams by weighing the amount of inoculant obtained using the automatic deliv-
ery system on a separate scale.
b. The team chose an experiment with four replicates of each level of the suspect and
five repeats for each run. Here, since the team had no information about the time
family that contains the dominant cause, the randomization is extremely impor-
tant. There is a danger that some unknown input may change in the same way as
the suspect. Four replicates may not be sufficient. The repeats do not help.
c. Plotting the percent nodularity by amount of inoculant gives:
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–126
88
Percent nodularity
83
78
12.3 13.5
Amount of inoculant
Because there was no baseline, we cannot tell if we have seen the full extent of
nodularity variation in the experiment. However, since all of the castings with per-
cent nodularity less than 80 would be scrapped, the team concluded that the
amount of inoculant was the dominant cause. An obvious fix was to replace the
plate with a cone. With the plate, not all the inoculant collected at the center. Using
a cone, all the inoculant collects at the bottom, and the cone does not tip too early.
CHAPTER 14
14.1 In the camshaft lobe runout example, the team searched for a dominant cause
of variation. As discussed in Chapter 10, they conducted a variation transmis-
sion investigation where runout was measured directly before heat treatment
and after the final step of the process, on the same 32 parts selected over the
course of one day. In the investigation the grinder (one of eight) and heat treat-
ment spindles (one of four) used were also recorded. The data are given in the
file camshaft lobe runout variation transmission. They found that a dominant
cause of variation was the BC runout just after heat treatment and, more specif-
ically, as shown in the plot that follows, that heat treatment spindle was a dom-
inant cause.
30
Final BC runout
20
10
1 2 3 4
Heat treatment
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–127
In this example, the team decided not to reformulate the problem but to look
for a more specific cause.
a. Discuss the advantages and disadvantages of the decision not to reformulate.
b. Suppose the team had reformulated the problem based on heat treatment
spindle and that the original goal was to reduce the final runout standard
deviation to less than 4.5. Using the results from a one-way ANOVA model
based on heat treatment spindles, derive a goal for the new problem based on
differences among spindle averages.
a. The advantages of not reformulating the problem are mainly that a new baseline
for the runout after heat treatment (that is, the dominant cause, or the output in
the new problem) does not need to be established, and we need not determine a
goal for the new problem. The main disadvantage of not reformulating is that we
still need to measure the final runout in future investigations. Had we reformu-
lated, we would use the runout after heat treatment. In this case, it was not
cheaper or easier to measure runout after heat treatment.
b. We use a one-way ANOVA model to assess the possible reduction in runout if we
could align all the heat treatment spindles perfectly. The appropriate ANOVA
results are:
Analysis of Variance for final BC
Source DF SS MS F P
heat treatment 3 894.7 298.2 17.53 0.000
Error 28 476.3 17.0
Total 31 1371.0
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ----------+---------+---------+------
1 8 15.538 2.819 (----*----)
2 8 16.688 5.491 (----*----)
3 8 28.900 2.751 (----*----)
4 8 18.675 4.729 (----*----)
----------+---------+---------+------
Pooled StDev = 4.124 18.0 24.0 30.0
By perfectly aligning the heat treatment averages, we predict the standard devia-
tion could be reduced to 4.1. Hence, to meet the goal we need to ensure that the heat
treatment averages are close to equal.
14.2 In Chapter 11, the team found that the piston diameter directly after operation
270 was a dominant cause of final V6 piston diameter variation. The relation-
ship is illustrated in the scatter plot that follows. The data are given in the file
V6 piston diameter variation transmission.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–128
599
589
584
The team decided to look further for a more specific dominant cause. Sup-
pose, however, they had wanted to reformulate the problem in terms of the oper-
ation 270 diameter. Determine an appropriate goal for the reformulated
problem. Recall that the goal for the original problem was to reduce the final
diameter standard deviation to less than 2.0.
Thus, to meet the goal of reducing the final diameter variation to less than 2.0, we
solve to get stdev(OP270 diameter) < 1.58. Thus, a reasonable goal for the reformu-
lated problem is to reduce the diameter variation at operation 270 to less than 1.6.
Note that in the variation transmission investigation, the diameter variation at OP270
was 3.4, so the goal requires more than a 50% reduction in diameter variation.
CHAPTER 15
15.1 Based on customer complaints about assembly difficulty, a team investigated
fascia dimension variation. A baseline investigation found that some fascias
were too large. The team felt that reducing the average size of the fascias could
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–129
solve the problem (that is, they adopted the Move the Center approach). They
planned a (full) factorial experiment with two candidates, cycle time and cure
time, each at two levels to look for an adjuster. They chose the levels for each
candidate based on engineering judgment. The results of the experiment are
given in the file fascia dimension move center and in the following table. For
each treatment, the team conducted four runs producing 10 fascias for each
run. The order of the 16 runs was randomized over a day. In the data, we give
only the average fascia dimension from each run and not the individual values.
a. We first plot dimension by treatment. We see that there are differences between
the average dimensions for different treatments. In particular, treatment 3, with
low cycle time and high cure time, leads to smaller average dimensions.
9.5
8.5
7.5
Dimension
6.5
5.5
4.5
3.5
1 2 3 4
Treatment
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–130
Next, we fit a full model and look at the Pareto plot of the effects.
A: Cycle time
B: Cure time
A
AB
0 1 2 3 4
Both cycle time and cure time have large effects; hence, both are potential
adjusters. The interaction between these two candidates is small. From the fol-
lowing main effect plots, we see that lowering cycle time and increasing cure
time will reduce the average dimension.
3
85
11
19
5
7.4
6.8
Dimension
6.2
5.6
5.0
15.2 An experiment was carried out to investigate four candidates to search for an
adjuster of the formability safety margin of galvanized sheet metal trunk lids.
The purpose was to increase the average safety margin from the baseline value
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–131
of 10.7. In the experiment, each candidate was tested at two levels, selected to
be near the edge of what was physically possible—see the table that follows.
Note that none of the treatments corresponded to the existing process settings.
Run Safety
Treatment order Tonnage Lubrication Blank size Prebending margin
Press tonnage was very difficult to change so all eight runs with low press
tonnage were carried out first. Within each group of eight runs, the order was
randomized. The data are given in the file sheet metal move center.
a. Analyze the experimental data to see if any of the candidates is an adjuster.
b. Does the restriction on randomization required for this experiment make any
difference to the conclusions we can draw?
a. We start the analysis by looking at a plot of the safety margin by treatment. In this
case, since there is only a single output for each treatment, it is difficult to draw
conclusions from the plot. We do see that there are treatments with very different
values for the safety margin.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–132
20
Safety margin
10
0 5 10 15
Treatment combination
Next, we fit a full model and examine a Pareto plot of the effects.
A: Tonage
D
B: Lubricat
A
C: Blank si
AB D: Pre-bend
CD
C
B
ABCD
AD
ACD
BD
BCD
BC
ABC
AC
ABD
0 1 2 3 4 5
D has a large main effect and AB a large interaction effect. We follow up with
plots of safety margin by prebending (input D) and an interaction plot of tonnage
by lubrication.
20 Tonnage
310
15 375
Safety margin
Safety margin
10
10
0 0
We see that prebending leads to higher safety margin on average and that the
safety margin is much more sensitive to the presence or absence of lubrication at
the high level of tonnage than at the low level. The average safety margin for the
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–133
two runs with prebending, low press tonnage, and lubrication was (18 + 16)/2 = 17,
considerably higher than the baseline average 10.7. This finding needs to be con-
firmed. See the discussion in part b.
b. Without the randomization it is more likely that the effect attributed to press ton-
nage is due to some other (unknown) input. This could occur if the level of the
unknown cause happened to change at (or near) the moment in time when press
tonnage was changed. When interpreting the experimental results we cannot tell
if this occurred, or not.
15.3 In the sand core strength example introduced in Chapter 1, too many cores
were breaking during handling. A suggested solution was to increase the core
strength (and thereby reduce core breakage) by increasing the resin concentra-
tion. It was known that increasing the resin would result in a stronger core.
However, the precise nature of the relationship—that is, how much the core
strength increases for a given change in resin concentration—was not known.
An experimental investigation was planned to quantify the relationship. Three
levels of resin concentration (1.3, 1.6, 1.9% by weight) were chosen based on
engineering judgment. In the experiment, 40 cores for each level of resin were
produced; 15 were measured for strength (using a destructive test) and the
remaining 25 were processed to look for casting problems. The experiment con-
sisted of three runs with 15 repeats. The order of the runs was not randomized.
The data are given in the file sand core strength move center.
a. What can you conclude about the relationship between resin concentration
and core strength?
b. The team used only three runs with 15 repeats for each run. Discuss the
advantages and disadvantages of this plan compared with using five repli-
cates for each treatment with three repeats each.
15
Core strength
10
We expect, on average, a 0.86 unit increase in core strength for each 0.1
increase in the percent resin. The team had quantified the effect of resin concen-
tration. At the 2% level of resin, one of the 25 cores led to a defective casting, so
the team knew they should not raise the resin concentration to this level. As a
result, although the team had now quantified how to move the process center, the
approach was abandoned because of the fear of increased scrap due to core-
related defects.
b. Using five replicates for each treatment would allow randomization of the order.
This would help protect against some unknown input changing in the same way
as resin in the experiment. A disadvantage is that it may be more difficult and
expensive to change the resin level more often.
CHAPTER 16
16.1 In a sonic welding operation, problems arose due to poor weld strength, meas-
ured as pull-off force. The goal was to reduce the variation and increase the
average pull-off force. The second goal is not addressed here. From the baseline,
the full extent of variation for pull-off force was 0.9 to 3.0. The team discovered
that the dominant cause acted in the time-to-time family. While they could not
be more specific, the team felt that the dominant cause was related to material
hardness, which was outside their control. They decided to try to desensitize the
process to variation in the dominant cause.
The team planned a fractional factorial experiment with four candidates at
two levels each in eight treatments. Using the results of regular process moni-
toring, they identified three time periods when weld strength was low, medium,
and high relative to the baseline. In each period, they randomized the order and
then produced a part with each of the eight treatments. The pull-off force data
and plan are given in the file sonic weld desensitization and the table that fol-
lows. The three values in the columns Order and Pull-off force correspond to
the three different time periods. The original settings of the candidates corre-
spond to treatment 2.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–135
a. Explain why the team believed the dominant cause acted over the three runs
for each treatment.
b. What levels of the candidates do you recommend to reduce the variation in
pull-off force?
c. Another way to assess the results of this experiment is to summarize the out-
put across each treatment using log standard deviation. Using this perform-
ance measure, do your conclusions differ from part b?
a. The team knew that the dominant cause acted time to time. Since they selected
periods with low, medium, and high weld strength under the current conditions,
they were confident that the dominant cause acted over the three periods. In the
experiment, this assumption was verified since the range of pull-off force values
for treatment 2, the current process settings, covered the full extent of variation.
b. To start the analysis we look at a plot of the pull-off force by treatment.
4 Time
1
2
3
3
Pull-off force
1 2 3 4 5 6 7 8
Treatment
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–136
There are several promising treatments with little variation in pull-off force.
To desensitize the process we are looking for a special kind of interaction
between time (a surrogate for the dominant cause) and the candidates. The inter-
action plots for time versus the candidates (using the average pull-off force as the
response) are
3.5 A B
1 3.0 1
–1 –1
Mean
Mean
2.5 2.5
1.5 2.0
1 2 3 1 2 3
Time Time
3.4 C D
1 3.0 1
–1 –1
2.5
Mean
2.4
2.0
1.4
1 2 3 1 2 3
Time Time
We see that changing the level of candidate C flattens the relationship between
the pull-off force and time. More specifically, with the low level of candidate C,
the time-to-time variation in pull-off force is much reduced. Changing the other
candidates does not help. We recommend changing to the low level of candidate
C, and leaving the remaining candidates at their current levels.
c. Fitting a full model in the four candidates and analyzing the results using the per-
formance measure log(s) gives the Pareto effects plot that follows. The plot indi-
cates that candidate C has the largest effect.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–137
A: A
C B: B
C: C
A D: D
AC
AB
AD
From the main effect plot that follows, we see that the low level of C reduces
the variation in pull-off force. This is the same conclusion as in part b. In most
cases the conclusions obtained with the two analysis methods will be the same.
We prefer the analysis that looks directly at the candidate by cause interaction
plots rather than the analysis based on log(s).
0.0
–0.5
Log(s)
–1.0
–1.5
–1 1
C
16.2 In the crossbar dimension example discussed in Chapter 12, the team found
that the dominant cause of dimension variation was barrel temperature.
Because it was hard to control in regular operation, the team decided to try to
make the process less sensitive to barrel temperature variation. In the current
process, barrel temperatures ranged over roughly 4°C. The team planned a half
fraction factorial experiment with three candidates—target barrel temperature,
injection pressure, and material—at two levels each, as shown in the following
table. The current injection pressure and target barrel temperature were 1000
and 75, respectively. Note that although the variation in barrel temperature was
the dominant cause, the target barrel temperature is a fixed input. Five cross-
bars were produced and measured in each run. For each treatment, there were
two runs, one at the target barrel temperature plus 2°C and the other at the tar-
get barrel temperature minus 2°C. The data are given in the file crossbar dimen-
sion desensitization and in the table as follows.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–138
Dimensions Dimensions
Target at barrel at barrel
barrel Injection temperature temperature
Treatment temperature pressure Material –2°C +2°C
Since the average dimension can be easily adjusted, we focus the analysis on
finding a way to make the process less sensitive to barrel temperature variation.
a. What levels of the candidates do you recommend?
b. Injection pressure and material were chosen as candidates based on engineer-
ing judgment. Looking again at the results presented in Chapter 12, what
motivates the choice of target barrel temperature as a possible candidate?
Low
High
2
Dimension
1 2 3 4
Treatment
We examine the nature of the interaction between the candidates and the
dominant cause with the following plots (where the response is the average
dimension):
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–139
Mean
1 1
0 0
2 Set barrel t
75
79
Mean
0
Low High
Range barrel
If we increase the target barrel temperature and, to a lesser degree, use the new
material, we see that the relationship between crossbar dimension and barrel tem-
perature is flatter. In other words, dimension is less sensitive to barrel tempera-
ture variation at the higher target barrel temperature. Changing the injection
pressure has no effect. The benefits of changing both the target barrel tempera-
ture and the material were later validated with another investigation. Note that the
observed variation in treatment 3 has likely overestimated the variation when
actually running the process at these settings, since we have forced barrel temper-
ature to near its expected extremes with the new target temperature.
b. The nonlinear relationship between barrel temperature and dimension is shown
in the left panel of Figure 12.3. In the plot, it appears there is less dimension vari-
ation for high than for low barrel temperatures.
16.3 In Chapter 16, we describe a desensitization experiment for the refrigerator frost
buildup example where each refrigerator is subjected to only two extreme levels of
environmental causes. Here we consider a hypothetical experiment in which each
refrigerator is exposed to a number of environmental conditions to ensure that any
chosen new design works well under any conditions, not just extreme conditions.
The experimental design for the four candidates—D1, D2, D3, and D4—is
the same as in Chapter 16. Here we plan to test each of the eight refrigerators
(treatments) under all eight possible combinations of the usage or environmen-
tal inputs as given in the following table:
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–140
Cause combination
Varying input 1 2 3 4 5 6 7 8
Ambient 26 26 26 26 32 32 32 32
temperature (°C)
Relative 70 70 90 90 70 70 90 90
humidity (%)
Door openings 4 8 4 8 4 8 4 8
per hour
The experimental plan had 64 runs. To conduct the experiment, all eight
refrigerators were simultaneously placed in a test chamber and exposed to each
cause combination in the given order. The cooling plate temperatures are given
in the file refrigerator frost buildup desensitization2 and in the following table:
In the table, we have coded the new and original settings for the candidates as
N and O, respectively. What conclusions can you draw? Remember, the goal is to
desensitize cooling plate temperature to changes in the environmental conditions.
We start the analysis by plotting the cooling plate temperatures for all of the eight
environmental conditions (cause combinations) by treatment.
15
Cooling plate temperature
10
1 2 3 4 5 6 7 8
Treatment
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–141
Treatment 7 gives consistently low cooling plate temperatures over all combina-
tions of the environmental inputs. Treatment 7 corresponds to using the new levels
for D2 and D4 and leaving D1 and D3 at their original settings. To examine how
changing each candidate flattens the relationship between the changing environ-
mental conditions and cooling plate temperature, we examine the appropriate inter-
action plots. To create an interaction plot of the environmental conditions versus D1
and D3, for example, we create a new column D1D3 with the four levels for the pair
D1 and D3.
D1 12 D2
New New
10 Original Original
7
Mean
Mean
5
2
0
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Cause combo Cause combo
D3 D4
12 New 12 New
Original Original
Mean
Mean
7
7
2
2
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Cause combo Cause combo
D1D4
15 D1D3 15 New_new
New_new New_orig
New_orig Orig_new
10 Orig_new 10 Orig_orig
Orig_orig
Mean
5 5
0 0
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Cause combo Cause combo
Not all possible interaction plots are given. The other plots give similar conclu-
sions. From the interaction plots, we see that changing a single candidate or a pair
of candidates does not reduce the variation in average cooling plate temperature. In
particular, all combinations of the candidates have a relatively large cooling plate
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–142
16.4 There were excessive failures in the accelerated life testing of electric motors.
Using a group comparison investigation, the team found that unevenness in the
commutator shaft surface was a dominant cause of these failures. The team next
reformulated the problem to one of reducing the unevenness in the commutator
shaft. The surface unevenness is measured on a scale of 1 (smooth) to 10 (rough).
With further investigation, the team determined that the dominant cause of the
variation in the (final) smoothness was the shaft profile before machining. The
team adopted the Desensitization approach. They decided to conduct a frac-
tional factorial experiment with eight treatments using four candidates. For
each of the eight treatments there were two runs, one that used a shaft with a
premachined smooth profile, and a second that used a rough profile. The exper-
imental plan and data are given in the file electric motor failure desensitization
and the table that follows. The order of the runs was randomized.
Smoothness
a. What is the confounding structure of the design? What limitations does this
introduce?
b. What conclusions can you draw?
c. What would be the advantages and disadvantages of measuring the time to
failure using the accelerated life test for each run rather than judging the
smoothness of the commutator surface after machining?
I
Depth + Grind*Rotation*Feed
Grind + Depth*Rotation*Feed
Rotation + Depth*Grind*Feed
Feed + Depth*Grind*Rotation
Profile
Depth*Grind + Rotation*Feed
Depth*Rotation + Grind*Feed
Depth*Feed + Grind*Rotation
Depth*Profile
Grind*Profile
Rotation*Profile
Feed*Profile
Depth*Grind*Profile + Rotation*Feed*Profile
Depth*Rotation*Profile + Grind*Feed*Profile
Depth*Feed*Profile + Grind*Rotation*Profile
9 Rough
Smooth
8
7
Smoothness
1 2 3 4 5 6 7 8
Treatment
The interaction plots for profile versus each of the four candidates follow.
Only rotational speed flattens the relationship between smoothness and the initial
shaft profile.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–144
Mean
4 4
3 3
2 2
Mean
5 4
4 3
3
2
2
We conclude that setting the rotational speed to 2400 will desensitize smooth-
ness to changes in the initial shaft profile.
c. With the existing experiment there is a risk that the treatment that results in a
smoother commutator surface may not translate into better performance in the
accelerated test. However, the chance of this occurring is small, since the team
verified that the commutator surface unevenness was a dominant cause of electric
motor failures. Using the accelerated test for the experiment would have been
more expensive.
CHAPTER 17
17.1 In an investigation, 100 trucks were selected from regular production over two
weeks. The frame geometry as given by four summaries (left and right front, left
and right rear) and the alignment outputs left and right camber and caster were
determined for all 100 trucks. The data are given in the file truck pull feedforward.
In Chapter 17 an analysis determined that feedforward control based on frame
geometry was feasible for left caster. Repeat the analysis for the other outputs:
right caster, left camber, and right camber.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–145
In each case, the residual variation (given by s in the regression results) is much
smaller than the variation for the given output in the 100 trucks. We assume this
closely matches the baseline variation since the data for the feedforward investiga-
tion were collected over a relatively long time. Feedforward control using the frame
geometry has the potential to greatly reduce pull variation. As described in Chapter 17,
the team was able to use a model to predict camber and caster based on the truck frame
geometry inputs and compensate for the effect if necessary.
17.2 Engine assembly problems occurred due to a poor fit between the pistons and
the engine bore. The dominant cause of poor fit was found to be variation in the
clearance, the difference between the (minimum) bore diameter and the (maxi-
mum) piston diameter. To solve this problem, the team thought about using the
feedforward (selective fitting) approach. The idea was to measure each piston
diameter and place them into bins of similar diameter. Then, after each bore
diameter was measured, a piston would be selected from the appropriate bin. To
assess this proposal the diameter measurements for 469 pistons and bores, as
measured from nominal, are given in the file block bore diameter feedforward.
Quantify the expected reduction in clearance variation when using one (that is,
no selective fitting), two, three, or four bins of pistons. A suggestion is to define
the bins by dividing the range in piston and bore diameters (roughly –10 to 10
microns) into equal widths.
The range in piston diameters roughly matches –10 to 10, and the variation in
bore and piston diameters is very similar. The standard deviation of the clearance for
the given random pairing between piston and bore is 4.3. This provides an estimate
of the process performance without selective fitting.
To quantify the expected variation in clearance using selective fitting we use sim-
ulation. We have two options: we can sample from the existing data to simulate the
effect of selective fitting or we can model the piston and bore diameter variation and
sample from the model. In this case, since both the piston and bore diameters have
a bell-curve shape, as shown by the histograms that follow, they are well modeled by
Gaussian random variables.
15
10
10
Percent
Percent 5
5
0 0
–1 0 10 –10 0 10
Bore Piston
To implement the first simulation option where we sample from the existing data,
we wrote a MINITAB macro that randomly selected a number of bores and corre-
sponding pistons assuming a given number of bins, and calculated the clearance.
See Appendix A for more information on writing MINITAB macros. The MINITAB
macro that assessed the effect of using two bins, defined as pistons and bores with
negative or positive diameter for this example, follows.
MACRO
#with the data file open call the MACRO from the command line as below
# %'macro2piston.txt' 'piston' 'bore'
let cpiston=1
let I2=1
let temp=1
let allvals=0
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–148
let I=I+1
endwhile
let I2=I2+1
endwhile
Alternatively, we can simulate choosing random bores and finding the matching
pistons from the Gaussian models in some programming language. We next give
Matlab (see http://www.mathworks.com) code that allows us to explore the effect of
any number of bins.
function []=selfitkbins(k,num)
%simulate the effect of using k bins for the piston/bore selective %fitting
example
%num equals the number of simulated bore/piston combinations
%the estimated distribution of piston and bore diameters are Gaussian %with
mean and standard deviation
pm=–0.12; ps=3.2;
bm=–.37; bs=2.9;
%we use k bins of pistons, divide the natural range –10 to 10 for the
%pistons into intervals of equal width
% a better choice would be to use bins of roughly equal frequency
%we use the same bins definition for both pistons and bores
%this works because here the distributions are close to the same
%bin intervals
bins=[–inf,–10+(20/k):(20/k):10-(20/k),inf];
The benefits of selective fitting are large even for only two bins. The team
needed to decide whether the substantial costs and logistical difficulties of imple-
menting feedforward control warranted adopting the approach. To implement
selective fitting, all piston and bore diameters must be measured. In addition, pis-
tons need to be placed in the appropriate bin. Logistical problems can occur if one
of the bins runs out. In this example, the team decided that while feedforward was
technically feasible other approaches should be considered due to concerns about
the high cost of implementing selective fitting.
17.3 In the V6 piston diameter example discussed in Chapter 11, the team found that
piston diameter after Operation 270 was a dominant cause of the final diameter.
The data are given in the file V6 piston diameter variation transmission. This
suggested that feedforward control might be a feasible approach.
a. What are the requirements for feedforward to be feasible in this context?
b. If feedforward were feasible, assess the potential benefit using the results of
the variation transmission investigation.
c. Could the team also use the diameter after Operation 200, rather than the
diameter after Operation 270, as the input to a feedforward controller?
599
589
584
Based on the regression model, using feedforward control and assuming per-
fect compensation would reduce the standard deviation in the final diameter to
1.2. This is a large reduction from the baseline standard deviation 3.32.
c. The diameter after Operation 200 is not a dominant cause of the final diameter
variation. We can see this clearly in the regression results and plot given as follows:
The regression equation is
diameter after OP310 = 225 + 0.325 diameter after OP200
599
Diameter after OP310
594
589
584
Feedforward control based on the diameter after Operation 200 would not
work. The Operation 200 diameter does not provide a good prediction of the
final diameter. The regression results suggest that if we implement perfect feed-
forward control based on the Operation 200 diameter, the final diameter variation
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–152
could be reduced to 2.86 from 3.32 in the baseline. One advantage over using the
Operation 270 diameter is that there is more opportunity to find an adjuster.
CHAPTER 18
18.1 The bias of the system used to measure camshaft journal diameters tended to
increase over time. The cause of this increase was not determined. Instead, the
team introduced a feedback controller. At the start of each shift, a master journal
with known diameter was measured. If the measured diameter deviated from the
known value by a large amount, the measurement system was recalibrated.
a. How could we sensibly define a large deviation in this context?
b. What would happen to the measurement variation if the measurement
device were recalibrated every time the master journal was measured, rather
than only when the deviation from the known dimension was large?
18.2 In a machining process, the dominant cause of dimension variation acted in the
setup family. That is, the dimension variation within a particular setup was
small relative to the variation from one setup to the next. The existing control
plan called for a complete process adjustment back to the target based on the
first observation after each setup. There were more than 200 parts machined
between setups. The baseline dimension standard deviation was 0.31. The team
decided to explore a new feedback control scheme based on the average for the
first five observations after each setup. In an offline investigation, they carried
out 10 setups and produced 20 parts after each setup without any adjustment.
The dimension data, scaled so that the target is zero, are given in the file
machining dimension feedback.
a. Use a one-way ANOVA to estimate the standard deviation if the process
could be adjusted so that the dimension averages across all setups were equal.
b. Use simulation to compare the performance of the existing feedback con-
troller with the proposed controller that makes a complete adjustment based
on the average for the first five observations after each setup.
c. In general, we may design a feedback controller by averaging the output
from the first n observations after each setup. What considerations help you
decide how many observations should be used to estimate the process aver-
age after each setup?
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–153
Dimension
0
–1
1 2 3 4 5 6 7 8 9 10
Setup
From the plot we can see the setup-to-setup variation. Using a one-way
ANOVA we can partition the overall variation into the within-setup component
and the variation between setups. The edited results of the ANOVA from
MINITAB are:
One-way ANOVA: dimension versus setup
The pooled standard deviation 0.2008 estimates the within-setup variation, the
dimension standard deviation if we could adjust the process to keep the setup
averages equal. This is substantially lower than the baseline value 0.31.
We get:
Estimated process
Adjustment scheme standard deviation
None 0.34
First observation 0.29
Average of first five 0.22
By using the average of five parts, we can nearly reach the estimated minimum
value 0.20. These comparisons are valid if the number of parts produced between
setups is much greater than 20 pieces. If the number of parts was only 20 pieces,
we should include the unadjusted parts (first one or first five parts) in the esti-
mated standard deviation.
c. We need to ask questions like:
How much variation is there within each setup? If there is little variation, we
need fewer observations.
How many parts are machined between setups? If there are only a small num-
ber of parts (or if the cost of poor parts is high), we may wish to make an
initial adjustment based on a small number of observations and then
adjust again once a few more observations are available.
What are the adjustment costs? If the adjustment cost is large, we may decide
to adjust only if the deviation from target is large.
18.3 In a machining process, there was excess variation in the diameter of a precision
ground shaft. The shaft diameter was measured for all shafts using a complex
automated gage (that also measured other outputs). Upon investigation, the
team discovered that the dominant cause acted in the measurement family. In
particular, the measurement bias changed from day to day, consistent with the
pattern observed in the baseline. To explore this bias change further the team
planned an investigation where the diameter of the same shaft was measured
each hour for four days. A total of 32 diameter measurements were made. The
data are given in the file precision shaft diameter feedback, with the output
being the diameter measured from nominal. The results show a gradual drift.
The team speculated that the drift was caused by changes in some (unidenti-
fied) environmental conditions. They decided to reduce the measurement vari-
ation using a feedback controller.
a. What type of feedback controller (that is, what prediction equation and what
adjustment rule) would you recommend in this application?
b. Suppose the team decided to use a feedback controller based on EWMA fore-
casts with the smoothing parameter alpha equal to 0.4. What kind of a
reduction in the measurement variation could they expect?
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–155
a. The change in the diameter is gradual (that is, does not have sudden shifts). As a
result, a feedback controller based on the EWMA forecasts is reasonable. The
adjustments should be made so that if the one-step-ahead forecast is correct and
the adjustment is perfect, the next value will be zero (that is, diameter equals the
nominal value).
b. Using simple exponential smoothing with alpha equal to 0.4 gives:
Actual
Predicted
1 Actual
Predicted
Diameter
–1
Smooth constant
–2 Alpha: 0.400
MAPE: 124.149
MAD: 0.428
MSD: 0.290
0 10 20 30
Time
From these results, we predict that with the feedback controller, the measure-
ment standard deviation would be reduced to roughly 0.54 ( 0.29 ). Note that
given the small sample size, we are not confident of this prediction.
If instead we try to optimize the value of alpha, MINITAB gives:
Actual
Predicted
Actual
1
Predicted
Diameter
–1
Smooth constant
Alpha: 0.859
–2
MAPE: 105.760
MAD: 0.342
MSD: 0.193
0 10 20 30
Time
However, because of the small sample size, we are not confident that alpha
= 0.86 would be substantially better than alpha equal to 0.4 when applied to the
actual process.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–156
CHAPTER 19
19.1 In the paint film build example introduced in Chapter 3, the baseline standard
deviation in film build (paint thickness) was 0.67 thousandths of an inch. With
this variation, to ensure a minimum film build of 15-thousandths of an inch, the
process was centered at 17. The goal was to reduce the standard deviation to
0.35, thereby allowing for a reduction in the average film build.
The dominant cause of film build variation was found using a multivari
investigation to act in the car-to-car family. Despite further effort, the dominant
cause was not found. The team decided to adopt the process robustness
approach. Based on process experience, candidates and their corresponding
levels were chosen as follows:
Temperature 30 50
Anode Conductivity
Treatment dimension of paint Temperature X voltage Z voltage
For each run, five panels were painted. The order of the treatments was ran-
domized. Since the dominant cause acted car to car, the team believed the
unknown dominant cause would act within each run. Film build was measured
at five locations on each panel. The data for one location are given in the file
paint film build robustness and in the table that follows.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–158
a. Analyze the data using the standard deviation of film build over the five
consecutive panels to measure performance. Is it possible to make the
process robust to noise variation? What levels of the candidates do you
suggest?
b. The team had a way to adjust the process center. However, we can also use
the robustness experiment to look for an adjuster. Analyze the data using the
average film build over the five consecutive panels to measure performance.
Are any of the candidates adjusters?
c. In the experiment, the film build at a particular location on five consecutive
cars (panels) was used to define a run. Suppose, instead, that the five obser-
vations came from five fixed locations on a single door. What, if any, changes
are needed in the analysis presented in part a?
a. Plotting the film build by treatment suggests that treatments 11, 12, 15, and 16
are promising.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–159
18
17
Film build
16
15
14
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Treatment
BD A: Anode dimension
D B: Conductivity
B C: Temperature
D: Zone X voltage
BC E: Zone Z voltage
CE
AE
BE
AD
DE
AC
C
A
CD
AB
E
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45
To draw conclusions we also look at the following main effects and interac-
tion plots.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–160
h
w
3.1
5
0
5
ig
9
Lo
30
50
45
47
50
52
3.
H
–0.35
–0.41
Log(s)
–0.47
–0.53
–0.59
h
w
5
ig
Lo
30
50
45
47
50
52
H
–0.40
Anode dimens
3.9 –0.65
3.1
–0.90
Conductivity –0.40
High –0.65
Low
–0.90
Temperature –0.40
50 –0.65
30
–0.90
Zone Z volta
Recall that smaller log(s) is better. From the interaction plots, the combination
of high zone X voltage, high conductivity, and low temperature is best. We are
fortunate that high conductivity is best in both the large interactions. The high
level of zone X voltage and conductivity, and low level for temperature corre-
sponds to treatments 11 and 12. Using the new process settings is expected to
reduce the baseline standard deviation of 0.67 to 0.37, since the average log(s) for
treatments 11 and 12 is –1.03 and e–1.03 = 0.37.
With the new process settings, the average film build is about 15.9. The team
had a way to adjust the film build average. The reduction in variation suggested a
reduction in the average film build to around 16.25 from the current 17. This
would translate into a 4% reduction in paint volume used. After implementing
this solution the dominant cause acted within a door. There is ongoing effort to
address this source of variation.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–161
b. Looking for an adjuster, we analyze the experimental results using film build (or
average film build in each run) as the output. The Pareto plot of the effects is:
A: Anode dimension
A
B: Conductivity
AC C: Temperature
D D: Zone X voltage
DE E: Zone Z voltage
CD
B
C
BC
AD
AB
BE
BD
CE
E
AE
0 5 10
Input A (anode dimension) is an adjuster. From the plot of film build by anode
dimension, given as follows, we see that lowering the average dimension lowers
the average film build.
18
17
Film build
16
15
14
3.1 3.9
Anode dimension
c. When measuring the same five locations on each door, we may expect a system-
atic difference between the door locations. In the analysis we would want to keep
track of the location and use an analysis as in Chapter 16. Using a performance
measure like log(s) is not advised.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–162
19.2 In a trim plant, customer complaints about seat appearance prompted manage-
ment to assign a team the task of reducing shirring variation. The team pro-
ceeded without using Statistical Engineering and made a number of mistakes.
Seat cover shirring was scored on a scale of 1 to 6 using boundary samples by
how much cloth was gathered by the parallel stitching. Shirring scores of 1 to 4
were acceptable with 1 being the best. Scores of 5 or 6 resulted from either too
much or too little shirring. A review of historical data suggested that the
observed shirring score over a week covered all six possible values. Next, the
team informally checked the measurement system. They found the measure-
ment system added little variation. The team decided not to look for a dominant
cause. Rather they moved directly to assessing the feasibility of making the
process robust. They used brainstorming to select six candidates with two levels
each as follows:
Seam width 9 mm 11 mm
Bun thickness +5 mm –5 mm
Each run consisted of three seats (repeats). The runs were conducted in the
treatment order given in the table. The data are given in the file seat cover
shirring robustness and reproduced as follows:
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–164
1 13 3 1 2 2.0
2 16 1 2 1 1.3
3 7 2 2 2 2.0
4 6 2 2 2 2.0
5 10 2 1 1 1.3
6 1 3 1 3 2.3
7 11 4 2 1 2.3
8 15 2 2 4 2.7
9 5 1 2 2 1.7
10 3 4 5 2 3.7
11 14 3 3 2 2.7
12 9 2 3 3 2.7
13 8 1 2 2 1.7
14 2 2 2 3 2.3
15 4 1 4 2 2.3
16 12 2 3 1 2.0
a. Explain why choosing the process output as a measure of variation (that is,
high scores come from either too much or too little shirring) was a poor one.
b. The goal is to find process settings that lower the average shirring score. Can
we use any of the candidates to achieve the goal?
c. Each run consisted of three seats. Discuss this choice in the context of a
robustness experiment.
For the last two parts of this question, suppose the first three candidates
(leather thickness, leather toughness, and seam width) used in the robustness
experiment were normally varying rather than fixed inputs.
d. How should the levels of the first three inputs have been chosen?
e. Discuss changes you would make to the analysis you conducted in part b.
b. We first plot the shirring values versus the treatment number. In the plot that fol-
lows, we have added some jitter in the vertical direction (see Appendix C) to
eliminate the problem of overplotting.
Shirring
3
0 5 10 15
Treatment
There are some promising treatments that result in a low average shirring
score. Next, we fit a model with all possible terms and present the results using a
Pareto plot of the effects.
D A: Leather
AB B: Seam wid
E C: Leather
C D: Machine
E: Steam
ACE
F: Bun thic
AC
F
A
CE
AE
EF
DE
BE
B
ABE
There are no large effects. We may be tempted to choose one of the treatments
that resulted in a low average shirring score as the new process settings. However,
consider the answer to part c.
As an aside the particular fractional factorial design selected has the confound-
ing structure given as follows. The inputs are labeled A, B, and so on, using the
same order as in the preceding tables. The design is resolution III. This is a poor
choice, since with 16 runs in six candidates a resolution IV design is possible.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–166
I – A*D*F – B*C*F
A – D*F + B*C*D
B – C*F + A*C*D
C – B*F + A*B*D
D – A*F + A*B*C
E
F – A*D – B*C
A*B + C*D – A*C*F – B*D*F
A*C + B*D – A*B*F – C*D*F
A*E – D*E*F
B*E – C*E*F
C*E – B*E*F
D*E – A*E*F
E*F – A*D*E – B*C*E
A*B*E + C*D*E
A*C*E + B*D*E
c. In a robustness experiment, we do not know the dominant cause, and we define a
run over sufficient time so that the unknown dominant cause has time to act. In
this example, the team did not know the time family of the dominant cause. It
seems unlikely that three repeats (seats) for each run is sufficient. This means that
the dominant cause has likely not acted in (most of) the treatments. With this
choice of run, the experiment will not be able to identify process settings that are
robust. The experiment was doomed to failure because of poor planning.
d. The levels for each suspect should be selected at the extremes of the normal range
of values. This requires measuring the value of the three suspects over a sufficient
time span to see the full range before planning the robustness (desensitization)
experiment.
e. The goal of the experiment would change to finding settings of the last three
inputs that result in a lower average shirring score. It no longer makes sense to find
the best settings of the suspects, since they are varying in the normal process. We
would refer to only the last three inputs as candidates. The appropriate perform-
ance measure would be the average shirring score across all runs (and repeats)
with a given combination of the three candidates. There are now only eight treat-
ments in terms of the candidates. For example, we would average the shirring
scores across runs 1 and 10. There is still no guarantee that the experiment will
provide useful results. We do not know if the three suspects include important, let
alone dominant, causes. This sort of an experiment, where both candidates and
suspects are mixed together, is a common mistake when teams know about
designed experiments but do not follow a structured problem-solving algorithm.
19.3 Torsional rigidity of the weather stripping was the dominant cause of door
assembly problems. Management set a goal of reducing standard deviation in
torsional rigidity to 0.3. A baseline investigation found the variation in torsional
rigidity was roughly 0.55 mm and that the dominant cause acted over the short
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–167
term and certainly within any half hour. The team looked briefly for a dominant
cause of rigidity variation without success. Next, they planned a robustness
experiment with four candidates at two levels each, chosen based on engineer-
ing judgment. The candidates and levels are:
Extruder RPM 22 26
Tension (pre) 1 5
Water flow 2 6
The team planned a full factorial experiment with 16 runs, one for each treat-
ment. The correspondence between treatments and candidate levels is given in
the table that follows.
1 –1 –1 –1 –1
2 –1 –1 –1 1
3 –1 –1 1 –1
4 –1 –1 1 1
5 –1 1 –1 –1
6 –1 1 –1 1
7 –1 1 1 –1
8 –1 1 1 1
9 1 –1 –1 –1
10 1 –1 –1 1
11 1 –1 1 –1
12 1 –1 1 1
13 1 1 –1 –1
14 1 1 –1 1
15 1 1 1 –1
16 1 1 1 1
Each run consisted of running the process for half an hour after the candi-
date levels had been reached. Within each run, 10 weather-strip samples were
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–168
selected spread out over the half hour. The order of the runs was randomized.
The torsion rigidity of each of the 10 weather-strip samples for each treatment
is given in columns s1 to s10 of the table that follows and in the file weatherstrip
torsional rigidity robustness.
1 13 10.3 13.0 11.5 11.8 10.7 9.9 10.7 11.5 11.0 11.1
2 6 11.5 13.0 10.4 11.1 10.9 10.6 12.0 9.3 9.2 9.3
3 9 11.6 13.0 10.4 16.0 10.3 10.8 11.5 11.0 11.3 10.9
4 1 11.5 11.7 10.4 11.7 14.0 11.7 10.4 11.7 10.4 10.4
5 3 14.0 11.7 11.7 19.0 11.9 11.7 12.1 13.0 11.1 11.0
6 11 22.0 15.0 18.3 11.7 20.3 21.0 12.6 13.6 14.7 15.1
7 5 9.1 9.6 10.2 9.8 9.0 9.7 10.0 12.0 9.0 8.8
8 14 10.0 9.1 10.6 10.4 10.8 11.0 11.1 10.8 10.5 10.8
9 2 11.7 12.5 11.9 11.7 20.0 14.0 10.4 11.5 11.7 20.0
10 10 10.3 11.6 10.5 10.6 13.0 14.0 11.7 10.3 15.0 11.8
11 7 10.3 10.5 11.0 11.4 9.8 10.4 11.7 11.8 11.5 11.9
12 15 11.6 11.0 11.4 11.3 12.0 10.6 10.9 10.7 10.7 10.7
13 16 10.6 10.7 11.6 10.6 10.7 22.0 11.0 10.4 10.4 23.0
14 8 9.1 10.4 10.6 11.4 10.9 10.4 10.8 10.9 11.0 11.6
15 12 10.3 11.0 12.0 12.1 10.5 10.7 11.3 11.4 10.8 10.9
16 4 10.4 10.4 10.4 10.5 10.9 11.4 9.0 9.6 9.8 10.2
a. The goal of the experiment is to find process settings that results in less variation
in torsional rigidity (as the unknown dominant cause acts). The performance
measure log(s), where s is the standard deviation of the rigidity values across the
10 repeats within each run, is appropriate. A secondary performance measure is
the average rigidity across each run. If we find process settings that result in less
rigidity variation, we may need to adjust the process center back to the target.
b. Plotting the torsional rigidity values by treatment, we get:
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–169
20
Torsional rigidity
15
10
0 5 10 15
Treatment
There are some treatments that have much less rigidity variation than others.
Fitting a complete model (using the log(s) performance measure) a Pareto plot of
the effects gives:
C A: Heat
B: Extruder
ACD
C: Tension
ABC D: Water flow
D
AD
AC
BC
ABCD
BCD
A
AB
CD
ABD
BD
B
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Candidate C (tension) has a large effect. Looking at the main effects plots sug-
gests that the high level of tension gives substantially lower rigidity variation.
–1
–1
–1
–1
1
0.6
0.4
Log(s)
0.2
0.0
–0.2
However, the average torsional rigidity variation at the high level of tension is
0.8 (e–0.2), much higher than the baseline variation of 0.55. In fact, even the best
treatment, number 12, has a variation of 0.47, only marginally better than the
baseline. The experiment has identified a candidate (tension) that can be changed
to increase rigidity variation. The experiment failed to find more robust process
settings. The team was curious about this result and decided to investigate even
higher tension levels. This investigation also failed since high tension led to other
negative side effects and little reduction in rigidity variation. The approach was
abandoned.
CHAPTER 21
21.1 Discuss whether lessons learned are properly maintained in corporate memory
in your organization. What could be done to improve the situation?
The answer depends on the organization.
21.2 In the paint film build example described in Chapter 19, the team found new
process settings that resulted in reduced car-to-car variation in film build. To
validate the proposed solution, 80 cars were painted over one day with the set-
tings given in the following table. These were the best settings found in the
robustness investigation. The film build values from five specific positions on
one door for each of the cars are available in the file paint film build validation.
Candidate Setting
Temperature 30
a. The baseline film build standard deviation was 0.68. The problem goal was
to reduce the standard deviation to 0.35, and the robustness experiment
results suggested that changing settings would reduce the standard deviation
to about 0.37. Has the solution been validated?
b. What, if anything, do the validation results tell us about the home of the
dominant cause in the remaining variation?
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–171
a. We analyze the data as in the baseline investigation with the following numerical
and graphical summaries.
Variable N Mean Median TrMean StDev SE Mean
film build 400 16.208 16.222 16.212 0.302 0.015
9
8
7
6
Percent
5
4
3
2
1
0
17.0
16.5
Film build
16.0
15.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Hour
The overall standard deviation has been reduced to 0.302, a substantial reduc-
tion from the baseline standard deviation 0.68, and exceeding both the problem
goal of 0.35 and the expected improvement. The average film build is 16.2. Fur-
ther efforts were made to bring the average to 16, a little more than 3 ¥ 0.30 above
the minimum acceptable film build of 15 units. With these changes, the ultimate
paint cost savings were about 6%.
b. As suggested by the plot that follows, the team could now look for further
improvement by reducing the position-to-position variation in film build.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–172
17.0
16.5
Film build
16.0
15.5
1 2 3 4 5
Position
21.3 In the truck pull example described in Chapter 17 and Exercise 17.1, a feed-
forward controller was implemented to compensate for the effect of truck-
frame geometry on pull. After the feedforward system had been operating
successfully for some time, management decided to review its operation. The
four frame geometry measurements and left and right caster and camber were
recorded for roughly a month of production consisting of over 6600 trucks. The
data are given in the file truck pull validation.
a. The standard deviations for caster and camber before implementation of the
feedforward controller can be estimated from the 100-truck investigation
described in Chapter 17. From the same investigation, the team predicted the
possible reduction in standard deviation using a feedforward controller. A
summary is given in the following table.
Do the results of the investigation validate the reduction in left and right
caster variation due to the feedforward controller?
b. For each of the two caster characteristics, conduct a regression analysis to see
if the feedforward controller can be improved. Recall that the feedforward
controller should be compensating for variation in the frame geometry.
c. Repeat the analysis in parts a and b for left and right camber.
9 5
8
7
6
Lcast
4
Percent
5
4
3
2
3
1
0
1000 2000 3000 4000 5000 6000
3 4 5
Lcast Index
8 5
7
6
Percent
Rcast
5
4
4
3
2
1 3
The standard deviations for left and right caster (0.25 and 0.23, respectively) are
much smaller than they were before implementation of the feedforward controller.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–174
The reduction in standard deviations did not completely meet the predictions based
on the overly optimistic assumption that perfect compensation was possible. The
feedforward controller was a great success.
b. To check the operation of the feedforward controller for the caster characteristics,
we fit a regression model for left caster (and right caster) as a function of the
truck frame characteristics—left front, right front, left rear, and right rear. If the
feedforward controller is working properly, there will be at most a weak relation-
ship between caster and the frame geometry characteristics.
From the solution for part a, the standard deviations for left and right caster are
0.2461 and 0.2314 when using the feedforward controller. The corresponding
residual standard deviations in the regression analysis are 0.2351 and 0.2176.
The residual standard deviations estimate the process standard deviation if we
could hold the frame geometry fixed or totally compensate for its effect. Because
the residual standard deviations are so close to the caster standard deviations with
the existing feedforward controller, the effects of frame geometry on the outputs
are very small. In other words, allowing for the inevitable small measurement and
adjustment errors, the existing feedforward controller cannot be improved further
using the frame geometry characteristics.
H1212 Steiner ExrSol.qxd 4/1/05 10:25 AM Page CD–175
To check the operation of the feedforward controller for camber, we fit a sep-
arate regression model for left camber and right camber, as a function of the truck
frame characteristics.
The regression equation is
lcmb = - 1.69 + 0.0761 left front + 0.00141 right front + 0.0727 left rear +
0.0669 right rear
Chapter Supplements
H1212Supp2.qxd 4/1/05 10:26 AM Page CD–178
H1212Supp2.qxd 4/1/05 10:26 AM Page CD–179
Chapter 2 Supplement
Describing Processes
100
80
100
Percent
60
Count
40
50
20
0 0
Grind Bore Broach Assembly Others
Defect
Count 85 24 14 6 2
Percent 64.9 18.3 10.7 4.6 1.5
Cum% 64.9 83.2 93.9 98.5 100.0
Figure S2.1 Pareto chart for scrap by operation for the rod line.
CD–179
H1212Supp2.qxd 4/1/05 10:26 AM Page CD–180
We use the Pareto principle extensively to choose projects and focus the problem. The
Pareto principle implies that, for any problem, there are one or two dominant causes.
See, for example, Kalbfleisch (1985). Replacing Y with the output and X with the input, we
have
( (
sd ( output ) = sd E output input ))
2
( (
+ E sd output input ))
2
The first term under the square root sign can be thought of as the variation in the output
explained by the input. The second term is the residual variation that would remain if we
could hold the input constant. The input is a dominant cause if the first term is large compared
to the second.
Formally, we call a cause dominant if it explains more than half the variation. This
implies the residual standard deviation must be less than 1 2 = 0.71, or roughly 70% of the
overall standard deviation, sd(output).
A formal derivation of dominant cause when the output is discrete or binary is difficult.
For a binary output, we use an informal definition and say an input is dominant if for some
level of the input the proportion defective is substantially reduced.
10 10
Output
Output
5 5
0 0
–1 1 –1 1
Input 1 Input 1
10
Input 2
1
–1
Output
0
–1 1
Input 1
Figure S2.2 Plots illustrating the three cases for a dominant cause.
Case one is illustrated in the top left panel of Figure S2.2, where there is a single input
(input 1) that explains the variation in the output. Here the second input has no effect on the
output. In case two, as given by the top right panel of Figure S2.2, there are two large
causes (inputs), but the effect of each input does not depend on the level of the other input
(we say the effects of the two inputs on the output are additive). A dominant cause involv-
ing two inputs is presented in the bottom panel of Figure S2.2. This last case is the most
complicated. We say a dominant cause involves two inputs if, for both inputs, we can
observe close to the full extent of output values as the input changes for some value of the
other input. In the plot, for the high level of input 1, changing the level of input 2 results in
a large difference in the output, while similarly, for the high level of input 2, changing the
level of input 1 results in a large difference in the output. There are gray areas among the
three cases.
A dominant cause involving two or more inputs is related to the idea of an interaction.
We say there is an interaction between input 1 and input 2 in their effect on the output if the
relationship between input 1 and the output depends on the level input 2 and vice versa.
Note that there can be an interaction between input 1 and 2 in their effect on the output even
if there is no correlation between input 1 and 2 (that is, if input 1 and 2 vary independently
of one another). We see the interaction between input 1 and input 2 clearly in the bottom
panel of Figure S2.2, since the effect of changing from the low to high level of input 2 is
much greater when input 1 is at its high level than when it is at its low level.
H1212Supp2.qxd 4/1/05 10:26 AM Page CD–182
We return to the issue of the consequences of a dominant cause involving two (or more)
inputs in Chapter 10 and its supplement, where we discuss finding such dominant causes,
and in Chapter 14, where we consider possible solutions for problems with such a domi-
nant cause.
where stdev is the standard deviation of the process output. Larger values of Ppk indicate
smaller variation. The index is increased by better locating the average near the middle of
the specification limits and/or by reducing the variation among the parts (that is, decreasing
stdev). Note how Ppk is a function of both kinds of variation (off-target and part-to-part variation)
and gets bigger as one or another kind of variation gets smaller.
For the angle error data discussed in the context of the camshaft lobe runout example
in Chapter 2, the average and standard deviation are given as –21.3 and 71.5, respectively.
The specification limits for angle error are ±400; thus the specification limit closest to the
average is –400. Hence, the value of the capability index Ppk is
-21.3 - ( - 400 )
Ppk = = 1.77
3 ¥ 71.5
Such a large value of Ppk indicates that the closest specification limit is far from the
average relative to the process variation, as described by three times the output standard
deviation. Since the histogram for angle error data is bell shaped, this means that it is highly
unlikely that any of the observed angle errors will be outside of the specification limits.
This can be clearly seen in the histogram of angle error given in Figure S2.3, where we
added dashed lines to show the specifications.
5
Percent
Like all other summary measures of process performance, capability ratios depend on
how the data are collected. If the sample is collected over a short time or when conditions
are exceptional, the calculated capability ratio may be very misleading about the process
performance in general.
We define the root mean squared deviation (RMSD) of the data from the target by
RMSD =
(y
1
- T ) + ( y2 - T ) + ... + ( yn - T )
2 2 2
Note that RMSD is a measure of the average variation from the target. To relate RMSD to
the two kinds of variation, we need the following result:
(y1
- avg ) + ... + ( yn - avg ) = 0
That is, the sum of deviations of a set of numbers from their average is zero. We can show
this result by noting that we can add the pieces before subtracting. That is,
(y1
- avg ) + ... + ( yn - avg ) = y1 + ... + yn - ( avg + ... + avg )
= n * avg - n * avg
=0
Now to decompose the RMSD into the two kinds of variation, we split each term in the
numerator of the sum into three pieces.
We can add the three pieces separately. The sum of the first pieces is
H1212Supp2.qxd 4/1/05 10:26 AM Page CD–185
(y 1
- avg ) + ... + ( yn - avg )
2 2
Note that in the second and third pieces (avg – T) is a constant that is the same for all terms
in the sum (i.e., it is the same for all i). The sum of the second pieces is
The third piece is the sum of n constants and is n(avg – T)2. Combining the pieces we have
RMSD =
(y 1
- avg ) + ... + ( yn - avg )
2 2
+ ( avg - T )
2
n -1
stdev 2 + ( avg - T )
2
=
n
ª stdev 2 + ( avg - T )
2
In words, the squared average deviation from the target is approximately the square
root of the sum of squares of the measures of the two kinds of variation. The relative con-
tributions of the two kinds of variation can be assessed from this formula. For example, if
the stdev is large and the avg is close to target, we can make very little gain by moving the
process average closer to the target.
If the target T is an ideal value for the output, then there is likely to be some cost to the
supplier or loss to the customer when the actual output deviates from the target. The RMSD
has the property that it is approximately linear in the deviation from target yi - T when yi
is relatively far from T (relatively here means compared to the other output values). Close
to the target, the contribution of yi - T to the RMSD is small. In other words, RMSD is a
reasonable surrogate for the cost associated with a deviation from target.
where avg is the average of the 1296 values. For any term in the sum, if avgj is the average
of the angle errors at lobe position j, we can write the squared deviation from the average
as three pieces:
(y ) ( ) ( )
2
= ÈÎ yij - avg j + avg j - avg ˘˚
2
ij
- avg
( ) ( )( ) ( )
2 2
= yij - avg j + 2 yij - avg j avg j - avg + avg j - avg
For each lobe position, we can add the pieces separately. Note that the sum of the middle
piece is zero because it is a constant times the sum of deviations from the jth position aver-
age. Looking at the sum for position j, we have
(y ) ( ) (
= È y1, j - avg j ) ( )
+ ... + y108, j - avg j ˘ + 108( avg j - avg )2
2 2 2 2
- avg + ... + y108, j - avg
1, j ÍÎ ˙˚
In this equation, the first part of the sum on the right is directly related to the standard devi-
ation of the output within position j, denoted stdevj. That is,
(y ) ( )
2 2
1, j
- avg j + ... + y108, j - avg j = 107stdev 2j
(y ) ( )
2 2
1,1
- avg + ... + y108,12 - avg
= 107( stdev12 +.... + stdev122 ) + 108 È( avg1 - avg ) + ... + ( avg12 - avg ) ˘
2 2
Î ˚
(
= 107 stdev12 + ... + stdev122 + 108 ¥ 11 stdev( avg1 ,..., avg12 )2 )
Dividing both sides by 1295 (= 108 ¥ 12 – 1) and taking the square root we get
If we could control the process so that the average angle error was constant for each posi-
tion, then the overall standard deviation would be reduced from 71.5 to
In general, we can partition the overall standard deviation into variation within group
and group to group. Aligning the group averages reduces the standard deviation by elimi-
nating the variation due to group-to-group differences.
The division of the standard deviation into parts is the basis for the analysis of variance
(ANOVA), a powerful numerical analysis method used in the book and discussed in
Appendix D.
Frequency Standard
deviation
Mean
Output characteristic
We prefer the name Gaussian curve to the more usual Normal curve because of the
baggage that the word normal carries. We have been questioned many times by process
engineers and production managers who look at the output from some statistical software
that indicates that the process data is not “normal” and, hence, believe that something is
wrong in the process. There is nothing abnormal or unusual about a process with a histogram
of a different shape.
The area under the curve over any interval describes what proportion of the output values
fall in that interval. We can specify the curve by two parameters, the mean (or center of
the symmetric curve) and the standard deviation, a measure of the variation or spread of the
values. The mean and standard deviation associated with the model are directly analogous
to the avg and stdev of the histogram.
The Gaussian model has the beautiful property that it predicts that a fixed percentage
of the output will fall within the interval (mean ± c standard deviation) for any constant c.
For example, the percentages are approximately:
Within one standard deviation of the mean 68%
Within two standard deviations of the mean 95%
Within three standard deviations of the mean almost 100%
There is a strong connection between the interpretation of capability measures such as
Ppk and these percentages. If the process average is on target and we can describe the output
by a Gaussian model, we can directly relate the magnitude of Ppk and the proportion of
output that is out of specification:
These numbers should not be taken too seriously because a Gaussian model will never
describe the process output perfectly.
In the book we rarely rely on formal statistical analysis. However, much of the formal
analysis that we do present depends on the assumption that we can apply a Gaussian model.
We often assume we can capture the effects of measured inputs in the mean of the model.
The effects of all other varying inputs are lumped together in the standard deviation. For
example, to formally analyze the position-to-position differences of the angle errors in the
camshaft lobes, we assume that a Gaussian model can describe the output of the process
where each position has a separate mean to explain the effect of position. We also assume
that the variation within each position is the same since this variation captures the effect of
all other varying inputs. Thus we assume that the standard deviation within each position is
the same. We can describe the model pictorially using Figure S2.5 where we show the models
for only three of the lobes to make it easier to interpret.
Lobe 6
Frequency
Lobe 10
Lobe 4
Output
Chapter 4 Supplement
An Algorithm for Reducing
Variation
Plan for
Understand Eliminate Remove Reduce
continuous
the process errors slack variation
improvement
Each stage is further divided into a number of substages, used in completing the stage.
Scholtes provides many other processes to support the algorithm, especially to help with
the problem selection and the team operation.
Both the Six Sigma and Scholtes algorithms are designed to apply to processes in gen-
eral and to many different kinds of problems. They are not specialized to reducing variation
in high- to medium-volume manufacturing processes, as is Statistical Engineering.
CD–191
H1212Supp4.qxd 4/1/05 10:27 AM Page CD–192
In our view, these algorithms (including the one we propose) share two common elements:
• They are based on the diagnostic and remedial journeys described by Juran and
Gryna (1980) and Juran (1988), as given in the figure that follows. The idea is
that if we know the cause of the problem, we are more likely to find efficient
and effective remedies or solutions.
Chapter 5 Supplement
Obtaining Process Knowledge
Empirically
S5.1 ATTRIBUTES
In applying QPDAC to plan investigations, we specify many numerical attributes other than
averages, standard deviations, and proportions. Here we give some examples that can apply
to the target and study population or the corresponding sample.
Stratification
The first complication is stratification. For example, we may have a process with distinct
streams, such as two suppliers of the same component, three production teams (one for each
shift), four different parallel operations within a machining operation, and so on. In the truck
alignment process described in Chapter 1, there are four gages operating in parallel to measure
caster and other alignment characteristics (see Figure S5.1).
We can stratify the output by gage and define attributes such as the average and standard
deviation for each gage. If the averages are different, we can reduce the overall standard
deviation by improving the calibration process for the gages. If the standard deviation within
one gage is much larger than within the others, we can look within that gage to understand
why it is behaving differently. We can estimate these attributes by applying QPDAC.
Upper control
Gage 2
arm
Assembly
Knuckle Gage 4
CD–193
H1212Supp5.qxd 4/1/05 10:28 AM Page CD–194
We exploit the idea of stratification and specifying attributes by strata to generate clues
about the dominant cause. See chapters 9 and 10.
Combined Attributes
Sometimes we express a question in terms of an attribute defined as a combination of two
or more attributes. For example, in the supplement to Chapter 2, we defined the process
performance measure Ppk as
Ppk is a function of the population average and standard deviation. Another example is the
capability index Cpk, where
Cpk is a function of the average and a different attribute, the within-subgroup standard deviation.
To define within-subgroup standard deviation, we specify what we mean by a subgroup.
We can choose as we please; often we define a subgroup as five consecutive parts from the
process or all parts within a specified (short) period of time. In each case, we are stratifying
the population of units into a large number of groups with no overlap. If all of the subgroups
have the same size, then we define the within-subgroup standard deviation as
where stdev1,stdev2,... are the standard deviations within subgroups 1, 2 ..., and so on. We
average the squares and then take the square root to match the model behavior as discussed
in Chapter 2.
We trust that you have noticed that Ppk and Cpk are different attributes because their
denominators are not the same. Hence questions that we ask in terms of Ppk and Cpk are also
different. We can see the connection between the denominators by noting the following
result that we demonstrated in the supplement to Chapter 2.
Here we have stratified the population into subgroups. We have produced a very gen-
eral formula, because the definition of a subgroup was up to us. We interpret the result by
noting that the overall standard deviation is made up of a within-subgroup and a subgroup-
to-subgroup component. In looking for a dominant cause of variation, we use this interpre-
tation to eliminate from consideration causes associated with small components of the
overall variation.
H1212Supp5.qxd 4/1/05 10:28 AM Page CD–195
Note that the overall standard deviation and the within-subgroup standard deviation are
almost equal when the standard deviation of the subgroup averages is small, that is, when
all of the averages are about the same. In other words, the two attributes Ppk and Cpk will be
almost the same when most of the variation in process output occurs within the subgroups,
not subgroup to subgroup.
 ( y - a - bx )
2
i i
where the sum is overall units in the population. We denote the fitted line in the population
by the equation
y = α + βx
The Greek letters α (alpha) and β (beta) are numerical attributes of the population. The
slope β describes how much, on average, y will increase for a unit increase in x. Given a
sample of parts from the study population, we estimate these attributes with the correspon-
ding sample quantities. See Appendix E for more details on how to fit the line and create
plots like Figure S5.2 for the data in the sample.
60
50
40
Y
30
20
10
5 10 15
X
Looking at Figure S5.2, we see that if x were held constant, the variation in y would be
much smaller than when x is allowed to vary. For example, if x is held near 10, then from
the scatterplot, we see that y varies from about 25 to 40. Overall, y varies from about 10 to
60. We use this idea to quantify the contribution of a single cause to the overall variation.
We can partition the standard deviation of y into two components:
The second component under the square root sign describes the variation that would
remain if x were held fixed. We can picture this attribute on the scatterplot as the variation
around the fitted line. The first component describes the contribution of the input x to the
variation in the output y. If x is a dominant cause of variation in the output, stdev(rest) is
small compared to stdev(y).
Given a sample from the population, we can estimate stdev(y) and the component contri-
butions using MINITAB. Then we can decide if x is a major contributor to the variation in y.
This is the easy part. Now we must find the parts corresponding to the sampled identifica-
tion codes. This can be difficult if the process is complex and parts do not appear in the
expected order. Finding the sampled units can take a long time and be expensive.
We once consulted on a project (by telephone only) where we naively recommended
selecting a random sample of 50 water heaters from a lot of 2200 heaters. Each heater had
a unique serial number. The heaters were stored in boxes in a warehouse. We provided a list
of 50 serial numbers corresponding to a random sample of heaters. The sampling protocol
fell apart immediately because it was not possible to locate the serial number without open-
ing the box, and furthermore, it was very difficult to get at certain heaters due to the method
of storage. In the end, a convenience sample was selected from locations spread throughout
the warehouse. Note that when we discuss experimental plans in more detail, we will use
random assignment extensively. This is a different use of randomization not to be confused
with random sampling.
Sample size is a major issue. The most frequent question asked of a statistical consultant
is, “How large a sample should I use?” The answer, like that to all statistical questions, is,
“It depends!” For a given protocol, the larger the sample, the greater the chance of small
sample error. However, even with a very large sample, we may be unlucky and have sub-
stantial sample error. To determine sample size, the first consideration is how the sampling
fits into the rest of QPDAC. For instance, if there are likely to be large study and measure-
ment errors, it makes little sense to try to get very small sample error. Second, there are cost
and time constraints that often outweigh any consideration of sample error. We can often
answer the sample size question based on what the team can afford.
There are some formal procedures that can be used to determine sample sizes if we
have random sampling (or another sampling protocol where we pretend that the sampling
will be random) and simple attributes in the Question step. We do not discuss these for-
mally here. See Odeh and Fox (1975), Nelson (1985), and Neter et al. (1996). Also
MINITAB has a “power and sample size” function. To understand how these methods work,
consider the following example.
We were once asked how large a sample would be needed to determine if there were
any defective rails in a suspect lot of 5700 rails. The defect was a vertical crack in the head
of the rail that might cause derailment of a train. To detect the defect, the railway used a
destructive test. The management wanted to be 99% confident that there were no defective
rails in the lot. To answer the question, from a statistical perspective, the situation requir-
ing the largest sample size would occur if there were exactly one defective rail in the lot.
Suppose this was the case, that is, that there was one defective in the lot and we plan to
choose a sample of n rails at random. We want to be 99% sure that we find the defective
rail. Simple calculations show that we need to sample 99% of the rails. This was com-
pletely infeasible; sampling could not provide the information required in this case. The
only feasible solutions were to scrap the lot of rails or to accept the risk of an accident.
In many QPDAC applications, we will use data that are already available; that is, data
that have been collected for some other purpose. For example, we may examine scrap
records or measurements taken as part of the process control plan. These data can be helpful
but it is wise to think about how they were collected. Are we likely to be misled because of
large study or sample error?
H1212Supp5.qxd 4/1/05 10:28 AM Page CD–198
In summary, we most often recommend systematic sampling to cover the study popu-
lation. Systematic sampling is relatively easy to implement. Choose a sample size as large
as you can afford remembering that there are likely to be more investigations before the
problem solving is complete.
S5.4 OUTLIERS
An outlier is an unexpectedly large or small value. The occurrence of outliers in any empirical
investigation can have a large effect on the conclusions. We try to identify outliers and con-
trol their impact.
An outlier can have a large influence on the estimate of the standard deviation, one of
the attributes we use most often. We use an artificial example to illustrate. Suppose we take
a sample of 100 parts from a process and measure an output characteristic (denoted y). We
present the data summary from MINITAB and the process histogram in Figure S5.3.
H1212Supp5.qxd 4/1/05 10:28 AM Page CD–199
35
30
25
Percent
20
15
10
2 3 4 5 6 7 8 9 10 11 12
Y
Descriptive Statistics: y
Variable N Mean Median TrMean StDev SE Mean
y 100 5.081 5.000 5.049 1.059 0.106
The first value in the data set is 11.4, an apparent outlier. We can study the effect of the
outlier on the sample standard deviation by replacing 11.4 by a number of values between
1 and 20 (denoted x) and then recalculating the sample attribute. We plot the results in
Figure S5.4.
1.6
1.5
1.4
1.3
Stdev
1.2
1.1
1.0
0.9
0.8
0 10 20
X
Figure S5.4 Effect of changing a single value on the sample standard deviation.
H1212Supp5.qxd 4/1/05 10:28 AM Page CD–200
We see that the standard deviation is very sensitive to changes in a single value in a sam-
ple of size 100. If the first value in the sample had been 5 (close to the average) rather than
11.4, the standard deviation would have been reduced from 1.06 to 0.85, a 20% reduction.
What should we do if we find an outlier? We first look at the data collection procedure
to ensure that no blunder occurred, such as incorrectly entering the data into the spread-
sheet. If a blunder occurred and it cannot be easily corrected, we have a number of options.
We can remove the observation from the analysis. This is the preferred approach for inves-
tigations such as estimating the problem baseline (see Chapter 6) where there are many
observations under similar conditions. For other investigations such as designed experi-
ments, where we make relatively few measurements under a number of combinations of
inputs, we can repeat the observation or replace the outlier by an estimate of its value made
from the remaining observations.
If we cannot attribute the outlier to a blunder, we have a much more difficult decision.
First, if eliminating or changing the outlier has no effect on conclusions then we may ignore
its presence. If the outlier does have an effect, the key question is whether the dominant
cause of the outlier is the same as the cause of the problem we are trying to address. This
question is unfortunately not answerable without considerable process knowledge. It is pos-
sible that the outlier comes from a different, perhaps rare, failure mechanism that we are not
trying to address. On the other hand, the outlier could be an extreme example of the
action of the dominant cause of the current problem. In the former case, the presence of the
outlier will make finding the cause and solving the problem more difficult, since the outlier
will cloud the effects we are looking for. In the latter case, the outlier may provide a great
deal of useful process knowledge. Consideration of outliers is especially critical when using
the idea of leverage, as defined in Chapter 9.
H1212Supp6.qxd 4/1/05 10:28 AM Page CD–201
Chapter 6 Supplement
Defining a Focused Problem
The estimated standard deviation is 3.32 microns. We can get an approximate 95%
confidence interval for the study population standard deviation of the form
where c1 and c2 are found in Table S6.1. The given constants can be derived using a Gaussian
assumption (as discussed in Chapter 2) to model the distribution of output values. The
degrees of freedom are related to the sample size. Technically, in the previous calculation,
the degrees of freedom are one less than the sample size. In general, for large sample sizes
(greater than 30), you can replace the degrees of freedom by the sample size.
CD–201
H1212Supp6.qxd 4/1/05 10:28 AM Page CD–202
In the V6 piston diameter example, the confidence interval for the study population
standard deviation is (3.08, 3.59) microns. We can be confident that the standard deviation
in the study population is in this range. Even though we used a large sample size, we can
only be confident that the attribute is within about ±7% of the estimate. In general, from
Table S6.1 we note that the process standard deviation is poorly estimated when the sam-
ple size is small.
10 0.69 1.75
20 0.77 1.44
30 0.80 1.34
40 0.82 1.28
50 0.84 1.24
60 0.85 1.22
80 0.87 1.18
or, in this case, 590.85 ± 0.30. In other words, we can be confident that the study population
average diameter falls in the range (590.55, 591.15) microns.
For many other attributes, the form of the confidence interval is the same as for the
average, that is,
We can often find both the estimated attribute and the standard error of the estimate in the
MINITAB summary.
For estimating a small proportion, such as the proportion defective, we get a rough idea
of the precision of the estimate from a confidence interval of the form
is no sample error. The baseline performance measure was the pull standard deviation calcu-
lated over the two-month period. We assumed that this two-month period was long enough so
that the standard deviation over this study population would describe how the process would
perform into the future if no changes were made. In the language of QPDAC, we assumed
that there was little study error.
For the purpose of discussion, suppose that a control chart had been in place at the four
gages at the start of the project. There are many sampling protocols that could be used to
generate control charts. The process may be stable using some of these protocols and unstable
using others. In other words, stability is not solely a property of the process; it also depends
on the sampling and charting procedure used. Suppose that in our example, the chart was
based
– on pull measured on five consecutive trucks, once per shift. In Figure S6.1, we show
the X chart based on the first five trucks produced after 2:00 A.M., 10:00 A.M., and 4:00 P.M.
sampled from the data set truck pull baseline for all days.
The process is unstable with respect to this charting procedure. That is, there are
causes (varying inputs) that change from shift to shift within the two-month period that
produce the systematic patterns seen on the chart. The effects of these causes contribute to
the overall standard deviation. Our assumption about study error is that in the future, the
control chart would show similar patterns of instability. This is an assumption and the control
chart here provides no help in deciding if the assumption is reasonable or not. If we were
concerned with the assumption and if the data were available, we could examine a sequence
of two-month periods in the past to see if such a time frame captures most of the process
variation.
In summary, the selection of a study population with a time frame long enough to well
represent the future is a key issue in establishing a baseline. Stability within the selected
study population is not the issue. We emphasize the word within because the baseline standard
deviation describes the whole two-month period.
Whether the process is stable or not, we may use control chart data to quantify the
baseline and provide clues about the dominant cause.
0.4 1
1
1 11
11
UCL
0.3
Sample mean
=
X
0.2
LCL
1 1
0.1 1 11
11
1
1
–
Figure S6.1 X chart for truck pull.
H1212Supp7.qxd 4/1/05 10:29 AM Page CD–205
Chapter 7 Supplement
Checking the Measurement System
From the baseline investigation we have a good estimate of P(pass). This is analogous
to obtaining an estimate of stdev(total) from the baseline in a problem with a continuous
output. The key formula relating the properties of the measurement system and the manu-
facturing process is
The proportions P(good) and P(bad) = 1 – P(good) are properties of the manufacturing
process. The misclassification rates P(fail|good) = 1 – P(pass|good) and P(pass|bad) are
properties of the measurement system. For a binary output, Equation (S7.2) corresponds to
CD–205
H1212Supp7.qxd 4/1/05 10:29 AM Page CD–206
the partition of the overall standard deviation into components due to the measurement system
and the manufacturing process for a continuous output.
We can derive Equation (S7.2) by noting that
and
If instead the baseline performance is specified by P(good | pass), we can rewrite Equa-
tion (S7.1) as
to separate the effects of the measurement system from those of the rest of the process.
In summary, to assess the binary measurement system we estimate the misclassification
rates P(pass|bad) and P(fail| good), attributes of the measurement system in the population
of all future measurements. If these proportions are too high, then the measurement system
is not adequate and must be improved before proceeding to the next stage of the Statistical
Engineering algorithm.
We illustrate these ideas using an example in which credit cards were classified pass or
fail by an automated visual inspection system. The aim was to find visual defects such as
missing parts of the intended design, surface scratches, bleeding of the colors, fuzzy letters
and numbers, and so on. The measurement system used a digital image of the front of each
card to calculate hundreds of summary measures based on comparing the picture to a tem-
plate of the ideal card. If none of the summary measures fell outside the prespecified ranges,
the measurement system passed the card. In the baseline investigation, the team monitored
the measurement system for more than one week and found an 89.5% pass rate. About 30,000
cards were measured each hour. The goal was to reduce the proportion of bad cards being
shipped to the customer P(pass|bad) without substantially decreasing the pass rate P(pass).
H1212Supp7.qxd 4/1/05 10:29 AM Page CD–207
Plan 1
We start with equal-sized groups of passed and failed parts. We select each group over a
period of time and over a range of conditions so that they represent the long-run popula-
tions of passed and failed parts. Then we determine the true state (good or bad) of each part.
In the credit card example, the team selected 40 passed and failed cards per day for five
days. Then human inspectors classified each card as good or bad. The data are summarized
in the following table.
From these data, we estimate P(good | pass) = 195 / 200 = 0.975 and P(good | fail) = 16 /
200 = 0.080.
With the a derivation used to obtain Equation (S7.2), we can express the proportion of
good parts as
Substituting the estimates for P(good |pass) and P(good | fail) and the overall pass rate
0.895, we estimate the proportion of good cards in the population as
The team decided that the measurement system needed improvement because the
estimate for P(pass | bad) was large. If the team could reduce this misclassification rate,
the proportion of bad cards shipped to the customer would decrease from 0.025.
Plan 2
Here we start with equal-sized groups of good and bad parts. We select each group over a
period of time and over a range of conditions so that they represent the long-run populations
of good and bad parts. Using the credit card example, the operators examined cards until
they had 200 good cards and 200 bad cards. Next, the cards were classified by the measure-
ment system with the following results.
From these data, we can directly estimate the misclassification rates as P(fail| good) =
10/200 = 0.05 and P(pass|bad) = 8/200 = 0.04. Since we also know P(pass) = 0.895, we
can estimate P(good) using Equation (S7.2):
Substituting the estimates, we have 0.895 = 0.95P(good) + 0.04(1 –P(good)), and solving,
we find
0.895 - 0.04
P ( good ) = = 0.940
0.95 - 0.04
Comments
The second plan gives direct estimates of the misclassification rates and an indirect estimate
of P(bad | pass), whereas the first plan provides indirect estimates of the misclassification
rates and direct estimates of P (bad| pass). Which plan is preferred depends on costs and
the way the baseline is specified. The second plan will not be feasible in many cases
because it will be difficult to get representative samples of good and bad parts.
If we decide that the misclassification rates are too high, we can investigate families
of causes such as operator-to-operator or time-to-time by repeating the investigation over
different time periods or with different operators.
H1212Supp7.qxd 4/1/05 10:29 AM Page CD–209
The estimates produced by these plans will be imprecise unless we have large samples.
In the first plan using the credit card example, we can be confident that the estimate
P(pass | bad) = 0.188 is within about ±0.06 of the actual misclassification rate. To increase
the precision of the estimate, we need to determine the characteristic (good or bad) of more
than 200 passing and failing cards.
See Boyles (2001) and AIAG (1995a) for more details on alternative ways of assessing
variation due to binary measurement systems. The AIAG method is not feasible in the credit
card example because it assumes a measurable continuous output has been discretized to
give the binary output.
Here we discuss the planning and analysis of an investigation to assess a destructive meas-
urement system for a continuous characteristic. A measurement system is destructive if we
change the true value of the characteristic by making the measurement. Tensile strength is a
good example. We cannot make repeated measurements on the same part when the system
is destructive.
To assess measurement variation, we select parts that we hope have identical or very
similar characteristic values. If we have many similar units, we recommend the plan and
analysis in sections 7.2 and 7.3. If there are only pairs of similar units, we can use an inves-
tigation like an Isoplot (see Section S7.4).
The following example of assessing a destructive measurement system for the tensile
strength of tin-plated steel is motivated by Spiers (1989). In the baseline investigation, the
team found that the overall variation in strength was 2.5 KSI (thousands of pounds per
square inch) relative to the tolerance ±6.0 KSI measured from nominal.
The next step was to check the measurement system. To determine tensile strength, the
operator:
1. Cut a standard sized sample from a sheet of steel
2. Calculated the cross sectional area of the sample with a micrometer
3. Pulled the sample apart in a tensometer
The tensile strength is the ratio of the maximum load to the cross-sectional area.
In the measurement investigation, 30 standard-sized pieces were cut from three differ-
ent sheets of steel, parallel to the rolling direction and one-quarter distance from the edge
of the sheet to eliminate variation in thickness and tensile strength due to edge or crown
effects. The sheets were chosen to cover the full extent of variation in strength seen in the
baseline investigation. The 30 pieces from each sheet were then randomly divided into
three groups of 10. The three appraisers each measured the tensile strength of all 30 pieces
(10 pieces from each sheet). We give the data in the file tin plate strength measurement.
In the analysis, we assume that all the pieces cut from the same sheet have the same
tensile strength. This is analogous to measuring the same part a number of times in the
H1212Supp7.qxd 4/1/05 10:29 AM Page CD–210
95
Tensile strength
85
75
1 2 3
Sheet
nondestructive measurement case. Figure S7.1 shows the measured tensile strength versus
the sheet number. A numerical summary is:
(1.5 2
+ 1.4132 + 1.4062 ) 3 = 1.44
Using the estimate stdev (total) of 2.50, obtained from the baseline investigation and Equa-
tion (7.2), we estimate
–1
–2
–3
–4
–5
1 2 3
Operator
ratio is small—that is, D < 3—we cannot be sure if the fault lies in the measurement sys-
tem or if there is large variation within the samples taken from the same sheet.
In the example, the team concluded that they should improve the measurement system
before continuing further with the Statistical Engineering algorithm. In further analysis of the
measurement results, as shown in Figure S7.2, the team found that some of the measurement
variation was due to the differences between the operators, with operator three exhibiting
more variation than the other two operators.
To estimate measurement bias with a destructive measurement system, it is necessary
to use parts with known characteristic value. These standard parts are difficult to obtain
unless there is some other (better) nondestructive measurement system.
We estimate the repeatability as 0.593. Since the estimate of the overall measurement stan-
dard deviation was found earlier to be 0.756, the repeatability is a large component.
We give the operator averages in the following MINITAB summary:
–2
–2
1 2
Week
• There is no time-to-time component in the plan. Most R&R studies are carried
out as quickly as possible and miss the day-to-day or week-to-week component
of the measurement variation. The R&R likely underestimates the overall
measurement system variation.
• The use of 10 parts selected over a short time is not sufficient to produce a reliable
estimate of the variation of the true part dimensions in the process. Rather, we
should use the available baseline information.
To illustrate, we use the camshaft journal diameter context for demonstration purposes.
Data generated to be realistic are given in the data file camshaft journal diameter measure-
ment gageRR. We simulated the use of three operators who measure each of the 10 parts
twice for a total of 60 measurements.
The plan we recommended in Section 7.2 requires the same number of measure-
ments in total as the typical R&R investigation but uses fewer parts and takes longer. We
can add multiple gages and time periods to the R&R plan—see the AIAG Measurement
System Analysis manual (1995)—but then the standard MINITAB gage R&R routine
cannot be used.
In any case, using the traditional gage R&R analysis stdev(due to process) is poorly
estimated. In Table 6.3, the relative precision for estimating a standard deviation with a
sample size of 50 is about ±20% For a sample size of 10, relative precision is approxi-
mately ±50%. In other words, there is a huge uncertainty.
Alternately, we recommend estimating the overall variation in the baseline investigation
that uses hundreds of parts. We can then calculate the variation in the true values, stdev(due
to process), using the square root formula
To use this idea in the MINITAB gage R&R analysis, we need to enter the historical
standard deviation. The detailed MINITAB results for the gage R&R are:
H1212Supp7.qxd 4/1/05 10:29 AM Page CD–216
Gage R&R
Source VarComp
In the example we have 100/D = 14.42, or D = 6.9. Thus, we conclude that the measure-
ment system is adequate. Note that by providing an estimate for stdev(overall) we have not
used the estimate for stdev(due to process) derived just from the gage R&R (6.89 in the
example).
The estimate for stdev(due to measurement) derived from the gage R&R results differs
from the 0.756 we obtained in Section 7.2 based on the same data. The gage R&R results are
determined using a random effects ANOVA model, while the Section 7.2 results come from a
fixed effects ANOVA model with different assumptions. See Feder (1974) for a discussion of
the difference between fixed and random effects models. For our purposes, neither model is
ideal. However, since we are looking for large effects, either model would suffice.
The AIAG standard for measurement systems is that the measurement variation must
be less than 30% of the overall variation. This criterion corresponds to a discrimination
ratio D, as defined by Equation (7.2), greater than 3.2. We recommend a more liberal cutoff
point for problem solving, since by using the baseline data we have a better estimate for
stdev(process). The more stringent AIAG standard applies to measurement systems used
for process control and part inspections.
The gage R&R analysis should be supplemented by graphical displays to check assump-
tions underlying the numerical calculations. The useful default plots automatically provided
by MINITAB are given in Figure S7.4.
We can examine the various graphs looking for evidence of outliers and interactions
between operators and parts. In the example, we see no concerns.
Gage name:
Date of study:
Reported by:
Gage R&R (ANOVA) for measurement Tolerance:
Misc:
100
80 0
60
40
20 –10
0
Gage R&R Repeat Reprod Part-to-part 1 2 3 4 5 6 7 8 9 10
Part
R chart by operator By operator
3 1 2 3 10
Sample range
UCL = 2.528
2
0
1
R = 0.7738
–10
0 LCL=0
1 2 3
Operator
Xbar chart by operator Operator*part interaction
Operator
1 2 3 10
Sample mean
10 1
2
Average
UCL = 0.3536 3
0 Mean = –1.809 0
LCL = –3.264
–10 –10
1 2 3 4 5 6 7 8 9 10
Part
Isoplot
From the ANOVA results, we estimate stdev(due to measurement) as 0.829 and we can estimate
( )
stdev(due to process) as 6.55 6.6032 - 0.8292 . The plan uses 30 parts to estimate the
process variation. This is an improvement over the gage R&R. With only 30 parts, the rel-
ative precision for estimating a standard deviation is approximately ±25%. We prefer to
derive an estimate for stdev(due to process) by combining the results of the baseline and
measurement investigations.
We also plot the first versus second measurement on each of the selected parts as in
Figure S7.5. This scatter plot is sometimes called an Isoplot. We can see the process variation
by looking at the spread of values on either axis. Note that a point on the 45° line corre-
sponds to the two measurements on that part being identical. If all the plotted points are
clustered tightly around the 45° line, then we know that the measurement system standard
deviation (or the appropriate component, depending on the plan) is small relative to the
process variation.
H1212Supp7.qxd 4/1/05 10:29 AM Page CD–219
10
Measurement 2
0
–10
–10 0 10
Measurement 1
Figure S7.5 Scatter plot of the measurement results (dashed line represents
“measurement 1” = ”measurement 2,” the 45° line).
The scatter plot is valuable because we can easily see the comparison of the measurement
variation and process variation. We can also identify any outliers on the plot.
The Isoplot method has several deficiencies:
• There is no time-to-time component in the plan. Most Isoplot investigations are
carried out as quickly as possible and miss the day-to-day or week-to-week
component of measurement variation.
• Since all second measurements are conducted under identical conditions, only one
family of causes that influence measurement variation is assessed at a time.
• The use of 30 parts and 30 repeated measurements is not enough to get good
estimates of the process and measurement variation, respectively.
Bhote and Bhote (2000) recommended the use of the Isoplot and a minimum measure-
ment discrimination ratio D of 5 before proceeding to the next stage of their process-
improvement algorithm. We believe this is too conservative.
To detect differences in the two groups, we compare the observed sample averages in
the two groups. Under conceptual repeated sampling and measuring, the standard error
of an average decreases by a factor 1 n as the sample size n increases. As a result,
the sample size needed to maintain the desired power is greater by a factor
Operator 1 Operator 3
Frequency
Operator 2
Measurement error
The overall system bias can be zero even if each operator bias is not. Also, the differ-
ences in operator bias contribute to measurement variation. This is another example of the
combination of group-to-group and within-group variation that we discussed in Chapter 2
and its supplement.
In looking for opportunities to improve a measurement system, we can stratify by
operator. Suppose that we identify large differences among the operator-specific attributes.
Then, we know that these differences arise because of differences in the methods among
the operators. If we can standardize the method, we will see substantial improvement in
the overall measurement system attributes.
We use the camshaft journal diameter data from the plan described in Section 7.2 to
look at differences among operators. In this plan, each operator measured each part six
times. Since we do not know the true value of the diameters, we calculate
for each of the 54 measured diameters. We show these data plotted by operator in Figure S7.7.
The three operators have roughly the same measurement variation, but some (a small
component) of the total variation comes from differences in relative bias between the operators,
since the average “diameter minus part average” differs between operators.
H1212Supp7.qxd 4/1/05 10:29 AM Page CD–223
–1
–2
1 2 3
Operator
We may see many different patterns in plots such as Figure S7.7 that will lead us to
different actions such as:
• No differences among operator measurement bias and variation—look
elsewhere for opportunities to improve the system.
• Different measurement variation or different relative biases for each operator—
look for systematic differences among the methods used by the different
operators.
We can do the same analysis with other inputs to the measurement process such as gages,
environmental inputs, and time.
H1212Supp7.qxd 4/1/05 10:29 AM Page CD–224
H1212Supp9.qxd 4/1/05 10:30 AM Page CD–225
Chapter 9 Supplement
Finding a Dominant Cause Using the
Method of Elimination
CD–225
H1212Supp9.qxd 4/1/05 10:30 AM Page CD–226
Figure S9.1 Example of conventional brainstorming to list all potential dominant causes.
The cause-and-effect diagram is a hindrance for the method of elimination. The fami-
lies are not constructed with the idea of elimination. In the output shaft example, we can-
not think of an investigation that would eliminate all but one of the five families. If the main
branches are considered families, there are too many families to start and time is wasted
getting the detailed list of causes within each main branch. We have been involved in proj-
ects where considerable time was spent arguing about the labels for the main branches of
the cause-and-effect diagram and which causes went where.
The second difference is the method of prioritizing the causes. With the method of
elimination, at each step there are only a few families, and we can let the process do the voting
by carrying out an appropriate investigation. On the other hand, with a cause-and-effect
diagram and multivoting, for example, there is no guarantee that the dominant cause is
anywhere near the top or even on the final list.
In the method of elimination, we assume there is a single dominant cause of the baseline
variation. This assumption can be wrong in two ways:
• There are two (or perhaps three) causes, each with a large effect.
• There is no dominant cause, only a large number of causes, each with a
relatively small effect.
If there are two (or three) dominant causes, each with a large effect, that live in different
families, we cannot rule out either of those families. We then investigate the families
separately to search for the dominant causes. Once we isolate the dominant causes, we
may need to choose a different variation reduction approach to address each of them.
The second case in which there is no dominant cause contradicts the Pareto Principle.
We believe this to be a rare occurrence. We may find a dominant family but no single cause
within that family. We may rule out any of these families and then decide, in the next split,
that there are two dominant families. As we partition the causes further, we will never be
able to explain the full extent of variation found in the baseline investigation with a family of
only a few causes.
In our experience, the search for the dominant cause is often abandoned in these cases.
The consequence is that none of the caused-based variation reduction approaches is an
available remedy. We must select a working approach that does not require the identifica-
tion of the cause.
H1212Supp9.qxd 4/1/05 10:30 AM Page CD–228
H1212Supp10.qxd 4/1/05 10:30 AM Page CD–229
Chapter 10 Supplement
Investigations to Compare Two
Families of Variation
The important result in the ANOVA output is the Pooled StDev, which is an estimate of
the process standard deviation if all of the position averages are equal. In this example, the
Pooled Stdev equals
CD–229
H1212Supp10.qxd 4/1/05 10:30 AM Page CD–230
Pooling the standard deviations within each position is the method that we used to estimate
the measurement system variation in Chapter 7, where we pooled the standard deviations
within each part.
Since the baseline standard deviation for rod thickness was 11.023 (see Chapter 6), we
can make substantial improvement by aligning the position averages. From the plots pre-
sented earlier, we need to increase the thickness center for position 3. We can adopt the
Move the Process Center approach to learn how to make the change. We can use the ANOVA
results to assess the maximum benefit from this process change.
We consider a second example to illustrate some further complexities in interpreting
ANOVA results.
Block Porosity
Consider the block porosity example in Chapter 10. The data are given in the file engine
block porosity multivari. Recall that porosity was measured on five blocks molded consec-
utively, once per hour over one shift. The goal was to identify which of the two families,
hour-to-hour or mold-to-mold, was the home of the dominant cause. We apply a one-way
ANOVA with porosity as the output and hour as the only term in the model. The results are:
The Pooled StDev, 72.3, at the bottom of the MINITAB results gives an estimate of the
variation due to causes that act in the mold-to-mold family.
Using the overall standard deviation normally available from the baseline investigation,
we estimate the component of the variation due to the hour-to-hour family as
Here the team did not conduct a baseline investigation for the new measure of
porosity. They judged that they had seen the full extent of variation during the one-day
investigation, because the proportion of scrapped blocks matched the historical level.
As a result, we estimate the total variation by looking at the variation in porosity over
the course of the investigation. We can use MINITAB to produce this estimate
stdev(total) = 107.3:
An alternative is to take into account the special structure in the data. We can indirectly
estimate the total variation using the available data by specifying hour as a random effect
in the one-way ANOVA. Here the MINITAB results are:
Based on this result, we estimate the variation associated with the hour-to-hour family
as 7017 = 83.8, and of the total variation as
In this example, the two methods give similar estimates for the total variation.
H1212Supp10.qxd 4/1/05 10:30 AM Page CD–232
Here, neither family is dominant. This is not surprising since we only resort to ANOVA
when the conclusion from the plots is not obvious. However, we see that if we could elim-
inate the mold-to-mold variation (or at least a substantial portion of it), we could substan-
tially reduce the overall variation. The major advantage of using ANOVA when there is no
obvious dominant family is that we can quantify the contributions of the various families.
We can then decide which family we want to pursue. We expect there to be a dominant
cause within each family.
Note also that ANOVA Pooled StDev does not capture the interaction between causes
in the two families, which is apparent in Figure 10.9. This interaction was the most impor-
tant discovery in the investigation. We can see the interaction in the ANOVA results since
the StDev values for each hour are very different (for example, 26.3 for hour 1 and 149.6
for hour 8). The calculated Pooled StDev masks these differences.
Hours 2,5,8
Porosity
Porosity
Range of cause
for hours other
than 2, 5, and 8
Hours 1,3,4,6,7
Figure S10.1 Possible explanations for engine block porosity results (o—observations for
hours 2, 5, and 8; x—observations for other hours).
H1212Supp10.qxd 4/1/05 10:30 AM Page CD–233
In the left-hand panel, the range of values for the dominant cause for hours 2, 5, and 8
is different than the range for the other hours. In the right-hand panel, the relationship
between the cause and porosity depends on the time period, and in hours 2, 5, and 8 the
process is much more sensitive to variation in the dominant cause. This pattern in the right
panel indicates there is a strong interaction involving the plotted cause and some other
input that changes from hour to hour.
In the example, the range of the dominant cause was different in hours 2, 5, and 8. At
these times, the pouring temperature was lower and varied substantially, since during breaks
the iron cooled.
There is a similar explanation for a change in variation between groups when there is
no corresponding change in the process center, as in the hypothetical V6 piston diameter
illustrated in Figure 10.17. The two possibilities are illustrated in Figure S10.2, where the
horizontal and vertical axes give the full range of variation for the dominant cause and the
output. We have:
• Left panel: The variation in the dominant cause is different for streams A and B
• Right panel: The effect of changes in the dominant cause is much greater for
stream B than stream A
We can also have a combination of the two scenarios.
If we observe such patterns, we cannot rule out the within-strata or strata-to-strata families.
However, we do get a strong clue about the nature of the dominant cause. To proceed, for
example in the V6 piston investigation, we consider causes within stream B and look for dif-
ferences in causes or effects between the two streams. We have a big advantage because we
know it is possible to run process (that is, stream A) in a way that results in substantially less
output variation.
Stream B
Output
Output
Stream A
Figure S10.2 Possible explanations for pattern in Figure 10.17 (observation from streams
A and B are denoted with x and o, respectively).
H1212Supp10.qxd 4/1/05 10:30 AM Page CD–234
We see that:
• There are statistically significant differences among the position average
thicknesses, since the p-value 0.000 in the F-test is so small
• Some of the 95% confidence intervals for the position means do not overlap in
the plot
We conclude that there are position-to-position differences among the averages. We
cannot conclude, however, that position is a dominant cause.
We see the same difficulties, as illustrated here for ANOVA, with nonparametric tests
based on end-counts (due to Tukey, 1959) as recommended by Bhote and Bhote (2000) and
others. The tests can identify groups with statistically significant differences but do not tell
us if the dominant cause acts in the group-to-group family.
In general, with large sample sizes, a formal hypothesis test will show that there are
statistically significant differences among group averages. The small difference may have
no practical significance, however. We are searching for a dominant cause that explains a
large part of the variation in the output, not for causes with small but statistically significant
H1212Supp10.qxd 4/1/05 10:30 AM Page CD–235
effects. The dominant cause will produce significantly different averages, but the converse
is not necessarily true. For this reason, we do not rely on hypothesis tests to help in the
search for a dominant cause.
Operations Swap
Operations swap (Ingram, 1993) is feasible if:
• Differences between two parallel processes are a dominant cause of variation.
• We can swap parts between the two parallel processes at some intermediate
processing step.
The idea is simple. We swap the production path of parts moving through the process at
a given processing step (for example, halfway through the process). Figure S10.3 illustrates
the plan with two parallel lines and two operation steps. The solid lines show the usual
process path through lines 1 and 2. The dashed lines show how we propose to swap the path
of parts between operations A and B temporarily for the purposes of the investigation.
We keep track of the parts that are swapped. Then, we compare the output of the two
streams of swapped parts with the performance of the regular processes. We know
there was a substantial difference in performance between the two production lines
before the swapping. If the performance of the swapped parts that went through operation A
in line 1 and operation B in line 2 matches the performance of line 1 before the swapping
investigation, and similarly for line 2, then we can conclude that the dominant cause of the
difference between the two lines is in operation A or earlier.
An operations swap investigation can be logistically difficult. We need to continue swap-
ping, tracking swapped parts, and measuring finished parts (though it is not necessary to
swap all parts in the given time period) until we see the full extent of variation in the output.
Swapping parts from line 1 to line 2 and vice versa is not necessary. We could draw
conclusions just by swapping in one direction. However, the double swap is useful to
ensure nothing else has changed in the process and, for logistical reasons, to balance the
load on the two production lines.
Randomized Sequencing
Randomized sequencing (Asher, 1987) is feasible if:
• The process output exhibits a known pattern of variation—for example, a drift—
or sustains shifts.
• We can change the processing order of parts at an intermediate operation and
later identify the order used.
Suppose the output drifts with a predictable pattern over each day. To conduct the
randomized sequencing plan, we:
• Select a sample of parts spread out over one day after operation A.
• Record the time for each sampled part and set it aside.
• Randomize the order of these parts.
• Process and track the parts through operation B.
• Measure the output.
This plan is illustrated in Figure S10.4.
If the dominant cause lives in operation A or upstream, we will see the predicted drift
when we plot the output against the time of processing through operation A. If we do not
see the predicted drift, then we know the dominant cause lies in operation B.
Order Order
ABCDEFGHIJ BCFGDEJIHA
Operation A Operation B
Suppose, for the moment, that the assembly variation is dominant—that is, stdev(due
to assembly) is much larger than stdev(due to components). When we select two parts with
extreme output values, we can be confident that the reason the values are extreme is that the
contributing values from the assembly are also extreme. This must be true since, by our
hypothesis, the contributions from the components are not highly variable and hence cannot
produce the extreme values.
When we disassemble and reassemble the parts, the contribution of the components
does not change. However, the assembly contributions will change, and since we started
with extreme values, we expect that there will be large changes to the output.
On the other hand, if the dominant cause lives with the components—that is, stdev(due
to components) is much larger than stdev(due to assembly)—then the reason for the extreme
values is a large component contribution. When we disassemble and reassemble such parts,
we will continue to see extreme values for the output since the component contribution is
not changed.
H1212Supp10.qxd 4/1/05 10:30 AM Page CD–238
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–239
Chapter 11 Supplement
Investigations to Compare Three or
More Families of Variation
Casting Thickness
At a foundry, a team was assigned the task of reducing variation in thickness, specified at
four locations on each casting. A baseline investigation found that the full extent of variation
in thickness (as measured from nominal) was –20 to 35 thousandths of an inch. The team
found that the measurement system was acceptable and decided to look for a dominant
cause of thickness variation.
They conducted a multivari investigation in which they measured thickness at the four
positions for three consecutive castings from each of the six cavities in the mold. They
sampled each hour for six hours on two different days to get a total of 864 thickness mea-
surements on 216 castings. The data are available in the file casting thickness multivari.
This investigation will allow us to look at the following families and their interactions:
• Casting-to-casting
• Position-to-position
• Cavity-to-cavity
• Hour-to-hour
• Day-to-day
CD–239
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–240
Plotting a histogram of the multivari data, as shown in Figure S11.1, we see close to the
full extent of variation from the baseline.
We give a number of multivari charts in Figure S11.2 looking at the individual families.
We have defined a new input time that sequentially numbers the 12 time periods representing
all the possible combinations of hour and day and a new input group to uniquely identify the
sets of three consecutive thickness measurements within a position at the different time
periods. We used group = (time – 1)*24 + (cavity – 1)*4 + position.
The multivari charts are not easy to interpret, since there is no single dominant cause.
We can attribute a large amount of variation to the position-to-position, cavity-to-cavity, and
casting-to-casting families (as shown by the multivari chart by group). In this example, we
can quantify the components of variation that can be attributed to the various families. Using
the model from Chapter 2, if the causes in separate families act independently, we have
In the ANOVA calculation, the total sum of squares is similarly partitioned into sums
of squares associated with the various families. See Neter et al. (1996) and Box et al.
(1978) for a general background on ANOVA. See also de Mast et al. (2001) for a more for-
mal discussion of the connection between multivari charts and ANOVA.
We propose to fit an ANOVA model with all possible terms involving the inputs whose
effects are expected to be systematic. For example, we might expect a systematic differ-
ence (in the average thickness) among the four positions. We would not expect systematic
variation among the measurements taken at one position from consecutive castings. That
is, we would be surprised if the average of the first measurement over all cavities and times
was substantially different from the average of the second measurement.
9
8
7
6
Percent
5
4
3
2
1
0
–20 –10 0 10 20 30 40
Thickness
30 30
20 20
Thickness
Thickness
10 10
0 0
–10 –10
1 2 3 4 5 6 1 2 3 4
Cavity Position
30
30
20
20
Thickness
Thickness
10
10
0 0
–10 –10
1 2 3 4 5 6 7 8 9 10 11 12 1 144 288
Time Group
Position
1
30 2
3
20 4
Thickness
10
–10
1 2 3 4 5 6
Cavity
See Appendix D for details on how to fit the ANOVA model in MINITAB. We get the
following results when we fit an ANOVA model with time, cavity, and position (the three
inputs expected to have systematic effect) and all interactions among these three inputs.
Source DF SS MS F P
time 11 9008.82 818.98 31.81 0.000
cavity 5 16994.99 3399.00 132.00 0.000
position 3 16697.24 5565.75 216.15 0.000
time*cavity 55 1544.02 28.07 1.09 0.311
time*position 33 7580.19 229.70 8.92 0.000
cavity*position 15 5363.43 357.56 13.89 0.000
time*cavity*position 165 1634.06 9.90 0.38 1.000
Error 576 14832.00 25.75
Total 863 73654.75
To compare the relative sizes of the families, we look at the sum of squares (SS) col-
umn. We see large, roughly equal-sized effects due to position, cavity, and error. This
matches our conclusion from the multivari charts. The calculated sum of squares tell us
approximately how much we could expect to reduce the total sum of squares if we could
eliminate all variation due to the given family.
It is not easy to translate these sums of squares into estimates for the standard deviation
attributable to each family, but there is a rough correspondence. Recall that we are looking
for dominant causes. For example, if we eliminate cavity-to-cavity variation, the total sum
of squares is reduced by 16994.99 to 56659.76 and the estimated overall standard deviation
is reduced from 9.24 ( 73654.75 863 ) to 8.10 ( 56659.76 863 ).
The error sum of squares is relatively large at 14832. This sum of squares includes the
effect of the casting-to-casting family (recall that the casting-to-casting family is expected
to have haphazard effect) and all interactions between the casting-to-casting family and the
other families. We look more at this family in the next section.
As there was no single dominant family, the team proceeded by trying to address each
of the three large families separately.
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–243
• Three consecutive blocks selected from the process each hour for 80 hours
• Three heights and two orientations at each height for each bore
We give the 11,520 observations in the file block bore diameter multivari. To reduce
the number of families, we define a new input, called position, with six values, which
labels the two different orientations at three heights. If the position-to-position variation is
found to be large, we can look more closely at this family.
We find multivari charts, such as Figure S11.3, difficult to interpret because of the
large number of time periods used in the investigation. The chart does show that there are
several odd values about a third of the way through the data set that we should investigate
further.
Bore
30 1
2
3
20 4
5
Diam1000
6
10 7
8
–10
1 80
Time
To try to understand the results, we fit a full ANOVA model using all families (except
block-to-block) and their interactions:
Source DF SS MS F P
time 79 24602.93 311.43 56.34 0.000
bore 7 951.65 135.95 24.60 0.000
position 5 10614.80 2122.96 384.09 0.000
time*bore 553 15568.13 28.15 5.09 0.000
time*position 395 5344.71 13.53 2.45 0.000
bore*position 35 4417.75 126.22 22.84 0.000
time*bore*position 2765 2935.01 1.06 0.19 1.000
Error 7680 42449.46 5.53
Total 11519 106884.44
The two largest sums of squares correspond to Error and time. We conclude that
dominant causes of variation act in the block-to-block family (or in an interaction involv-
ing the block-to-block family) and the time-to-time family. The team proceeded to look for
dominant causes within each of these families.
batch, hour, and position, and all possible interactions among these inputs. This model
allows us to quantitatively assess the relative sizes of the effects due to each of the families
and interactions among the families.
Source DF SS MS F P
grinder 1 11855.39 11855.39 1230.72 0.000
batch 2 2.15 1.07 0.11 0.894
position 7 298.55 42.65 4.43 0.000
hour 3 12.40 4.13 0.43 0.732
grinder*batch 2 22.54 11.27 1.17 0.311
grinder*position 7 47.14 6.73 0.70 0.673
grinder*hour 3 53.57 17.86 1.85 0.136
batch*position 14 223.45 15.96 1.66 0.060
batch*hour 6 85.25 14.21 1.48 0.184
position*hour 21 238.67 11.37 1.18 0.261
grinder*batch*position 14 92.57 6.61 0.69 0.789
grinder*batch*hour 6 112.93 18.82 1.95 0.070
grinder*position*hour 21 153.74 7.32 0.76 0.770
batch*position*hour 42 487.06 11.60 1.20 0.180
grinder*batch*position*hour 42 300.26 7.15 0.74 0.886
Error 768 7398.07 9.63
Total 959 21383.75
From the ANOVA results, we see (as expected) that the largest sum of squares is asso-
ciated with the family of causes that differ between the two grinders. The sums of squares
for any other systematic family (that is, batch, hour, position, and interactions) are much
smaller. The error sum of squares, representing the part-to-part family, is also fairly large.
We explore the part-to-part family further in the next section.
Comments
We can use a random-effects ANOVA model that ignores the systematic nature of the
effects as an alternative analysis. See Feder (1974) for a background on random-effects
models. However, it is incorrect to assume a random effect for a family like position in the
casting thickness example since there are only four positions in total. The random-effects
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–246
ANOVA model provides estimates of the variance components attributable to the various
families. In other words, the random-effects model provides estimates of stdev(due to a
particular family).
In the casting thickness example, the results from the ANOVA model and the multivari
charts match closely. The ANOVA has the advantage of quantifying the contributions of the
various families and, in one step, allowing us to look at interactions among the families.
However, note that ANOVA models are looking for variation caused by differences in average
output for different level of the inputs. They are not good at detecting other patterns such
as changes in output variation across different levels of the inputs or outliers. We recom-
mend first looking at graphical summaries of the data such as multivari charts before fitting
an ANOVA model.
If there are no families included in the multivari investigation that can be expected to
have a haphazard effect, the proposed analysis strategy where we fit an ANOVA model with
all inputs and interactions will not work in MINITAB. Although the appropriate sums of
squares (which we use to draw conclusions) could still be calculated, MINITAB will not pro-
ceed because there are no degrees of freedom left to estimate the error sum of squares. If
there are no haphazard families, we recommend leaving out one of the families in the
ANOVA model. The error sum of squares is then attributable to the left-out family plus all
interactions between the left-out family and other families in the model. If the error sum of
squares is large, we redo the analysis leaving out a different family, continuing until we find
the home of a dominant cause.
For large data sets or multivari investigations with many families, we have difficulty
displaying the effect of families, such as part-to-part, that are expected to have a hap-
hazard rather than systematic effect. To explore the variation due to a haphazard family,
we define a new input that uniquely numbers the groups of consecutive parts. For
instance, in the cylinder head example from Chapter 11 we defined group as (time – 1)*4
+ pattern. Similarly, in the fascia cratering example we used group = (time – 1)*2 +
(machine – 1).
We illustrate the analysis to explore haphazard families with two examples.
Casting Thickness
The casting thickness example was introduced in the previous section. We expect the effect of
the casting-to-casting family to be haphazard. We define a new input group that numbers the
288 different sets of consecutive thickness measurements from each different combination of
time, cavity, and position—that is, we use group = (time – 1)*24 + (cavity – 1)*4 + position.
Fitting a one-way ANOVA model with group gives:
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–247
The ANOVA divides the overall variation (as measured by the total standard deviation
73654.7 / 863 = 9.24) into two components.
Pooled StDev is an estimate of the variation attributable to the casting-to-casting fam-
ily and all interactions between the casting-to-casting family and other families. In other
words, if we could eliminate all group-to-group differences due to changes in time, cavity,
and position, we estimate that the process standard deviation would be reduced to 5.074.
That is, there is a dominant cause acting in the group-to-group family. We examined the
components of this family (time, cavity, and position) in the first section of this supplement.
Here, the pooled standard deviation is relatively small. In other cases, there may be a
dominant cause in the part-to-part family. To demonstrate the methodology, suppose there
is a dominant cause in the casting-to-casting family and we explore the casting-to-casting
family further. We define a new output characteristic, the within-group standard deviation
for each group. We then repeat the multivari analysis using the new output group stdev. To do
this we need to define new input characteristics that correspond to the levels of the other
families associated with each group. In the example, we define stime, scavity, and sposi-
tion, which correspond to the time, cavity, and position associated with each group, respec-
tively. See the discussion on multivari charts in Appendix C for details on how to do this
easily with MINITAB.
In the multivari chart, we are looking for any systematic change in the casting-to-casting
variation over the other families. Note that these changes reflect an interaction with the
casting-to-casting family. Looking at the multivari charts in figures S11.4 and S11.5 we
conclude that the largest effect is due to a time-by-cavity interaction. At some particular
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–248
15 15
Group stdev
Group stdev
10 10
5 5
0 0
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6
Stime Scavity
15
Group stdev
10
1 2 3 4
Sposition
Figure S11.4 Multivari charts for casting thickness group standard deviation.
Scavity
15 1
2
3
4
10 5
Group stdev
1 2 3 4 5 6 7 8 9 10 11 12
Stime
Figure S11.5 Multivari chart for group stdev, showing cavity by time interaction.
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–249
times and for some cavities, the variation in thickness for three consecutive castings
changes substantially. We are looking for a cause that can explain such process behavior.
The interpretation of the results is difficult because the output is already a measure of
variation. We see that for some times (for example, time 4) the variation in each group is small
relative to other times. Also, for some times (for example, times 3, 5, and 12) the variation
within a group varies for different cavities, while at all other times the variation within a group
is similar for all cavities. From the interaction plot in Figure S11.5, we see that at some times,
cavities 5 and 6 result in either more or less variation within a group than the other cavities.
This matches the conclusion from fitting an ANOVA model with all but the three-input inter-
action term, as given:
Source DF SS MS F P
stime 11 705.022 64.093 22.19 0.000
scavity 5 22.070 4.414 1.53 0.184
sposition 3 14.738 4.913 1.70 0.169
stime*scavity 55 715.179 13.003 4.50 0.000
stime*sposition 33 288.226 8.734 3.02 0.000
scavity*sposition 15 70.297 4.686 1.62 0.073
Error 165 476.570 2.888
Total 287 2292.102
The conclusion is that the dominant cause acts casting to casting, but its effect varies
with cavity and time. The team now must ask, “What causes vary in a way that matches this
pattern?”
7 7
6 6
Group stdev
Group stdev
5 5
4 4
3 3
2 2
1 1
1 2 3 1 3 5 7
Sbatch Shour
7
7
6
6
Group stdev
Group stdev
5 5
4 4
3 3
2 2
1 1
1f 1r 2f 2r 3f 3r 4f 4r A B
Sposition Sgrinder
Figure S11.6 Multivari chart for group stdev for camshaft journal diameter multivari.
We conclude from Figure S11.6 that there is no evidence of any effects that change the
within-group standard deviation. The dominant cause of the part-to-part variation acts in a
similar way over all grinders, batches, and times.
where b0 and b1 are unknown model parameters that represent the y-intercept and slope,
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–251
respectively, of a straight line that summarizes the relationship between the two diameter
measurements. The parameters b0 and b1 are estimated based on the available data.
Fitting a regression model (see Appendix E) to the data gives:
Analysis of Variance
Source DF SS MS F P
Regression 1 836.67 836.67 558.58 0.000
Residual Error 94 140.80 1.50
Total 95 977.47
The estimated model parameters for b0 and b1 are given as 64.3 and 0.884, respectively.
The estimated slope parameter (0.884) can be interpreted as the change in the average diam-
eter after Operation 310 for every unit change in the diameter after Operation 270. Often, as
in this example, the y-intercept parameter estimate will have no useful interpretation, since
it tells us the average diameter after Operation 310 if the diameter after Operation 270 is
zero. The regression equation is a straight line that best summarizes the relationship between
the piston diameters after Operation 270 and Operation 310. As we see in Figure S11.7, in
this case the relationship is strong. The value of s = 1.224 from the MINITAB output is an
estimate of the standard deviation in the final diameter if we hold the diameter after
599
Diameter after OP310
594
589
584
Figure S11.7 Plot of final diameter versus diameter after Operation 270.
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–252
599
589
584
Figure S11.8 Plot of final diameter versus diameter after Operation 200.
Operation 270 fixed. We can compare this standard deviation to the baseline value 3.32 to
determine that the input (in this case the diameter after Operation 270) is a dominant cause.
In the same example, we can also fit a regression model using the diameter after
Operation 200 as the input. From MINITAB, we get the plot in Figure S11.8 and the fol-
lowing numerical results:
From these results, it is clear that the relationship between the diameter after Operation
200 and the final diameter (after Operation 310) is much weaker than between the diame-
ter after Operation 270 and the final diameter. In the second regression model, the estimate
of the residual variation, s = 2.862, is close to the baseline standard deviation 3.32. If we
were to hold the diameter at Operation 200 fixed, we would see little change in the variabil-
ity after Operation 310. We can rule out all causes that act up to and including Operation
200 as possible dominant causes of final diameter variation.
Comments
In the context of variation transmission investigations, we do not recommend fitting a
regression model to try to explain the final output as a function of two or more intermedi-
ate output measurements. Strong correlation between the inputs in the regression model
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–253
(that is, the intermediate diameter measurements) makes interpretation of the results difficult.
The residual variation (given by s in the MINITAB results) would estimate the residual
standard deviation if we could hold all the inputs fixed.
We use regression models in Chapter 12 and discuss them further in the supplement to
Chapter 12.
Comments
With the Bhote and Bhote (2000) procedure, the length of the search depends on engineering
judgment regarding which components are likely to be most important. Also, Amster and
Tsui (1993) provide some examples of when the Bhote and Bhote method would yield
incorrect conclusions.
As suggested by Tippett (1934), Taguchi (1987), and Parmet and Steinberg (2001), we
could also conduct a component-swapping experiment without using the idea of leveraging;
that is, by choosing components from regular production. This strategy requires considerably
more pairs of parts and a greater number of disassemblies and reassemblies.
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–254
The Bhote and Bhote (2000) component swap procedure matches the factorial compo-
nent swap experiment as described in the next section, where we do not conduct all swaps
once we believe the results are clear.
We believe iteratively forming two groups of components and using the method of
elimination is a more efficient strategy (see Section 11.3).
Treatment G1 G2
1 Low Low
2 Low High
3 High Low
4 High High
Treatments 1 and 4 correspond to the two original assemblies and treatments 2 and 3 cor-
respond to swapping the components in G1 (or G2) between the two assemblies.
With three groups of components, we disassemble and reassemble to get all eight pos-
sible combinations:
Treatment G1 G2 G3
G2 G2
High 7 High
Low Low 1
G3 High G3 Low
We fill the other six cells of the table with the results from three swaps. We might get:
G2 G2
High 7 1 High 7 1
Low 7 2 Low 6 1
G3 High G3 Low
Here, since the results in the two tables are similar, we can eliminate all the components in
G3 and interpret the result in either table as in Chapter 11. In this example, the dominant
cause acts in G2.
To help us make the correct interpretation, we can generate three pairs of tables similar
to the previous set using a different group to stratify the pair of tables. If none of the three
pairs of tables gives similar results across the two 2 ¥ 2 tables, the dominant cause involves
a component (or more than one) in each of the three groups. This is unlikely.
H1212Supp11.qxd 4/1/05 10:31 AM Page CD–256
H1212Supp12.qxd 4/1/05 10:32 AM Page CD–257
Chapter 12 Supplement
Investigations Based on Single Causes
CD–257
H1212Supp12.qxd 4/1/05 10:32 AM Page CD–258
Plastisol amount
Glass flushness
Slider drain
hole gap
Urethane stop/
start gap
Backglass gap
Figure S12.1 Matrix scatter plot for window leaks group comparison (leakers are denoted
by o and nonleakers by +).
In output/input investigations with large numbers of inputs, examining all the scatter
plots can be tedious, even when using a draftsman plot. In this context, another useful tool in
MINITAB is best subsets regression (see Appendix E). With best subsets regression, we can
ask MINITAB to fit all possible regression models involving a single input (or larger numbers
of inputs, but that is not needed in this context). The results rank the possible models by the
estimated residual standard deviation s. For illustration, results from using the best subsets
regression routine for the crossbar dimension example are:
Best Subsets Regression: dimension versus die temp, nozzle temp, ...
Response is dimension
d n b h c
i o a y a
e z r d v
z r r i
t l e a t
e e l u y
m l
Vars R-Sq R-Sq(adj) C-p S p t t i p
H1212Supp12.qxd 4/1/05 10:32 AM Page CD–259
We see that barrel temperature is a dominant cause, because the residual standard deviation
for the model with barrel temperature is so small relative to the baseline value of 0.46.
In the crossbar dimension example, there are only five inputs. Using best subsets regression
is unnecessary, since we get the same information from Figure S12.2. One danger with relying
on the best subsets regression is that the models are ranked according to how well a linear
model fits the data. As a result, nonlinear relationships may be missed. We strongly recom-
mend that you look at all of the scatter plots of the output versus the selected inputs.
1.39630
Dimension
0.35806
5
3
6
53
0
3
2
5
95
37
55
46
4
63
54
95
97
98
.7
.6
7.
2.
7.
.7
.6
.8
.3
3
.
83
21
14
14
96
21
75
78
81
92
Die temp Nozzle temp Barrel temp Hydraulic pressure Cavity pressure
There are some important differences in the two methods. First, in the paired com-
parison, a hypothesis test based on end-count is recommended to compare the two
groups of parts. As discussed in the supplement to Chapter 10, we do not believe that
formal hypothesis tests are necessary or appropriate when searching for a dominant
cause of variation. Significant differences do not (necessarily) correspond to dominant
causes.
A second difference is that paired comparison involves an arbitrary pairing of the parts,
one from each group. In the search for the dominant cause, pairing makes little sense. Since
the causes of variation are unknown, determining how to pair is problematic. In addition, the
goal of the comparing extreme parts is to help identify a dominant cause of variation in the out-
put. If we pair in such a way that a dominant cause is held constant within the pairs, the
analysis will fail to identify the dominant cause. With artificial pairing, the analysis results
depend on the way pairs are produced.
We do not recommend paired comparison as an alternative to group comparison.
Chapter 13 Supplement
Verifying a Dominant Cause
CD–263
H1212Supp13.qxd 4/1/05 10:33 AM Page CD–264
Chapter 15 Supplement
Moving the Process Center
CD–267
H1212Supp15.qxd 4/1/05 10:34 AM Page CD–268
1 –1 –1 –1
2 –1 –1 +1
3 –1 +1 –1
4 –1 +1 +1
5 +1 –1 –1
6 +1 –1 +1
7 +1 +1 –1
8 +1 +1 +1
1 –1 –1 +1 –1 +1 +1 –1
2 –1 –1 +1 +1 –1 –1 +1
3 –1 +1 –1 –1 +1 –1 +1
4 –1 +1 –1 +1 –1 +1 –1
5 +1 –1 –1 –1 –1 +1 +1
6 +1 –1 –1 +1 +1 –1 –1
7 +1 +1 +1 –1 –1 –1 –1
8 +1 +1 +1 +1 +1 +1 +1
H1212Supp15.qxd 4/1/05 10:34 AM Page CD–269
Table S15.4 Contrast matrix for the 23 design with output data.
Treatment A B AB C AC BC ABC Weight
1 –1 –1 +1 –1 +1 +1 –1 0.56
2 –1 –1 +1 +1 –1 –1 +1 0.17
3 –1 +1 –1 –1 +1 –1 +1 0.44
4 –1 +1 –1 +1 –1 +1 –1 0.08
5 +1 –1 –1 –1 –1 +1 +1 1.52
6 +1 –1 –1 +1 +1 –1 –1 0.37
7 +1 +1 +1 –1 –1 –1 –1 1.34
8 +1 +1 +1 +1 +1 +1 +1 0.03
Recall from Chapter 13 that the main effect of a particular input is the difference
between the average output over runs at the high level and the average output over runs at
the low level of the input. In the example, the main effect of A is
Table S15.5 Contrast matrix for the 23 design with estimated effects.
Treatment A B AB C AC BC ABC Weight
1 –1 –1 +1 –1 +1 +1 –1 0.56
2 –1 –1 +1 +1 –1 –1 +1 0.17
3 –1 +1 –1 –1 +1 –1 +1 0.44
4 –1 +1 –1 +1 –1 +1 –1 0.08
5 +1 –1 –1 –1 –1 +1 +1 1.52
6 +1 –1 –1 +1 +1 –1 –1 0.37
7 +1 +1 +1 –1 –1 –1 –1 1.34
8 +1 +1 +1 +1 +1 +1 +1 0.03
Again we have reordered the terms in the numerator of the second expression in Equa-
tion (S15.2) and you can see that to get this sum, we apply the signs from the AC column
of the contrast matrix to the data and add.
All main effects and interactions can be found by applying the signs from the appropri-
ate column of the contrast matrix to the data, adding, and then dividing by half the number of
runs. Every effect is the difference of two averages, where each average includes exactly half
the data. We show all of the effects in Table S15.5, where we add an extra row to Table S15.4.
In the analysis, we look for large (positive or negative) effects. For two-level factorial
and fractional factorial designs, MINITAB will calculate and rank the absolute value of the
effects in a Pareto diagram. We can use this diagram to isolate large effects. The effects in
the example are plotted in Figure S15.1. We see that the main effects of C and A are rela-
tively large, as is the interaction effect AC. Since there is evidence of interaction, we look
at the effects of A and C simultaneously.
We now use the contrast matrix to explain fractional factorial designs. As an example,
we start with the contrast matrix for a 24 design, that is, a design with four inputs each at
two levels. We give the contrast matrix in Table S15.6. Note the order of the columns.
A: A
C
B: B
C: C
A
AC
AB
ABC
BC
1 –1 –1 +1 –1 +1 +1 –1 –1 +1 +1 –1 +1 –1 –1 +1
10:34 AM
2 –1 –1 +1 –1 +1 +1 –1 +1 –1 –1 +1 –1 +1 +1 –1
3 –1 –1 +1 +1 –1 –1 +1 –1 +1 +1 –1 –1 +1 +1 –1
4 –1 –1 +1 +1 –1 –1 +1 +1 –1 –1 +1 +1 –1 –1 +1
5 –1 +1 –1 –1 +1 –1 +1 –1 +1 –1 +1 +1 –1 +1 –1
Page CD–271
6 –1 +1 –1 –1 +1 –1 +1 +1 –1 +1 –1 –1 +1 –1 +1
7 –1 +1 –1 +1 –1 +1 –1 –1 +1 –1 +1 –1 +1 –1 +1
8 –1 +1 –1 +1 –1 +1 –1 +1 –1 +1 –1 +1 –1 +1 –1
9 +1 –1 –1 –1 –1 +1 +1 –1 –1 +1 +1 +1 +1 –1 –1
10 +1 –1 –1 –1 –1 +1 +1 +1 +1 –1 –1 –1 –1 +1 +1
11 +1 –1 –1 +1 +1 –1 –1 –1 –1 +1 +1 –1 –1 +1 +1
12 +1 –1 –1 +1 +1 –1 –1 +1 +1 –1 –1 +1 +1 –1 –1
13 +1 +1 +1 –1 –1 –1 –1 –1 –1 –1 –1 +1 +1 +1 +1
14 +1 +1 +1 –1 –1 –1 –1 +1 +1 +1 +1 –1 –1 –1 –1
Moving the Process Center
15 +1 +1 +1 +1 +1 +1 +1 –1 –1 –1 –1 –1 –1 –1 –1
16 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1
CD–271
H1212Supp15.qxd 4/1/05 10:34 AM Page CD–272
Suppose we want to conduct an experiment and we can afford only 16 runs. With four
inputs, A, B, C, and D, we can run all treatments. The four columns labeled A, B, C, and D
give the levels of the four inputs for each treatment. If, however, we have a fifth input, E,
there are 32 possible treatments and we have the resources to conduct an experiment using
only half of the possible treatments. The question is, which half? The resulting design is
often referred to as a 25–1 design since we are using the half fraction (2–1) design with five
inputs at two levels each.
One possibility is to assign E to the column denoted ABCD in the contrast matrix in
Table S15.6. Then, we can read the 16 treatments we will use in the experiment by looking
at the five columns labeled A, B, C, D, and E. For example, with treatment 7, we have the
levels A: –1, B: 1, C: 1, D: –1, E: 1. With this choice for E, exactly half the 16 runs have E
at the high level, that is, E = +1, and the other half have E at the low level, that is, E = –1.
The same is true for the other four inputs.
To see the consequences of this choice, suppose we carry out the experiment and meas-
ure the output for each of the selected 16 treatments. We can calculate the main effects of
each of the five inputs by applying the appropriate column of plus and minus ones to the
data, adding and dividing by eight. We carry out the same calculation for every column.
To calculate the effect of the four-input interaction ABCD and the main effect for E, we
use the same column of the contrast matrix and we say these two effects are confounded. If
the calculated effect is large, this may be due to the change in input E or the four-input
interaction involving inputs A, B, C, and D. We cannot separate these two effects using the
data. However, we usually assume that three- and four-input interaction effects are small
and so, in this case, we would attribute the large effect to the main effect of E.
You should be wondering by now where to find the interactions involving E. To find the
two-input interaction AE, we multiply the A and E columns as before. We found the column
of signs for E by multiplying the corresponding columns for A, B, C, and D. We use the
convenient notation E = ABCD. Hence we have
AE = A ¥ ABCD = BCD
since, if we multiply column A by itself, we get a column of plus ones, which has no effect
on the overall product. In other words, the interaction effect AE is confounded with the
three-input interaction BCD. If this column produces a large effect, we cannot tell if this is
due to AE or BCD. Again we will attribute the effect to the lower-order interaction—in this
case, the two-input interaction AE.
You can quickly find all other two-input interactions involving E. Every column of the
contrast matrix corresponds to two effects (since we used a half fraction design). MINITAB
will produce a list of the confounded effects for the design. For the example, the list of
confounded effects (MINITAB calls these effects aliases) is:
Alias Structure
H1212Supp15.qxd 4/1/05 10:34 AM Page CD–273
I + ABCDE
A + BCDE
B + ACDE
C + ABDE
D + ABCE
E + ABCD
AB + CDE
AC + BDE
AD + BCE
AE + BCD
BC + ADE
BD + ACE
BE + ACD
CD + ABE
CE + ABD
DE + ABC
Any effects in the same row, linked by a + or –, are confounded. See Appendix F for
instructions on how to create this list. You might also wonder what happens if we start by
assigning E to a different column, say E = ABC. Since the ABC column gives the level of
E for each run, we now have a different set of 16 treatments. We use MINITAB to produce
the list of confounded effects. This time, we ask that four- and five-input interactions be
suppressed. The confounded effects with this design are:
I
A + BCE
B + ACE
C + ABE
D
E + ABC
AB + CE
AC + BE
AD
AE + BC
BD
CD
DE
ABD + CDE
ACD + BDE
ADE + BCD
H1212Supp15.qxd 4/1/05 10:34 AM Page CD–274
Resolution Meaning
With this choice, we see that main effects are confounded with three-input interactions
and, more important, pairs of two-input interactions such AB and CE are confounded. This
plan is less desirable than the design based on assigning E to ABCD.
We say that the design with E = ABC has resolution IV, because at least one pair of
two-input interactions is confounded. If we assign E to column AB, then a main effect is
confounded with a two-input interaction and the design has resolution III. The first design
discussed earlier with E = ABCD has resolution V. The higher the resolution, the less likely
it is that important effects will be confounded. We summarize the meaning of resolution in
Table S15.7.
For a fixed number of runs, the greater the number of inputs, the lower the highest possible
resolution (see Figure S15.2).
In using MINITAB, we normally assign the letters A, B, C, and so on to the candidates
and +1 and –1 to the two levels for each. It does not matter which letter is assigned to each
candidate. Then we ask MINITAB to generate the design, a worksheet of the treatments
selected for the experiment. Sometimes we want to ensure that one particular treatment is
included in the experiment—for example, the current process levels of the candidates. In
this case, we generate the design first and then assign the letters and levels so that one of
the treatments in the design has the required levels.
We recommend an analysis based on a plot of the output by treatment number, the
Pareto plot of the effects, and main effect and interaction plots. MINITAB will produce the
table of confounded effects to help interpret the important effects.
Despite the confounding, factorial experiments are useful because of the scarcity of
effects principle, which states that when there are many inputs, there are likely only a few
large effects. These effects are commonly the main effects and low-order interactions.
(–1,1) (1,1)
Center point
(0,0)
Input B
Input A
(–1,–1) (1,–1)
Summary
Most issues in the planning of fractional factorial experiments are the same as for full
factorial experiments. We need to
• Choose the k inputs and their levels
• Define a run
• Use MINITAB to select the design (that is, choose the treatments, a fraction of
all of the possibilities), mindful of the tradeoff between resolution and number
of runs
• Randomize the run order
The choice of fraction is based on cost considerations and concerns about confounding.
In fractional factorial experiments, every effect will be confounded with one or more other
effects. We choose the design with the highest possible resolution to minimize problems
associated with confounded effects. That is, we choose a design so that main effects and
two-input interactions are confounded with high-order interactions as much as possible. If we
find large effects, we typically attribute the effect to the main effect or two-input interaction.
MINITAB can help plan and analyze fractional factorial experiments. The complete
confounding structure for any fractional factorial design is given by MINITAB. Your task
is to assign the inputs and levels to the generic letters A, B, C, and so on, and the codes –1
and +1 to produce the design.
H1212Supp16.qxd 4/1/05 10:35 AM Page CD–277
Chapter 16 Supplement
Desensitizing a Process to Variation
in a Dominant Cause
The levels of other fixed inputs determine the coefficient b0. For a given level c of the
candidate, the coefficient b1 + b3c represents the effect of the cause X on the output. The
standard deviation of the output is
sd (Y ) = ( b1 + b3 c )2 sd ( X )2 + sd ( R )2 (S16.2)
assuming that the effect of the other causes, R, varies independently of the dominant cause.
In Equation (S16.1), we have modeled the interaction between the cause X and the candidate
CD–277
H1212Supp16.qxd 4/1/05 10:35 AM Page CD–278
by the product term b3Xc. We see from Equation (S16.2) that the standard deviation of the
output sd(Y) is sensitive to the level of the candidate if b3 is not equal to zero. If we can set
c = –b1 / b3, then the effect of the dominant cause sd(X) is completely eliminated and sd(Y)
is reduced to sd(R).
In practice, such a simple model rarely describes the relationship amongst the inputs
and output exactly. As well, we must estimate the coefficients b1 and b3 without error before
we can achieve the optimal reduction of variation.
If the candidate is binary (for example, supplier 1 or 2), then c in Equation (S16.1) can
take only one of two possible values—for example, –1 or +1—and we choose the level of
the candidate with the smaller value of (b1 + b3c)2.
1 –1 +1 –1 –1 +1
2 –1 –1 +1 +1 –1
3 –1 +1 +1 –1 –1
4 +1 +1 +1 +1 +1
5 +1 +1 –1 +1 –1
6 +1 –1 –1 –1 –1
7 +1 –1 +1 –1 +1
8 –1 –1 –1 +1 +1
1 –1 +1 –1 –1 +1 * *
2 –1 –1 +1 +1 –1 * *
3 –1 +1 +1 –1 –1 * *
4 +1 +1 +1 +1 +1 * *
5 +1 +1 –1 +1 –1 * *
6 +1 –1 –1 –1 –1 * *
7 +1 –1 +1 –1 +1 * *
8 –1 –1 –1 +1 +1 * *
In MINITAB, we need each row to correspond to a single run. We create the design by:
• Pasting a copy of the eight treatments into the next eight rows of the
spreadsheet
• Adding a column for the dominant cause, denoted X, with eight –1s followed
by eight +1s
We get the columns as shown in Table S16.3.
H1212Supp16.qxd 4/1/05 10:35 AM Page CD–280
–1 +1 –1 –1 +1 –1
–1 –1 +1 +1 –1 –1
–1 +1 +1 –1 –1 –1
+1 +1 +1 +1 +1 –1
+1 +1 –1 +1 –1 –1
+1 –1 –1 –1 –1 –1
+1 –1 +1 –1 +1 –1
–1 –1 –1 +1 +1 –1
–1 +1 –1 –1 +1 +1
–1 –1 +1 +1 –1 +1
–1 +1 +1 –1 –1 +1
+1 +1 +1 +1 +1 +1
+1 +1 –1 +1 –1 +1
+1 –1 –1 –1 –1 +1
+1 –1 +1 –1 +1 +1
–1 –1 –1 +1 +1 +1
Next we use MINITAB to create a custom factorial design (see Appendix F) with the six
inputs. We can get the confounding structure for this design before we collect the data by using
a column of dummy output. Any set of 16 numbers will do. The given design has the following
confounding structure, where we show only main effects and two- and three-input interactions:
I + ABD + ACE
A + BD + CE
B + AD + CDE
C + AE + BDE
D + AB + BCE
E + AC + BCD
X
AX + BDX + CEX
BC + DE + ABE + ACD
BE + CD + ABC + ADE
BX + ADX
H1212Supp16.qxd 4/1/05 10:35 AM Page CD–281
CX + AEX
DX + ABX
EX + ACX
BCX + DEX
BEX + CDX
The crossed design has the advantage that all of the two-input interactions involving
the dominant cause X and the candidates are not confounded with any other two-input
interactions. This is a powerful motive for using the crossed design given that the goal of the
experiment is to examine these interactions.
We recommend using a crossed design for desensitization experiments. There are other
possible designs, perhaps using fewer runs. See Hamada and Wu (2000).
for each of the eight treatments defined by the candidates. In the model, the residual represents
the variation in the eddy current measurements not explained by Brinell hardness. A good
treatment would have a large estimated slope b and a small residual standard deviation
(given as s in the MINITAB regression results—see Appendix E). We use |b/s| as the perform-
ance measure. To illustrate the analysis, for treatment 7 we get:
Analysis of Variance
Source DF SS MS F P
Regression 1 1.5394 1.5394 32.62 0.000
Residual Error 62 2.9261 0.0472
Total 63 4.4656
H1212Supp16.qxd 4/1/05 10:35 AM Page CD–282
The performance measure for treatment 7 is | –2.5951/0.2172 | = 12.0. The data with the
calculated performance measures are given in Table S16.4.
Treatment 7 gives the best performance. The Pareto plot of the effects given in Figure S16.1
summarizes the results across all treatments. The temperature main effect and the tempera-
ture by frequency interaction are the largest effects, but nothing stands out.
A: Freq
B B: Temp
C: Gain
AB
AC
BC
ABC
0 1 2 3 4
Figure S16.1 Pareto plot of the effects based on the performance measure.
H1212Supp16.qxd 4/1/05 10:35 AM Page CD–283
We can also look at promising treatments in more detail by plotting the eddy current by
Brinell hardness stratified by the four levels of the cause as defined by:
Cause Cleaning
combination Day time
1 1 5
2 2 5
3 1 19
4 2 19
From Figure S16.2 for treatment 7, we see that the eddy current measurement system
gives inconsistent output across the different levels of the cause. In addition, it does not work
well for any cause combination, even well cleaned parts. We also see that the average Brinell
hardness of the parts differed by day. This suggests the chemistry was indeed different
across the two days.
5.1
1
5.0 2
3
4.9
Eddy current hardness
4
4.8
4.7
4.6
4.5
4.4
4.3
Figure S16.2 Scatter plot of eddy current hardness versus Brinell hardness for treatment 7
(plotting symbols represent the four values of the cause).
H1212Supp16.qxd 4/1/05 10:35 AM Page CD–284
H1212Supp17.qxd 4/1/05 10:36 AM Page CD–285
Chapter 17 Supplement
Feedforward Control Based on a
Dominant Cause
CD–285
H1212Supp17.qxd 4/1/05 10:36 AM Page CD–286
MACRO
selectivefit2 part1 part2
let cpart2=1
let I2=1
let temp=1
let allvals=0
endif
if (I2=1) AND (I=1)
let allvals=distc
else
Stack allvals distc allvals. #store all the distance values
endif
let I=I+1
endwhile
let I2=I2+1
endwhile
To call the macro, open the MINITAB file steering wheel vibration feedforward and
copy the following command into the command line, replacing “filelocation” with the loca-
tion of the macro on your system:
Another way to simulate the effect of selective fitting is to first build models that describe
the centers of gravity for the two components. Then, in the simulation, rather than drawing
samples from the existing data, we draw samples from the models. We need to be careful that
the models are appropriate. For an example of the modeling option, see the Chapter 17 exer-
cises and solutions.
where the residual is the variation due to all other causes not explicitly included in the
model. This is called a multiple regression model. For further discussion of regression models,
see the Chapter 12 supplement and Appendix E.
We want to avoid adding nondominant causes to the model because:
• Measuring additional inputs can be expensive.
• The inclusion of these inputs in the model can lead to overadjustment and, thus, to
increased output variation.
Remember there are measurement errors, prediction errors, and adjustment errors. It does
not make sense to improve the prediction marginally.
Smoothers
A good smoother is the LOcally WEighted Scatterplot Smoother (LOWESS) available in
MINITAB (Cleveland, 1979). A LOWESS smoother models the cause/output relationship
with a smooth curve. Predictions for the output can then be determined for any input value
within the usual range. This prediction does not assume a linear relationship between the
cause and the output.
Chapter 18 Supplement
Feedback Control
CD–289
H1212Supp18.qxd 4/1/05 10:36 AM Page CD–290
To implement one of many versions of the rules, measure the output on one part at
fixed time periods, and:
• If the measured output falls in the green zone, make no adjustment.
• If the measured output falls in the red zone, make a full adjustment and use a
check (five consecutive parts in the green zone, for example) to ensure the
process is properly centered.
• If the measured output falls in the yellow zone, measure another part. If the
second measured value falls in the yellow (on the same side of the target) or red
zones, make an adjustment. Otherwise, continue with no adjustment.
Satterthwaite (1954) first introduced Precontrol, which has received considerable attention
in the research literature. For example, see Traver (1985), Shainin and Shainin (1989), Mack-
ertich (1990), and Ledolter and Swersey (1997b). For Precontrol to be successful, the drift in
the process must be relatively slow compared to the frequency of sampling and small relative
to the specification range. Note that we need to add a rule to determine the size of the adjustment.
As another alternative, we can use a control chart to monitor the process and signal the
need for adjustment. In the simplest version, we periodically measure one or more output
–
values and plot the measured values on X and R charts. We make an adjustment if there is
an out-of-control point on either chart. Again, we need a rule to decide on the size of the
adjustment. To be successful, the sampling must be frequent enough to quickly detect
changes in the process center.
Unlike Precontrol, we can also use control charts for process monitoring and for detecting
the action of a cause of variation. A good comparison of process monitoring and feedback
control is given in Box and Kramer (1992).
In the simulation, we keep track of the cumulative adjustment made to the series. Once
the feedback control scheme is in operation, we observe only the adjusted series. To fix the
notation we define
• yt as the unadjusted output at time t
• y*t as the adjusted (observed) output at time t when using the feedback
controller
• at as the adjustment made at time t (note the adjustment is not felt in the
process until time t + 1)
• ct as the cumulative adjustment up to and including time t
Then, we have:
• ct = Â i=1 ai = ct -1 + at , that is, the cumulative adjustment is a sum of all
t
previous adjustments
• y*t = yt + ct–1, that is, the observed output is the original (unadjusted) output plus
the cumulative adjustment
As long as we keep track of the cumulative adjustments in the simulation, we can use these
equations to go back and forth between the original and adjusted series.
To simulate the effect of applying the feedback controller, we start at time 1 and calculate
adjustment a1 and the cumulative adjustment c1 = a1. Then, at time 2, we observe the output
y2* = y2 + c1. Next, we apply the feedback control rules to the observed series y1*, y2* to get the
adjustment at time 2, a2, and the cumulative adjustment c2 = a1 + a2 = c1 + a2. And so on.
The Matlab (see http://www.mathworks.com) code that follows illustrates the retrospec-
tive application of a feedback control scheme to the flow rate example from Chapter 18. The
simulation is also straightforward to implement in Microsoft Excel. We show both the sim-
ulation of a controller based on the EWMA forecast with parameter alpha (as suggested in
Chapter 18) and a simpler feedback controller that adjusts back to target if the observed flow
rate is outside the range [99, 107].
function []=feedbacksim(flowrate,alpha,dev)
%simulate the effect of using feedback on the fascia film build example
%original flow rate data given in vector "flowrate"
%example function call: feedbacksim(flowrate,0.2,4)
%try out the feedback controller based on the EWMA forecast of unadjusted %series,
and based on partial adjustment of deviation from target 103
Note that when using an exponential smoother to predict the next output value (as with
the flow rate data shown in Figure 18.6) and making the full adjustment, MINITAB auto-
matically gives the standard deviation of the adjusted series as the square root of the MSD,
so there is no need to simulate.
Simulating the implementation of the proposed feedback controller using historical
data will most likely overestimate the potential benefit since:
• We use the same historical data to model the data and develop a prediction
equation.
• The simulation assumes there is no adjustment error.
An alternative way to assess the potential benefit of feedback is to fit a time series model
to the historical data and simulate new output data. With a model for the output we need to
check that the simulated series without any adjustment seems reasonable when compared to any
historical data we have from the existing process. An advantage of having a model is that we
can repeatedly simulate the effect of the proposed feedback controller.
output yt+1, using all previous outputs yt, yt–1, yt–2, ..., where yt denotes the value of the out-
put measured at time t. For the moment, we assume there have been no adjustments. Denot-
ing the one-step-ahead forecast as ŷt+1, two possible predictors are:
• Moving average: the average of the last k outputs; that is,
ŷt +1 = ( yt + yt -1 + … + yt -k +1 ) k
1.0 1.0
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
Weight
Weight
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0.0 0.0
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
Time periods ago Time periods ago
Figure S18.2 Possible weights (left panel moving average with k = 3, right panel EWMA
with α = 0.6).
H1212Supp18.qxd 4/1/05 10:36 AM Page CD–294
(that is adjusted series) at time t, and T is the target value for the process center. We use the
same notation as in the previous section to define the adjustment at time t, at, and the cumu-
lative adjustment at time t, ct.
Without loss of generality, we assume yt and yt* have been scaled so that the target value
is zero. Now suppose at each time period we make an adjustment to partially compensate
*
for the difference between the observed output yt and the process target; that is, we set
(
at = - α yt* - 0 )
Then, the cumulative adjustment at time t is given by
ct = ct -1 + at
= ct -1 - αyt*
= ct -1 - α ( yt + ct -1 )
= - αyt + (1 - α) ( ct -2 + at -1 )
(
= - αyt + (1 - α) ct -2 - α ( yt -1 + ct -2 ) )
= - α ( yt + (1 - α) yt -1 ) + (1 - α)2 ct -2
(
= - α ( yt + (1 - α) yt -1 ) + (1 - α)2 ct -3 - α ( yt -2 + ct -3 ) )
( )
= - α yt + (1 - α) yt -1 + (1 - α)2 yt -2 + (1 - α)3 ct -3
…
so ct = - yˆt +1
Hence the cumulative adjustment is given by Equation (18.1); that is, we fully compensate
for the deviation between the EWMA forecast for yt+1 made at time t and the target. Since,
for the adjusted series, the predicted value at time t+1 equals yˆt*+1 = yˆt +1 + ct , the predicted
output at time t + 1 will be on target.
H1212Supp18.qxd 4/1/05 10:36 AM Page CD–295
Partial Adjustment
In discussing a feedback controller based on an EWMA forecast, we assume a full adjustment
is made after each observation. Due to cost or other considerations, we may decide to make
an adjustment only if the predicted deviation from target is large.
We can easily assess the effect of occasional adjustments by keeping track of the cumu-
lative adjustment made up to time t (that is, ct–1) so that we can recreate the series we would
have observed had no adjustments been made ( yt = yt - ct -1 ). Using this unadjusted (and
*
unobserved) series, we can determine an EWMA or any other forecast for the next time
period. Then, translating back to the observed series, we compare the predicted observed
output (that is, ŷt +1 + ct -1) to the process target to decide if further adjustment is needed.
Chapter 19 Supplement
Making a Process Robust
Notice that the objective of parameter design is very different from a pure scientific
study. The goal in parameter design is not to characterize the system but to achieve
robust function. Pure science strives to discover the causal relationships and to
understand the mechanics of how things happen. Engineering, however, strives to
achieve the results needed to satisfy the customer. Moreover, cost and time are very
important issues for engineers. Science is to explain nature while engineering is to
utilize nature. (p. 130)
Taguchi considers using experimental design in both the product development process
and to improve an existing process. This is a much broader view of improvement than is
taken in Statistical Engineering, where we focus on improving an existing process.
CD–297
H1212Supp19.qxd 4/1/05 10:37 AM Page CD–298
( ) ( )
10 log y 2 s 2 or –10 log s 2
where y is the average output in a run and s is the standard deviation of the output values
measured within a run. Note that –10 log(s2) = –20 log(s), so that Taguchi S/N ratio
–10 log(s2) is a rescaled version of the performance measure we recommend to analyze
robustness experiments where the goal is to reduce the within-run variation. We get identical
conclusions from the analysis with either performance measure.
In some problems, the goal is to lower or increase the process center. Taguchi calls these
smaller/larger-is-better problems and recommends the S/N ratios:
1 1 1
Larger is better: -10 log[ ( 2 + ... + 2 )]
r y1 yr
where y1,..., yr are the r measured values of the output characteristic on the repeats within
each run. The idea is to calculate the performance measure (S/N ratio) for each run and then
analyze these measures as the response in the experiment. In all cases, S/N ratios are
defined so that larger values are better in terms of the goal of the problem. See Box (1988)
for further discussion of signal-to-noise ratios. We recommend using two performance
measures, the within-run average and log standard deviation, rather than a single S/N ratio.
Many of Taguchi’s designs and methods of analysis are available in MINITAB. We do
not recommend them.
H1212AppA.qxd 4/1/05 10:38 AM Page CD–299
APPENDICES
Using MINITAB
In the appendices we show how to create the graphs and statistical analyses described in this
book using the statistical software package MINITAB. Throughout we assume the reader is
familiar with a Windows-based environment.
MINITAB is a general-purpose statistical software package. See http://www.minitab.com.
MINITAB is a leader in the area of quality improvement and is widely used in industry.
There are other suitable packages such as SAS, Splus, Statgraphics, JMP, and Systat. We do
not recommend Microsoft Excel because we find it cumbersome and inefficient for produc-
ing the required analyses. In this book we illustrate MINITAB release 13.30 for Windows.
MINITAB is easy to use since its interface is based on pull-down menus and dialog
boxes. The capabilities of MINITAB are well explained in the manuals (MINITAB User’s
Guide 1 and 2, 2000a and 2000b) and by the online help. In particular, in each dialog box
there is an option for context-sensitive help that usually includes an example. A good ref-
erence book for MINITAB is Ryan et al. (2000).
In these appendices, we show how to select a MINITAB command through its pull-
down menu structure using words and arrows. For example, a multivari chart is obtained in
MINITAB using the menu selection:
Stat Æ Quality Tools Æ Multi-Vari Chart
This corresponds to menu navigation.
Appendix A
Data Storage and Manipulation
The first step is to get the data into the MINITAB worksheet. To illustrate, we use an inves-
tigation where the goal was to compare the performance of three different measurement
devices—a feeler gage, a height gage, and a scale gage—for measuring the distance
between a bottle lip and a label. The data are given in the file label height measurement.
To open a MINITAB worksheet (a file with the extension .mtw) we use:
File Æ Open Worksheet
To open an existing MINITAB project, which includes a worksheet with the data and
all numerical and graphical summaries of the data we previously produced, use
File Æ Open Project
CD–301
H1212AppA.qxd 4/1/05 10:38 AM Page CD–302
(output or input), and each row represents a different observation. This format is not neces-
sarily intuitive for people who commonly use Excel. Also, in MINITAB every column (i.e.
characteristic) can be given a descriptive name.
To illustrate why the row/column format is recommended, see Figure A.2, which
shows how the data for the measurement investigation were initially recorded. The Excel
worksheet in Figure A.2 gives the results only for the feeler gage. The results for the height
and scale gages were stored in separate worksheets. In Figure A.2, the meaning of each data
value depends not only on its location in the worksheet, but also on the location of various
labels. For example, the value in cell B2 (0.062) gives the lowest value of trial 1 of the first
operator for bottle A using the feeler gage. This way of storing data is not convenient when
doing analysis, since determining the meaning of the various data values is difficult.
Figure A.1 shows how the same data for all three gages were stored in MINITAB. The
label above each column provides the name of the stored characteristic, and each row rep-
resents a different observation, or in this case, a measurement. The 0.062 value discussed
previously is now stored in the first row of column C3 (labeled feeler_low). We can iden-
tify the corresponding part (bottle 1) and operator (A) by examining the values of the other
characteristics in the same row.
The more efficient row/column format approach to storing data identifies the relation-
ships among the collected values. We can also store data in Excel in the row/column for-
mat. Translating from the original data format, shown in Figure A.2, to the row/column
format, shown in Figure A.1, is a tedious process.
H1212AppA.qxd 4/1/05 10:38 AM Page CD–303
Calc Æ Calculator
In the label height measurement example, the minimum and maximum heights (as the bottle
was rotated) were recorded. In Figure A.5, we use the calculator to record the difference
(max–min) for each feeler gage measurement in column C9.
To use the calculator, we enter a formula involving the existing characteristics in the
expression window. The expression can be typed in directly or identified using the
select button. The expression may include any of the functions given in the list. Many
standard statistical functions such as average and standard deviation are available. Note
that (most of) these functions act simultaneously on all rows across different columns.
Column summaries (for example, the mean or standard deviation of a characteristic) are
discussed in Appendix B.
When selecting the subset, we select the data observations to either include or exclude.
There are a variety of ways of specifying the subset of observations. We can give the row numbers,
select the rows beforehand using brushing, or give a condition that must be satisfied.
Figure A.7 Brake rotor balance verification experiment plan and data.
H1212AppA.qxd 4/1/05 10:38 AM Page CD–306
The corresponding MINITAB dialog box with the required information filled in is
given in Figure A.8.
Stacking the columns produces the first two columns shown in Figure A.9, where the
second column has been given a descriptive label. To produce Figure 13.4, we need an
additional column that gives the treatment corresponding to each weight. Figure A.9 shows
the results of using the command to make patterned data as described in Section A.2.
The data were originally stored as in Figure A.7 because, for subsequent analysis, we
calculate main and interaction effects based on the average balance weight.
H1212AppA.qxd 4/1/05 10:38 AM Page CD–307
MACRO
macrotest diameter adjusted
let I=1
let cumad=0 #cumulative adjustment needed based on rule
let nadj=0
while I<=200
let diameter2=diameter(I)+cumad #current value
let adjusted(I)=diameter2
#determine if further adjustment needed
if diameter2<2.7
let cumad=cumad+(6.7-diameter2)
let nadj=nadj+1
endif
if diameter2>10.7
let cumad=cumad-(diameter2-6.7)
let nadj=nadj+1
endif
let I=I+1
endwhile
ENDMACRO
We save the macro in a separate text file and call the macro using the command line editor.
The command line editor is available using:
Edit Æ Command Line Editor
H1212AppA.qxd 4/1/05 10:38 AM Page CD–308
The result of executing the macro is shown in the data file snapshot in Figure A.11. At
observation 39, the observed diameter was 11.7. In the example, this was the first time an
adjustment was required. At that time, the cumulative adjustment was set to –5. We can see
the effect of the adjustment on subsequent diameters. Later in the data file (not shown in
Figure A.11) further adjustments are necessary.
To write MINITAB macros, we can use the Show History dialog button that gives the
command line interpretation for any MINITAB commands run using the pull-down menus.
The History dialog button is the small yellow button with a prompt (>) at the far right at the
top of the MINITAB window, as shown in Figure A.12.
Another example macro to simulate the effect of selective fitting is given as part of the
solution to Exercise 17.2.
H1212AppA.qxd 4/1/05 10:38 AM Page CD–310
H1212AppB.qxd 4/1/05 10:39 AM Page CD–311
Appendix B
Numerical Summaries
We use a manifold sand scrap investigation to illustrate various numerical data summaries
in MINITAB. In the investigation, each manifold was classified as scrap or not and linked
(as well as possible) to a number of sand and pour characteristics. We have a total of 17
characteristics measured on 970 castings. The data are given in the file manifold sand scrap
comparison.
To define the summary measures, Mean is the sample average and StDev is the sample
standard deviation. If we rank the values, the smallest value is given by Minimum, the
largest value by Maximum, and the middle value by Median. Q1 and Q3 define the first and
CD–311
H1212AppB.qxd 4/1/05 10:39 AM Page CD–312
third quartiles—that is, the observation halfway in the ordered list between the minimum
and the median, and halfway between the median and the maximum, respectively. The
median defines the second quartile.
We can also produce numerical summaries for any characteristic stratified by a discrete
characteristic. The dialog box in Figure B.1 shows how to stratify the summary for pour
time by the mold hour.
The results are:
Sometimes we need to save numerical summaries in the worksheet for further analysis.
We request storage of some of the descriptive summaries using:
Stats Æ Basic Statistics Æ Store Descriptive Statistics
From the dialog box (see Figure B.2) selecting Statistics, we check off the data summaries
we want to save. This is useful in the analysis of multivari investigations that involve a part-
to-part family or other family that is expected to have a haphazard effect.
Note that numerical summaries are sometimes given using exponential notation.
Exponential notation is convenient for very small or very large numbers. The integer given
after the E tells us how many positions (to the right, if the integer is positive; to the left, if
the integer is negative) to shift the decimal point; for example, 4.49E–03 = 0.00449.
We see that there are only two possible values for scrap (where 0 represents a passed man-
ifold and 1 a manifold that was scrapped). The proportion of scrap is roughly 12%.
In such situations, we may also be interested in determining whether a relationship
exists between two discrete characteristics. We use the MINITAB menu selection:
0 1 All
3 141 25 166
84.94 15.06 100.00
4 148 24 172
86.05 13.95 100.00
5 151 25 176
85.80 14.20 100.00
6 156 24 180
86.67 13.33 100.00
7 133 9 142
93.66 6.34 100.00
8 124 10 134
92.54 7.46 100.00
Cell Contents --
Count
% of Row
H1212AppB.qxd 4/1/05 10:39 AM Page CD–316
H1212AppC.qxd 4/1/05 10:40 AM Page CD–317
Appendix C
Graphical Summaries
In all investigations, graphical summaries play an essential role in the analysis. Often we use
only a graphical summary to draw conclusions.
C.1 HISTOGRAM
A histogram summarizes the distribution of a continuous characteristic in the data set.
Histograms are available through the MINITAB menu selection:
Graph Æ Histogram
To make comparing histograms easier, we recommend using the option button and
choosing percent rather than the default frequency display. In the manifold sand scrap
example introduced in Appendix B, we enter the characteristic pour time as the graph vari-
able to get the histogram in Figure C.1. The data are given in the file manifold sand scrap
comparison.
20
Percent
10
3 4 5 6 7 8
Pour time
CD–317
H1212AppC.qxd 4/1/05 10:40 AM Page CD–318
The histogram shows the distribution of pour times. In Figure C.1, we see that the aver-
age pour time is roughly 4.75 seconds and that there are a few large pour times above 7 seconds.
Note that the time order of the data is lost in a histogram.
If there is a large number of observations, it may be difficult to identify outliers (unusually
small or large values) in a histogram. This occurs, for instance, in the histogram of align-
ment pull given in Figure 1.2. Box plots, covered in Section C.3, are a good alternative
when the number of observations is large or when we wish to compare distributions.
The run chart of pour time in Figure C.2 shows the time order. Figure C.2 suggests the
variation in pour times is greater at the beginning of the investigation. At the end, say after
casting 700, the variation is noticeably smaller.
The run chart assumes the observations are equally spaced in time. If this is not the
case, we use a scatter plot (see Section C.4) where the horizontal axis is defined in terms of
time. If the data are collected in subgroups over time, say five units measured each hour, we
can also use a multivari plot (see Section C.5).
7.5
7.0
6.5
6.0
Pour time
5.5
5.0
4.5
4.0
3.5
3.0
In Figure C.4 we stratify pour time by whether or not the part is scrapped. The horizontal
line in the middle of each box gives the median value. The upper and lower edges of the rectan-
gle show the first and third quartiles, Q1 and Q3, defined in the descriptive statistics summary in
Appendix B. The so-called whiskers are the lines coming out of the central rectangle. The ends
of the whiskers identify the range of the data with the exception of unusual values plotted as sep-
arate stars. The plotting of unusual values with separate symbols is useful for finding outliers.
7.5
7.0
6.5
6.0
Pour time
5.5
5.0
4.5
4.0
3.5
3.0
0 1
Scrap
If the number of observations summarized in the box plot is small, we prefer displaying
the individual observations rather than the summary statistics. In MINITAB we change the
display items to Individual Symbol, rather than the default of IQRange Box and Outlier
Symbol. In MINITAB release 14.11, we can plot individual observations using:
For example, Figure C.5 shows measurement error stratified by part number for the
camshaft diameter relative bias investigation discussed in Chapter 7. The data come from
the file camshaft journal diameter measurement2. In Figure C.5, we see the that measure-
ment error is negative in all cases and that there is little difference in the distribution of the
errors among the parts.
0.0
–0.5
–1.0
–1.5
Meas. error
–2.0
–2.5
–3.0
–3.5
–4.0
–4.5
1 2 3
Part
Figure C.5 Box plot (showing individual observations) of measurement error by part number.
Scatter plots are useful for examining the relationship between two characteristics. In
the manifold sand scrap example we examine how pour time changes with ladle number
(see Figure C.6).
The resulting scatter plot, given in Figure C.7, suggests there is no association between
pour time and ladle number.
H1212AppC.qxd 4/1/05 10:40 AM Page CD–321
7
Pour time
0 10 20 30 40
Ladle number
Adding Jitter
In the crossbar dimension robustness investigation from Chapter 19, eight treatments were
used, and each run consisted of five consecutive parts for each treatment. The data are given
in the file crossbar dimension robustness. The output “burn” could take only four possible
values. Plotting burn by treatment, as shown in the left panel of Figure C.8, is helpful but
difficult to interpret, since we cannot see the output for all five parts in each run. Many
observations are plotted on the same position, called overplotting. We add jitter (small
H1212AppC.qxd 4/1/05 10:40 AM Page CD–322
4 4
3 3
Burn
Burn
2 2
1 1
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Treatment Treatment
Figure C.8 Scatter plot of burn by treatment—regular on left, with jitter in vertical
direction on right.
Labeling Points
The plotted points can be labeled using the values of any discrete characteristic. Consider
the cylinder head scrap example discussed in Chapter 12. The data are given in the file
cylinder head scrap multivari. The team wanted to see if the relationship between side shift
(the output) and time depended on pattern.
They used the dialog box, as shown in Figure C.10, to make the scatter plot given in
Figure C.11. Note the change to the data display in the Plot dialog box. There are four dif-
ferent plotting symbols, one for each pattern. Generally, pattern 1 has the largest side shift
and pattern 2 the smallest, but there is no dependency on time.
H1212AppC.qxd 4/1/05 10:40 AM Page CD–323
0.02 1
2
0.01 3
4
0.00
Side shift
–0.01
–0.02
–0.03
–0.04
1 2 3 4 5 6
Time
Figure C.11 Scatter plot of side shift by time with labels for different patterns.
Graph Æ Plot Æ Draftsman Plot [option each Y versus each X under Graph Æ
Matrix Plot in release 14.11]
The draftsman plot automatically produces all scatter plots that involve the output
characteristic (Y variable) and the list of input characteristics (X variables). Figure C.12
shows the dialog box.
H1212AppC.qxd 4/1/05 10:40 AM Page CD–324
In Figure C.13, we see no clear relationship between the sand scrap proportion and var-
ious sand characteristics. Upon closer inspection, however, a quadratic relationship between
temperature and sand scrap proportion becomes clear. This example illustrates the point that
if too many characteristics are chosen, draftsman plots (and matrix scatter plots, as will be
discussed) will be hard to interpret, since each individual plot will be very small. In that case
we look at the scatter plots one at a time.
A matrix plot is another way of producing multiple scatter plots simultaneously. Unlike
with a draftsman plot, we now make no distinction between inputs and outputs. The win-
dow leaks problem discussed in Chapter 12 provides an example, where we also label the
0.114675
Sand scrap proportion
0.038225
5
5
12
7
75
25
85
75
33
4.
.1
.3
.9
.5
0
0
.
2
37
43
10
95
25
29
15
17
15
9.
2.
3.
plotted points to show leakers and nonleakers. The data are given in the file window leaks
comparison. A matrix scatter plot is available using:
Selecting all the inputs and using class as a group variable, as shown in Figure C.14,
we obtain the matrix scatter plot in Figure C.15.
Plastisol amount
Glass flushness
Slider drain
hole gap
Urethane stop/
start gap
Backglass gap
Leaking windows are shown with circles, and nonleakers are shown with plus signs.
There is a separation between leakers and nonleakers if we look simultaneously at the two
inputs: plastisol amount and primary seal fit. The plot in the lower left corner of Figure C.15
is reproduced in a larger format in the right panel of Figure 12.2.
0.02
0.01
0.00
Side shift
–0.01
–0.02
–0.03
–0.04
1 2 3 4
Pattern
Using the selections made in Figure C.16 results in the multivari chart given in Figure C.17.
Figure C.17 shows large systematic differences among the patterns (cavities). The vari-
ation within each pattern is about 60% of the overall variation.
The dialog box in Figure C.18 creates a multivari chart involving two inputs—time and
pattern. From the resulting chart in Figure C.19, we conclude that the dominant cause acts
only in the pattern-to-pattern family and does not involve the time-to-time family.
A multivari chart with three or four inputs can be difficult to interpret. Using a variety
of different charts and orders for the inputs may help.
Time
0.02
1
2
0.01
3
0.00 4
5
Side shift
–0.01 6
–0.02
–0.03
–0.04
1 2 3 4
Pattern
Figure C.19 Multivari chart of side shift versus pattern and time.
To create the multivari charts for the group standard deviation, we also need to define
new input columns that give the values of the original inputs for each level of the charac-
teristic group. This is best accomplished using:
Manip Æ Code Æ Use Conversion Table
We define a new input column for cavity as shown in Figure C.22. We label the new
input column “scavity,” since it tells us the value of the cavity that corresponds to the group
standard deviation column. We similarly define “stime” and “sposition.”
With this preliminary work, we can now create the desired multivari charts for group stdev.
We see that the time-to-time variation is captured well by the exponential smoothing.
The smoothing constant (alpha) is 0.17, and the standard deviation of the smoothed
series is given by the square root of the mean squared deviation (MSD). In the example,
we have 3.83 = 1.96.
H1212AppC.qxd 4/1/05 10:40 AM Page CD–331
Actual
110 Predicted
Actual
Predicted
105
Flow rate
MAPE: 1.53881
95 MAD: 1.56194
MSD: 3.82725
We can also use MINITAB to determine how the adjusted series will look (assuming
no adjustment errors) by calculating the output minus the target value and fitting the single
exponential smoother to the translated output. Saving the fitted values (using the Storage
button from Figure C.23), we can determine the adjusted series by subtracting the fitted
values from the original film build values.
H1212AppC.qxd 4/1/05 10:40 AM Page CD–332
H1212AppD.qxd 4/1/05 10:41 AM Page CD–333
Appendix D
Analysis of Variance (ANOVA)
Analysis of variance (ANOVA) is the main formal numerical analysis tool associated with
the search for a dominant cause of variation. While in most investigations we draw conclu-
sions from graphical displays, we can use ANOVA to:
• Assess a measurement system (Chapter 7).
• Compare contributions of two families to output variation (Chapter 10).
• Supplement the analysis of multivari investigations (Chapter 11).
• Set a new goal for a reformulated problem (Chapter 14).
ANOVA is applicable when the output is continuous and we have one or more discrete
inputs. See Box et al. (1978) and Neter et al. (1996). Here we describe a few forms of
ANOVA that we may need when applying the Statistical Engineering algorithm.
ANOVA partitions the overall variation (as quantified by the total sum of squares) into
components attributable to various inputs or families. We recommend using ANOVA as a
supplement to graphical displays if the results are unclear.
Consider the cylinder head scrap example discussed in Chapter 11. The data are given
in the file cylinder head scrap multivari. Figure D.1 shows a plot of the side shift (the output)
versus pattern (an input). With ANOVA, we quantify how much of the variation in the side
shift can be explained by differences among the four mold patterns.
CD–333
H1212AppD.qxd 4/1/05 10:41 AM Page CD–334
0.02
0.01
0.00
Side shift
–0.01
–0.02
–0.03
–0.04
1 2 3 4
Pattern
Using the dialog box shown in Figure D.2, we get the ANOVA results:
Pooled StDev provides an estimate of the variation within each pattern pooled
across all patterns. That is, Pooled StDev provides an estimate of remaining output
variation if we could eliminate all pattern-to-pattern differences in the average output.
It estimates the variation attributable to all families other than the pattern-to-pattern
family.
We can also use a one-way ANOVA model to estimate the variation due to the meas-
urement system. In Chapter 7, we described a measurement investigation for the camshaft
journal diameter example. The data are given in the file camshaft journal diameter meas-
urement. A one-way ANOVA by part is:
Source DF SS MS F P
time 11 9008.82 818.98 31.81 0.000
cavity 5 16994.99 3399.00 132.00 0.000
position 3 16697.24 5565.75 216.15 0.000
time*cavity 55 1544.02 28.07 1.09 0.311
time*position 33 7580.19 229.70 8.92 0.000
cavity*position 15 5363.43 357.56 13.89 0.000
time*cavity*position 165 1634.06 9.90 0.38 1.000
Error 576 14832.00 25.75
Total 863 73654.75
To compare the relative sizes of the families, we look at the sum of squares (SS) column.
We see large, roughly equal-sized effects due to position, cavity, and error. The error sum of
squares includes the effect of the casting-to-casting family and all interactions between the
casting-to-casting family and the other families. The calculated sum of squares tells us
approximately how much we could expect to reduce the total sum of squares if we could elim-
inate all variation due to the given family. It is complicated to translate these sums of squares
into estimates for the standard deviation attributable to each family, but there is a rough cor-
respondence. See the supplement to Chapter 11 for a complete discussion of the casting
thickness example.
If some data are lost, the assumption of balance (equal number of observations at each
level of each input) is violated and the balanced ANOVA analysis is no longer appropriate.
For unbalanced data, we use a general linear model:
Stat Æ ANOVA Æ General Linear Model
H1212AppD.qxd 4/1/05 10:41 AM Page CD–338
H1212AppE.qxd 4/1/05 10:42 AM Page CD–339
Appendix E
Regression Models and Analysis
With regression, we model the relationship between an output characteristic and one or more
inputs. We find regression models useful for the following tasks:
• Investigate variation transmission (chapters 10 and 11).
• Investigate the relationship between the output and inputs (Chapter 12).
• Set the goal for a reformulated problem (Chapter 14).
• Find a prediction equation for a feedforward controller (Chapter 17).
• Calibrate an adjuster (Chapter 18).
There are many good references. See Box et al. (1978), Ryan (1989), and Montgomery
et al. (2001).
The term b0 + b1input captures the effect of the input, and residual describes the variation in
the output due to all other inputs. The regression analysis consists of estimating (also called
fitting) the unknown constants b0 and b1 and the residual variation.
In MINITAB, regression analysis is available using:
CD–339
H1212AppE.qxd 4/1/05 10:42 AM Page CD–340
To illustrate, we use the truck alignment example introduced in Chapter 1 and discussed in
Chapter 12. The data are given in the file truck pull input-output. Figure E.1 is the MINITAB
Regression dialog box, used to fit a regression model with output right caster and input
U-reading.
The corresponding MINITAB regression results are:
Analysis of Variance
Source DF SS MS F P
Regression 1 0.38566 0.38566 10.39 0.003
Residual Error 28 1.03893 0.03710
Total 29 1.42459
Unusual Observations
Obs U readin right ca Fit SE Fit Residual St Resid
22 2.09 5.1663 4.9089 0.1121 0.2574 1.64 X
5.2
5.1
5.0
4.9
Right caster
4.8
4.7
4.6
4.5
4.4
4.3
4.2
–1 0 1 2
U-reading
We illustrate using the crossbar dimension example covered in Chapter 12. The data
from the investigation are given in the file crossbar dimension input-output. The left
panel in Figure E.3 shows the resulting graph and numerical regression model summary
when fitting a model for dimension as a linear function of barrel temperature. We see
that the linear model fits quite well and (since the dimension variation in the regression
data matches the baseline) that barrel temperature is a dominant cause. However, we
also notice that the relationship between dimension and barrel temperature seems to be
nonlinear.
H1212AppE.qxd 4/1/05 10:42 AM Page CD–342
Regression plot
Regression plot
dimension = –325.949 + 8.16801 barrel temp
dimension = –23.8980 + 0.322935 barrel temp
–0.0509189 barrel temp**2
S = 0.254387 R–Sq = 78.9% R–Sq(adj) = 78.3%
S = 0.233475 R–Sq = 82.7% R–Sq(adj) = 81.8%
2 2
Dimension
Dimension
1 1
0 0
74 75 76 77 78 79 80 74 75 76 77 78 79 80
Barrel temp Barrel temp
Using the fitted line plot dialog box as shown in Figure E.4, where we request a quad-
ratic model, we get the plot given in the right panel of Figure E.3. In this way, we can
explore the relationship between the input and output. Using the Options button we can also
request a log transformation of the input, output, or both.
We use the best subsets regression options, as shown in the right panel of Figure E.5,
to request a minimum and maximum of one free predictor (that is, input) and to display five
models of each size to print. To illustrate, we use the brake rotor balance example discussed
in one of the case studies. In a group comparison investigation, 26 foundry-related inputs
were measured on 30 balance rejects and 30 well-balanced brake rotors. The data are given
in the file brake rotor balance comparison. Previous analysis for this data was based on the
binary output: balanced or not balanced. However, since the balance weight required for
each rotor was also recorded, we can also analyze these data using regression models. Filling
in the Best Subsets Regression dialog box as in Figure E.5 gives the following (edited)
MINITAB results:
Response is balance
o i i i o o h h h h h
u o n o o n n u u h h h h o o o o o
t v n f f n n t t e e e e l l l l l
e e e f f e e e e i i i i e e e e e
r r r s s r r r r g g g g
a e e h h h h s s s s s
v v v v v v v v t t l t t d d d d t t t t i i i i i
R-Sq R-Sq(adj) C-p S 1 2 3 4 5 6 7 8 h h l 1 2 i i i i 1 2 3 4 z z z z z
The results summarize the five best regression models that involve only a single input.
The models are ranked by largest R-Sq value (or equivalently, since all models have the
same number of parameters, by the smallest value for S). Overall thickness variation is the
single input that explains the most variation in the output. We would then look at the scatter
plots that correspond to the best inputs. To determine the regression equation for any partic-
ular input, we need to fit the individual regression model.
Using the best subsets regression routine, we can avoid examining all scatter plots (and
fitting all the corresponding regression models) involving the output and all possible single
inputs. There are 26 such plots (models) in this case. We do not recommend the use of the
best subsets regression routine unless the number of inputs is large. Looking at the individual
plots is preferred because the regression summary can miss patterns like nonlinear relation-
ships and the effect of outliers.
To illustrate, consider the truck pull feedforward investigation discussed in Chapter 17. The
goal of the investigation was to find a prediction equation for caster that could be used to build
a feedforward controller. The data are given in the file truck pull feedforward. We fit a regres-
sion model to describe the relationship between the output left caster and the inputs given by
the truck frame geometry using the Regression dialog box, as shown in Figure E.6.
H1212AppE.qxd 4/1/05 10:42 AM Page CD–345
Analysis of Variance
Source DF SS MS F P
Regression 4 77.827 19.457 627.82 0.000
Residual Error 95 2.944 0.031
Total 99 80.771
In this example, the regression equation summarizes the relationship between the left
caster and the four truck frame geometry inputs. The value of S in the regression results gives
an estimate of the remaining variation in left caster (quantified in terms of standard deviation)
if we hold all of the four truck frame inputs fixed or if we could perfectly compensate for their
effects using a feedforward controller.
H1212AppE.qxd 4/1/05 10:42 AM Page CD–346
For regression models with many inputs, we need to use caution when trying to interpret
the estimated regression parameters (given by the “Coef” column in the MINITAB regres-
sion results). We may interpret the parameter estimate corresponding to the input “left front”
(1.24) as the expected change in the left caster for a unit change in left caster if we hold all
the other inputs fixed. If the inputs used in the regression analysis do not vary independently,
it may make no physical sense to think of changing one input while holding all the others
fixed. In the truck pull example, where we want a prediction equation, we do not interpret
the individual regression parameter estimates, so this is not a major concern.
H1212AppF.qxd 4/1/05 10:42 AM Page CD–347
Appendix F
Planning and Analysis of
Designed Experiments
To illustrate, we use the brake rotor balance verification experiment discussed in Chapter 13
and in one of the case studies. In the experiment, two levels for each of the three suspects—
tooling, position, and thickness variation—were chosen to capture their full range of vari-
ation. The team decided to make eight rotors for each of the eight treatments. That is, there
were eight runs with eight repeats. No treatment was replicated. Figure F.1 shows how the
experimental plan was entered into MINITAB. In MINITAB, the inputs that are changed in
an experiment are called factors.
CD–347
H1212AppF.qxd 4/1/05 10:42 AM Page CD–348
For the choices shown in Figure F.1, MINITAB gives the summary:
Factorial Design
The plan and the collected data are given in the file brake rotor balance verification. The
worksheet with the data is shown in Figure F.2. Note that MINITAB has created four spe-
cial columns in the worksheet: StdOrder, RunOrder, CenterPt, and Blocks. These columns
are necessary to use any of the MINITAB subcommands under Stat Æ DOE Æ Factorial
once the design has been created.
In Figure F.2, the columns r1, r2, ..., r8 give the balance weights for the eight rotors
produced for each treatment. Much of the analysis in the brake rotor example uses the per-
formance measure average weight. The average weight for each run is calculated as
described in Appendix A. For experiments with a number of repeats for each run, such as
the brake rotor balance verification experiment, we need to store the data in a different way
H1212AppF.qxd 4/1/05 10:42 AM Page CD–349
to plot the output by treatment. Putting all the output values (that is, repeats) in a single col-
umn (use the stack columns command as described in Section A.5), we get the worksheet
in Figure A.9.
Custom Designs
In some cases, the experiment we wish to conduct is not a standard two-level factorial
design, or we already have the experimental data stored. In such cases, we can set up the
experiment in MINITAB using:
For instance, in the oil pan scrap example discussed in Chapter 16, the experiment had four
inputs, with one at three levels and the other three at two levels each. We enter the design
as in Figure F.3. For two-level factorial designs, we need to also use the Low/High dialog
box to tell MINITAB what codes correspond to the high and low level of each input.
Entering the design as shown in Figure F.3 adds the columns StdOrder, RunOrder,
CenterPt, and Blocks to the existing data as shown in Figure F.4. The data are available in
the file oil pan scrap desensitization.
For a proper interpretation of this experiment, recall that this is a desensitization exper-
iment. Three of the inputs—namely, lubricant type, die temperature, and binder force—are
candidates, while lube amount is the dominant cause. As shown in Figure F.4, we number
the treatments based on combinations of the candidates.
Figure F.6 Design setup for the paint film build robustness experiment.
Figure F.7 Fractional factorial design and output for paint film build experiment.
H1212AppF.qxd 4/1/05 10:42 AM Page CD–352
The resulting dialog box is given in Figure F.9. We specify the output (response) to
analyze and choose the form of the model using the Terms button.
In this example, as shown in Figure F.9, we choose the performance measure “log(s)”
as the output (upper left panel of Figure F.9) and select a model with all possible main
effects and interactions (upper right panel of Figure F.9). Here we include interactions up
to fifth order, since there are five inputs in the experiment.
From the Graphs dialog box (lower left panel in Figure F.9) we select a Pareto effects
plot. In some cases, we may also want to have a numerical summary of some of the impor-
tant effects. We can use the Results dialog box (lower right panel of Figure F.9) to request
a display of the estimated average output for different levels of the input or inputs.
18
17
Film build
16
15
14
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Treatment
For the paint film build example, the resulting Pareto effects plot, which ranks the
unsigned effects, is given in Figure F.10. In this plot any large effects (relative to the other
effects) should be clearly evident. We see that the effects due to zone X voltage, conductivity,
and the interaction between these two candidates are large.
H1212AppF.qxd 4/1/05 10:42 AM Page CD–354
A: Anode dimension
BD
B: Conductivity
D C: Temperatune
B D: Zone X voltage
BC E: Zone Z voltage
CE
AE
BE
AD
DE
AC
C
A
CD
AB
E
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45
Figure F.10 Pareto plot of the effects for paint film build robustness experiment.
To make further sense of these experimental results, we consider factorial plots obtained
using the MINITAB menu selection:
Stat Æ DOE Æ Factorial Æ Factorial Plots
We can request main effects, two-way interaction, and three-way interaction plots.
Figure F.11 shows how we request main effects plots based on the performance meas-
ure log(s).
In the paint film build example, choosing main effects and interaction plots with all five
candidates, we get the plots given in figures F.12 and F.13.
h
w
5
0
5
ig
1
9
Lo
30
50
45
47
50
52
3.
3.
H
–0.35
–0.41
Log(s)
–0.47
–0.53
–0.59
Figure F.12 Main effects plot (in terms of log(s)) for film build robustness experiment.
gh
w
5
Lo
30
50
45
47
50
52
Hi
Conductivity –0.40
High –0.65
Low
–0.90
Temperature –0.40
50 –0.65
30
–0.90
Zone Z volta
Figure F.13 Interactions plot (in terms of log(s)) for film build robustness experiment.
Note that in the paint film build robustness experiment, the design was resolution V;
thus, the main effects and interactions can be estimated separately. If the design is resolution
IV, some two-way interactions are aliased with other two-way interactions. In particular,
for the standard half fraction resolution IV designs, half the individual interaction plots
show the same information. Similarly, in resolution III design, the main effects are con-
founded with interactions. With resolution III designs, we need only consider the main
effects plots, since the interaction plots will not provide any new information.
H1212AppF.qxd 4/1/05 10:42 AM Page CD–356
Zone X volta
0.0 450
–0.1
475
Log(s)
Log(s)
–0.5
–0.6
–1.0
–1.1
3.1 3.9 Low High
Anode dimension Conductivity
Figure F.14 Main effects and two-way interaction plots showing individual observations.
From figures F.12 and F.13, we conclude there are large effects due to conductivity, zone
X voltage, and the interactions between conductivity and zone X voltage and between conduc-
tivity and temperature. Recall that smaller log(s) is better. Since there are large interactions,
we draw conclusions based on the interaction plot. Figure F.13 suggests high zone X voltage,
high conductivity, and low temperature are best. We are fortunate that high conductivity is
best in both large interactions.
Note that the main effects and interaction effects plots produced by MINITAB show
only averages. We can use box plots (showing individual output values) and multivari plots
(see Appendix C) to create alternative displays of main effects and two-way interactions that
show individual observations, as illustrated in Figure F.14.