DoE Lecture
DoE Lecture
March, 2017
1
1. Introduction
2
Introduction: Why for Experiments?
Research is systematic investigation to establish facts.
It is systematic because all the activities are planned and
MINIMIZE
Unwanted effects from other sources of variation.
5
Advantage of Experiments
Experiments help us answer questions, but there are also non-
experimental techniques (eg. surveys).
Experiments allow us to set up a direct comparison between
comparison.
is small.
7
Strategy of Experimentation
Factorial approach (invented in the 1920’s)
9
A Brief History of Experimental Design
Four Eras in the History of DOE
10
A Brief History of Experimental Design
The second industrial era, late 1970s – 1990
11
Some Typical Applications of Experimental Design
Experimental design is applicable in many disciplines.
Engineering
Agriculture
Clinical Trial
Boiling water
Basketball
14
Guidelines for Designing Experiments
iv. Performing the experiment
possible.
study
design.
18
Basic Principles
19
Basic Principles
allows for a more precise estimate of the sample mean value, i.e.
20
Basic Principles
Randomization
Blocking
21
Terms and Concepts
Factors
Input variables that can be changed
fertilizer in agronomy,
Levels
22
Terms and Concepts
Experimental units
could be
plots of land receiving fertilizer,
23
Terms and Concepts
24
Terms and Concepts
Experimental Error:
the same treatment, and it is often true that applying the same
treatment over and over again to the same unit will result in
different responses in different trials.
25
Terms and Concepts
26
2. Review of Simple Comparative Experiments
27
Introduction
29
Graphical View of the Data: Dot Diagram
31
Basic Statistical Concepts
Each of the observations in the Portland cement experiment
described would be called a run.
noise
Histogram
33
Basic Statistical Concepts
Probability Distribution
The probability structure of a random variable is described by
34
Basic Statistical Concepts
35
Basic Statistical Concepts
Mean
The mean, µ, of a probability distribution is a measure of its
Variance
Measures the variability or dispersion of a probability
distribution.
36
Basic Statistical Concepts
37
Basic Statistical Concepts
y i
y i 1
n
yi y
2
s2
n 1
S: sample standard deviation
38
Basic Statistical Concepts
b) Properties of sample mean and sample variance
i) Estimator
y¯ is an estimator of the population mean µ
Minimum variance
39
Basic Statistical Concepts
c) Sampling distributions
i. i. Important distributions
normal
χ2
t
F
ii. Sampling distributions
40
Inference About the Difference in Means
Consider the Portland cement
example
41
Hypothesis Testing
42
Hypothesis Testing
normal populations
43
Hypothesis Testing
From Portland cement experiment we may claim that the mean
tension bond strength of two mortar formulations are equal
H 0 : 1 2
H1 : 1 2
µ1 the mean tension bond strength of the modified mortar
false
p Type II error p fail to reject H 0 / H 0 is false
Remark: the value for should be specified priory which called the level
of significance
45
Two Sample T-Test
Assume the variances of tension bond strengths were
identical for both mortar formulations.
Where
46
Two Sample T-Test
Estimation of parameters
1 n
y yi estimates the population mean
n i 1
1 n
S
2
i
n 1 i 1
( y y ) 2
estimates the variance 2
48
Two Sample T-Test
This suggests a statistic:
Values of t0 that are near zero are consistent with the null
hypothesis
Values of t0 that are very different from zero are consistent with the
alternative hypothesis
49
Two Sample T-Test
A value of t0 between –2.101
and 2.101 is consistent with
equality of means, outside of
these supports non equality of
means
The P-value is the risk of
≥ 0.00000000355. 50
Confidence Intervals
To define an interval estimate of θ , we need to find two statistics L and U
such that the probability statement is true. The
interval is called a 100(1-α) percent confidence interval for
θ.
51
Confidence Intervals
52
Confidence Intervals: Example
53
Summary on Mean Test
54
Summary on Mean Test
55
Inference About the variances of Normal Distribution
In Some experiments it is the comparison of variability in the
data that is important.
57
Inference About the variances of Normal Distribution
Consider testing the equality of variances of two normal
populations
Independent random samples of sizes n1 and n2 are taken from population
58
Inference About the variances of Normal Distribution
The upper and lower tails of F-distribution are related as follows
Summary
59
Inference About the variances of Normal Distribution
Example: A chemical engineer is investigating the inherent
variability of two types of test equipment that can be used to
monitor the output of a production process. He suspects that the
old equipment, type 1, has a larger variance than the new one. A
random samples of n1=12 and n2=10 observations are taken, and
the sample variances are s12 14.5 and s22 10.8 .
Hypothesis
Test statistic
60
Inference About the variances of Normal Distribution
F0.05, 11, 9=3.10, so the null hypothesis can’t be rejected.
61
Inference About the variances of Normal Distribution
Construct a 95% confidence interval for the ratio of variance
for chemical engineer monitor the output of production process
Homework:
62
Statistical Inference
Exercise 1:
63
Statistical Inference
Exercise 2:
64
Inference About the variances of Normal Distribution
Exercise 3:
65
3. Completely Randomized Design: Single Factor Analysis
of Variance
66
Introduction
In section 2, we discussed methods for comparing two
conditions or treatments.
factor.
67
Introduction
Example
Weight percent of cotton used in the blend of materials for the fiber
first. etc
69
Introduction
Suppose the test sequence is as follows
70
Introduction
The engineer has performed the experiment and obtained the
following result
71
Introduction
From the figure, tensile strength increase as cotton weigh percent
increases up to 25%
72
Analysis of Variance
Suppose there is a single factor with a treatments/levels
73
Analysis of Variance
Models for the Data
Let
An alternative model is
Effects
Model
74
Analysis of Variance
Models for the Data
Both means model and effects model are linear statistical models.
The means model (effects model) are also called one way or single
factor analysis of variance model
Objective:
Estimating means
75
Analysis of Variance
Model Assumptions
distributed
This implies
76
Analysis of Variance
yi. represent the total of observations in the ith factor
level/treatment
yi. yij
n
y
j 1
ij
yi.
j 1
yi. , i 1, 2, , a.
n n
a n
y
a n
y.. yij ij
y..
i 1 j 1
i 1 j 1 y..
N N
77
Analysis of Variance
The question is are the a treatment means equal?
78
Analysis of Variance
The treatment or factor effects are deviations from the
overall mean
The hypothesis in terms of the treatment effects is
79
Analysis of Variance
Decomposition of the Total Sum of Squares
The total corrected sum of square is
80
Analysis of Variance
Decomposition of the Total Sum of Squares
Partitioned into
sum of squares of the difference between the treatment averages and grand
averages
treatment average.
The difference between the observed treatment average and the grand
average is a measure of the difference between treatment means
81
Analysis of Variance
Decomposition of the Total Sum of Squares
has a-1 df
82
Analysis of Variance
Error sum of square
a sample variances
83
Analysis of Variance
If there were no differences between the a treatments means, the
variation of the treatment averages from the grand average estimates
σ2
84
Analysis of Variance
Statistical Analysis
85
Analysis of Variance
Statistical Analysis (Example)
Test whether the mean tensile strength is not equal at the five
different cotton weight percentage
86
Analysis of Variance
Statistical Analysis (Example):The tensile strength experiment
87
Analysis of Variance
Statistical Analysis (Example):The tensile strength experiment
88
Analysis of Variance
Estimation of the model parameters
observations, i.e.
89
Analysis of Variance
Estimation of the model parameters
If the errors are normally distributed, then yi. NID i. , 2 n
and
90
Analysis of Variance
Estimation of the model parameters (Example)
Consider the tensile strength data (at the beginning of section 3)
91
Analysis of Variance
Estimation of the model parameters (Example)
The estimate of the overall mean is
92
Analysis of Variance
93
Analysis of Variance
Unbalanced Data
Advantage of choosing balanced design
94
Analysis of Variance: Model Adequacy Checking
Model Adequacy Checking
Assumptions are
Independent
Normally distributed
Mean zero and constant variance σ2
96
Analysis of Variance: Model Adequacy Checking
Non-constant variance
97
Analysis of Variance: Model Adequacy Checking
Non-constant variance
If the computed ratio is less than the critical value, the groups are
Bartlett’s Test is the uniformly most powerful (UMP) test for the
met.
The test’s reliability is sensitive (not robust) to non-normality.
If the treatment populations are not approximately normal, the true significance level
can be very different from the nominal significance level
99
Analysis of Variance: Model Adequacy Checking
Non-constant variance
Levene’s test
100
Analysis of Variance: Model Adequacy Checking
Non-constant variance
Brown-Forsythe test
101
Analysis of Variance: Model Adequacy Checking
Non-constant variance
Easiest remedial measure is usually a transformation (can help both
non-constant variance and non-normality)
If variance proportional to μi then try Y (sometimes occurs if Y
is a count)
Plot of
Plot of
1
If standard deviation proportional to
2
i , try Y
Plot of
102
Analysis of Variance:Model Adequacy Checking
The Normality Assumption
103
Analysis of Variance: Model Adequacy Checking
The Normality Assumption
105
Analysis of Variance:Model Adequacy Checking
The Independence Assumption
Durbin-Watson Test
Assumptions are:
That the errors are normally distributed with a mean of 0.
106
Analysis of Variance: Interpretation of Results
Comparison Among Treatment Means
107
Analysis of Variance: Interpretation of Results
Contrasts
108
Analysis of Variance: Interpretation of Results
Contrasts
109
Analysis of Variance: Interpretation of Results
Contrasts
or equivalently
110
Analysis of Variance: Interpretation of Results
Contrasts
111
Analysis of Variance: Interpretation of Results
Contrasts
112
Analysis of Variance: Interpretation of Results
Contrasts
The variance of C is
113
Analysis of Variance: Interpretation of Results
Contrasts (equal sample size)
c y ι i.
i=1
N 0,1
a
nσ 2 ci2
i=1
a
cy
i i.
t0 i = 1 t , N a
a 2 2
nMS c
E i
i=1
114
Analysis of Variance: Interpretation of Results
Contrasts (equal sample size)
cy c y
i i. i i.
F0 t0
2
i 1 i 1
F ,1, N a
a 2 a 2
nMS E ci nMS c
E i
i 1 i 1
SSC
Alternatively MSC 1
F0
MS E SS E
N a
Where
115
Analysis of Variance: Interpretation of Results
Contrasts (unequal sample size)
The t statistics is
Alternatively
SSC
MSC 1
F0
MS E SS E
N a
Where
116
Analysis of Variance: Interpretation of Results
Orthogonal Contrasts
Two contrasts are orthogonal if the pairwise products of the terms
sum to zero.
unbalanced data
118
Analysis of Variance: Interpretation of Results
Orthogonal Contrasts (Example)
The contrasts from the hypothesis (planned)
a
i i.
c y
The contrast sum of squares are computed as i 1
a
n ci2
i 1
119
Analysis of Variance: Interpretation of Results
Orthogonal Contrasts (Example)
120
Analysis of Variance: Interpretation of Results
Orthogonal Contrasts (Class Activity)
Results (mg shoot dry weight) of an experiment (CRD) to determine
the effect of seed treatment by acids on the early growth of rice
seedlings.
Treatment Replications Total Mean
Control 4.23 4.38 4.10 3.99 4.25 20.95 4.19
HCl 3.85 3.78 3.91 3.94 3.86 19.34 3.87
Propionic 3.75 3.65 3.82 3.69 3.73 18.64 3.73
Butyric 3.66 3.67 3.62 3.54 3.71 18.20 3.64
The treatment structure of this experiment suggests that the investigator had several specific questions in mind from the
beginning: Hint:
1) Do acid treatments affect seedling growth? HCL, Propanic and Butyric
2) Is the effect of =organic acids different from that of inorganic acids? are all acids
3) Is there a difference in the effects of the two different organic acids? HCL is inorganic and
Conduct an orthogonal contrast test? Butyric&propanic are organic
121
Analysis of Variance: Interpretation of Results
Scheffe’s Method for Comparing all Contrasts
In many exploratory experiments, comparisions of interest are
discovered after some preliminary examination of data
122
Analysis of Variance: Interpretation of Results
Scheffe’s Method for Comparing all Contrasts
The standard error of this contrast is
a
ciu2
SCU MS E
i 1
ni
Where
123
Analysis of Variance: Interpretation of Results
Scheffe’s Method for Comparing all Contrasts
The critical value against which Cu compared is
L1=µ1 + µ3 - µ4 - µ5
and
L2=µ1 - µ4
124
Analysis of Variance: Interpretation of Results
Scheffe’s Method for Comparing all Contrasts (example)
Contrast estimated value
125
Analysis of Variance: Interpretation of Results
Scheffe’s Method for Comparing all Contrasts (example)
The critical values are
Wehere
ymax maximum sample mean ymin minimum sample means
127
Analysis of Variance: Comparing Pairs of Treatment Means
Tukey Test
Test statistic for equal sample size is
Reject H0 when
ymax ymin T
128
Analysis of Variance: Comparing Pairs of Treatment Means
Tukey Test (Example-cotton weight percentage experiment)
Test statistic value is (5 treatments and 25 observations)
129
Analysis of Variance: Comparing Pairs of Treatment Means
Tukey Test (Example-cotton weight percentage experiment)
The differences in pairs of averages
yi. y j . LSD
131
Analysis of Variance: Comparing Pairs of Treatment Means
LSD(Example-cotton weight percentage experiment)
Test statistic value is (5 treatments and 25 observations)
132
Analysis of Variance: Comparing Pairs of Treatment Means
LSD(Example-cotton weight percentage experiment)
The differences in pairs of averages
134
Analysis of Variance: Comparing Pairs of Treatment Means
Duncan's Multiple Range Test
From duncans table of significan't ranges obtain
for p=2, 3, , . . ., a
Continue until all means have been compared with largest mean
Recall that
136
Analysis of Variance: Comparing Pairs of Treatment Means
Duncan’s Multiple Range Test (Example-cotton weight
percentage experiment)
Significant ranges with α=0.05, f=20 (error degrees of freedom)
137
Analysis of Variance: Comparing Pairs of Treatment Means
Duncan’s Multiple Range Test (Example-cotton weight
percentage experiment)
The comparisions are
138
Analysis of Variance: Comparing Treatment Means with
Control
Dunnett's Test
139
Analysis of Variance: Comparing Pairs of Treatment Means
Dunnett’s Test(Example-cotton weight percentage experiment)
Consider the 5th treatment is control, a=5, f=20, and ni=5
Any treatment mean that differes from the control by more than
4.76 would be declared significantly different
140
Analysis of Variance: Comparing Pairs of Treatment Means
Dunnett’s Test(remark)
When comparing control with treatments, use more observations
in control than treatments.
na
a
n
141
Analysis of Variance: More About Single Factor Experiments
Choice of sample size: Operating Characteristic curves
Choice of sample size is closely related to the probability of type II
error .
Hypotheses H o : 1 2
H1: 1 2
Power = 1 = P(Reject HoHo is false)
F
2 i 1
a 2
a = 5, N = an = 5n, a – 1 = 4, N – a = 5(n-1)
145
Analysis of Variance: More About Single Factor Experiments
Choice of sample size: Operating Characteristic curves
146
Analysis of Variance: More About Single Factor Experiments
Choice of sample size: Operating Characteristic curves
The objective is to find to see if the power is satisfied
147
Analysis of Variance: More About Single Factor Experiments
Choice of sample size: Operating Characteristic curves
It is often difficult to select a set of treatment means for choosing
the sample size
Typically work in term of the ratio of D/ and try values of n until
the desired power is achieved
148
Analysis of Variance: More About Single Factor Experiments
Choice of sample size: Specifying a Standard Deviation Increase
If the treatment means don’t differ, the standard deviation of an
observation chosen is
i 1
As the difference between means increase, the standard deviation
increases
a
2 i2 / a
i 1
149
Analysis of Variance: More About Single Factor Experiments
Choice of sample size: Specifying a Standard Deviation Increase
a 2
i / a
2
i 1
1 P / 100
Rearrange it
a
i /a
2
i 1
1 P / 1002 1
150
Analysis of Variance: More About Single Factor Experiments
Choice of sample size: Specifying a Standard Deviation Increase
i /a
2
F i 1
1 P / 100
2
1( n )
/ n
We can use operating characteristics curve to determine n.
2MS E
t / 2, N a
n
152
Analysis of Variance: More About Single Factor Experiments
Choice of sample size: Confidence Interval
n=3
n=4
n=5
154
Analysis of Variance: More About Single Factor Experiments
Least square estimation of model parameters
The derivative yields and
Simplifying
Constraining
155
Analysis of Variance: More About Single Factor Experiments
Repeated measures
156
Analysis of Variance: More About Single Factor Experiments
Repeated measures
158
Analysis of Variance: More About Single Factor Experiments
Repeated measures (layout)
Condition/Treatment
Subject Cond 1 Cond 2 … Cond k Total Mean
1 Y11 Y12 … Y1k Y1. Y1./k
2 Y21 Y22 … Y2k Y1. Y2./k
… … … … … … …
n Y21 Y22 … Y2k Yn. Yn./k
Total Y.1 Y.2 Y.2 Y..
Mean Y.1/n Y.2/n … Y.k/n Y../N
159
Analysis of Variance: More About Single Factor Experiments
Repeated measures (Sum of square decomposition)
160
Analysis of Variance: More About Single Factor Experiments
Repeated measures (Sum of square decomposition)
SST Yij Y ..
2
161
Analysis of Variance: More About Single Factor Experiments
Repeated measures (Example)
Five subjects, all are tested for reaction time after taking each of
the four drugs, over a period of four days. The following result is
obtained
Person Drug 1 Drug 2 Drug 3 Drug 4 Totali Mean
(Ci )
1 30 28 16 34 108 27
2 14 18 10 22 64 16
3 24 20 18 30 92 23
4 38 34 20 44 136 34
5 26 28 14 30 98 24.5
Total 132 128 78 160 498
(Rj)
Mean 26.4 25.6 15.6 32 24.9 24.9 162
Analysis of Variance: More About Single Factor Experiments
Repeated measures (Example)
Within Subjects
163
Analysis of Variance: More About Single Factor Experiments
Repeated measures (Example)
Source SS df MS F p
SSPeople/subject 680.80 4 170.20
SSDrug/Condition 698.20 3 232.73 24.759 0.000020
SSError 112.80 12 9.40
SST 1491.80 19 78.52
164
4. Blocking
165
Introduction
Blocking
(increase precision)
explained.
experiment. 167
Randomized Block Design
If you have a nuisance factor that is known but uncontrollable,
Time
planting, harvesting
Physical characteristics
height, maturity
Natural groupings
branches (experimental units) on a tree (block)
170
Randomized Block Design
Demonstration (The Hardness Testing Example The Hardness
Testing Example)
We wish to determine whether 4 different tips produce different (mean)
171
Randomized Block Design
Demonstration (The Hardness Testing Example The Hardness
Testing Example)
To conduct this experiment as a RCBD, assign all 4 tips to each
coupon
172
Randomized Block Design
Demonstration (The Hardness Testing Example The Hardness
Testing Example)
A complete replicate of the basic experiment is conducted in each
block
173
Randomized Block Design
Demonstration (The Hardness Testing Example The Hardness Testing Example)
Suppose that we use b = 4 blocks:
but now we have to remove the variability associated with the nuisance
174
Randomized Complete Block Design
Let yij be the response for the ith treatment in the jth block.
yij = µ + αi + βj + εij
This standard model says that treatments and blocks are additive,
treatments have the same effect in every block and blocks only serve to
175
Randomized Complete Block Design
The quantities are
176
Randomized Complete Block Design (Sum of Squares)
ANOVA Partitioning of Total Sum of Squares
a b a b
i 1 j 1
177
Randomized Complete Block Design (Sum of Squares)
ANOVA Partitioning of Total Sum of Squares
178
Randomized Complete Block Design (Sum of Squares)
ANOVA Table
179
Randomized Complete Block Design (Example)
A hardness testing machine operates by pressing a tip into a metal test
“coupon.” The hardness of the coupon can be determined from the depth
of the resulting depression. Four tip types are being tested to see if they
produce significantly different readings. However, the coupons might
differ slightly in their hardness (for example, if they are taken from
ingots produced in different heats).
180
Randomized Complete Block Design (Example)
Coupon (Block)
Type of Tip 1 2 3 4 yi. yi.
2
181
Randomized Complete Block Design (Example)
Source SS df MS F P-Value
Treatment
0.385 3 0.1283 14.4375 0.0009
(Tip)
Coupon
0.825 3 0.2750 30.9375
(block)
Error 0.08 9 0.0089
Total 1.29 15
The hypothesis
Decision: Reject , the mean measurement hardness from the four tips is not
same
182
Randomized Complete Block Design (Example)
Hardness as Completely randomized design
Source SS df MS F
Treatment (Tip) 0.385 3 0.1283 1.70
Error 0.905 12 0.0754
Total 1.29 15
183
RCBD or CRD?
Which is better, a RCBD or a CRD?
184
RCBD or CRD?
If the blocking was not helpful, then the relative efficiency
equals 1.
The larger the relative efficiency is, the more efficient the
blocking was at reducing the error variance.
n
The value can be interpreted as the ratio
b
where n is the number of experimental units that would have to be
185
RCBD or CRD?
Example (Hardness test)
0.0754
RE RCBD , CRD 8.47
0.0089
This implies that it would have taken more than 8.47 times as many
experimental units/treatment to get the same MSE as we got using the
coupon as blocks.
186
Multiple Comparison RCBD
When a significant result is found, determine where the difference
lies
Minor modification
RCBD.
187
Latin Square Design
Latin square designs are used to simultaneously control (or
eliminate) two sources of nuisance variability.
189
Latin Square Design (ANOVA Table)
190
Latin Square Design (Example)
Five different formulations of a rocket propellant
191
Latin Square Design (Example)
Coding (by subtracting 25 from each observation)
192
Latin Square Design (Example)
The sum of squares for the total, Batches (rows), (Operators)
columns are computed as follows
193
Latin Square Design (Example)
194
Latin Square Design (Example)
195
Graeco-Latin Square Design
There is a single factor of primary interest, typically called the
treatment factor, and several nuisance factors.
For Latin square designs there are 2 nuisance factors, for Graeco-
Latin square designs there are 3 nuisance factors
196
Graeco-Latin Square Design
A 4x4 Graeco-Latin square design
197
Graeco-Latin Square Design
ANOVA table for Graeco-Latin square design
198
Graeco-Latin Square Design
A 4x4 Graeco-Latin square design
199
Graeco-Latin Square Design (Example)
Five different formulations of a rocket propellant
200
Graeco-Latin Square Design (Example)
The sum of squares for the total, Batches (rows), (Operators)
columns are computed as follows
201
Graeco-Latin Square Design (Example)
202
Graeco-Latin Square Design (Example)
203
Graeco-Latin Square Design (Example)
The ANOVA table
204
Balanced Incomplete Block Design (BIBD)
A BIBD is a design in which
Each treatments occur together in the same block the same number
experimental units
205
Balanced Incomplete Block Design (BIBD)
N = a(r) = b(k), total number of subjects
λ must be an integer
206
Balanced Incomplete Block Design (BIBD)
Consider the following examples
Treatment Block(b=4)
Each pair of treatments occurs λ times? (a=4) 1 2 3 4
In each block there are k experimental units? 1 1 0 1 0
2 0 1 0 1
Each treatment occurs r times?
3 1 0 1 0
4 0 1 0 1
Is it BIBD?
207
Balanced Incomplete Block Design (BIBD)
208
Balanced Incomplete Block Design (BIBD)
209
Balanced Incomplete Block Design (BIBD)
The adjusted treatment sum of square is
210
Balanced Incomplete Block Design (BIBD)
The adjusted treatment sum of square is
211
Balanced Incomplete Block Design (Example)
Suppose a chemical engineer thinks that the time of reaction for a
chemical process is a function of the type of catalyst employed. Four
catalysts are being investigated. Variation in the batches of raw
material may affect the performance of the catalysts, the engineer
decides the use batches of raw material as a block. However each
batch is only large enough to permit three catalyst to be run.
212
Balanced Incomplete Block Design (Example)
There are
a=4, b=4, k=3, r=3, λ=2, and N=12
213
Balanced Incomplete Block Design (Example)
The adjusted treatment totals are
214
Balanced Incomplete Block Design (Example)
The Analysis of variance table
215
5. Factorial Design
216
Basic Definition and Principles
Two factors A and B are said to be crossed if every level of A
occurs with every level of factor B, and vice versa.
Let factor A have two levels(L,H) and factor B has two levels
(L, H)
When the factors crossed the treatment combinations would be
as follows:
Treatment 1 2 3 4
LL LH HL HH
217
Basic Definition and Principles
A factorial experiment allows investigation into the effect of two
or more factors on the mean value of a response.
220
Two factor design with interaction
Consider the two factor experiment shown below
AB
28 30
29
2
222
Advantages of Factorial Designs
More efficient than one factor at a time experiments
223
The Two Factor Factorial Design
The simplest type of factorial designs involve only two factors or
treatments
There are a levels of factor A and b levels of factor B, and these are
arranged in a factorial design, that is, each replicate of the
experiment contains all ab treatment combinations.
224
The Two Factor Factorial Design
The observations in a factorial experiment can be described by a
model.
225
The Two Factor Factorial Design
Hypothesis to be tested
226
The Two Factor Factorial Design
Statistical Analysis
227
The Two Factor Factorial Design
Sum of squares
228
The Two Factor Factorial Design (ANOVA Table)
229
The Two Factor Factorial Design
Expected values for mean of squares
230
The Two Factor Factorial Design (Example))
The Battery Design Experiment: An engineer is designing a battery for
use in a device that will be subjected to some extreme variations in
temperature. The only design parameter that he can select at this point is the
plate material for the battery, and he has three possible choices. When the
device is manufactured and is shipped to the field, the engineer has no
control over the temperature extremes that the device will encounter, and he
knows from experience that temperature will probably affect the effective
battery life. However, temperature can be controlled in the product
development laboratory for the purpose of the test. All three plate materials
tested at three temperature levels (15, 70, 125 oF), four batteries are tested at
each combinations plate material and temperature, and all 36 tests are
performed in random order.
231
The Two Factor Factorial Design (Example)
The life (in hours) data is as follows
Two questions:
What effects do material type and temperature have on the life of the battery?
Is there a choice of material that would give uniformly long life regardless of
temperature?
232
The Two Factor Factorial Design (Example)
233
The Two Factor Factorial Design (Example)
234
The Two Factor Factorial Design (Example)
235
The Two Factor Factorial Design (Example)
Material Type
Temperature
Changing from low to intermediate temperature, battery life with material type 3
From intermediate to high temperature, battery life decreases for material type 2 and 3,
Material type 3 seems to give the best results if engineer want less loss of effective life
as temperature changes
237
The Two Factor Factorial Design (Example)
Checking interaction for single replication
238
The Two Factor Factorial Design (Example)
Impurity present in a chemical product is affected by two factors-
pressure and temperature. The data from a single replicate of a
factorial experiment are
239
The Two Factor Factorial Design (Example)
240
The Two Factor Factorial Design (Example)
241
The Two Factor Factorial Design (Example)
242
Three factor factorial design
243
Three factor factorial design
244
Three factor factorial design
ANOVA Table
245
Three factor factorial design (Example)
Soft Drink Bottling Problem
A soft drink bottler is interested in obtaining more uniform fill heights
in the bottles produced by his manufacturing process. An
experiment is conducted to study three factors of the process,
which are the percent carbonation (A): 10, 12, 14 percent the
operating pressure (B): 25, 30 psi the line speed (C): 200, 250 bpm
The response is the deviation from the target fill height. Each
combination of the three factors has two replicates and all 24 runs
are performed in a random order. The experiment and data are
shown below
246
Three factor factorial design (Example)
Soft Drink Bottling Problem
247
Three factor factorial design (Example)
Soft Drink Bottling Problem
248
Three factor factorial design (Example)
Soft Drink Bottling Problem
249
Three factor factorial design (Example)
Soft Drink Bottling Problem
250
Three factor factorial design (Example)
Soft Drink Bottling Problem
251
Three factor factorial design (Example)
ANOVA Table: Soft Drink Bottling Problem
252
Three factor factorial design (Example)
253
Blocking in factorial design
We have discussed factorial experiments in a completely
randomized design way.
254
Blocking in factorial design
Now suppose to run this experiment a particular raw material is
needed
255
Blocking in factorial design
256
Blocking in factorial design(Example)
257
Blocking in factorial design(Example)
ANOVA table
258
The 2k factorial design
The 2k designs are a major set of building blocks for many
experimental designs.
The 2k refers to designs with k factors where each factor has just
two levels.
with each factor having the minimal number of levels, just two.
By screening we are referring to the process of screening a large
number of factors that might be important in your experiment, with the
goal of selecting those important for the response that you're measuring.
259
The 2k factorial design
The 22 factorial design
The simplest case is 2k where k = 2.
Since there are two levels of each of two factors, 2k equals four.
260
The 22 factorial design
Therefore, there are four treatment combinations and the data
We use
"a" for when A is at its high level and B is at its low level,
"b" for when B is at its high level and A is at its low level, and
The use of this Yates notation indicates the high level of any factor
simply by using the small letter of that level factor.
262
The 22 factorial design
This notation actually is used for two purposes.
One is to denote the total sum of the observations at that
level.
In the case below b = 60 is the sum of the three observations
at the level b.
263
The 22 factorial design
264
The 22 factorial design
Practical interpretation?
Increasing reactant concentration increases yield
Catalyst effect is negative
Interaction effect is relatively smaller
265
The 22 factorial design (Sum of squares)
Consider the sum of square for A, B, and AB.
266
The 22 factorial design (Sum of squares)
The sum of squares for any contrast can be computed (chapter 3),
Which states that the that the contrast sum of squares is equal to
the contrast squared divided by the number of observations in each
total the contrast times the sum of squares of the contrast
coefficients
267
The 22 factorial design (Sum of squares)
The sum of squares are
268
The 22 factorial design
269
The 23 factorial design
Here is an example in three dimensions, with factors A, B and C.
270
The 23 factorial design
271
The 23 factorial design
Consider estimating main effects
272
The 23 factorial design
The average effect of AB interaction
273
The 23 factorial design
Sum of squares for effects are
274
The 23 factorial design
Sum of squares for effects are
275
The 23 factorial design
276
The 23 factorial design
277
The 23 factorial design
The effects estimates, sum of squares and percent contribution.
279
6. Nested and Split Plot Design
280
Nested Design
Nested design, the levels of one factor (B) is similar to but not
identical to each other at different levels of another factor (A)
Experimental design
281
Nested Design (Two stage)
Anova Table
283
Nested Design-Two stage (Example)
Example
284
Nested Design
The sum of squares are
285
Nested Design
The ANOVA table
The purity of batches of raw material from the same supplier does
differ significantly.
286
Nested Design
The ANOVA table
The purity of batches of raw material from the same supplier does
differ significantly.
287
Nested Design –Three Stage
288
Nested Design- Three Stagae
289
Split Plot Design
Split-plot designs are needed when the levels of some treatment
factors are more difficult to change during the experiment than those
of others. or
Useful when the nature of the experiment requires the use of large
experimental units for some factors and smaller experimental units for
others
The designs have a nested blocking structure: split plots are nested
within whole plots, which may be nested within blocks.
290
Split Plot Design
Split-plot designs have three main characteristics:
i. The levels of all the factors are not randomly determined and reset for
each experimental run. Did you hold a factor at a particular setting and then
run all the combinations of the other factors?
ii. The size of the experimental unit is not the same for all experimental
factors. Did you apply one factor to a larger unit or group of units involving
combinations of the other factors?
291
Split Plot Design
Example
designate a farm field for the experiment (blocking factor with s=s
levels).
292
Split Plot Design
i. The blocks are divided into three (equal sized) large experimental units
called whole plots.
ii. The three levels of factor A (oats) are randomly assigned to these whole
plots (each plot is assigned a variety of oat according to a randomized
block design).
iii. Each whole plot is sub divided into two smaller experimental units
called sub plots (split-plots) and the two levels of manure are randomly
assigned to the 2 split plots.
293
Split Plot Design
A2 A1 A3 A3 A1 A2
B2 B1 B1 B2 B2 B1
subplots
B1 B2 B2 B1 B1 B2
Whole plots
Block 1 Block 2
294
Split Plot Design (Example)
The general model for a two factor split plot experiment with in
RCBD
With r random blocks
295
Split Plot Design (ANOVA Table)
Source df Sum of E(MS)
Square
Blocks (R) r-1 Ssblock
A a-1 SSA
AB (a-1)(b-1) SSAB
296
Split Plot Design (ANOVA Table)
r a b
SST yijk
2
CF
i j k
1 a 2
SS A y.j. CF Y ...
2
rb j 1 CF
rab
1 r 2
SS Block
ab i 1
yi.. CF
1 b
SS B y..k CF
ra k 1
1 r k 2
SS RA yij. CF SSblock SS A
b i 1 j 1
298
Split Plot Design (Example)
The result is as follows
Spray
Variety
1 2 3
1
1 71 64 84
Farm
2 66 56 82
1 83 77 97
2
2 79 73 88
299
Split Plot Design (Example)
The ANOVA table
Source SS Df MS F P-value
Farm 456.33 1 456.33 82.97 0.0028
Spray 842.17 2 421.03 76.56 0.027
Farm*Spray 15.17 2 7.58 1.38 0.3761
Variety 85.33 1 85.33 15.52 0.0292
Spray*Variety 1.17 2 0.58 0.11 0.9026
Error 16.5 3 5.50
Total 1416.67 11
The mean square for Spray is: 421.08, p-value=0.0177 (look at the
anova table)
302
Introduction
A ‘classic’ ANOVA tests for differences in mean responses to
categorical factor (treatment) levels.
303
Introduction
These sources of extraneous variability historically have been
referred to as ‘nuisance’ or ‘concomitant’ variables.
df
305
ANCOVA for CRD
If no effect of treatment then the model will be
distributed
306
ANCOVA for CRD
Adjusted mean estimates are as follows
has distribution
307
ANCOVA for CRD
Cross products are computed as follows
308
ANCOVA for CRD
Example:
309
ANCOVA for CRD
Cross products
310
ANCOVA for CRD
Cross products
311
ANCOVA for CRD
Cross products
312
ANCOVA
Test Statistic
313
ANCOVA
Adjusted estimated treatment means are
314
ANCOVA for CRD
ANOVA Table
315