0% found this document useful (0 votes)
12 views

Chapter 11

Uploaded by

10aisahpayapo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Chapter 11

Uploaded by

10aisahpayapo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 69

Introduction to Probability

and Statistics
Twelfth Edition

Robert J. Beaver • Barbara M. Beaver • William Mendenhall

Presentation designed and written by:


Barbara M. Beaver
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Introduction to Probability
and Statistics
Twelfth Edition

Chapter 11
The Analysis of Variance

Some graphic screen captures from Seeing Statistics ® Copyright ©2006 Brooks/Cole
Some images © 2001-(current year) www.arttoday.com A division of Thomson Learning, Inc.
Experimental Design
• The sampling plan or experimental design
determines the way that a sample is selected.
• In an observational study, the experimenter
observes data that already exist. The sampling
plan is a plan for collecting this data.
• In a designed experiment, the experimenter
imposes one or more experimental conditions
on the experimental units and records the
response.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Definitions
• An experimental unit is the object on which a
measurement or measurements) is taken.
• A factor is an independent variable whose
values are controlled and varied by the
experimenter.
• A level is the intensity setting of a factor.
• A treatment is a specific combination of factor
levels.
• The response is the variable being measured by
the experimenter.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
• A group of people is randomly divided into
an experimental and a control group. The
control group is given an aptitude test after
having eaten a full breakfast. The
experimental group is given the same test
without having eaten any breakfast.
Experimental unit = person Factor = meal
Breakfast or
Response = Score on test Levels =
no breakfast
Treatments: Breakfast or no breakfast
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
• The experimenter in the previous example
also records the person’s gender. Describe
the factors, levels and treatments.
Experimental unit = person Response = score
Factor #1 = meal Factor #2 = gender
breakfast or
Levels = Levels = male or
no breakfast female
Treatments:
male and breakfast, female and breakfast, male
and no breakfast, female and no breakfast
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Analysis of Variance
(ANOVA)
• All measurements exhibit variability.
• The total variation in the response
measurements is broken into portions that
can be attributed to various factors.
factors
• These portions are used to judge the effect
of the various factors on the experimental
response.

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
The Analysis of Variance
• If an experiment has been properly
Factor
Factor11
designed,
Total
Totalvariation
variation Factor
Factor22
Random
Randomvariation
variation
•We compare the variation due to any one
Thefactor
variationtobetween the
the typical The variation
random variationbetween the
in the
sample means is larger than sample means is about the
experiment.
the typical variation within same as the typical variation
the samples. within the samples.

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Assumptions
• Similar to the assumptions required in
Chapter 10.
1.
1. The
The observations
observations within
within each
each population
population are
are
normally
normally distributed
distributed with
with aa common
common variance
variance

 ..
22

2.
2. Assumptions
Assumptions regarding
regarding the
the sampling
sampling
procedures
procedures are
are specified
specified for
for each
each design.
design.
•Analysis of variance procedures are fairly robust
when sample sizes are equal and when the data are
fairly mound-shaped.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Three Designs
• Completely randomized design: an extension of the two independent sample
t-test.
• Randomized block design: an extension of the paired difference test.
• a × b Factorial experiment: we study two experimental factors and their effect
on the response.

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
The Completely
Randomized Design
• A one-way classification in which one
factor is set at k different levels.
• The k levels correspond to k different normal
populations, which are the treatments.
treatments
• Are the k population means the same, or is at
least one mean different from the others?

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Example
Is the attention span of children
affected by whether or not they had a good
breakfast? Twelve children were randomly
divided into three groups and assigned to a
different meal plan. The response was attention
span in minutes during the morning reading time.
No Breakfast Light Breakfast Full Breakfast
8 14 10
kk==33treatments.
treatments.
Are
Arethe
theaverage
average
7 16 12
attention
attentionspans
spans
9 12 16 different?
different?
13 17 15
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Completely
Randomized Design
• Random samples of size n1, n2, …,nk are
drawn from k populations with means 1, 2,
…, k and with common variance 2.
• Let xij be the j-th measurement in the i-th
sample.
• The total variation in the experiment is
measured by the total sum of squares:
squares
22
Total SS
Total SS ((xxijij  xx)) Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Analysis of Variance
The Total SS is divided into two parts:
 SST (sum of squares for treatments):
measures the variation among the k sample
means.
 SSE (sum of squares for error):
measures the variation within the k samples.
in such a way that:
Total SS
Total SS SST SSE
SST SSE
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Computing Formulas

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
The Breakfast Problem
No Breakfast Light Breakfast Full Breakfast
8 14 10
7 16 12
9 12 16
13 17 15
T1 = 37 T2 = 59 T3 = 53 GG==149
149
22
149
149 1850.0833
CM
CM  1850.0833
12
12
22 22 22
Total SS 8  7  ...  15  CM
Total SS  8  7  ...  15 CM
1973 1850.0833
1973--1850.0833 122.9167
122.9167
22 22 22
37 53 59
37  53  59  CM 1914.75  CM 64.6667
SST
SST     CM 1914.75  CM 64.6667
44 44 44
SSE
SSE Total
TotalSS SST
SS--SST 58
58.25
.25
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Degrees of Freedom and
Mean Squares
• These sums of squares behave like the
numerator of a sample variance. When
divided by the appropriate degrees of
freedom,
freedom each provides a mean square,square
an estimate of variation in the experiment.
• Degrees of freedom are additive, just like
the sums of squares.
Total df 
Total df Trt df  Error
Trt df Error df
df
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The ANOVA Table
Total df = n +n +…+n –1 = n -1
1 2 k Mean Squares
Treatment df = k –1 MST = SST/(k-1)

Error df = n –1 – (k – 1) = n-k MSE = SSE/(n-k)

Source df SS MS F
Treatments k -1 SST SST/(k-1) MST/MSE
Error n-k SSE SSE/(n-k)
Total n -1 Total SS

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
The Breakfast Problem
22
149
149 1850.0833
CM
CM  1850.0833
12
12
22 22 22
Total SS 8  7  ...  15  CM
Total SS  8  7  ...  15 CM
1973 1850.0833
1973--1850.0833 122.9167
122.9167
22 22 22
37 53 59
37  53  59  CM 1914.75  CM 64.6667
SST 
SST     CM 1914.75  CM 64.6667
44 44 44
SSE
SSE Total
TotalSS SST
SS--SST 58
58.25
.25

Source df SS MS F
Treatments 2 64.6667 32.3333 5.00
Error 9 58.25 6.4722
Total 11 122.9167

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Testing the Treatment Means
H 0 : 1   2   3 ...   k versus
H a : at least one mean is different

Remember that 2 is the common variance for all k


populations. The quantity MSE SSE/(n  k) is a
pooled estimate of 2, a weighted average of all k
sample variances, whether or not H 0 is true.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
• If H 0 is true, then the variation in the
sample means, measured by MST [SST/
(k 1)], also provides an unbiased estimate
of 2.
• However, if H 0 is false and the population
means are different, then MST— which
measures the variance in the sample means
— is unusually large. The test statistic F
MST/ MSE tends to be larger that
Copyright ©2006 usual.
Brooks/Cole
A division of Thomson Learning, Inc.
MY APPLET
The F Test
• Hence, you can reject H 0 for large values of
F, using a right-tailed statistical test.
• When H 0 is true, this test statistic has an F
distribution with d f 1  (k  1) and d f 2  (n 
k) degrees of freedom and right-tailed critical
values of the F distribution can be used.
TotestH 0 : 1  2  3 ...  k
M ST
TestStatistic: F 
M SE
RejectH 0 if F  F withk  1 and n-k df .
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Breakfast Problem
Source df SS MS F
Treatments 2 64.6667 32.3333 5.00
Error 9 58.25 6.4722
Total 11 122.9167

HH00 ::11 22 


33 versus
versus
HHaa ::atatleast
least one
one mean
mean isis different
different
MST
MST 32
32 ..3333
3333 5.00
FF
  5.00
MSE
MSE 66..4722 4722
Rejection region ::FFFF.05.05 
Rejection region 44..26
26.. MY APPLET

We
Wereject
reject HH00 and
and conclude
conclude that thatther
thereeisis aa
difference
difference inin average
average attention
attention spans.
spans.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Confidence Intervals
•If a difference exists between the treatment
means, we can explore it with confidence
intervals.
ss
AAsingle mean, i i ::xxi i 
single mean, tt/ /22
nni i
 11 11 
tt/ /22 ss   
Difference i i  j j ::((xxi i  xxj j))
22
Difference
 nni i nnj j 
where ss 
where  MSE
MSE and and ttisis basedbasedon onerror
error df df..
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Tukey’s Method for
Paired Comparisons
•Designed to test all pairs of population means
simultaneously, with an overall error rate of
.
•Based on the studentized range,
range the
difference between the largest and smallest of
the k sample means.
•Assume that the sample sizes are equal and
calculate a “ruler” that measures the distance
required between any pair of means to declare
a significant difference. Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Tukey’s Method
ss
Calculate ::  
Calculate qq((kk,,df
df ))
nni i
where kk 
where number
number of
of treatment
treatment means
means
ss   MSE
MSE df 
df error
error df
df
nni i 
common
common sample
sample size
size
df ))
qq((kk,,df value
value from
from Table
Table 11.
11.
IfIf any
any pair
pair of
of means
means differ
differ by
bymore than ,,
more than
they
they arearedeclared
declared different.
different.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Breakfast Problem
Use Tukey’s method to determine which of the
three population means differ from the others.
No Breakfast Light Breakfast Full Breakfast
T1 = 37 T2 = 59 T3 = 53
Means 37/4 = 9.25 59/4 = 14.75 53/4 = 13.25

ss 66.4722
.4722 5.02
 qq.05.05((33,9,9)) 33.95
.95 5.02
44 44

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
The Breakfast Problem
List the sample means from smallest to
largest.
xx11 xx33 xx22
 
55..02
02
99..25
25 13.25
13.25 14.75
14.75
Since the difference between 9.25 and 13.25 is
We can declare a significant
less than  = 5.02, there is no significant
difference in average attention
difference. There is a difference
spans betweenbetween
“no breakfast”
population means 1 and and2“light
however.
breakfast”, but not
between the other pairs.
There is no difference between 13.25 and
14.75. Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Randomized
Block Design
• A direct extension of the paired difference
or matched pairs design.
• A two-way classification in which k
treatment means are compared.
• The design uses blocks of k experimental
units that are relatively similar or
homogeneous, with one unit within each
block randomly assigned to each
treatment. Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Randomized
Block Design
• If the design involves k treatments within
each of b blocks,
blocks then the total number of
observations is n bk.bk
• The purpose of blocking is to remove or isolate
the block-to-block variability that might hide
the effect of the treatments.
• There are two factors—treatments and
blocks,
blocks only one of which is of interest to the
experimenter.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
We want to investigate the affect of
3 methods of soil preparation on the growth
of seedlings. Each method is applied to
seedlings growing at each of 4 locations and
the average first year Location
growth is recorded. Soil Prep 1 2 3 4
A 11 13 16 10
Treatment
Treatment==soil
soilpreparation
preparation(k
(k==3)
3)B 15 17 20 12

Block C 10 15 13 10
Block==location
location(b
(b==4)
4)
IsIsthe
theaverage
averagegrowth
growthdifferent
differentfor
forthe
the33
soil
soilpreps?
preps? Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Randomized
Block Design
• Let xij be the response for the i-th
treatment applied to the j-th block.
– i = 1, 2, …k j = 1, 2, …, b
• The total variation in the experiment is
measured by the total sum of squares:
squares
22
Total SS
Total SS ((xxijij  xx))
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Analysis of Variance
The Total SS is divided into 3 parts:
 SST (sum of squares for treatments): measures
the variation among the k treatment means
 SSB (sum of squares for blocks): measures the
variation among the b block means
 SSE (sum of squares for error): measures the
random variation or experimental error
in such a way that:
Total SS
Total SS SST SSB
SST SSB SSE
SSE
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Computing Formulas
G2
CM  where G  xij
n
Total SS  xij2  CM
2
 Ti
SST   CM where Ti total for treatm ent i
b
2
 Bj
SSB   CM where B j total for block j
k
SSE Total SS - SST - SSB

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
The Seedling Problem
Locations
Soil Prep 1 2 3 4 Ti
A 11 13 16 10 50
B 15 17 20 12 64
C 10 15 13 10 48
22
1162
62 2187 Bj 36 45 49 32 162
CM
CM  2187
12
12
22 22 22
Total SS  11  15  ... 
Total SS 11  15  ...  10 --2187
10 2187
111
111
22 22 22
50  64  48
50  64  48  2187 38
SST 
SST   2187 38
44
22 22 22 22
36  45  49  32
36  45  49  32  2187 61.6667
SSB 
SSB   2187 61.6667
33
SSE
SSE 11 38
1111 38 61 6667
61..6667 11
11..3333
3333 Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The ANOVA Table
Total df = bk –1 = n -1 Mean Squares
Treatment df = k –1 MST = SST/(k-1)

Block df = b –1 MSB = SSB/(b-1)


Error df = bk– (k – 1) – (b-1) = MSE = SSE/(k-1)(b-1)
(k-1)(b-1)
Source df SS MS F
Treatments k -1 SST SST/(k-1) MST/MSE
Blocks b -1 SSB SSB/(b-1) MSB/MSE
Error (b-1)(k-1) SSE SSE/(b-1)(k-1)
Total n -1 Total SS
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Seedling Problem
22
1162
62 2187
CM
CM  2187
12
12
22 22 22
Total SS  11  15  ... 
Total SS 11  15  ...  10 --2187
10 2187
111
111
22 22 22
50  64  48
50  64  48  2187 38
SST 
SST   2187 38
44
22 22 22 22
36  45  49  32
36  45  49  32  2187 61.6667
SSB 
SSB   2187 61.6667
33
SSE
SSE 11 38
1111 38 61 6667
61..6667 11
11..3333
3333
Source df SS MS F
Treatments 2 38 19 10.06
Blocks 3 61.6667 20.5556 10.88
Error 6 11.3333 1.8889
Total 11 122.9167 Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Testing the Treatment
and Block Means
For either treatment or block means, we can
test:
H 0 : 1   2   3 ... versus
H a : at least one mean is different

Remember that 2 is the common variance for all bk


treatment/block combinations. MSE is the best
estimate of 2, whether or not H 0 is true.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
• If H 0 is false and the population means are
different, then MST or MSB— whichever
you are testing— will unusually large. The
test statistic F MST/ MSE (or F
MSB/ MSE) tends to be larger that usual.
• We use a right-tailed F test with the
appropriate degrees of freedom.
To test H 0 : treatment (or block) means are equal
MST MSB
Test Statistic : F  (or F  )
MSE MSE
Reject H 0 if F  F with k - 1 (or b  1) and (b  1)( k  1) df .
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Seedling Problem
Source df SS MS F
Soil Prep (Trts) 2 38 19 10.06
Location 3 61.6667 20.5556 10.88
(Blocks)
Error 6 11.3333 1.8889
Total 11 122.9167Although
To
To test
testfor
for aadifference
difference due
duetotosoil
soil preparationot
preparatio nn:: of primary importance,
HH0 ::1 2 3 versus notice that the blocks (locations)
versus
0 1 2 3
were also significantly different (F =
HHa ::atatleast
least one
onemean
mean isisdifferent
different10.88)
a
MST
FF MST 10 10.06
.06
MSE
MSE
Rejection region ::FFFF.05.05 55.14
Rejection region .14. .
We
Wereject
rejectHH0 and
andconclude
conclude that thatther
thereeisisaa
0
difference
difference due
duetotosoil
soil preparatio
preparation.n. MY APPLET
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Confidence Intervals
•If a difference exists between the treatment
means or block means, we can explore it with
confidence intervals or using Tukey’s method.
2222
Difference
Difference inintreatme means ::((TTi i  TTj j))
treatme ntnt means tt/ /22 ss  
bb
 22
22
Difference
Difference inin block means ::((BBi i  BBj j))
block means tt/ /22 ss  
kk 
where TTi i 
where and BBi i 
TTi i //bb and BBi i //kk are
arethe
the
necessary
necessary treatment
treatment or orblock
block means.
means.
ss 
 MSE
MSE and
and ttisis based
basedon
onerror
error df ..
dfCopyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Tukey’s Method
ss
For
Forcomparing
comparing treatment means ::  
treatment means qq((kk,,df
df ))
bb
ss
For
Forcomparing
comparing block means :: 
block means qq((bb,,df
df ))
kk
ss  MSE MSE df 
df error
error df
df
df ))
qq((kk,,df value
value from
from Table
Table 11.
11.
IfIf any
any pair
pair of
of means
means differ
differ by
bymore than ,,
more than
they
they are
aredeclared
declared different.
different.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Seedling Problem
Use Tukey’s method to determine which of the
three soil preparations differ from the others.
A (no prep) B (fertilization) C (burning)
T1 = 50 T2 = 64 T3 = 48
Means 50/4 = 12.5 64/4 = 16 48/4 = 12

ss 11.8889
.8889 2.98
 qq.05.05((33,6,6)) 44.34
.34 2.98
44 44

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
The Seedling Problem
List the sample means from smallest to
largest.
TTCC TTAA TTBB
  22..98
98
1122 12.5
12.5 16.0
16.0
Since the difference between 12 and 12.5 is
less than  = 2.98, thereAissignificant difference in
no significant
average growth only occurs
difference. There is a difference
when the soilbetween
has been
population means C andfertilized.
B however.
There is a significant difference between A and
B. Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Cautions about Blocking
A randomized block design should not be used
when treatments and blocks both correspond to
experimental factors of interest to the researcher
Remember that blocking may not always be

beneficial.
Remember that you cannot construct

confidence intervals for individual treatment


means unless it is reasonable to assume that the b
blocks have been randomly selected from a
population of blocks. Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
An a x b Factorial
Experiment
• A two-way classification in which
involves two factors, both of which are of
interest to the experimenter.
• There are a levels of factor A and b levels
of factor B—the experiment is replicated
r times at each factor-level combination.
• The replications allow the experimenter
to investigate the interaction between
factors A and B. Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Interaction
• The interaction between two factor A and B is the
tendency for one factor to behave differently, depending
on the particular level setting of the other variable.
• Interaction describes the effect of one factor on the
behavior of the other. If there is no interaction, the two
factors behave independently.

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Example
• A drug manufacturer has three
supervisors who work at each of three different
shift times.
Supervisor Do does
1 always outputsSupervisor
of the supervisors
1 does better earlier
better than 2, regardless of in the day, while supervisor 2
behave differently, depending on the particular
the shift. does better at night.
shift they are working?
(No Interaction) (Interaction)

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
The a x b Factorial
Experiment
• Let xijk be the k-th replication at the i-th
level of A and the j-th level of B.
– i = 1, 2, …,a j = 1, 2, …, b
– k = 1, 2, …,r
• The total variation in the experiment is
measured by the total sum of squares: squares
22
Total
Total SSSS((xxijkijk  xx))
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Analysis of Variance
The Total SS is divided into 4 parts:
 SSA (sum of squares for factor A): measures the
variation among the means for factor A
 SSB (sum of squares for factor B): measures the
variation among the means for factor B
 SS(AB) (sum of squares for interaction): measures
the variation among the ab combinations of factor
levels
 SSE (sum of squares for error): measures
experimental error in such a way that:

Total SS
Total SS SSA SSB
SSA SSB SS(AB)
SS(AB) SSE
SSE
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Computing Formulas
G2
CM  where G  xijk
n
2
Total SS  xijk  CM
2
 Ai
SSA   CM where Ai total for level i of A
br
2
 Bj
SSB   CM where B j total for level j of B
ar
2
 ABij
SS(AB)   CM - SSA - SSB
r
wher e ABij total for level i of A and level j of B
SSE Total SS - SSA - SSB - SS(AB) Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Drug Manufacturer
• Each supervisors works at each of
three different shift times and the shift’s
output is measured on three randomly
selected days.
Supervisor Day Swing Night Ai
1 571 480 470 4650
610 474 430
625 540 450
2 480 625 630 5238
516 600 680
465 581 661
Bj 3267 3300 3321 9888
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The ANOVA Table
Total df = n –1 = abr - 1 Mean Squares
Factor A df = a –1 MSA= SSA/(k-1)
Factor B df = b –1 MSB = SSB/(b-1)
Interaction df = (a-1)(b-1) MS(AB) = SS(AB)/(a-1)(b-1)
Error df = by subtraction MSE = SSE/ab(r-1)

Source df SS MS F
A a -1 SST SST/(a-1) MST/MSE
B b -1 SSB SSB/(b-1) MSB/MSE
Interaction (a-1)(b-1) SS(AB) SS(AB)/(a-1)(b-1) MS(AB)/MSE
Error ab(r-1) SSE SSE/ab(r-1)
Total abr -1 Total SS
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Drug Manufacturer
• We generate the ANOVA table using
Minitab (StatANOVA Two way).
Two-way ANOVA: Output versus Supervisor, Shift

Source DF SS MS F P
Supervisor 1 19208 19208.0 26.68 0.000
Shift 2 247 123.5 0.17 0.844
Interaction 2 81127 40563.5 56.34 0.000
Error 12 8640 720.0
Total 17 109222

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Tests for a Factorial
Experiment
• We can test for the significance of both
factors and the interaction using F-tests from
the ANOVA table.
• Remember that 2 is the common variance
for all ab factor-level combinations. MSE is
the best estimate of 2, whether or not H 0
is true.
• Other factor means will be judged to be
significantly different if their mean square is
large in comparison to MSE.Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Tests for a Factorial
Experiment
•• The
The interaction
interaction isis tested
tested first
first using
using FF ==
MS(AB)/MSE.
MS(AB)/MSE.
•• IfIf the
the interaction
interaction isis not
not significant,
significant, the the main
main
effects
effects A A and
and BB can
can bebe individually
individually tested tested
using
using FF == MSA/MSE
MSA/MSE and and FF == MSB/MSE,
MSB/MSE,
respectively.
respectively.
•• IfIf the
the interaction
interaction isis significant,
significant, the the main
main
effects
effects areare NOT
NOT tested,
tested, andand we
we focus
focus on on thethe
differences
differences in in the
the ab
ab factor-level
factor-level means.
means.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Drug Manufacturer
Two-way ANOVA: Output versus Supervisor, Shift

Source DF SS MS F P
Supervisor 1 19208 19208.0 26.68 0.000
Shift 2 247 123.5 0.17 0.844
Interaction 2 81127 40563.5 56.34 0.000
Error 12 8640 720.0
Total 17 109222

The test statistic for the interaction is F = 56.34 with


p-value = .000. The interaction is highly significant,
and the main effects are not tested. We look at the
interaction plot to see where the differences lie.

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
The Drug Manufacturer
Interaction Plot (data means) for Output
Supervisor
650 1
2

Supervisor 1 does
600
better earlier in the day,
while supervisor 2 does
Mean

550
better at night.
500

450

1 2 3
Shift

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Revisiting the
ANOVA Assumptions
1.
1. The
The observations
observations within
within each
each population
population are
are
normally
normally distributed
distributed with
with aa common
common variance
variance

 ..
22

2.
2. Assumptions
Assumptions regarding
regarding the
the sampling
sampling
procedures
procedures are
are specified
specified for
for each
each design.
design.

•Remember that ANOVA procedures are fairly


robust when sample sizes are equal and when the
data are fairly mound-shaped.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Diagnostic Tools
•Many computer programs have graphics
options that allow you to check the
normality assumption and the assumption of
equal variances.
1.
1. Normal
Normal probability
probability plot
plot of
of residuals
residuals
2.
2. Plot
Plot of
of residuals
residuals versus
versus fit
fit or
or
residuals
residuals versus
versus variables
variables
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Residuals
•The analysis of variance procedure takes
the total variation in the experiment and
partitions out amounts for several important
factors.
•The “leftover” variation in each data point
is called the residual or experimental
error.
error
•If all assumptions have been met, these
residuals should be normal,
normal with mean 0
and variance 2. Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Normal Probability Plot

IfIf the
the normality
normality assumption
assumption isis valid,
valid, the
the
plot
plot should
should resemble
resemble aa straight
straight line,
line,
sloping
sloping upward
upward to to the
the right.
right.
Normal Probability Plot of the Residuals


(response is Growth)

IfIf not,
not, you
you will
will often
99

95
often see
see the
the pattern
pattern fail
fail
in
in the
the tails
tails of
of the
the graph.
90

80
70
graph.
Percent

60
50
40
30
20

10

1
-3 -2 -1 0 1 2 3
Residual

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Residuals versus Fits

IfIf the
the equal
equal variance
variance assumption
assumption isis valid,
valid,
the
the plot
plot should
should appear
appear asas aa random
random
scatter
scatter around
around the
the zero
zero center
center line.
line.
Residuals Versus the Fitted Values
(response is Growth)


IfIf not,
not, you
you will
will see
see aa pattern
1.5

1.0 pattern in in the


the
residuals.
residuals. 0.5
Residual

0.0

-0.5

-1.0

-1.5

-2.0
10 12 14 16 18 20
Fitted Value

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Some Notes
•Be careful to watch for responses that are
binomial percentages or Poisson counts. As
the mean changes, so does the variance.
pq
pq
Binomial pˆpˆ ::Mean
Binomial Mean  Variance 
pp;;Variance 
nn
Poisson
Poisson xx::Mean ;;Variance
Mean  
Variance 
•Residual plots will show a pattern that
mimics this change.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Some Notes
•Watch for missing data or a lack of
randomization in the design of the
experiment.
•Randomized block designs with missing
values and factorial experiments with
unequal replications cannot be analyzed
using the ANOVA formulas given in this
chapter.
•Use multiple regression analysis (Chapter
13) instead. Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Key Concepts
I. Experimental Designs
1. Experimental units, factors, levels, treatments, response
variables.
2. Assumptions: Observations within each treatment group
must be normally distributed with a common variance 2.
3. One-way classification—completely randomized design:
Independent random samples are selected from each of
k populations.
4. Two-way classification—randomized block design:
k treatments are compared within b blocks.
5. Two-way classification — a  b factorial experiment:
Two factors, A and B, are compared at several levels.
Each factor– level combination is replicated r
times to allow for the investigation of an interaction between the
two factors.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Key Concepts
II. Analysis of Variance
1. The total variation in the experiment is divided into
variation (sums of squares) explained by the various
experimental factors and variation due to experimental
error (unexplained).
2. If there is an effect due to a particular factor, its
mean square(MS SS/df ) is usually large and F
MS(factor)/MSE is large.
3. Test statistics for the various experimental factors
are
based on F statistics, with appropriate degrees of freedom
(d f 2  Error degrees of freedom).

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Key Concepts
III. Interpreting an Analysis of Variance
1. For the completely randomized and randomized block
design, each factor is tested for significance.
2. For the factorial experiment, first test for a significant
interaction. If the interactions is significant, main effects
need not be tested. The nature of the difference in the factor–
level combinations should be further examined.
3. If a significant difference in the population means is found,
Tukey’s method of pairwise comparisons or a similar method
can be used to further identify the nature of the difference.
4. If you have a special interest in one population mean or the
difference between two population means, you can use a
confidence interval estimate. (For randomized block
design, confidence intervals do not provide estimates for
single population means).
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Key Concepts
IV. Checking the Analysis of Variance Assumptions
1. To check for normality, use the normal probability plot for
the residuals. The residuals should exhibit a straight-line
pattern, sloping upward to the right.
2. To check for equality of variance, use the residuals versus
fit plot. The plot should exhibit a random scatter, with the
same vertical spread around the horizontal “zero error
line.”

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.

You might also like