Chapt 12 ANOVA BBA 2k23
Chapt 12 ANOVA BBA 2k23
Characteristics of F-Distribution
The F distribution is continuous. i.e it can assume an infinite number of values between zero
It is positively skewed. The long tail of the distribution is to the right-hand side.
3
Analysis of Variance (ANOVA)
Another use of the F distribution is the analysis of variance (ANOVA) technique in which
we compare three or more population means to determine whether they are be equal.
i.e The analysis of variance (ANOVA) is used to determine whether there are any
statistically significant differences between the means of three or more independent
(unrelated) groups.
To use ANOVA, we assume the following:
1. The populations follow the normal distribution.
2. The populations have equal standard deviations (a).
3. The populations are independent.
10
Why (ANOVA)
11
Why ANOVA?
Suppose we have four different methods (A, B, C, and D) of training new recruits to be
firefighters. We randomly assign each of the 40 recruits to one of the four methods. At the
end of the training program, we administer to the four groups a common test to measure
understanding of firefighting techniques. The question is: Is there a difference in the mean
test scores among the four groups? An answer to this question will allow us to compare the
four training methods.
1. Using the t distribution to compare the four population means, we would have to
run 6 tests:
i.e we would need, to conduct six different t tests, to compare the mean scores as:
A versus B (AB), AC, AD, BC, BD, and C versus D.
12
Why ANOVA?
2. Increase in error
If we set the significance level at .05, then the probability of a correct statistical decision is .95, i.e
1 - .05.
Because we conduct six separate (independent) tests the probability that we do not make an
incorrect decision due to sampling in any of the six independent tests is:
P(AII correct) = (.95)(.95)(.95)(.95)(.95)(.95) = .735
To find the probability of at least one error due to sampling, we subtract this result from 1. Thus, the
probability of at least one incorrect decision due to sampling is
1 - .735 = .265.
Thus by conducting six independent tests using the t distribution, the likelihood of rejecting a true null
hypothesis because of sampling error is increased from .05 to an unsatisfactory level of .265.
ANOVA will allow us to compare the treatment means simultaneously, without running multiple tests,
and avoid the buildup of the Type I error.
13