Unit 4 Stats
Unit 4 Stats
UNIT 4 Stats
Hypothesis Tes琀椀ng:
A hypothesis test is a formal sta琀椀s琀椀cal test we use to reject or fail to reject a sta琀椀s琀椀cal hypothesis.
The null hypothesis, denoted as H0, is the hypothesis that the sample data occurs purely from
chance.
The alterna琀椀ve hypothesis, denoted as H1 or Ha, is the hypothesis that the sample data is in昀氀uenced
by some non-random cause.
State the null and alterna琀椀ve hypotheses. These two hypotheses need to be mutually exclusive, so if
one is true then the other must be false.
2. Determine a signi昀椀cance level to use for the hypothesis. Decide on a signi昀椀cance level. Common
choices are .01, .05, and .1.
Find the test sta琀椀s琀椀c and the corresponding p-value. O昀琀en we are analyzing a popula琀椀on mean or
propor琀椀on and the general formula to 昀椀nd the test sta琀椀s琀椀c is: (sample sta琀椀s琀椀c – popula琀椀on
parameter) / (standard devia琀椀on of sta琀椀s琀椀c)
Using the test sta琀椀s琀椀c or the p-value, determine if you can reject or fail to reject the null hypothesis
based on the signi昀椀cance level.
The p-value tells us the strength of evidence in support of a null hypothesis. If the p-value is less than
the signi昀椀cance level, we reject the null hypothesis.
Interpret the results of the hypothesis test in the context of the ques琀椀on being asked.
There are two types of decision errors that one can make when doing a hypothesis test:
Type I error: You reject the null hypothesis when it is actually true. The probability of commi琀�ng a
Type I error is equal to the signi昀椀cance level, o昀琀en called alpha, and denoted as α.
Type II error: You fail to reject the null hypothesis when it is actually false. The probability of
commi琀�ng a Type II error is called the Power of the test or Beta, denoted as β.
A one-tailed hypothesis involves making a “greater than” or “less than ” statement. For example,
suppose we assume the mean height of a male in the U.S. is greater than or equal to 70 inches. The
null hypothesis would be H0: µ ≥ 70 inches and the alterna琀椀ve hypothesis would be Ha: µ < 70 inches
A two-tailed hypothesis involves making an “equal to” or “not equal to” statement.
For example, suppose we assume the mean height of a male in the U.S. is equal to 70 inches. The
null hypothesis would be H0: µ = 70 inches and the alterna琀椀ve hypothesis would be Ha: µ ≠ 70
inches.
Note: The “equal” sign is always included in the null hypothesis, whether it is =, ≥, or ≤.
Suppose we want to know whether or not the mean weight of a certain species of turtle in Florida is
equal to 310 pounds. Since there are thousands of turtles in Florida, it would be extremely 琀椀me-
consuming and costly to go around and weigh each individual turtle.
Instead, we might take a simple random sample of 40 turtles and use the mean weight of the turtles
in this sample to es琀椀mate the true popula琀椀on mean:
However, it’s virtually guaranteed that the mean weight of turtles in our sample will di昀昀er from 310
pounds. The ques琀椀on is whether or not this di昀昀erence is sta琀椀s琀椀cally signi昀椀cant. Fortunately, a one
sample t-test allows us to answer this ques琀椀on.
H1 (le昀琀-tailed): μ < μ0 (popula琀椀on mean is less than some hypothesized value μ0)
H1 (right-tailed): μ > μ0 (popula琀椀on mean is greater than some hypothesized value μ0)
t = (x – μ) / (s/√n)
where: df – n-1 -Degrees of freedom describe the number of scores in a sample that are
independent and free to vary. Because the sample mean places a restric琀椀on on the value of one
score in the sample, there are n – 1 degree of freedom for a sample with n scores
x: sample mean
n: sample size
If the p-value that corresponds to the test sta琀椀s琀椀c t with (n-1) degrees of freedom is less than your
chosen signi昀椀cance level (common choices are 0.10, 0.05, and 0.01) then you can reject the null
hypothesis.
For the results of a one sample t-test to be valid, the following assump琀椀ons should be met:
EXAMPLE :
Suppose we want to know whether or not the mean weight of a certain species of turtle is equal to
310 pounds. To test this, will perform a one-sample t-test at signi昀椀cance level α = 0.05 using the
following steps:
Sample size n = 40
According to the T Score to P Value Calculator, the p-value associated with t = -3.4817 and degrees of
freedom = n-1 = 40-1 = 39 is 0.00149.
Since this p-value is less than our signi昀椀cance level α = 0.05, we reject the null hypothesis. We have
su昀케cient evidence to say that the mean weight of this turtle species is not equal to 310 pounds.
REPORTING RESULTS:
For example- United fans reported higher levels of stress (M = 83, SD = 5) than found in the
popula琀椀on as a whole, t(48) = 2.3, p = .026.
The 昀椀rst table displays summary sta琀椀s琀椀cs for the variable height
Std. Devia琀椀on: The standard devia琀椀on of the height of plants in the sample.
Std. Error Mean: The standard error of the mean, calculated as s/√n
The second table displays the results of the one sample t-test:
Sig. (2-tailed): The two-sided p-value that corresponds to a t value of -1.685 with df=11
Mean Di昀昀erence: The di昀昀erence between the sample mean and the hypothesized mean
95% C.I. of the Di昀昀erence: The 95% con昀椀dence interval for the true di昀昀erence between the sample
mean and the hypothesized mean
Since the p-value of the test (.120) is not less than 0.05, we fail to reject the null hypothesis. We do
not have su昀케cient evidence to say that the true mean height of this species of plant is di昀昀erent
than 15 inches.
1. A measurement is taken on a subject before and a昀琀er some treatment – e.g. the max ver琀椀cal jump
of college basketball players is measured before and a昀琀er par琀椀cipa琀椀ng in a training program.
2. A measurement is taken under two di昀昀erent condi琀椀ons – e.g. the response 琀椀me of a pa琀椀ent is
measured on two di昀昀erent drugs.
In both cases we are interested in comparing the mean measurement between two groups in which
each observa琀椀on in one sample can be paired with an observa琀椀on in the other sample.
t = xdi昀昀 / (sdi昀昀/√n)
where:
If the p-value that corresponds to the test sta琀椀s琀椀c t with (n-1) degrees of freedom is less than your
chosen signi昀椀cance level (common choices are 0.10, 0.05, and 0.01) then you can reject the null
hypothesis.
Suppose we want to know whether or not a certain training program is able to increase the max
ver琀椀cal jump (in inches) of college basketball players.
To test this, we may recruit a simple random sample of 20 college basketball players and measure
each of their max ver琀椀cal jumps. Then, we may have each player use the training program for one
month and then measure their max ver琀椀cal jump again at the end of the month.
To determine whether or not the training program actually had an e昀昀ect on max ver琀椀cal jump, we
will perform a paired samples t-test at signi昀椀cance level α = 0.05 using the following steps:
We will perform the paired samples t-test with the following hypotheses:
According to the T Score to P Value Calculator, the p-value associated with t = -3.226 and degrees of
freedom = n-1 = 20-1 = 19 is 0.00445.
Since this p-value is less than our signi昀椀cance level α = 0.05, we reject the null hypothesis. We have
su昀케cient evidence to say that the mean max ver琀椀cal jump of players is di昀昀erent before and a昀琀er
par琀椀cipa琀椀ng in the training program.
Here is an example of how to report paired sample t-test results in APA format:
"Paired sample t-tests were conducted to examine the e昀昀ect of the interven琀椀on on the dependent
variable. The mean score on the dependent variable before the interven琀椀on was M = 4.2 (SD = 1.2),
and the mean score a昀琀er the interven琀椀on was M = 5.3 (SD = 1.1), t(20) = -3.52, p < .05, Cohen's d =
1.12. These results indicate a signi昀椀cant increase in the dependent variable scores a昀琀er the
interven琀椀on, t(20) = -3.52, p < .05, Cohen's d = 1.12. Therefore, the null hypothesis was rejected, and
the interven琀椀on had a signi昀椀cant e昀昀ect on the dependent variable."
The Wilcoxon Signed Rank Test is the non-parametric version of the paired t-test. It is used to test
whether or not there is a signi昀椀cant di昀昀erence between two popula琀椀on means.
Use the Wilcoxon Signed Rank test when you would like to use the paired t-test but the distribu琀椀on
of the di昀昀erences between the pairs is severely non-normally distributed.
The easiest way to determine if the di昀昀erences are non-normally distributed is to create a histogram
of the di昀昀erences and see if they follow a somewhat normal, “bell-shaped” distribu琀椀on.
Keep in mind that the paired t-test is fairly robust to departures from normality, so the devia琀椀on
from a normal distribu琀椀on needs to be pre琀琀y severe to jus琀椀fy the use of the Wilcoxon Signed Rank
test.
Hypotheses-
HA: The median di昀昀erence is nega琀椀ve. (e.g. the players make less free throws before par琀椀cipa琀椀ng in
the training program)
Assump琀椀ons –
1. The data are paired and come from the same popula琀椀on.
2. The di昀昀erences between the paired observa琀椀ons are independent of each other.
Repor琀椀ng results-
Repor琀椀ng the results of the Wilcoxon signed-rank test should include the following informa琀椀on:
4. The p-value.
For example:
"A Wilcoxon signed-rank test was conducted to determine if there was a signi昀椀cant di昀昀erence in
scores on a pretest and pos琀琀est for the same sample of par琀椀cipants. The null hypothesis that there is
no signi昀椀cant di昀昀erence in scores was rejected in favor of the alterna琀椀ve hypothesis (Z = -2.50, p
= .013), indica琀椀ng a signi昀椀cant di昀昀erence in scores. The median di昀昀erence in scores was 3.5 (Mdn =
3.5, IQR = 2), indica琀椀ng that the pos琀琀est scores were signi昀椀cantly higher than the pretest scores.
Therefore, we conclude that there is evidence to support the hypothesis that the interven琀椀on had a
posi琀椀ve e昀昀ect on scores."
Assump琀椀ons
Con琀椀nuous
Normally Distributed
Random Sample
Enough Data
t = (M1 - M2) / SE
where M1 is the mean of group 1, M2 is the mean of group 2, and SE is the standard error of the
di昀昀erence between the two means.
When repor琀椀ng the results of an independent samples t-test in APA format, you should include the
following informa琀椀on:
Descrip琀椀ve sta琀椀s琀椀cs: Report the means and standard devia琀椀ons for each group.
Test sta琀椀s琀椀c: Report the t-value, degrees of freedom, and p-value for the independent samples t-test.
E昀昀ect size: Report the e昀昀ect size, such as Cohen's d, to indicate the magnitude of the di昀昀erence
between the means.
Conclusion: Report whether the null hypothesis was rejected or not, and interpret the results in
terms of the research ques琀椀on.
REPORTING RESULTS:
"A independent samples t-test was conducted to compare the mean scores on the dependent
variable between Group 1 (M = 3.2, SD = 1.1) and Group 2 (M = 5.1, SD = 0.9). The results indicated a
signi昀椀cant di昀昀erence between the groups, t(28) = -4.23, p < .001, Cohen's d = 1.05. These results
suggest that Group 1 (M = 3.2, SD = 1.1) had signi昀椀cantly lower scores on the dependent variable
than Group 2 (M = 5.1, SD = 0.9), t(28) = -4.23, p < .001, Cohen's d = 1.05. Therefore, the null
hypothesis was rejected, and the di昀昀erence between the two groups was sta琀椀s琀椀cally signi昀椀cant."
p-
Group 1 Group 2 t-value df value
Dependent variable
The Mann-Whitney U Test is a sta琀椀s琀椀cal test used to determine if 2 groups are signi昀椀cantly di昀昀erent
from each other on your variable of interest.
Con琀椀nuous
Random Sample
Enough Data
When repor琀椀ng the results of a Mann-Whitney U test in APA style, you should include the following
informa琀椀on:
Test used: State that the Mann-Whitney U test was used to compare the medians of two
independent groups
Hypothesis: State the null hypothesis and alterna琀椀ve hypothesis that were tested.
Sample characteris琀椀cs: Provide descrip琀椀ve sta琀椀s琀椀cs about the two groups, including the sample size,
mean, standard devia琀椀on, and any other relevant characteris琀椀cs.
Test sta琀椀s琀椀cs: Report the Mann-Whitney U sta琀椀s琀椀c, as well as the signi昀椀cance level and direc琀椀on of
the test.
E昀昀ect size: Report an e昀昀ect size measure, such as Cohen's d or the odds ra琀椀o.
Con昀椀dence intervals: If relevant, report any con昀椀dence intervals associated with the test.
Assump琀椀ons: Report any assump琀椀ons made for the Mann-Whitney U test and whether they were
met.
Here is an example of how to report the results of a Mann-Whitney U test in APA style:
"A Mann-Whitney U test was used to compare the median reac琀椀on 琀椀mes of two groups, Group A
(Mdn = 375 ms) and Group B (Mdn = 425 ms). The null hypothesis that there is no di昀昀erence in the
median reac琀椀on 琀椀mes between the two groups was rejected in favor of the alterna琀椀ve hypothesis (U
= 120, p = .015, r = .43). The e昀昀ect size was medium, with a Cohen's d of .68. The assump琀椀ons for the
Mann-Whitney U test were met. These results suggest that Group A had signi昀椀cantly faster reac琀椀on
琀椀mes compared to Group B."
ANOVA
Analysis of variance (ANOVA) is a hypothesis-tes琀椀ng procedure that is used to evaluate mean
di昀昀erences between two or more treatments (or popula琀椀ons). As with all inferen琀椀al procedures,
ANOVA uses sample data as the basis for drawing general conclusions about popula琀椀ons. It may
appear that ANOVA and t tests are simply two di昀昀erent ways of doing exactly the same job: tes琀椀ng
for mean di昀昀erences. In some respects, this is true—both tests use sample data to test hypotheses
about popula琀椀on means. However, ANOVA has a tremendous advantage over t tests. Speci昀椀cally, t
tests are limited to situa琀椀ons in which there are only two treatments to compare. The major
advantage of ANOVA is that it can be used to compare two or more treatments. Thus, ANOVA
provides researchers with much greater 昀氀exibility in designing experiments and interpre琀椀ng results.
1. There really are no di昀昀erences between the popula琀椀ons (or treatments). The observed di昀昀erences
between the sample means are caused by random, unsystema琀椀c factors (sampling error) that
di昀昀eren琀椀ate one sample from another. (NuLL)
2. The popula琀椀ons (or treatments) really do have di昀昀erent means, and these popula琀椀on mean
di昀昀erences are responsible for causing systema琀椀c di昀昀erences between the sample means.
(Alternate)
The structure of this design is shown in Figure 12.2. No琀椀ce that the study uses two factors, one
independent-measures factor and one repeated-measures factor: 1. Factor 1: Therapy technique. A
separate group is used for each technique (independent measures). 2. Factor 2: Time. Each group is
tested at three di昀昀erent 琀椀mes (repeated measures). In this case, the ANOVA would evaluate mean
di昀昀erences between the two therapies as well as mean di昀昀erences between the scores obtained at
di昀昀erent 琀椀mes. A study that combines two factors, like the one in Figure 12.2, is called a two-factor
design or a factorial design. The ability to combine di昀昀erent factors and to mix di昀昀erent designs
within one study provides researchers with the 昀氀exibility to develop studies that address scien琀椀昀椀c
ques琀椀ons that could not be answered by a single design using a single factor.
Suppose we want to know whether or not three di昀昀erent exam prep programs lead to di昀昀erent
mean scores on a college entrance exam. Since there are millions of high school students around the
country, it would be too 琀椀me-consuming and costly to go around to each student and let them use
one of the exam prep programs.
Instead, we might select three random samples of 100 students from the popula琀椀on and allow each
sample to use one of the three test prep programs to prepare for the exam. Then, we could record
the scores for each student once they take the exam.
However, it’s virtually guaranteed that the mean exam score between the three samples will be at
least a li琀琀le di昀昀erent. The ques琀椀on is whether or not this di昀昀erence is sta琀椀s琀椀cally signi昀椀cant.
Fortunately, a one-way ANOVA allows us to answer this ques琀椀on.
Assump琀椀ons:
2. Equal Variances – The variances of the popula琀椀ons that the samples come from are equal. You can
use Bartle琀琀’s Test to verify this assump琀椀on.
3. Independence – The observa琀椀ons in each group are independent of each other and the
observa琀椀ons within groups were obtained by a random sample.
H1 (alterna琀椀ve hypothesis): at least one popula琀椀on mean is di昀昀erent from the rest
Example-
Suppose we want to know whether or not three di昀昀erent exam prep programs lead to di昀昀erent
mean scores on a certain exam. To test this, we recruit 30 students to par琀椀cipate in a study and split
them into three groups. The students in each group are randomly assigned to use one of the three
exam prep programs for the next three weeks to prepare for an exam. At the end of the three weeks,
all of the students take the same exam.
From the output table we see that the F test sta琀椀s琀椀c is 2.358 and the corresponding p-value
is 0.11385. Since this p-value is not less than 0.05, we fail to reject the null hypothesis. This
means we don’t have su昀케cient evidence to say that there is a sta琀椀s琀椀cally signi昀椀cant di昀昀erence
between the mean exam scores of the three groups.
When repor琀椀ng the results of a one-way ANOVA in APA format, you should include the following
informa琀椀on:
1. Test used: State that a one-way ANOVA was used to test for di昀昀erences in means between
three or more groups.
2. Hypothesis: State the null hypothesis and alterna琀椀ve hypothesis that were tested.
3. Sample characteris琀椀cs: Provide descrip琀椀ve sta琀椀s琀椀cs about the groups, including the sample
size, mean, standard devia琀椀on, and any other relevant characteris琀椀cs.
4. Test sta琀椀s琀椀cs: Report the F-value, degrees of freedom, and signi昀椀cance level of the ANOVA.
5. Post-hoc tests: If signi昀椀cant di昀昀erences were found, report the results of post-hoc tests, such
as Tukey's HSD or Bonferroni tests, to determine which speci昀椀c group means di昀昀er from each
other.
Test for mul琀椀ple comparisons found that the mean value of [dependent variable] was
signi昀椀cantly di昀昀erent between [group name] and [group name] (p = [p-value]).
• There was no sta琀椀s琀椀cally signi昀椀cant di昀昀erence between [group name] and [group name] (p=[p-
value]).
6. E昀昀ect size: Report an e昀昀ect size measure, such as eta-squared or par琀椀al eta-squared. Par琀椀al
eta-squared indicates the % of the variance in the Dependent Variable (DV) a琀琀ributable to a
par琀椀cular Independent Variable (IV). • Eta-squared ranges from 0 to 1 and indicates • Eta
Squared η2 = 0.01 indicates a small e昀昀ect; η2 = 0.06 indicates a medium e昀昀ect; η2 = 0.14
indicates a large e昀昀ect.
7. Assump琀椀ons: Report any assump琀椀ons made for the ANOVA and whether they were met.
Here is an example of how to report the results of a one-way ANOVA in APA style:
"A one-way ANOVA was conducted to compare the mean test scores of three groups, Group A (M =
75, SD = 5), Group B (M = 80, SD = 6), and Group C (M = 85, SD = 5). The null hypothesis that there is
no di昀昀erence in mean test scores between the groups was rejected in favor of the alterna琀椀ve
hypothesis (F(2, 87) = 10.52, p < .001, η2 = .20). Post-hoc tests revealed that Group A had
signi昀椀cantly lower mean test scores compared to Group B (p = .03) and Group C (p < .001), while
there was no signi昀椀cant di昀昀erence between Group B and Group C (p = .15). The assump琀椀ons for the
ANOVA were met. These results suggest that Group A performed signi昀椀cantly worse on the test
compared to Group B and Group C, while there was no signi昀椀cant di昀昀erence between Group B and
Group C."
The Kruskal-Wallis One-Way ANOVA is a sta琀椀s琀椀cal test used to determine if 3 or more groups are
signi昀椀cantly di昀昀erent from each other on your variable of interest.
1. Con琀椀nuous
2. Random Sample
3. Enough Data
6. Skewed
When repor琀椀ng the results of a Kruskal-Wallis test in APA format, you should include the following
informa琀椀on:
Here is an example of how to report the results of a Kruskal-Wallis test in APA style:
"A Kruskal-Wallis test was conducted to determine if there were signi昀椀cant di昀昀erences in anxiety
levels between the three treatment groups (H(2) = 6.12, p = .047). Post hoc tests using Dunn's
method revealed that Group A had signi昀椀cantly lower anxiety levels (M = 23.4, SD = 4.2) than Group
B (M = 28.6, SD = 5.1, p = .031), but there were no signi昀椀cant di昀昀erences between Group A and
Group C (M = 24.8, SD = 3.6, p = .317), or between Group B and Group C (p > .05)."
TWO-WAY ANOVA
A two-way ANOVA (“analysis of variance”) is used to determine whether or not there is a sta琀椀s琀椀cally
signi昀椀cant di昀昀erence between the means of three or more independent groups that have been split
on two variables (some琀椀mes called “factors”). ( the e昀昀ect of two IV’s on a Single DV)
You should use a two-way ANOVA when you’d like to know how two factors a昀昀ect a response
variable and whether or not there is an interac琀椀on e昀昀ect between the two factors on the response
variable.
For example, suppose a botanist wants to explore how sunlight exposure and watering frequency
a昀昀ect plant growth. She plants 40 seeds and lets them grow for two months under di昀昀erent
condi琀椀ons for sunlight exposure and watering frequency. A昀琀er two months, she records the height of
each plant.
Is there an interac琀椀on e昀昀ect between sunlight exposure and watering frequency? (e.g. the
e昀昀ect that sunlight exposure has on the plants is dependent on watering frequency)
We would use a two-way ANOVA for this analysis because we have two factors. If instead we wanted
to know how only watering frequency a昀昀ected plant growth, we would use a one-way ANOVA since
we would only be working with one factor.
Assump琀椀ons
1. Normality – The response variable is approximately normally distributed for each group.
2. Equal Variances – The variances for each group should be roughly equal.
3. Independence – The observa琀椀ons in each group are independent of each other and the
observa琀椀ons within groups were obtained by a random sample.
Factor B
Factor B
Example-
A botanist wants to know whether or not plant growth is in昀氀uenced by sunlight exposure and
watering frequency. She plants 30 seeds and lets them grow for two months under di昀昀erent
condi琀椀ons for sunlight exposure and watering frequency. A昀琀er two months, she records the height of
each plant, in inches.
We can see the following p-values for each of the factors in the table:
From the table we can see the p-values for the following comparisons:
This tells us that there is a sta琀椀s琀椀cally signi昀椀cant di昀昀erence between high and low sunlight exposure,
along with high and medium sunlight exposure, but there is no signi昀椀cant di昀昀erence between low
and medium sunlight exposure.
REPORTING RESULTS
When repor琀椀ng the results of a two-way ANOVA in APA format, you should include the following
informa琀椀on:
The main e昀昀ects of the two factors being tested (Factor A and Factor B).
The interac琀椀on between Factor A and Factor B.
The F-values and degrees of freedom for each e昀昀ect.
The p-values for each e昀昀ect.
A descrip琀椀on of the sample size and other relevant details.
Here is an example of how to report the results of a two-way ANOVA in APA style:
"A two-way ANOVA was conducted to examine the e昀昀ects of Factor A and Factor B on the dependent
variable (DV). There was a signi昀椀cant main e昀昀ect of Factor A (F(df) = 4.12(2, 56), p = .022), indica琀椀ng
that Factor A had a signi昀椀cant e昀昀ect on the DV. There was also a signi昀椀cant main e昀昀ect of Factor B
(F(df) = 7.61(3, 56), p = .001), indica琀椀ng that Factor B had a signi昀椀cant e昀昀ect on the DV. Addi琀椀onally,
there was a signi昀椀cant interac琀椀on between Factor A and Factor B (F(df) = 5.78(6, 56), p = .005),
indica琀椀ng that the e昀昀ects of Factor A on the DV depended on the levels of Factor B, and vice versa.
Post-hoc tests using Tukey's HSD revealed that Group 1 had signi昀椀cantly higher scores on the DV (M =
36.2, SD = 3.2) than Group 2 (M = 28.7, SD = 2.5, p = .001), and Group 3 (M = 31.5, SD = 4.1, p = .022).
No other signi昀椀cant di昀昀erences were found between the groups (p > .05). The sample consisted of 60
par琀椀cipants, with an equal number of par琀椀cipants in each group."
The repeated measures analysis of variance (ANOVA) is an extension of the paired-samples t-test and
is used to determine whether there are any sta琀椀s琀椀cally signi昀椀cant di昀昀erences between the
popula琀椀on means of three or more related groups. The groups are related as they contain the same
cases (e.g., par琀椀cipants) in each group and each group represents a repeated measurement on the
same dependent variable. This test is also referred to as a within-subjects ANOVA or ANOVA with
repeated measures.
In order to run a repeated measures ANOVA you require the following: One independent variable
that is categorical with three or more related groups (e.g., 琀椀me: pre-, 1-month, post-interven琀椀on).
A repeated measures ANOVA is most o昀琀en used for three types of study design:
If you have a study design where you are measuring how a par琀椀cular variable changes over 琀椀me in
the same par琀椀cipants and you want to compare three or more 琀椀me points, a repeated measures
ANOVA might be appropriate. It does not ma琀琀er what occurs between the 琀椀me points, so you could
have ini琀椀ated an interven琀椀on, such as a training program, or alterna琀椀vely, simply measured the
passage of 琀椀me, as long as you are measuring the same variable at all 琀椀mes points.
If you have a study design where the same par琀椀cipants are being measured on the same variable,
but under three or more di昀昀erent condi琀椀ons, a repeated measures ANOVA might be appropriate. In
other words, par琀椀cipants are performing a cross-over design by receiving all condi琀椀ons. These can
either be short-term condi琀椀ons, such as reac琀椀on 琀椀mes in a 10-second period under three di昀昀erent
ligh琀椀ng condi琀椀ons (e.g., blue vs. red vs. green light), or longer-term condi琀椀ons, such as a six week
control, exercise-training or dietary program with cholesterol concentra琀椀on measured at the end of
each trial.
If you have a study design where the same par琀椀cipants have performed three or more di昀昀erent
interven琀椀ons (e.g., control/interven琀椀on 1/interven琀椀on 2), the same con琀椀nuous dependent variable
is measured at the beginning and end of each interven琀椀on in all groups, and a change score
calculated (i.e., post-values minus pre-values), a repeated measures ANOVA might be appropriate.
If you have a study design where the same par琀椀cipants are being measured on a di昀昀erent variable,
but using the same measurement scale, a repeated measures ANOVA might be appropriate.
ASSUMPTIONS
Assump琀椀on #1: You have one dependent variable that is measured at the con琀椀nuous (i.e.,
ra琀椀o or interval) level.
Assump琀椀on #2: Your independent variable is categorical with three or more separate
measurements of the same par琀椀cipants.
Assump琀椀on #3: There should be no signi昀椀cant outliers in any of the measurements of the
par琀椀cipants, meaning each measurement should be assessed separately.
Assump琀椀on #4: Your dependent variable should be approximately normally distributed for
each measurement of the independent variable.
Assump琀椀on #5: The variances of the di昀昀erences between related groups are equal (the
assump琀椀on of sphericity). This assump琀椀on is similar to the homogeneity of variances for
separate groups that you tested for in the between subjects ANOVA. However, this
assump琀椀on inves琀椀gates if the variances of the di昀昀erence scores between pairs of levels are
the same. Therefore, if you had three measurements (levels), the variance of the di昀昀erence
between measurement 1 and measurement 2 should be the same as the variance of the
di昀昀erence between measurement 1 and 3 and measurement 2 and 3.
Hypotheses
H0: all related group means are equal (i.e. µ1 = µ2 = µ3 = ... = µk). There are no di昀昀erences
between TIME1/CONDITION1, TIME2/CONDITION2, and TIME3/CONDITION3 on the dependent
variable.
HA: at least one related group mean is di昀昀erent (i.e. they are not all the same). There are
di昀昀erences between TIME1/CONDITION1, TIME2/CONDITION2, and TIME3/CONDITION3 on the
dependent variable.
Example-
A researcher wishes to understand how exercise might reduce heart disease. They wish to
concentrate on a protein called C-Reac琀椀ve Protein (CRP) that is a marker of chronic in昀氀amma琀椀on
in the body and associated with heart disease: the greater the concentra琀椀on of CRP, the greater
the risk of heart disease. Regular exercise reduces the risk of heart disease. The researcher
would, therefore, like to know whether exercise has an e昀昀ect on CRP concentra琀椀on because this
might indicate that exercise has an an琀椀-in昀氀ammatory e昀昀ect. To test this theory out, the
researchers recruit 10 subjects to undergo a 6-month exercise-training program and CRP
concentra琀椀on is measured pre-, mid-way (3-months) and immediately post-interven琀椀on. The
CRP concentra琀椀ons pre-interven琀椀on were recorded in the crp_pre variable, the CRP
concentra琀椀ons mid-way in the crp_mid variable, and the post-interven琀椀on CRP concentra琀椀ons in
the crp_post variable. The researcher would like to know whether there are changes in CRP
concentra琀椀on over 琀椀me. In variable terms, the researcher would like to know if there are
di昀昀erences between the three variables, crp_pre, crp_mid and crp_post.
When repor琀椀ng the results of a repeated measures ANOVA in APA format, you should include
the following informa琀椀on:
Here is an example of how to report the results of a repeated measures ANOVA in APA style:
"A repeated measures ANOVA was conducted to examine the e昀昀ect of the within-subjects factor
on the dependent variable (DV). There was a signi昀椀cant main e昀昀ect of the within-subjects factor
(F(df) = 12.87(2, 56), p = .001), indica琀椀ng that the within-subjects factor had a signi昀椀cant e昀昀ect
on the DV. There was also a signi昀椀cant interac琀椀on between the within-subjects factor and the
between-subjects factor (F(df) = 4.63(1, 56), p = .035), indica琀椀ng that the e昀昀ect of the within-
subjects factor on the DV depended on the level of the between-subjects factor. Post-hoc tests
using Bonferroni correc琀椀on revealed that the di昀昀erence between the mean scores of Group 1 (M
= 26.4, SD = 2.1) and Group 2 (M = 21.6, SD = 1.9) was signi昀椀cant (p = .001), but no other
signi昀椀cant di昀昀erences were found (p > .05). The sample consisted of 60 par琀椀cipants, with an
equal number of par琀椀cipants in each group."
Source SS df MS F p
Within-Subjects Factor xx yy zz aa bb
Between-Subjects Factor xx yy zz aa bb
Interac琀椀on xx yy zz aa bb
Total xx yy zz - -
The Friedman Test is a non-parametric alterna琀椀ve to the Repeated Measures ANOVA. It is used to
determine whether or not there is a sta琀椀s琀椀cally signi昀椀cant di昀昀erence between the means of three or
more groups in which the same subjects show up in each group.
Con琀椀nuous
Random Sample
Enough Data
Skewed distribu琀椀on
1. Measuring the mean scores of subjects during three or more 琀椀me points.
For example, you might want to measure the res琀椀ng heart rate of subjects one month before they
start a training program, one month a昀琀er star琀椀ng the program, and two months a昀琀er using the
program. You can perform the Friedman Test to see if there is a signi昀椀cant di昀昀erence in the mean
res琀椀ng heart rate of pa琀椀ents across these three 琀椀me points.
For example, you might have subjects watch three di昀昀erent movies and rate each one based on how
much they enjoyed it. Since each subject shows up in each sample, you can perform a Friedman Test
to see if there is a signi昀椀cant di昀昀erence in the mean ra琀椀ng of the three movies.
Example-
Suppose we want to know if the mean reac琀椀on 琀椀me of subjects is di昀昀erent on three di昀昀erent drugs.
To test this, we recruit 10 pa琀椀ents and measure each of their reac琀椀on 琀椀mes (in seconds) on the
three di昀昀erent drugs.
The null hypothesis (H0): µ1 = µ2 = µ3 (the mean reac琀椀on 琀椀mes across the popula琀椀ons are all equal)
The alterna琀椀ve hypothesis: (Ha): at least one popula琀椀on mean is di昀昀erent from the rest
When repor琀椀ng the results of Friedman's test, it is important to include the following informa琀椀on in
APA format:
1. Descrip琀椀ve sta琀椀s琀椀cs: report the median and interquar琀椀le range (IQR) for each group.
2. Friedman's test sta琀椀s琀椀c: report the value of the Friedman's test sta琀椀s琀椀c (χ2) and degrees of
freedom.
3. p-value: report the p-value associated with the Friedman's test sta琀椀s琀椀c.
4. Post-hoc tests: if the Friedman's test is sta琀椀s琀椀cally signi昀椀cant, follow-up post-hoc tests may
be conducted to determine which groups di昀昀er signi昀椀cantly from each other. In this case,
report the results of the post-hoc tests along with appropriate adjustments for mul琀椀ple
comparisons.
"A Friedman's test was conducted to compare the median scores of three groups on the variable X.
The median scores and interquar琀椀le ranges (IQRs) for each group were as follows: group 1, Mdn =
10.0, IQR = 3.0; group 2, Mdn = 12.0, IQR = 2.0; group 3, Mdn = 7.0, IQR = 4.0. The Friedman's test
sta琀椀s琀椀c was signi昀椀cant, χ2(2) = 6.0, p = .05. Post-hoc tests using Dunn's test with Bonferroni
adjustment revealed that there was a signi昀椀cant di昀昀erence between group 2 and group 3 (p < .05).
No other signi昀椀cant di昀昀erences were found."
MULTIVARIATE ANOVA
ANCOVA-
The One-Way ANCOVA is a sta琀椀s琀椀cal test used to determine if 3 or more groups are signi昀椀cantly
di昀昀erent from each other on your variable of interest while accoun琀椀ng for the e昀昀ect of another
variable (called a covariate).
Con琀椀nuous
Normally Distributed
Random Sample
Enough Data
EXAMPLE-
We split up a class of 90 students into three groups of 30. Each group uses a di昀昀erent studying
technique for one month to prepare for an exam. At the end of the month, all of the students take
the same exam. • Ques琀椀on : Is there a sig e昀昀ect for studying technique on exam results a昀琀er c If we
want to know whether or not the studying technique has an impact on exam scores • Here we are
also accoun琀椀ng for the grade that the student already has in the class so we use their current grade
as a covariate • Use ANCOVA to determine if there is a sta琀椀s琀椀cally signi昀椀cant di昀昀erence between the
mean scores of the three groups a昀琀er accoun琀椀ng for a covariate controlling for the e昀昀ect of their
intelligence.
REPORTING RESULTS
When repor琀椀ng the results of an analysis of covariance (ANCOVA), it is important to include the
following informa琀椀on in APA format:
1. Descrip琀椀ve sta琀椀s琀椀cs: report the means and standard devia琀椀ons for each group on the
dependent variable and the covariate.
2. ANCOVA table: include a table that shows the sources of varia琀椀on, degrees of freedom,
mean squares, F-values, and p-values.
4. Post-hoc tests: if signi昀椀cant main e昀昀ects or interac琀椀ons are found, follow-up post-hoc tests
may be conducted to determine which groups di昀昀er signi昀椀cantly from each other. In this
case, report the results of the post-hoc tests along with appropriate adjustments for mul琀椀ple
comparisons.
"A one-way analysis of covariance (ANCOVA) was conducted to examine the e昀昀ect of treatment
(independent variable) on anxiety scores (dependent variable), controlling for age (covariate).
Descrip琀椀ve sta琀椀s琀椀cs indicated that the mean anxiety scores for the three groups were as follows:
Group 1 M = 4.2, SD = 0.9; Group 2 M = 4.8, SD = 0.7; Group 3 M = 5.1, SD = 0.6. The mean age for
the three groups were as follows: Group 1 M = 35.5, SD = 3.1; Group 2 M = 36.1, SD = 2.9; Group 3 M
= 37.0, SD = 2.7.
The ANCOVA table showed that there was a signi昀椀cant main e昀昀ect of treatment, F(2, 57) = 4.23, p
< .05, par琀椀al η2 = 0.13. Post-hoc tests using Tukey's HSD revealed that Group 3 had signi昀椀cantly
higher anxiety scores than Group 1 (p < .05). The e昀昀ect size for the main e昀昀ect of treatment was
medium (par琀椀al η2 = 0.13). There was no signi昀椀cant e昀昀ect of age, F(1, 57) = 1.02, p = .32, par琀椀al η2 =
0.02, and no signi昀椀cant interac琀椀on between treatment and age, F(2, 57) = 0.84, p = .44, par琀椀al η2 =
0.03.
Assump琀椀ons of ANCOVA were checked and met, including normality of residuals, homogeneity of
variance, and linearity of the covariate."
MANOVA –
Repor琀椀ng results-
When repor琀椀ng the results of a MANOVA, it is important to include the following informa琀椀on in APA
format:
3. A summary of the results for the main e昀昀ects of the independent variables and any
interac琀椀on e昀昀ects.
4. The F and p-values for the main e昀昀ects and interac琀椀on e昀昀ects.
6. Any e昀昀ect sizes that were calculated (such as par琀椀al eta-squared or Cohen's d).
"A MANOVA was conducted to examine the e昀昀ect of treatment type (between-subjects factor) and
琀椀me (within-subjects factor) on anxiety, depression, and stress levels. The sample consisted of 100
par琀椀cipants (60% female, mean age = 35.2 years) who were randomly assigned to one of three
treatment condi琀椀ons: cogni琀椀ve-behavioral therapy, mindfulness-based stress reduc琀椀on, or a waitlist
control group.
The main e昀昀ect of treatment type was signi昀椀cant, F(2, 97) = 10.26, p < .001, par琀椀al eta-squared
= .18, indica琀椀ng that the three treatment groups di昀昀ered in their overall levels of anxiety, depression,
and stress. Post-hoc tests revealed that both cogni琀椀ve-behavioral therapy and mindfulness-based
stress reduc琀椀on resulted in signi昀椀cantly lower levels of anxiety, depression, and stress compared to
the waitlist control group (p < .001).
The main e昀昀ect of 琀椀me was also signi昀椀cant, F(1, 97) = 5.23, p = .02, par琀椀al eta-squared = .05,
indica琀椀ng that overall anxiety, depression, and stress levels decreased over 琀椀me. There was no
signi昀椀cant interac琀椀on e昀昀ect between treatment type and 琀椀me, F(2, 97) = 1.86, p = .16, par琀椀al eta-
squared = .04.
Assump琀椀ons of normality and homogeneity of variance were tested and met. The results of this
study suggest that both cogni琀椀ve-behavioral therapy and mindfulness-based stress reduc琀椀on are
e昀昀ec琀椀ve treatments for reducing anxiety, depression, and stress levels."
Repor琀椀ng the results of a MANOVA (Mul琀椀variate Analysis of Variance) typically involves repor琀椀ng
the outcomes of several sta琀椀s琀椀cal tests, including Wilks' Lambda, Pillai's Trace, Hotelling's Trace, and
Roy's Largest Root.
The repor琀椀ng format in APA style for the results of a MANOVA can be as follows:
"We conducted a mul琀椀variate analysis of variance (MANOVA) to examine the e昀昀ects of (independent
variable(s)) on (dependent variable(s)). The MANOVA revealed a signi昀椀cant main e昀昀ect of
(independent variable(s)) on the dependent variables, F(df numerator, df denominator) = F-value, p <
.05, Wilks' Lambda = value, Pillai's Trace = value, Hotelling's Trace = value, and Roy's Largest Root =
value.
Post-hoc analyses were conducted using (appropriate post-hoc tests, such as Bonferroni, Tukey's
HSD, etc.), revealing that (speci昀椀c signi昀椀cant di昀昀erences between groups or condi琀椀ons, if
applicable)."
The table for repor琀椀ng the results of a MANOVA should include the values for each of the sta琀椀s琀椀cal
tests (Wilks' Lambda, Pillai's Trace, Hotelling's Trace, and Roy's Largest Root) as well as the associated
degrees of freedom, F-value, and p-value.