0% found this document useful (0 votes)
3 views24 pages

Notes for Chapter 4

The document provides an overview of Analysis of Variance (ANOVA), a statistical method for comparing means across three or more groups to identify significant differences. It explains the workings of ANOVA, types (one-way, two-way), and applications in various fields such as medicine and business. Additionally, it discusses the importance of post-hoc tests for identifying specific group differences after significant ANOVA results.

Uploaded by

ajayalama786
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views24 pages

Notes for Chapter 4

The document provides an overview of Analysis of Variance (ANOVA), a statistical method for comparing means across three or more groups to identify significant differences. It explains the workings of ANOVA, types (one-way, two-way), and applications in various fields such as medicine and business. Additionally, it discusses the importance of post-hoc tests for identifying specific group differences after significant ANOVA results.

Uploaded by

ajayalama786
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Marketing Analytics

Analysis of Variance – ANOVA


Topics of the day
1. One-Factor ANOVA
2. Multifactor ANOVA

Definition
Analysis of Variance (ANOVA): Definition and Purpose

Analysis of Variance (ANOVA) is a statistical method used to determine whether


there are statistically significant differences between the means of three or more
independent groups. Unlike the t-test, which is limited to comparing the means of two
groups, ANOVA allows for simultaneous comparison across multiple groups using a
single test, reducing the risk of false positives that can occur when conducting
multiple t-tests.

How ANOVA Works

ANOVA works by partitioning the total variance observed in the data into two
components:

 Between-group variance: Variability due to differences between the group


means.

 Within-group variance: Variability within each group, due to random error or


individual differences.

The core idea is to compare the between-group variance to the within-group variance.
If the between-group variance is significantly larger than the within-group variance,
this suggests that the group means are not all equal and that at least one group
differs from the others.

The statistical test used in ANOVA is the F-test, which produces an F-statistic. This
statistic is then compared to a critical value from the F-distribution to determine
statistical significance.

Types of ANOVA

 One-way ANOVA: Tests the effect of a single independent variable (factor) on a


dependent variable. It assesses whether there are differences among the means
of three or more unrelated groups.

 Two-way ANOVA: Examines the effect of two independent variables and can
also test for interactions between them.

 Other variations: Include factorial ANOVA, repeated measures ANOVA, and


others, depending on the experimental design.

Example Applications
 Medical research: Comparing the effectiveness of different treatments across
patient groups.

 Business: Testing whether different training programs lead to different levels of


employee performance.

 Psychology: Assessing whether different teaching methods impact test scores


across classrooms.

Interpretation

If the ANOVA test indicates a statistically significant difference, not all group means
are equal. However, ANOVA does not specify which groups differ from each other;
further post-hoc tests (such as Tukey's HSD) are needed to identify specific group
differences.

Historical Context

ANOVA was developed by Ronald Fisher in 1918 and became widely used after being
featured in his 1925 book, "Statistical Methods for Research Workers". It is considered
an extension of the t-test to more than two groups.

Summary Table

Feature ANOVA Purpose Typical Use Case

Medical, business, social


Compares Means of 3 or more groups
science

Test Significance of group


F-statistic
Statistic differences

One-way, Two-way, Factorial,


Types Varies by number of factors
etc.

Developed
Ronald Fisher 1918
by

ANOVA is a foundational tool in statistics for comparing multiple group means and
understanding whether observed differences are likely due to chance or reflect real
effects.
Explanation of the Slide

This slide illustrates how to statistically compare the effectiveness of different


advertising themes- Sport, Family, and Environment-in generating preference for the
new VW Roadster.

Key Points:

 Research Question:
"Which advertising theme results in the highest preference for the new VW
Roadster?"
This asks which of the three themes leads to the highest average preference
score among respondents.

 Treatments:
The three advertising themes (Sport, Family, Environment) are considered
different "treatments" or experimental conditions.

 Preference Measurement:
Participants rate their preference for the car after seeing each ad theme on a
10-point scale. The bar chart shows the average preference score for each
theme, with error bars indicating variability (e.g., standard error).

 Statistical Comparison:
The red text ("How to measure differences across treatments? Several t-Tests?")
highlights a common statistical challenge. While you could compare each pair of
themes using multiple t-tests, this increases the risk of Type I error (false
positives).

 Why Use ANOVA:


Instead of multiple t-tests, Analysis of Variance (ANOVA) is the correct approach.
ANOVA tests whether there are any statistically significant differences among
the means of the three groups in a single analysis.

 Interpretation of Results:
The chart shows that the "Sport" theme has the highest average preference,
followed by "Environment," with "Family" lowest. The notation "p < .05"
indicates that at least one of the differences between group means is
statistically significant (the probability that the observed differences are due to
chance is less than 5%).

Summary:
This slide uses the VW Roadster ad example to illustrate why ANOVA is used when
comparing more than two groups. It avoids the pitfalls of multiple t-tests and provides
a robust answer to which advertising theme is most effective.

Alpha-error
• Statistical tests (e.g., t-test) are calculated using a certain significance level (α
level)

• Having 3 groups, 3 t-tests are required to compare groups with each other (1- 2, 1-3,
and 2-3) • If each test uses a 5% significance level, the following applies to each test:
the probability of incorrectly rejecting the null hypothesis (type 1 error) is 5%

• Thus, the probability of no type 1 error (counter probability): 95%. If the 3 tests are
independent of each other, the probabilities can be multiplied

• Overall probability of type 1 error: (.95)³ = .857

• Probability of at least one type 1error: 1 -.857 = .143

>> Probability of a first type error increased from 5% to 14.3%

The p-value from your statistical test is compared to alpha. If p < α, you reject the
null hypothesis, but there is still a probability (alpha) that this is a false positive
What
Is This Slide About?

This slide shows a simple example of how to use ANOVA (Analysis of Variance) to
compare people’s preferences for the new VW Roadster after seeing three different
types of advertisements: Sport, Family, and Environment.

What Do the Numbers Mean?

 Respondent Preference: Each column lists how much each person liked the
car after seeing one of the ad types. The numbers are their scores (out of 10).

o For example, after seeing the Sport ad, R06 gave a score of 10, R11 gave
a 9, etc.

 Sum: This is the total score for each group.

o Sport: 45, Family: 10, Environment: 20

 Group Mean: This is the average score for each group.

o Sport: 45 divided by 5 people = 9

o Family: 10 divided by 5 people = 2

o Environment: 20 divided by 5 people = 4

 Mean Y_bar: This is the overall average score, combining all groups.

o Add all scores (45+10+20 = 75), divide by total number of people (15), so
the mean is 5.

Why Is This Useful?

 The Sport ad group has the highest average preference (9 out of 10), meaning
people liked the car most after seeing the Sport ad.
 The Family ad group has the lowest average preference (2 out of 10).

 The Environment ad group is in the middle (4 out of 10).

What Would Happen Next?

 These averages are compared using ANOVA to see if the differences are
statistically significant (real differences, not just due to random chance).

 If the ANOVA test says the difference is significant, we can confidently say that
the type of ad does affect how much people like the car.
Interpretation

Possible statement:

The ANOVA revealed a significant effect of advertisement type on preference (F(2,12)


= 38.92; p < .001; η2 = .87).
However:

The significance test only states that at least one experimental group differs
significantly from one other

>>𝑋1 =𝑋2 =𝑋3 is not true. It makes no statement about which groups differ in
pairs.

Solution

Calculation of contrasts or post-hoc tests (variants of paired t-tests; e.g., Scheffé or


LSD)

ANOVA - Example

Viagra is a stimulant (for the treatment of impotence)

• It conquered the black market in the belief that it makes men better lovers

Experiment to test this hypothesis

• 3 groups: Placebo (blue sugar pill), low dosage, high dosage

• DV: "objective measure of libido which was measured over the course of a week

 Grand Mean: The average across all scores from all groups (3.467).
 Grand SD (Standard Deviation): The overall spread of all scores (1.767).
 Grand Variance: The overall variance of all scores (3.124).

What Does This Mean?

 High Dose group has the highest average score (5.00), suggesting the
strongest effect.

 Placebo group has the lowest average (2.20), as expected if the drug works.

 The Low Dose group is in between (3.20).


 The spread (standard deviation and variance) is fairly similar across groups, but
a bit higher in the High Dose group.

 The ANOVA test found a statistically significant difference between the


groups (p = 0.025).
 This means it is very likely that at least one group is different from the others,
and the results are not just due to random chance.

 The data shows that higher doses lead to higher average scores.
 The ANOVA test confirms that these differences are real and not just
luck.
 Conclusion: The dose of Viagra has a significant effect on the measured
outcome (e.g., libido).
Post-Hoc-Tests
What Are Post-Hoc Tests?

Post-hoc tests are additional statistical tests you perform after an ANOVA shows that
there are significant differences between group means. The word "post-hoc" means
"after the event" in Latin.

Why Are Post-Hoc Tests Needed?

 ANOVA tells you that at least one group is different from the others, but it does
not tell you which groups are different.

 Post-hoc tests help you find out exactly which pairs of groups (e.g., Placebo
vs. Low Dose, Low Dose vs. High Dose, Placebo vs. High Dose) are significantly
different from each other.

How Do Post-Hoc Tests Work?

 They compare all possible pairs of group means.

 They adjust for the fact that you are making multiple comparisons, which helps
control the risk of making a Type I error (false positive).

Common Post-Hoc Tests

 Tukey’s HSD (Honestly Significant Difference): Very popular, compares all


pairs of means.

 Bonferroni correction: Adjusts the significance level to be more strict when


making many comparisons.

 Scheffé’s test: Very conservative, used when you want to be extra careful
about false positives.

 Dunnett’s test: Compares each group to a control group (like Placebo).

Example Using Your Data

From the ANOVA table, you know there is a significant difference between the Placebo,
Low Dose, and High Dose groups (p = 0.025).
A post-hoc test would tell you, for example:

 Is the difference between Placebo and High Dose significant?

 Is the difference between Placebo and Low Dose significant?

 Is the difference between Low Dose and High Dose significant?

Simple Summary

 ANOVA: “There is a difference somewhere.”

 Post-hoc tests: “Here’s exactly where the differences are.”


In your case:
After finding a significant ANOVA result, you would use post-hoc tests to see which
specific groups (Placebo, Low Dose, High Dose) differ from each other in terms of
libido scores.

This table shows the results of post-hoc tests (specifically, Tukey’s HSD), which
compare each pair of groups to find out exactly where the differences are.

 Mean Difference (I-J): The difference in average scores between each pair.

 Sig.: The p-value for each comparison.


 Highlighted Results:

 Placebo vs. High Dose: p = 0.021 (significant)

 High Dose vs. Placebo: p = 0.021 (the same comparison, reversed)

Interpretation:

 The only significant difference is between the Placebo and High Dose groups
(p = 0.021), meaning High Dose significantly increases libido compared to
Placebo.

 All other comparisons (Placebo vs. Low Dose, Low Dose vs. High Dose)
are not statistically significant (p > 0.05).
 Alcohol does not increase attractiveness ratings for males; in fact, after 4 pints,
males rate attractiveness much lower.

 Females’ ratings are less affected by alcohol, with only a slight decrease after 4
pints.

 Males’ ratings become much more inconsistent (higher variance) after drinking.

 The so-called "beer-goggles effect" (the idea that people seem more attractive
after drinking) is not supported by this data for males; it may even be the
opposite.
 Drinking 4 pints of alcohol leads to a much lower attractiveness
rating compared to drinking none or just 2 pints.

 There is no difference between drinking none and 2 pints.

 The "beer-goggles effect" (the idea that people seem more attractive after
drinking) is not supported; in fact, heavy drinking (4 pints) makes people rate
others as less attractive.
Task 2: Please work with “Teach.sav”

Data set contains an experiment on different teaching styles

• Punishment with a cane (punish)

• Discussion climate with sweets (reward)

• Indifference (indifferent)

DV: Exam grade (percentage)

Answer the following questions:

• Does the teaching style have a significant effect on the grade?

• How big is the effect size?

• Is reward the significantly best teaching method? (>> operant conditioning)

• Are punishment and indifference significantly different?


Answer:

a. Yes, there is a significant relationship between types of teaching


methods and exam marks. i.e.*(p=0.01 which is less than 0.05).
b. A one-way ANOVA revealed a large effect of group on Exam Marks,
ω² (random-effects) = .400, 95% CI [0.149, 0.541], indicating that
approximately 40% of the variance in exam scores is attributable to
group differences.
c. Reward is in its own subset (3) with the highest mean (65.4),
and not overlapping with the others → it's statistically
significantly higher. Therefore, the Reward method leads to
significantly higher exam marks compared to both Indifferent and
Punish, based on Tukey B post hoc results.
d. Since the p-value is less than the conventional significance level
of 0.050, there is a statistically significant difference between the
"Punish" and "Indifferent" groups in exam marks. This means that
the mean exam marks for these two groups are significantly
different.

You might also like