Lecture (Chapter 11) : Hypothesis Testing IV: Chi Square: Ernesto F. L. Amaral
Lecture (Chapter 11) : Hypothesis Testing IV: Chi Square: Ernesto F. L. Amaral
Source: Healey, Joseph F. 2015. ”Statistics: A Tool for Social Research.” Stamford: Cengage
Learning. 10th edition. Chapter 11 (pp. 276–306).
Chapter learning objectives
• Identify and cite examples of situations in which the chi
square test is appropriate
• Explain the structure of a bivariate table and the concept
of independence as applied to expected and observed
frequencies in a bivariate table
• Explain the logic of hypothesis testing in terms of chi
square
• Perform the chi square test using the five-step model
and correctly interpret the results
• Explain the limitations of the chi square test and,
especially, the difference between statistical significance
and substantive significance (importance, magnitude)
2
The bivariate table
• Bivariate tables display the scores of cases on
two different variables at the same time
5
Independent, dependent variables
• Columns are scores of the independent variable
– There will be as many columns as there are scores
on the independent variable
• Rows are scores on the dependent variable
– There will be as many rows as there are scores on
the dependent variable
• Each cell reports the number of times each
combination of scores occurred
– There will be as many cells as there are scores on the
two variables combined
6
Test for independence
• Chi Square as a test of statistical significance is
a test for independence
– Two variables are independent if the classification of
a case into a particular category of one variable has
no effect on the probability that the case will fall into
any particular category of the second variable
8
Computation of chi square
$%& '()*+,(- × /%-0', '()*+,(-
!" =
1
3
3
!9 − !"
2 %45(+,67 = 8
!"
where fo = cell frequencies observed in the
bivariate table
fe = cell frequencies that would be expected
if the variables were independent
9
Example
• Random sample of 100 social work majors
– We know whether the Council on Social Work Education has
accredited their undergraduate programs
– And whether they were hired in social work positions within three
months of graduation
• Is there a significant relationship between employment
status and accreditation status?
11
Step 2: Null hypothesis
• Null hypothesis, H0: fo = fe
– The variables are independent
– The observed frequencies are similar to the expected
frequencies
12
Step 3: Distribution, critical region
• Sampling distribution
– Chi square distribution (χ2)
• Significance level (α) = 0.05
– The decision to reject the null hypothesis has only a
0.05 probability of being incorrect
• Degrees of freedom (df) = (r–1)(c–1)
– r = number of rows; c = number of columns
– df = (r–1)(c–1) = (2–1)(2–1)= 1
• χ2(critical) = 3.841
– If the probability (p-value) is less than 0.05
– χ2(obtained) will be beyond χ2(critical)
13
Step 4: Test statistic
Expected frequencies
• χ2(obtained) = 10.78
16
Interpreting chi square
• The chi square test tells us only if the variables
are independent or not
• It does not tell us the pattern or nature of the
relationship
• To investigate the pattern, compute percentages
within each column and compare across the
columns
17
GSS example
. tab letin1 sex if year==2016, chi col
Key
18
Edited table
Table 1. Opinion of the U.S. adult population about how should the number
of immigrants to the country be nowadays by sex, 2004, 2010, and 2016
Opinion About Male Female Total Chi Square
p-value
Number of Immigrants (%) (%) (%) (df = 4)
2004 2.3397 0.6740
Increase a lot 3.17 4.30 3.78
Increase a little 6.89 6.27 6.56
Remain the same 35.01 34.05 34.49
Reduce a little 27.68 28.72 28.24
Reduce a lot 27.24 26.66 26.93
Total 100.00 100.00 100.00
(sample size) (914) (1,069) (1,983)
2010 7.0998 0.1310
Increase a lot 5.21 3.88 4.45
Increase a little 7.90 11.40 9.91
Remain the same 35.29 34.96 35.10
Reduce a little 24.03 25.31 24.77
Reduce a lot 27.56 24.44 25.77
Total 100.00 100.00 100.00
(sample size) (595) (798) (1,393)
2016 1.3515 0.8530
Increase a lot 5.98 5.75 5.85
Increase a little 12.70 11.11 11.82
Remain the same 40.17 40.25 40.22
Reduce a little 22.10 23.20 22.71
Reduce a lot 19.05 19.69 19.40
Total 100.00 100.00 100.00
(sample size) (819) (1,026) (1,845)
Source: 2004, 2010, 2016 General Social Surveys.
19
Limitations of chi square
• Difficult to interpret
– When variables have many categories
– Best when variables have four or fewer categories