0% found this document useful (0 votes)
29 views

Contingency Tables

Uploaded by

emailmarcha
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Contingency Tables

Uploaded by

emailmarcha
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

Lecture Slides

Elementary Statistics
Twelfth Edition

and the Triola Statistics Series

by Mario F. Triola

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-1


Chapter 11
Goodness-of-Fit and
Contingency Tables

11-1 Review and Preview


11-2 Goodness-of-Fit
11-3 Contingency Tables

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-2


Key Concept
In this section we consider contingency tables (or two-way
frequency tables), which include frequency counts for
categorical data arranged in a table with a least two rows and
at least two columns.
In Part 1, we present a method for testing the claim that the
row and column variables are independent of each other.
In Part 2, we will consider three variations of the basic
method presented in Part 1: (1) test of homogeneity, (2)
Fisher exact test, and (3) McNemar’s test for matched pairs.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-3


Part 1: Basic Concepts of Testing
for Independence

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-4


Definition
A contingency table (or two-way frequency table) is a table
in which frequencies correspond to two variables.

(One variable is used to categorize rows, and a second


variable is used to categorize columns.)

Contingency tables have at least two rows and at least two


columns.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-5


Example
Below is a contingency table summarizing the results of foot
procedures as a success or failure based different treatments.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-6


Definition

Test of Independence

A test of independence tests the null hypothesis that in a


contingency table, the row and column variables are
independent.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-7


Notation

O represents the observed frequency in a cell of a


contingency table.

E represents the expected frequency in a cell, found by


assuming that the row and column variables are
independent

r represents the number of rows in a contingency table (not


including labels).

c represents the number of columns in a contingency table


(not including labels).

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-8


Requirements
1. The sample data are randomly selected.

2. The sample data are represented as frequency counts in


a two-way table.

3. For every cell in the contingency table, the expected


frequency E is at least 5. (There is no requirement that
every observed frequency must be at least 5. Also, there
is no requirement that the population must have a normal
distribution or any other specific distribution.)

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-9


Hypotheses and Test Statistic
H 0 : The row and column variables are independent.
H1 : The row and column variables are dependent.

(O  E ) 2
  2

E
(row total)(column total)
E
(grand total)

O is the observed frequency in a cell and E is the expected frequency in a cell.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-10


P-Values and Critical Values

P-Values
P-values are typically provided by technology, or a range
of P-values can be found from Table A-4.

Critical Values
1. Found in Table A-4 using
degrees of freedom = (r – 1)(c – 1)
r is the number of rows and c is the number of
columns
2. Tests of Independence are always right-tailed.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-11


Expected Frequencies
Referring back to slide 6, the observed frequency is 54
successful surgeries.

The expected frequency is calculated using the first row


total of 66, the first column total of 182, and the grand
total of 253.

(row total)(column total)


E
(grand total)


 66 182 
 47.478
 253

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-12


Example
Does it appear that the choice of treatment affects the
success of the treatment for the foot procedures?

Use a 0.05 level of significance to test the claim that


success is independent of treatment group.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-13


Example - Continued
Requirement Check:

1.Based on the study, we will treat the subjects as being randomly


selected and randomly assigned to the different treatment groups.
2.The results are expressed in frequency counts.
3.The expected frequencies are all over 5.

The requirements are all satisfied.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-14


Example - Continued
The hypotheses are:

H 0 : Success is independent of the treatment.


H1 : Success and the treatment are dependent.

The significance level is α = 0.05.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-15


Example - Continued
We use a χ2 distribution with this test statistic:

 O  E  54  47.478   5  6.174 
2 2 2

 
2
  
E 47.478 6.174

 58.393

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-16


Example - Continued
P-Value: If using technology, the P-value is less than 0.0001. Since this value is
less than the significance level of 0.05, reject the null hypothesis.

Critical Value: The critical value of χ2 = 7.815 is found in Table A-4 with α = 0.05
and degrees of freedom of

Because the test statistic does fall in the critical region, we reject the null
hypothesis.
 r  1 c  1   4  1 2  1  3
A graphic of the chi-square distribution is on the next slide.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-17


Example - Continued

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-18


Example - Continued
Interpretation:

It appears that success is dependent on the treatment.

Although this test does not tell us which treatment is best,


we can see that the success rates of 81.8%, 44.6%,
95.9%, and 77.3% suggest that the best treatment is to
use a non-weight-bearing cast for 6 weeks.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-19


Relationships Among Key Components
in Test of Independence

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-20


Part 2: Test of Homogeneity and
the Fisher Exact Test

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-21


Definition

Test of Homogeneity

In a test of homogeneity, we test the claim that different


populations have the same proportions of some
characteristics.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-22


How to Distinguish Between
a Test of Homogeneity
and a Test for Independence

In a typical test of independence, sample subjects are randomly selected from


one population and values of two different variables are observed.

In a test of homogeneity, subjects are randomly selected from the different


populations separately.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-23


Example
We previously tested for independence between foot
treatment and success.

If we want to use the same data in a test of the null


hypothesis that the four populations corresponding to the
four different treatment groups have the same proportion
of success, we could use the chi-square test of
homogeneity.

The test statistic, critical value, and P-value are the same
as those found before, and we should reject the null
hypothesis that the four treatments have the same success
rate.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-24


Fisher Exact Test

The procedures for testing hypotheses with contingency


tables with two rows and two columns (2  2) have the
requirement that every cell must have an expected
frequency of at least 5.

This requirement is necessary for the χ2 distribution to be


a suitable approximation to the exact distribution of the
test statistic.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-25


Fisher Exact Test
The Fisher exact test is often used for a 2 X 2
contingency table with one or more expected frequencies
that are below 5.

The Fisher exact test provides an exact P-value and does


not require an approximation technique.

Because the calculations are quite complex, it’s a good


idea to use computer software when using the Fisher
exact test.

STATDISK, Minitab, XLSTAT, and StatCrunch all have


the ability to perform the Fisher exact test.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-26


McNemar’s Test for Matched Pairs

The methods in Part 1 of this chapter are based on


independent data.
For 2 X 2 tables consisting of frequency counts that result
from matched pairs, we do not have independence, and for
such cases, we can use McNemar’s test for matched pairs.
In this section we present the method of using McNemar’s
test for testing the null hypothesis that the frequencies from
the discordant (different) categories occur in the same
proportion.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-27


McNemar’s Test for Matched Pairs

McNemar’s test requires two frequency counts from


discordant (different) pairs.

P-values are typically provided by software, and critical


values can be found in Table A-4 with 1 degree of freedom.

McNemar’s test of the null hypothesis that the frequencies


from the discordant categories occur in the same
proportion.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-28


Example
A randomized controlled trial was designed to test the
effectiveness of hip protectors in preventing hip fractures in
the elderly.
Nursing home residents each wore protection on one hip,
but not the other.
McNemar’s test can be used to test the null hypothesis that
the following two proportions are the same:

•The proportion of subjects with no hip fracture on the


protected hip and a hip fracture on the unprotected hip.
•The proportion of subjects with a hi fracture on the
protected hip and no hip fracture on the unprotected hip.

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-29


Example
The test statistic can be calculated from the data table below:

 b  c  1  10  15  1
2 2

 2
   0.640
bc 10  15
sisi tidak pakai protektor

sisi pakai protektor

Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-30


Example - Continued
With a 0.05 level of significance and one degree of freedom,
the critical value for the right-tailed test is χ2 = 3.841.

The test statistic does not exceed the critical value, so we fail
to reject the null hypothesis.

The proportion of hip fractures with the protectors worn is not


significantly different from the proportion of hip fractures
without the protectors worn.

The hip protectors do not appear to be effective in preventing


hip fractures.
Copyright © 2014, 2012, 2010 Pearson Education, Inc. Section 11.3-31

You might also like