0% found this document useful (0 votes)
17 views

離散資料分析 Categorical Data Analysis: 陳俞成 Email:[email protected]

This document outlines topics in Chapter 3 of Categorical Data Analysis, including three-way contingency tables, partial association, and Cochran-Mantel-Haenszel methods. It provides an example analyzing the effects of racial characteristics on death penalty verdicts, using a three-way table with defendant's race, victim's race, and verdict. The chapter will discuss controlling for covariates, conditional associations in partial tables, and exact inference methods for small samples.

Uploaded by

陳依琪
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

離散資料分析 Categorical Data Analysis: 陳俞成 Email:[email protected]

This document outlines topics in Chapter 3 of Categorical Data Analysis, including three-way contingency tables, partial association, and Cochran-Mantel-Haenszel methods. It provides an example analyzing the effects of racial characteristics on death penalty verdicts, using a three-way table with defendant's race, victim's race, and verdict. The chapter will discuss controlling for covariates, conditional associations in partial tables, and exact inference methods for small samples.

Uploaded by

陳依琪
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

大綱

Chapter 3 Three-Way Contingency Tables

離散資料分析
Categorical Data Analysis

陳俞成
Email:[email protected]

2005.10.17

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


大綱
Chapter 3 Three-Way Contingency Tables

Chapter 3 Three-Way Contingency Tables


Partial Association
Cochran-Mantel-Haenszel Methods
Exact Inference About Conditional Associations

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Three-Way Contingency Tables


I The choice of control variables in research studies
: In studying the effect of an explanatory variable
X on a response variable Y , one should “control”
covariates that can influence that relationship.
I Holding covariates constant while studing the effect of
X on Y
I Otherwise, simply reflecting effects of those covariates
on both X and Y
I Being particularly true for observational studies
I Randomly assigning subjects to different treatments
can work well but is luxurious.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Three-Way Contingency Tables

I The study considers effects of passive smoking


which is associated with lung cancer.
I A cross-sectional study might compare lung cancer
rates between nonsmokers whose spouses smoke and
nonsmokers whose spouses do not smoke.
I To control for age, socioeconomic status, or other
factors that might relate both to whether one’s spouse
smokes and to whether one has lung cancer.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Three-Way Contingency Tables

I The main topic is analyzing the association


between two categorical variables X and Y , while
controlling for effects of a possibly confounding
variable(干擾變數) Z .
I By studying the X − Y relationship at fixed, constant
levels of Z .
I Later chapters use models to perform statistical
control.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Three-Way Contingency Tables

I §3.1 shows Simpson’s paradox.


I §3.2 presents large-sample inferential methods for
association.
I §3.3 discusses small-sample exact inference for
association.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Partial Tables

I Partial tables : Two-way cross-sectional slices of


the three-way table cross classify X and Y at
separate levels of the control variable Z .
I Displaying the X − Y relationship at fixed levels of Z .
I Showing the effect of X on Y while controlling for Z .
I Removing the effect of Z by holding its value constant.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Partial Tables

I The X − Y marginal table : The two-way


contingency table obtained by combining the
partial tables.
I Each cell count in the marginal table is a sum of
counts from the same cell location in the partial tables.
I The marginal table contains no information about Z ,
ignoring it.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Partial Tables
I The association in partial tables are called
conditional associations.
I Conditional associations refer to the effect of X
on Y conditional on fixing Z at some level.
I Conditional associations in partial tables can be
quite different from associations in marginal
tables.
I It can be misleading to analyze only a marginal
table of a multi-way contingency table.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Death Penalty Example


Death penalty Verdict by Defendant’s Race and Victims’ Race
Victims’ Defendant’s Death penalty Percentage
Race Race Yes No Yes
White White 53 414 11.3
Black 11 37 22.9
Black White 0 16 0.0
Black 4 139 2.8
Total White 53 430 11.0
Black 15 176 7.9
Source:Florida Law Rev. 43: 1-34(1991).
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Death Penalty Example

I Studying effects of racial characteristics on


whether individuals convicted of homicide receive
the death penalty.
I Y = “death penalty verdict,” having categories
(yes, no), and X = “race of defendant” and Z =
“race of victims,” each having categories (white,
black).

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Death Penalty Example


I When the victims were white, the death penalty
was imposed 22.9% − 11.3% = 11.6% more often
for black defendants than for white defendants.
I When the victims were black, the death penalty
was imposed 2.8% − 0.0% = 2.8% more often
for black defendants than for white defendants.
I Controlling for victims’ race by keeping it fixed,
the percentage of “yes” death penalty verdicts
was higher for black defendants than for white
defendants.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Death Penalty Example


I The marginal table for defendant’s race and the
death penalty verdict displays 11.0% of white
defendants and 7.9% of black defendants
received the death penalty.
I Ignoring victims’ race, the percentage of “yes”
death penalty verdicts was lower for black
defendants than for white defendants.
I The association reverses direction compared to
the partial tables.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Death Penalty Example

I The association between victims’ race and


defendant’s race is extremely strong.
I Regardless of defendant’s race, the death penalty
was considerably more likely when the victims
were white than when the victims were black.
I Whites are tending to kill whites, and killing
whites is more likely to result in the death
penalty.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Death Penalty Example

I Figure 3.2.
I A marginal association can have different
direction from the conditional associations is
called Simpson’s paradox.
I This result applies to quantitative as well as
categorical variables.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Conditional and Marginal Odds Ratios


I We illustrate for 2 × 2 × K tables, where K
denotes the number of levels of a control
variable, Z .
I Let {nijk } denote observed frequencies and let
{µijk } denote their expected frequencies.
I Within a fixed level k of Z , θXY (k) = µµ11k µ22k
12k µ21k
describes conditional X − Y association.
I The X − Y conditional odds ratios =
{θXY (1) , θXY (2) , · · · , θXY (K ) } for the K partial
tables.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Conditional and Marginal Odds Ratios

I The X − Y marginal table has observed


P
frequencies {nij+ = k nijk } and expected
P
frequencies {µij+ = k µijk }.
µ11+ µ22+
I The X − Y marginal odds ratio is θXY = µ12+ µ21+

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Conditional and Marginal Odds Ratios

n11k n22k
I Sample estimate of θXY (k) is θ̂XY (k) = n12k n21k
n11+ n22+
I Sample estimate of θXY is θ̂XY = n12+ n21+

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Conditional and Marginal Odds Ratios

Death Penalty Example for Z = White


Y
X Yes No Total
White 53 414 467
Black 11 37 48
Total 64 451 515
53×37
θ̂XY (1) = 414×11 = 0.43 < 1

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Conditional and Marginal Odds Ratios


Death Penalty Example for Z = Black
Y
X Yes No Total
White 0 16 16
Black 4 139 143
Total 4 155 159
θ̂XY (2) = 0×139
16×4 = 0.0 < 1 or
θ̂XY (2) = 0.5×139.5
16.5×4.5 = 0.94 < 1

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Conditional and Marginal Odds Ratios

Death Penalty Example for marginal over Z


Y
X Yes No Total
White 53 430 483
Black 15 176 191
Total 68 606 674
53×176
θ̂XY = 430×15 = 1.45 > 1

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Conditional and Marginal Odds Ratios


Death Penalty Example for marginal over Y
X
Z White Black Total
White 467 48 515
Black 16 143 159
Total 483 191 674
θ̂XZ = 467×143
48×16 = 87 > 1
(Whites are tending to kill whites
and blacks are tending to kill blacks.)
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Conditional and Marginal Odds Ratios


Death Penalty Example for marginal over X
Y
Z Yes No Total
White 64 451 515
Black 4 155 159
Total 68 606 674
θ̂YZ = 64×155
451×4 = 5.5 > 1
(Killing whites is more likely
to result in the death penalty.)
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Conditional and Marginal Odds Ratios


I This relates to the nature of association between
the control variable, Z , and each of the other
variables, X and Y . Z is a confounding factor.
I It can be misleading to analyze only a marginal
table of a multi-way contingency table.
I When we add results across the control variable,
Z , to get a summary result for the marginal effect
of the explanatory variable, X , on the response
variable, Y , the larger cell counts having the
greater number of cases have greater influence.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Mraginal versus Conditional Independence

I If X and Y are independent in each partial table, then X


and Y are said to be conditionally independent, given Z .

I All conditional odds ratios between X and Y then equal 1.

I Conditional independence of X and Y , given Z , does not


imply marginal independence of X and Y .

I When odds ratios between X and Y equal 1 at each level


of Z , the marginal odds ratio may differ from 1.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Mraginal versus Conditional Independence


Conditional Independence Does Not Imply
Marginal Independence
Response
Clinic Treatment Success Failure
1 A 18 12
B 12 8
2 A 2 8
B 8 32
Total A 20 20
B 20 40

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Mraginal versus Conditional Independence

I Y = response(success, failure),
X = drug treatment(A,B),
and Z = clinic(1,2)
18×8 2×32
I θ̂XY (1) = 12×12 = 1 and θ̂XY (2) = 8×8 =1
20×40
I θ̂XY = 20×20 = 2 > 1

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Mraginal versus Conditional Independence


For marginal over Y
X
Z A B Total
1 30 20 50
2 10 40 50
Total 40 60 100
30×40
θ̂XZ = 20×10 =6>1
(Clinic 1 tends to use treatment A more often.)
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Mraginal versus Conditional Independence


For marginal over X
Y
Z S F Total
1 30 20 50
2 10 40 50
Total 40 60 100
30×40
θ̂YZ = 20×10 =6>1
(Clinic 1 also tends to have more successes.)
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Homogeneous Association
I There is homogeneous X − Y association in a
2 × 2 × K table when
θXY (1) = θXY (2) = · · · = θXY (K ) .
I The effect of X on Y is the same at each level of
Z , and a single number describes the X − Y
conditional associations.
I Conditional idependence of X and Y is the
special case in which each conditional odds ratio
equals 1.0.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Homogeneous Association
I Homogeneous X − Y association in an I × J × K
table means that any conditional odds ratio
formed using two levels of X and two levels of Y
is the same at each level of Z .
I When X − Y conditional odds ratios are identical
at each level of Z , the same property holds for
the other association. For instance, the
conditional odds ratio between two levels of X
and two levels of Z is identical at each level of Y .
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Homogeneous Association

I Homogeneous association is a symmetric


property, applying to any pair of the variables
viewed across the levels of the third.
I When it occurs, there is said to no interaction
between two variables in their effects on the third
variable.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Homogeneous Association
I When homogeneous association does not exist,
the conditional odds ratio for any pair of variables
changes across levels of the third variable.
I For X = smoking(yes, no), Y = lung cancer(yes,
no), and Z = age(< 45, 45 − 65, > 65), suppose
θXY (1) = 1.2, θXY (2) = 2.8, θXY (3) = 6.2. Then,
smoking has a weak effect on lung cancer for
young people, but the effect strengthens
considerably with age.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

I Present a test of conditional independence and a


test of homogeneous association for the K
conditional odds ratios in 2 × 2 × K tables.
I Show how to combine the sample odds ratios
from the K partial tables into a single summary
measure of partial association.
I Analyses of conditional association are relevant in
most applications having multivariate data.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

The Cochran-Mantel-Haenszel Test

I For 2 × 2 × K tables
I H0 : X and Y are conditionally independent, given Z
I H0 : θXY (i) = 1, i = 1, 2, · · · , K

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

The Cochran-Mantel-Haenszel Test

I The standard sampling models treat the cell


counts as
(1) independent Poisson variates, or
(2) multinomial counts with fixed overall sample size, or
(3) multinomial counts with fixed sample size for each
partial table, with counts in different partial tables
being independent, or
(4) independent binomial samples within each partial table
with row totals fixed.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

The Cochran-Mantel-Haenszel Test

I In partial table k, the row totals are {n1+k , n2+k },


and the column totals are {n+1k , n+2k }
I Given both these totals, all these sampling
schemes yield a hypergeometric distribution for
the count n11k in the cell in the first row and first
column.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

The Cochran-Mantel-Haenszel Test

I Under the null hypothesis,

n1+k n+1k
µ11k = E (n11k ) =
n++k
n1+k n2+k n+1k n+2k
Var(n11k ) = 2
n++k (n++k − 1)

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

The Cochran-Mantel-Haenszel Test

I H0 : θXY (1) = θXY (2) = · · · = θXY (K ) = 1 v.s.


Ha : at least one partial table θXY (i) 6= 1
I The test statistic is called the
Cochran-Mantel-Haenszel(CMH) statistic.
[ k (n11k −µ11k )]2 ·
P
I CMH = P ∼ χ21 for large n
k Var (n11k )
I Reject H0 if CMH > χ21,α
at the significance level α

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

The Cochran-Mantel-Haenszel Test


I The CMH test is inappropriate when the
association varies dramatically among the partial
tables.
I The CMH test works best when the X − Y
association is similar in each partial table.
I When the true association is similar in each
table, the CMH test is more powerful than
separate tests within each table.
I Simpson’s paradox revealed the dangers of
collapsing three-way tables.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Lung Cancer

City Smoking Yes No Odds Ratio µ11k Var(n11k )


Beijing Smokers 126 100 2.20 113.0 16.9
Nonsmokers 35 61
Shanghai Smokers 908 688 2.14 773.2 179.3
Nonsmokers 497 807
Shenyang Smokers 913 747 2.18 799.3 149.3
Nonsmokers 336 598
Nanjing Smokers 235 172 2.85 203.5 31.1
Nonsmokers 58 121
Harbin Smokers 402 308 2.32 355.0 57.1
Nonsmokers 121 215
Zhengzhou Smokers 182 156 1.59 169.0 28.3
Nonsmokers 72 98
Taiyuan Smokers 60 99 2.37 53.0 9.0
Nonsmokers 11 43
Nanchang Smokers 104 89 2.00 96.5 11.0
Nonsmokers 21 36
Source:Intern. J. Epidemiol., 21:197-201(1992)
Chinese Smoking and Lung
陳俞成 Email:[email protected]
Cancer Study
離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Lung Cancer Meta Analysis Example


I Each study matched cases of lung cancer with
controls not having lung cancer and then
recorded whether each subject had ever been a
smoker.
I In each partial table, we treat the counts in each
column as a binomial sample, with column total
fixed.
I We test the hypothesis of conditional
independence between smoking and lung cancer.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Lung Cancer Meta Analysis Example


I H0 : θXY (1) = θXY (2) = · · · = θXY (8) = 1.0
I The sample odds ratio shows a moderate positive
association, so it makes sense to combine results
through the CMH statistic.
P P
I n 11k = 2930, k µ11k = 2562.5, and
Pk
k Var(n11k ) = 482.1
I Test statistic
·
CMH = (2930.0 − 2562.5)2 /482.1 = 280.1 ∼ χ21
I There is extremely strong evidence against
conditional independence(P < .0001).
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Lung Cancer Meta Analysis Example

I A statistical analysis that combines information


from several studies is called a meta analysis(統
合分析).
I The meta analysis provides stronger evidence of
an association than any single partial table gives
by itself.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Estimation of Common Odds Ratio


I In a 2 × 2 × K table, suppose that
θXY (1) = · · · = θXY (K ) .
I The Mantel-HaenszelPestimator of that common
(n n22k /n++k )
odds ratio is θ̂MH = Pk (n11k 12k n21k /n++k )
.
k
I The standard error for log(θ̂MH ) has a complex
formula(Agresti(1990), p.236, or Agresti(2002),
P.234).
I SAS-PROC FREQ computes a standard error.
I Using logit models to obtain an estimate and
standard error in §5.4.4.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Estimation of Common Odds Ratio


I For the Chinese smoking studies, the Mantel Haenszel odds
ratio estimate equals
(126)(61)/(322) + · · · + (104)(36)/(250)
θ̂MH = = 2.17.
(35)(100)/(322) + · · · + (21)(89)/(250)
I log(θ̂MH ) = log(2.17) = 0.777
I σ̂[log θ̂MH ] = 0.046
I An approximate 95% confidence interval for the common
log odds ratio is 0.777 ± 1.96 × 0.046, or (0.686, 0.868).
I An approximate 95% confidence interval for the common
odds ratio is (exp(0.686), exp(0.868)) = (1.98, 2.38).
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Estimation of Common Odds Ratio

I If the true odds ratios are not identical but do


not vary drastically, θ̂MH still provides a useful
summary of the K conditional associations.
I The CMH test is a powerful summary of
evidence against the hypothesis of conditional
independence, as long as the K conditional
associations fall primarily in a single direction.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Testing Homogeneity of Odds Ratios

I A test of homogeneous association for 2 × 2 × K


tables
I Test the hypothesis that the odds ratio between
X and Y is the same at each level of Z
I H0 : θXY (1) = · · · = θXY (K )

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Testing Homogeneity of Odds Ratios

I Let {µ̂11k , µ̂12k , µ̂21k , µ̂22k } denote estimated


expected frequencies in the kth partial table that
have the same marginal totals as the observed
data, yet have odds ratio equal to the
Mantel-Haenszel estimate θ̂MH of a common
odds ratio.
(n −µ̂ )2 ·
The test statistic, i,j,k ijkµ̂ijk ijk ∼ χ2K −1 , is
P
I

called the Breslow-Day statistic.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Testing Homogeneity of Odds Ratios


I The closer the cell counts fall to the values
having a common odds ratio, the smaller the
statistic and the less the evidence against H0 .
I Calculation of the {µ̂ijk } satisfying a common
odds ratio is complex.
I PROC FREQ in SAS reports this statistic.
I The sample size should be relatively large in each
partial table, with {µ̂ijk ≥ 5} in at least about
80% of the cells.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Testing Homogeneity of Odds Ratios

I For the Chinese smoking studies, software reports


a Breslow-Day statistic equal to 5.2, based on
df = 7, for which P = .64.
I This evidence does not contradict the hypothesis
of equal odds ratios.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Some Caveats
I R.Tarone showed(Biometrika, 72: 91-95(1985))
that adjust the Breslow-Day statistic by
subtracting

[ k (n11k − µ̂11k )]2


P
P 1 1 1 1 −1
k [ µ̂11k + µ̂12k + µ̂21k + µ̂22k ]

to ensure that the distribution of it converges to


chi-squared as the sample size increases.
I This adjustment is usually minor.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Some Caveats
I If for each case-control pair we had information
about whether each subject had been a smoker,
we could form a 2 × 2 table relating whether the
control had ever been a smoker(yes,no) to
whether the case had ever been a smoker(yes,no).
I Chapter 9 discusses some of these.
I §6.5.1 discusses an alternative test of
homogeneity of odds ratios, based on models.
I §7.3presents generalizations of the CMH test for
I × J × K tables.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

I The chi-squared tests presented in the previous


section are large-sample tests.
I It is difficult to provide general guidelines about
how large n must be.
I The tests’ adequacy depends more on the
two-way marginal totals than on counts in the
separate partial tables.
I In practice, small sample sizes are not
problematic, since one can conduct exact
inference about conditional associations.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Test of Conditional Independence


for 2 × 2 × K Tables

I For 2 × 2 × K tables, hypergeometric


distributions in each partial table determine
probabilities for {n11k , k = 1, . . . , K }.
I These determine the distribution of their sum,
P
k n11k .

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Test of Conditional Independence


for 2 × 2 × K Tables

I H0 : All conditional odds ratios {θXY (k) } equal 1


v.s.
I Ha : At least one conditional odds ratios
θXY (k) > 1
P
I P − value = P( k n11k ≥ observed
|for the fixed marginal totals)

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Test of Conditional Independence


for 2 × 2 × K Tables

I H0 : All conditional odds ratios {θXY (k) } equal 1


v.s.
I Ha : At least one conditional odds ratios
θXY (k) < 1
P
I P − value = P( k n11k ≤ observed
|for the fixed marginal totals)

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Test of Conditional Independence


for 2 × 2 × K Tables
I H0 : All conditional odds ratios {θXY (k) } equal 1
v.s.
I Ha : At least one conditional odds ratios
θXY (k) 6= 1
P − value = P ∗ , where
P
I

P ∗ ≤ P( k n11k = observed
P

|for the fixed marginal totals)


陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Test of Conditional Independence


for 2 × 2 × K Tables

I Exact tests of conditional independence are


computationally highly intensive.
I We used StatXact in the following example.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Promotion Discrimination Example


Promotion Decisions by Race and by Month

July August September


Promotions Promotions Promotions
Race Yes No Yes No Yes No
Black 0 7 0 7 0 8
White 4 16 4 13 2 13
Source: Statistical Reasoning in Law and Public
Policy(San Diego: Academic Press,1988,P.266)
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Promotion Discrimination Example

I We test conditional independence of promotion


decision and race.
I The table contains several small counts.
I The overall sample size is not small(n = 74), but
one marginal count (collapsing over month of
decision) equals zero, so we might be wary of
using the CMH test.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Promotion Discrimination Example


I X = Race(Black, White), Y = Promotions(Yes,
No), and Z = Month(July, August, September)
I n111 can range between 0 and 4, n112 can range
between 0 and 4, and n113 can range between 0
and 2.
P
I The totals k n11k can take values between 0
and 10.
I The sample data represent the most extreme
possible result in each of the three cases.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Promotion Discrimination Example

I H0 : θXY (1) = θXY (2) = θXY (3) = 1.0 v.s.


I Ha : an odds ratio less than 1.0
(the one-sided alternative)
P
I P − value = P( k n11k = 0
|for the fixed marginal totals) = .026

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Promotion Discrimination Example

I H0 : θXY (1) = θXY (2) = θXY (3) = 1.0 v.s.


I Ha : an odds ratio not equal 1.0
(the two-sided alternative)
P − value = p ∗ = .056 where p ∗ <
P
I
P
P( k n11k = 0|for the fixed marginal totals)

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Confidence Interval for Common


Odds Ratio

I For small sample the discreteness implies that


exact tests are conservative.(As discussed in
§2.6.3)
I When H0 is true the P-value may fall below .05
less than 5% of the time.
I One can alleviate conservativeness by using the
mid-P-value.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Confidence Interval for Common


Odds Ratio

I One can construct “exact” confidence intervals


for an assumed common value θ of {θXY (k) }.
I Because of discreteness, these are also
conservative.
I For a 95% “exact” confidence interval, the true
confidence level is at least as large as .95, but is
unknown.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Confidence Interval for Common


Odds Ratio

I A more useful 95% confidence interval is the one


containing the values θ0 having mid P-values
exceeding .05 in tests of H0 : θ = θ0 .

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Confidence Interval for Common


Odds Ratio
I For U.S. government promotion decisions
problem
I The Mantel-Haenszel estimator θ̂MH = 0.0
I StatXact reports an “exact” 95% confidence
interval for a common odds ratio of (0, 1.01).
I A 95% confidence interval based on
correspondence with tests using the mid P-value
is (0,0.78).
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Test of Homogeneity of Odds Ratios


I The Breslow-Day test of homogeneity of odds
ratios(§3.2.4) is a large sample test.
I When the total sample size is small, or when the
total sample size is large but the number K of
partial tables is large and individual tables have
small sample sizes, this test is not valid.
I An exact test of homogeneity of odds ratios,
sometimes called Zelen’s exact test, handles such
cases.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Test of Homogeneity of Odds Ratios

I The exact distribution is calculated using the set


of all 2 × 2 × k tables that have the same
two-way marginal totals as the observed table.
I The P-value is the sum of probabilities of all
2 × 2 × k tables that are no more likely than the
observed table.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Test of Homogeneity of Odds Ratios

I For U.S. government promotion decisions


problem
I The values {µ̂ijk } that yield the Mantel-Haenszel
estimate θ̂MH = 0.0 of the common odds ratio
are identical to the observed counts in each
partial table.
I The Breslow-Day statistic contains terms of the
form 0/0, and it is undefined.
陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis
Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Exact Test of Homogeneity of Odds Ratios

I No table other than the observed table has all


the two-way marginal totals of that table.
I Zelen’s exact test is degenerate, giving P = 1.0.
I A disadvantage of exact inference is that the
small-sample conditional distribution is often
highly discrete.

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis


Partial Association
大綱
Cochran-Mantel-Haenszel Methods
Chapter 3 Three-Way Contingency Tables
Exact Inference About Conditional Associations

Summary

I Partial Association
I Large-sample significance tests for conditional
independence and homogeneity
I Small-sample significance tests for conditional
independence and homogeneity

陳俞成 Email:[email protected] 離散資料分析 Categorical Data Analysis

You might also like