0% found this document useful (0 votes)
14 views

SAHADEB - Categorical - Data - Lecture3

The document discusses log-linear models for analyzing contingency tables. It introduces the general form of a contingency table and describes independence and saturated log-linear models. It explains how log-linear models link Poisson and multinomial distributions. The document also interprets parameters in independence models and discusses saturated models for 2x2 tables.

Uploaded by

Arshdeep Singla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

SAHADEB - Categorical - Data - Lecture3

The document discusses log-linear models for analyzing contingency tables. It introduces the general form of a contingency table and describes independence and saturated log-linear models. It explains how log-linear models link Poisson and multinomial distributions. The document also interprets parameters in independence models and discusses saturated models for 2x2 tables.

Uploaded by

Arshdeep Singla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 84

Categorical Data Analysis

APDS
IIM Calcutta

• Slides Adapted from Prof Ayanendranath Basu’s Class-notes


• R Programs and Data Sets in Textbook (Tan, He & Tu):
http://accda.sph.tulane.edu/r.html

1
Log-Linear Models for Contingency Tables
(Read: Agresti, pp. 314-326)
• We will develop models where the log of the cell
counts can be additively expressed as a function of
several parameters.
• We have already looked at Poisson Log-Linear
models.
• In the cross classified tables, the loglinear model
does not differentiate between the response and
explanatory variables. It treats both jointly as
responses, modeling log(μij) for combinations of the
levels (i,j).

2
General Form of the r × c Table
(Observations)
Column Level Column Level Column
1 2 Level c

Row Level
1
n11 n12  n 1c
R1
Row Level
2
n21 n22 
 n 2c R2

    
Row Level
r
nr 1 nr 2  n rc Rr

C1 C2 Cc T=n

Independence Model Expected Value: Eij = RiCj/T


3
General Form of the r × c Table
(Probabilities)
Column Level Column Level Column
1 2 Level c

Row Level
1
 11  12   1c
π1+
Row Level
2
 21  22 
  2c π2+

    
Row Level
r
 r1  r2   rc πr+
π+1 π+2 π+r 1

4
Linking Poisson with Multinomial Dist,
(pp. 202-203, Text)
Result: Let Y1, …,Yk be independent Poi(i) r.v.’s. Then
given (Y1+ …+Yk)=n, the conditional joint distribution of
(Y1, …,Yk) is multinomial distribution with parameters:
𝑖
n trials, and 𝜋𝑖 = 𝑘 , i=1,…,k 
𝑗=1 𝑗

We may assume observed multinomial cell counts {nij,


ij} are generated by independent Poisson(µij)
distributions and use the Poisson log-linear regression
𝑖𝑗
model for the cell counts, taking 𝜋𝑖𝑗 = .
𝑘,𝑙 𝑘𝑙

5
Linking Poisson with Multinomial Dist,
(pp. 202-203, Text)
𝑖 exp(𝑖 )
Log (i)=+i, I=0; Let 𝜋𝑖 = 𝐼 = 𝐼

𝑘=1 𝑘 𝑘=1 exp(𝑘 )
If our interest lies in estimating i’s (or i’s), maximum
likelihood based on Poisson approach produces the same
inference as that on the multinomial approach. Poisson
approach has an extra parameter to estimate = 𝐼𝑘=1 𝑘 =
𝑘=1 exp( + 𝑘 ), MLE of which is n.
𝐼

If n = number of subjects in the sample is treated as


random, then log-linear model on (Poisson) cell counts
applies. Similarly, in product binomial (multinomial)
sampling, if each group size is treated as random, then also
log-linear model applies.
6
Linking Poisson with Multinomial Dist,
(pp. 202-203, Text)
Log (i)=+i, I=0;
𝑖 exp(𝑖 )
Let 𝜋𝑖 = 𝐼 = 𝐼

𝑘=1 𝑘 𝑘=1 exp(𝑘 )

𝑖 𝑖
Exp(i) = log( ) = log( ) = log of odds of i-th category
𝐼 𝐼
happening relative to category I.

7
Log-Linear Models:
Independence Model for 2-Way Table
𝜋𝑖𝑗 = 𝜋𝑖+ 𝜋+𝑗

𝜇𝑖𝑗 = 𝑇𝜋𝑖𝑗 = 𝑇𝜋𝑖+ 𝜋+𝑗 , T = sample size

Thus, 𝜇𝑖𝑗 is of the form : 𝜇𝑖𝑗 =𝛼𝑖 𝛽𝑗

In the log scale, it has the additive form:

Log(𝜇𝑖𝑗 ) =  + 𝑖𝑋 + 𝑗𝑌

8
Log-Linear Models: Independence Model

log ij    i   j
X Y

This is called the loglinear model of independence.


For identifiability, we impose constraints such as

r  0 and c  0
X Y

9
Interpretation of Parameters:
Log-Linear Independence Model for r×2 tables
Let X and Y represent the explanatory and response
variable respectively. Suppose variable Y has 2 levels.

 P (Y  1 | X  i ) 
logit P (Y  1 | X  i )   log 
 P (Y  2 | X  i ) 
  i1 
 log   log  i1  log  i 2

 i 2 
 (  i X  1Y )  (  i X  2Y )
 1Y  2Y  independent of i
10
Interpretation of Parameters:
Log-Linear Models for r×2 tables

The final term does not depend on i.

logit(P(Y=1|X=i)) is identical at each level of i.

Thus, loglinear model of independence implies:


logit(P(Y=1|X=i)) = α.

In each row i, the odds of response in column 1 (for Y=1)


equals:
exp( )  exp(  2 )
Y Y
1
11
Intentionally Kept Blank

12
Saturated Log-Linear Models
Statistically dependent variables satisfy a more complex
loglinear model.

log ij    i   j  ij


X Y XY

The λijXY are interaction (or association) terms that


quantify the deviation from independence.

13
Saturated Model: 2 × 2 Table

 11 / 12   11 22 


log   log   log 
  21 /  22   12  21 
 log11  log 22  log12  log 21
 11 XY  22 XY  12 XY  21 XY
 11 XY (if 22 XY  12 XY  21 XY  0)

Thus, direct relations exist between the log odds ratio


and the interaction terms.

14
General Form of the r × c Table
Column
Level 1
Column
Level 2  Column
Level c
Total

Row Level 1
11 12   1c
R1
Row Level 2  21  22 
  2c
R2

    
Row Level r
 r1 r 2   rc
Rr

Total C1 C2 Cc T (=n)

15
General Log-Linear Models, r×c Tables
The number of parameters in the model

log ij    i   j  ij


X Y XY

equals 1 + (r−1) +(c−1) +(r−1)×(c−1) = rc, the number of


cells. Thus this model perfectly describes any cell-
frequency in the r×c contingency table.

Thus, this is the most general model for the r×c


contingency table, and is called the saturated model.

16
General Log-Linear Models, r×c Tables

log ij    i   j  ij


X Y XY

 For identifiability, we impose following natural constraints


on the interaction (association) terms:
λijXY = 0 whenever j = c, or i = r.

 Exactly [rc - (1+ r-1+ c-1)] = (r−1)×(c−1) of the interaction


terms are non-redundant in the saturated model.

 Tests of independence check whether all these (r−1)×(c−1)


interaction terms are equal to zero, so that residual degrees
of freedom is (r−1)×(c−1) for LRT=-2ln(maxL/maxHoL).
17
General Log-Linear Models
• In its general form, the log-linear model looks exactly like an
ANOVA model.
• However, unlike an ANOVA model, this is based on a table
of frequency-counts for distinct categories defined by
discrete variable(s), and not based on a table of
measurements on a continuous variable.
• Unlike in the ANOVA model, here for each category, the
individual “yes=1” observations are summed up and
presented as frequency-counts. It would be similar to an
ANOVA model situation where for the i-th “treatment”
instead of giving the ni individual observations {yij}, data are
𝑛𝑖
presented as sum of ni observations, i.e., 𝑗=1 𝑦𝑖𝑗 .

18
Independence Model: 2 × 2 Contingency
Table

log  ij    i   j
X Y

Independence model implies log of odds ratio  0.


 11 22 
log   log 
 12  21 
 log11  log 22  log12  log 21
0
19
Interaction Terms in Log-Linear Models
• In 2 x 2 tables, or more generally in r×c tables for two
variables, unsaturated models cannot include
Interaction terms.

• For tables with at least three variables, one can


include association terms in unsaturated models.

• Association terms keep increasing in order, whereas


odds are essentially two factor terms.

20
Hierarchical Models
Consider the model:
log ij    i   j  ij
X Y XY

The model is hierarchical, i.e., model contains all lower


order terms composed from variables that are
contained in a higher order model term. If a lower
order term is absent in the model, or it turns out to be
insignificant, the interpretation of the higher order
terms become questionable.

Non-hierarchical models are rarely sensible in practice.


Normally we restrict our attention to the highest
ordered term in the model.
21
Alternative Parameter Constraints
The parameter constraints for the saturated model are
arbitrary. Different software use different conventions.

(i) We can set: λicXY = λrjX Y = 0, for all i and j;

or
(ii) we can set:
∑λijXY = 0
when summed over either i or j.

22
Alternative Parameter Constraints
(p.317, Agresti, 2nd Ed)

The contrasts such as


11  22
XY XY
 12
XY
 21
XY

which determine the odds ratios (or log odds ratios) are
unique. For instance, suppose that the log odds ratio
equals 2 in a 2 x 2 table.

First type of constraints lead to:


λ11XY = 2 and λ21XY = λ12X Y = λ22X Y = 0.

Second type of constraints [ λ21XY = λ12X Y= λ22X Y= λ11XY ]


lead to: λ11XY = 0.5, λ21XY = −0.5, λ12X Y = −0.5, λ22X Y = 0.5
23
Intentionally Kept Blank

24
Log-Linear Models: 3-Way I×J×K Tables
Change of notation to I, J and K.

log( ijk )    
I J K
where   ijk  1 .
i 1 j 1 k 1

Conditional on sum n of cell counts, Poisson loglinear


models for cell means {ijk} become multinomial
𝜇𝑖𝑗𝑘
models for cell probabilities {𝑖𝑗𝑘 = }
𝜇𝑎𝑏𝑐

25
3-Way Tables (Agresti, 2nd ed, p.318, Ch 8)
Source: Wright State University, Dayton, Ohio, USA
Subjects: Students in the Final Year of High School

[Agresti]
26
Mutual Independence
X, Y, Z are said to be mutually independent if
for all i, j and k,
πijk = πi++ π+j+ π++k

( μijk = nπijk = nπi++ π+j+ π++k ).

Thus, for expected cell frequencies μijk , this means

log(μijk)=λ+ λiX + λjY + λkZ


27
Joint Independence
(“Independence of One Factor”, p.215, Text)
Variable Y is said to be jointly independent of X and
Z if, for all j, and all (i,k),
πijk = πi+k π+j+

For the expected frequencies this means


log(μijk)= λ+ λiX+ λjY + λkZ + λikXZ

Note: While Y is jointly independent of (X, Z)


combinations, X and Z may be dependent. Mutual
Independence implies Joint Independence.
28
Conditional Independence
Variables X and Y are said to be conditionally independent, given
Z, if independence holds for each partial table for which Z-
category is fixed.

Thus, πij|k = πi+|k π+j|k , for all i, j, k,

𝑖+𝑘 +𝑗𝑘
𝑖𝑗𝑘 = , for all i, j, k
++𝑘
For the expected frequencies this means
log(μijk)=λ+ λiX+ λjY + λkZ + λikXZ + λjkYZ
This is weaker than joint independence.

29
Conditionally Independent Models
0. log  ij    i X   j Y  k Z  ij XY   jk YZ  ik XZ (Homogeneous asso.)
1. log  ij    i X   j Y  k Z  ij XY   jk YZ (Cond. Indep of Z with X)
2. log  ij    i X   j Y  k Z  ij XY (Joint Indep of Z with (X & Y) )
3. log  ij    i X   j Y  k Z (Mutual Indep)

These models have, respectively, 0, 1, 2 and 3


conditionally independent pairs:
None;
(X,Z);
(X,Z), (Y,Z);
(X,Z), (Y,Z), (X,Y). 30
Homogeneous Association Model
• This model has all 2-factor interactions, but not the
three factor interaction. That is, conditional on the
third variable, each pair has homogeneous association.
• For this model, conditional odds ratios between any
chosen pair of variables are identical at each category
or level of the third variable.
• However, note that the common conditional odds ratio
may vary across different levels of chosen pair of
variables (i.e., 𝑂𝑅𝑖𝑖𝑘 ′ ,𝑗𝑗 ′ ) does not depend on k, but may
vary with various combinations of (i,i’,j,j’).
31
Homogeneous Association Model

log ij    i   j  k  ij  ik   jk


X Y Z XY XZ YZ

This model does not have any conditionally


Independent pairs

32
Cochran-Mantel-Haenszel Test for no row by
column association in any of the 22 Tables
(conditional Independence) (pp. 94-101)

33
List of 3-Way Models
(*). log  ij    i X   j Y  k Z  ij XY  ik XZ   jk YZ  ijk XYZ ; ( XYZ )
0. log  ij    i X   j Y  k Z  ij XY  ik XZ   jk YZ ; ( XY , XZ , YZ )
1. log  ij    i X   j Y  k Z  ij XY   jk YZ ; ( XY , YZ )
log  ij    i X   j Y  k Z  ij XY  ik XZ ; ( XY , XZ )
log  ij    i X   j Y  k Z  ik XZ   jk YZ ; ( XZ , YZ )
2. log  ij    i X   j Y  k Z  ij XY ; ( XY , Z )
log  ij    i X   j Y  k Z  ik XZ ; ( XZ , Y )
log  ij    i X   j Y  k Z   jk YZ ; (YZ , X )
3. log  ij    i X   j Y  k Z ; ( X ,Y , Z )
34
Probabilistic Forms of Conditionally
Independent Models

35
Calculation of Fitted Values

Mutual Independence 𝑋, 𝑌, 𝑍 : 𝜋𝑖𝑗𝑘 = 𝜋𝑖++ 𝜋+𝑗+ 𝜋++𝑘

Joint Independence (𝑋𝑍, 𝑌): 𝜋𝑖𝑗𝑘 = 𝜋𝑖+𝑘 𝜋+𝑗+

𝜋𝑖+𝑘 𝜋+𝑗𝑘
Cond. Independence 𝑋𝑍, 𝑌𝑍 : 𝜋𝑖𝑗𝑘 
𝜋++𝑘
36
Calculation of Fitted Values
𝑀𝑢𝑡𝑢𝑎𝑙 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑐𝑒 𝑋, 𝑌, 𝑍 : 𝜋𝑖𝑗𝑘 = 𝜋𝑖++ 𝜋+𝑗+ 𝜋++𝑘
𝐽𝑜𝑖𝑛𝑡 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑐𝑒 (𝑋𝑍): 𝜋𝑖𝑗𝑘 = 𝜋𝑖+𝑘 𝜋+𝑗+
𝜋𝑖+𝑘 𝜋+𝑗𝑘
𝐶𝑜𝑛𝑑. 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑐𝑒 𝑋𝑍, 𝑌𝑍 : 𝜋𝑖𝑗𝑘 
𝜋++𝑘
Homogeneous Association: 𝜋𝑖𝑗𝑘 = 𝜓𝑖𝑗 𝜙𝑗𝑘 𝜔𝑖𝑘

Mutual Independence X, Y, Z : 𝜇𝑖𝑗𝑘 = 𝑛𝑖++ 𝑛+𝑗+ 𝑛++𝑘 /𝑛2

Joint Independence XZ, Y : 𝜇𝑖𝑗𝑘 = 𝑛𝑖+𝑘 𝑛+𝑗+ /𝑛

𝑛𝑖+𝑘 𝑛+𝑗𝑘
Cond. Independence XZ, YZ : 𝜇𝑖𝑗𝑘 
𝑛++𝑘
Homogeneous Association: Iterative Methods 37
(XY, XZ, YZ) Model: Interpreting Model Parameters
(Agresti, 2nd ed, p. 321)

𝜋𝑖𝑗𝑘 𝜋𝑖+1,𝑗+1,𝑘
𝜃𝑖𝑗(𝑘) = , 1  i  I-1, 1  j  J-1
𝜋𝑖,𝑗+1,𝑘 𝜋𝑖+1,𝑗,𝑘

(I-1)(J-1) odds ratios {ij(k)} describe XY conditional association.


(J-1)(K-1) odds ratios {(i)jk} describe YZ conditional association.
(I-1)(K-1) odds ratios {i(j)k} describe XZ conditional association.

The two-factor parameters relate directly to conditional odds


ratios. For example, for model (XY, XZ, YZ), where 3-factor
interaction is absent,
𝑖𝑗𝑘 𝑖+1,𝑗+1,𝑘
log(𝜃𝑖𝑗(𝑘) ) = log( ) = 𝜆𝑋𝑌 𝑖𝑗 + 𝜆𝑋𝑌 𝑖+1,𝑗+1  𝜆𝑋𝑌 𝑖,𝑗+1 − 𝜆𝑋𝑌 𝑖+1,𝑗
𝑖,𝑗+1,𝑘 𝑖+1,𝑗,𝑘
= constant, for all values of k
38
Homogenous Association Model: Interpreting
Model Parameters
• Of the IP2 × JP2 possible odds ratios, there are (I – 1)(J – 1)
non-redundant odds ratios describing the association
between variables X and Y, at each of K levels of a third
conditioning variable.
• Conditional independence of X and Y (i.e., no three-
factor interaction) means that all of the odds ratios are
equal to 1.0.
• The logs of these odds ratios are functions of parameters
of the homogeneous association model (see equation
(8.14) in Agresti), and the functions do not depend on
the level of the conditioning variable.

39
Intentionally Kept Blank

40
Alcohol, Cigarette, Marijuana (ACM) use
(Agresiti, p. 323)

41
Calculation of Fitted Values
𝑀𝑢𝑡𝑢𝑎𝑙 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑐𝑒 𝑋, 𝑌, 𝑍 : 𝜋𝑖𝑗𝑘 = 𝜋𝑖++ 𝜋+𝑗+ 𝜋++𝑘
𝐽𝑜𝑖𝑛𝑡 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑐𝑒 (𝑋𝑍): 𝜋𝑖𝑗𝑘 = 𝜋𝑖+𝑘 𝜋+𝑗+
𝜋𝑖+𝑘 𝜋+𝑗𝑘
𝐶𝑜𝑛𝑑. 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑐𝑒 𝑋𝑍, 𝑌𝑍 : 𝜋𝑖𝑗𝑘 
𝜋++𝑘
Homogeneous Association: 𝜋𝑖𝑗𝑘 = 𝜓𝑖𝑗 𝜙𝑗𝑘 𝜔𝑖𝑘

Mutual Independence X, Y, Z : 𝜇𝑖𝑗𝑘 = 𝑛𝑖++ 𝑛+𝑗+ 𝑛++𝑘 /𝑛2

Joint Independence XZ, Y : 𝜇𝑖𝑗𝑘 = 𝑛𝑖+𝑘 𝑛+𝑗+ /𝑛

𝑛𝑖+𝑘 𝑛+𝑗𝑘
Cond. Independence XZ, YZ : 𝜇𝑖𝑗𝑘 
𝑛++𝑘
Homogeneous Association: Iterative Methods 42
Model Fits (Expected Cell Count)for ACM Data
(Agresti, p. 323)

43
ACM Data: Estimated Odds Ratios
Measuring Association (Agresti, 2nd ed, p. 323)

 Estimated Odds Ratios depend on the model fitted.


This highlights importance of good model selection.
 For 3-way tables, XY marginal and conditional odds
ratios are identical if either Z and X, or Z and Y are
conditionally independent [Agresti, 2nd ed, p. 358] 44
Calculation of Measures of Association
• The entry of 1.0 for the AC conditional association for
the model (AM, CM) of AC conditional independence is
the common value of AC fitted odds ratio and the two
levels of M. Thus, 909.24  0.24 438.54 179.84
1.0  
45.76  4.76 555.16 142.16

45
Best Model for A-C-M Data

46
Log-Likelihood-Ratio Statistic (G2)

47
Output from Best Model (AC,AM,CM)

48
Output from Best Model (AC,AM,CM)

49
Intentionally Kept Blank

50
Calculation of Fitted Values (Agresti, p. 333)
For simplicity, derivations use the Poisson sampling model,
which does not require a constraint on parameters such as the
multinomial does. For three-way tables, the joint Poisson
probability that cell counts {Yijk} is

For general loglinear model (XYZ), likelihood simplifies to

51
Calculation of Fitted Values (Agresti, p. 333)
For general loglinear model (XYZ), likelihood simplifies to

52
Chi-Squared Goodness of Fit Tests
(Agresti, p. 337-)

 Model goodness-of-fit statistics compare fitted cell


counts to sample counts. For Poisson GLMs, for
models with an intercept term, the deviance equals
the G2 statistic.
 With a fixed number of cells, G2 and X2 have
approximate chi-squared null distributions when
expected frequencies are large.
 The df equal the difference in dimension between
the alternative and null hypotheses. This equals the
difference between the number of parameters in the
general case and when the model holds.
53
Chi-Squared Goodness of Fit Tests: DF
(Agresti, p. 337-)

Example: For model (X, Y, Z),


df = IJK – [1+(I–1) + (J–1) + (K-1)] = IJK – I –J – K + 2.
(df = number of restrictions under Ho)

54
Intentionally Kept Blank

55
Loglinear-Logit Model Connection(Agresti, p. 330)

 Loglinear models treat categorical response variables


symmetrically, focusing on associations and interactions
in their joint distribution.
 Logit models, describe how a single categorical
response depends on explanatory variables. Model types
seem distinct, but connections exist between them.
 For a loglinear model, forming logits on one response
helps to interpret the model. Moreover, logit models
with categorical explanatory variables have equivalent
loglinear models.

56
Loglinear-Logit Model Connection(Agresti, p. 330)
To understand implications of a loglinear model formula, form a
logit on one variable; e.g., consider the loglinear model (XY, XZ,
YZ). When Y is binary, its logit is

This logit has the additive form:

Summarizing logit models by their predictors, we denote it by


(X+Z). 57
Loglinear-Logit Model Connection(Agresti, p. 330)

 loglinear model (XY, XZ, YZ), is equivalent to that logit model


(X+Z).
 The logit model does not assume anything about relationships
among explanatory variables, so it allows an arbitrary interaction
pattern for them.
58
Loglinear-Logit Model Connection(Agresti, p. 330)

 The saturated loglinear model ( XYZ) contains the three-factor


interaction term.
 When Y is a binary response, this model is equivalent to a logit
model with an interaction between the predictors X and Z. The
effect of X on Y depends on Z, implying the XY odds ratio varies
across categories of Z. This logit model is also saturated. 59
Loglinear-Logit Model Connection(Agresti, p. 330)
 When Y has several categories, similar correspondences
hold using baseline-category logit models.

 An advantage of the loglinear approach is its generality.


Loglinear models are most natural when at least two
variables are response variables. [The alcohol-cigarette-
marijuana example used loglinear models to study
association patterns among three response variables.]
 When only one is a response, it is more sensible to use
logit models directly.

60
Intentionally Kept Blank

61
The Likelihood Ratio Chi-Square
2 𝑛𝑖𝑗
𝐺 =2 𝑖 𝑗 𝑛𝑖𝑗 𝑙𝑜𝑔 𝜇𝑖𝑗

• This statistic has the same asymptotic null distribution


as the Pearson’s chi-square.

• This is also a very popular statistic for testing goodness


of fit.

• Most software routinely produce the value of G2

62
Intentionally Kept Blank
(beyond syllabus)

63
Political Affiliation Example
Political Affiliation
Rep Dem Indep
Letters 34 61 16 111
Engineering 31 19 17 67
College Agriculture 19 23 16 58
Education 23 39 12 74
Total 107 142 61 310

It is a 4 × 3 table.

Want to know if there is any evidence of the presence


of the interaction term. 64
Use of interaction plot
• Interaction plot help in looking for the presence of
interaction; The line plot along the levels of one of the
variables over the levels of the other variable should be
parallel if the variables are independent.

65
Political Affiliation Example
• In this example, the interaction plot suggests
lack of independence.
• However it also suggests that it is the
Engineering school which is primarily the
deviant group.
• Removal of the Engineering school may lead
to insignificance among the other three
schools.

66
Political Affiliation Example (All 4 Schools)

There is fairly strong evidence against independence.


Verify the following:

 2  16.16 (p value  0.0129)


G 2  16.39 (p value  0.0118)
Degrees of Freedom  (3 * 4) – [1  (3 - 1)  (4 - 1)]  6

67
Political Affiliation Example
(with Engineering School Removed)

Verify the following:


 2  5.770 (p value  0.2170)
G 2  5.536 (p value  0.2366)
df  (3 * 3) – [1  (3 - 1)  (3 - 1)]  4

This shows that there is no evidence against


independence. This confirms that the main source of
interaction is the Engineering school.

68
Intentionally Kept Blank

69
School Adversity Example
Adversity of School Condition (k)
Low Low Med Med High High
ium ium
Risk (j) N R N R N R Total
Classroom Non deviant 16 7 15 34 5 3 80
behavior (i)
Deviant 1 1 3 8 13 3 17
Total 17 8 18 42 6 6 97

This is an example of a 3 × 2 × 2 table. Can you


Calculate the expected frequencies for the different
models here? [Christensen, p. 61]
70
Adversity Example: Partial Totals
Adversity of School Condition (k)
Low Low Med Med High High
ium ium
Risk (j) N R N R N R Total
Classroom Non deviant 16 7 15 34 5 3 80
behavior (i)
Deviant 1 1 3 8 13 3 17
Total 17 8 18 42 6 6 97

n1   80 n12  18
n2    17 n 22  42
n 11  17 n 23  6
n 21  8 n    97 71
Different Presentation of Data

72
Higher Dimensional Tables
• As the dimension of the table becomes larger, the
analysis becomes complicated.
• If we can collapse over some of the variables without
losing the information on significant interaction
terms, it can make the analysis much easier.
• Inappropriate collapsing can lead to incorrect
inference.
• The best known example of inappropriate collapsing
is demonstrated by Simpson’s paradox.

73
ACM Data: Estimated Odds Ratios
Measuring Association (Agresti, p. 323)

 Estimated Odds Ratios depend on the model fitted.


This highlights importance of good model selection.
 For 3-way tables, XY marginal and conditional odds
ratios are identical if either Z and X, or Z and Y are
conditionally independent [Agresti, p. 358] 74
Intentionally Kept Blank
(Slides beyond this Optional)

75
Four-Way Contingency Tables (Agresti, p. 326)

Observations on 68,694 passengers in autos and light trucks


involved in accidents in the state of Maine in 1991, classified by
gender (G), location of accident (L), seat-belt use (S), and injury (I).
For each GL combination, the proportion of injuries was about
76
half for passengers wearing seat belts.
Four-Way Contingency Tables (Agresti, p. 326)
Models for higher dimensions using a four-way table with
variables W, X, Y, and Z. Interpretations are simplest when the
model has no three-factor interaction terms. Such models are
special cases of the model denoted by (WX,WY,WZ, XY, XZ, YZ):

Each pair of variables is conditionally dependent, with the same


odds ratios at each combination of categories of the other two
variables. An absence of a two-factor term implies conditional
independence, given the other two variables.
77
Four-Way Contingency Tables (Agresti, p. 326)
 A variety of models exhibit three-factor interaction. A
model could contain any of WXY, WXZ, WYZ, or XYZ
terms.
 For model (WXY,WZ, XZ, YZ) each pair of variables is
conditionally dependent, but at each level of Z the WX
association, the WY association, and the XY association
may vary across categories of the remaining variable.
The conditional association between Z and another
variable is homogeneous.
 The saturated model contains all the three-factor
terms plus a four-factor interaction term
78
Four-Way Contingency Tables (Agresti, p. 327)

 Model (G, I, L, S) of mutual independence fits very poorly. Model (GI, GL, GS, IL,
IS, LS) fits much better but still has a lack of fit (P < 0.001).
 Model (GIL, GIS, GLS, ILS) fits well (G2 = 1.3, df=1) but is complex and difficult to
interpret. This suggests studying models more complex than (GI, GL, GS, IL, IS, LS)
but simpler than (GIL, GIS, GLS, ILS).
79
Four-Way Contingency Tables (Agresti, p. 328)

 For model (GLS, GI, IL, IS), each pair of variables is conditionally dependent,
and at each category of I, the association between any two of the others varies
across categories of the remaining variable.  For this model, it is inappropriate
to interpret the GL, GS, and LS two-factor terms on their own.  Since I does
not occur in a three-factor interaction, the conditional odds ratio between I and
each variable (see the top portion of Table 8.10) is the same at each ombination
of categories of the other two variables. 80
Four-Way Contingency Tables (Agresti, p. 328)

 When a model has a three-factor interaction term but no higher order term,
one can study the interaction by calculating fitted odds ratios between two
variables at each level of the third. One can do this at any levels of remaining
variables not involved in the interaction.
 The bottom portion of Table 8.10 illustrates this for model (GLS, GI, IL, IS). For
instance, the fitted GS odds ratio of 0.66 for (L=urban. refers to four fitted
values for urban accidents, both the four with (injury=no) and the four with
81
(injury= yes); for example, 0.66 = (7273.2 ×10,959.2)/(11,632.6×10,358.9).
82
Contrasts
Let qij , i  1, , I , j  1, J be any set of numbers
with the property that qi   q  j  0. Then a contrast
of interactions may be expressed as
I J

 q 
i 1 j 1
ij ij
XY
.

(This is also a contrast in the log of cell counts.)

83
Contrasts

log ij  log i ' j  log ij '  log i ' j '


 ij  i ' j  ij '  i ' j '
This is a contrast in the log of the cell counts,
as also in the interaction terms.

What are the q values in the example below?


logij  logi ' j  logij '  logi ' j '
 ij  i ' j  ij '  i ' j '
84

You might also like