0% found this document useful (0 votes)
68 views

Lecture 22: Introduction To Log-Linear Models: Dipankar Bandyopadhyay, PH.D

This document provides an introduction to log-linear models. It discusses how log-linear models can be used to model cell counts in contingency tables and explore concepts of conditional independence between variables. It also provides examples of using log-linear models for 2x2 contingency tables and shows how expected cell counts can be modeled for different study designs using log-linear models.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Lecture 22: Introduction To Log-Linear Models: Dipankar Bandyopadhyay, PH.D

This document provides an introduction to log-linear models. It discusses how log-linear models can be used to model cell counts in contingency tables and explore concepts of conditional independence between variables. It also provides examples of using log-linear models for 2x2 contingency tables and shows how expected cell counts can be modeled for different study designs using log-linear models.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

Lecture 22: Introduction to Log-linear Models

Dipankar Bandyopadhyay, Ph.D.

BMTRY 711: Analysis of Categorical Data Spring 2011


Division of Biostatistics and Epidemiology
Medical University of South Carolina

Lecture 22: Introduction to Log-linear Models – p. 1/59


Log-linear Models

• Log-linear models are a Generalized Linear Model


• A common use of a log-linear model is to model the cell counts of a contingency table
• The systematic component of the model describe how the expected cell counts vary as
a result of the explanatory variables
• Since the response of a log linear model is the cell count, no measured variables are
considered the response

Lecture 22: Introduction to Log-linear Models – p. 2/59


Recap from Previous Lectures

• Lets suppose that we have an I × J × Z contingency table.


• That is, There are I rows, J columns and Z layers.

(picture of cube)

Lecture 22: Introduction to Log-linear Models – p. 3/59


Conditional Independence

We want to explore the concepts of independence using a log-linear model.

But first, lets review some probability theory.

Recall, two variables A and B are independent if and only if

P (AB) = P (A) × P (B)

Also recall that Bayes Law states for any two random variables

P (AB)
P (A|B) =
P (B)

and thus, when X and Y are independent,

P (A)P (B)
P (A|B) = = P (A)
P (B)

Lecture 22: Introduction to Log-linear Models – p. 4/59


Conditional Independence

Definitions:

In layer k where k ∈ {1, 2, . . . , Z}, X and Y are conditionally independent at level k of Z


when
P (Y = j|X = i, Z = k) = P (Y = j|Z = k), ∀i, j

If X and Y are conditionally independent at ALL levels of Z, then X and Y are


CONDITIONALLY INDEPENDENT.

Lecture 22: Introduction to Log-linear Models – p. 5/59


Application of the Multinomial

Suppose that a single multinomial applies to the entire three-way table with cell probabilities
equal to
πijk = P (X = i, Y = j, Z = k)

Let
P
π·jk = P (X = i, Y = j, Z = k)
X
= P (Y = j, Z = k)

Then,
πijk = P (X = i, Z = k)P (Y = j|X = i, Z = k)

by application of Bayes law. (The event (Y = j) = A and (X = i, Z = k) = B).

Lecture 22: Introduction to Log-linear Models – p. 6/59


Then if X and Y are conditionally independent at level z of Z,

πijk = P (X = i, Z = k)P (Y = j|X = i, Z = k)


= πi·k P (Y = j|Z = k)
= πi·k P (Y = j, Z = k)/P (Z = k)
= πi·k π·jk /π··k

for all i, j, and k.

Lecture 22: Introduction to Log-linear Models – p. 7/59


(2 × 2) table

• Lets suppose we are interested in a (2 × 2) table for the moment


• Let X describe the row effect and Y describe the column effect
• If X and Y are independent, then

πij = πi· π·j

• Then the expected cell count for the ij th cell would be

nπij = µij = nπi· π·j

Or,
log µij = λ + λX Y
i + λj

• This model is called the log-linear model of independence

Lecture 22: Introduction to Log-linear Models – p. 8/59


Interaction term

• In terms of a regression model, a significant interaction term indicates that the


response varies as a function of the combination of X and Y
• That is, changes in the response as a function of X require the specification of Y to
explain the change
• This implies that X and Y are NOT INDEPENDENT
• Let λXY
ij denote the interaction term

• Testing λXY
ij = 0 is a test of independence

Lecture 22: Introduction to Log-linear Models – p. 9/59


Log-linear Models for (2 × 2) tables

• Unifies all probability models discussed.


• We will use log-linear models to describe designs in which
1. Nothing is fixed (Poisson)
2. The total is fixed (multinomial sampling or double dichotomy)
3. One margin is fixed (prospective or case-control)
• Represents expected cell counts as functions of row and column effects and
interactions
• Makes no distinction between response and explanatory variables.
• Can be generalized to larger dimensions (R × C, 2 × 2 × 2, 2 × 2 × K, etc.)

Lecture 22: Introduction to Log-linear Models – p. 10/59


As before, for random counts, double dichotomy, prospective, and case-control designs
Variable (Y )
1 2

Y11 Y12 Y1+


1
Variable (X)
2
Y21 Y22 Y2+

Y+1 Y+2 Y++

Lecture 22: Introduction to Log-linear Models – p. 11/59


The expected counts are µjk = E(Yjk )
Variable (Y )
1 2

µ11 µ12 µ1+


1
Variable (X)
2
µ21 µ22 µ2+

µ+1 µ+2 µ++

Lecture 22: Introduction to Log-linear Models – p. 12/59


Example

An example of such a (2 × 2) table is

Cold incidence among French Skiers (Pauling, Proceedings of the national Academy of
Sciences, 1971).

OUTCOME

NO
|COLD | COLD | Total
T ---------+--------+--------+
R VITAMIN | | |
E C | 17 | 122 | 139
A | | |
T ---------+--------+--------+
M NO | | |
E VITAMIN | 31 | 109 | 140
N C | | |
T ---------+--------+--------+
Total 48 231 279

Regardless of how these data were actually collected, we have shown that the estimate of
the odds ratio is the same for all designs, as is the likelihood ratio test and Pearson’s
chi-square for independence.
Lecture 22: Introduction to Log-linear Models – p. 13/59
Using SAS Proc Freq

data one;
input vitc cold count;
cards;
1 1 17
1 2 122
2 1 31
2 2 109
;

proc freq;
table vitc*cold / chisq measures;
weight count;
run;

Lecture 22: Introduction to Log-linear Models – p. 14/59


/* SELECTED OUTPUT */

Statistics for Table of vitc by cold

Statistic DF Value Prob


------------------------------------------------------
Chi-Square 1 4.8114 0.0283 (Pearsons’)
Likelihood Ratio Chi-Square 1 4.8717 0.0273

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits


-----------------------------------------------------------------
Case-Control (Odds Ratio) 0.4900 0.2569 0.9343
Instead of just doing this analysis for a (2 × 2) table, we will now discuss a ‘log-linear’ model
for a (2 × 2) table

Lecture 22: Introduction to Log-linear Models – p. 15/59


Expected Counts

Expected cell counts µjk = E(Yjk ) for different designs

Y
1 2
Poisson µ11 µ12 (Poisson mean)
1 Double Dichotomy np11 np12 (table prob sums to 1)
Prospective n1 p 1 n1 (1 − p1 ) (row prob sums to 1)
Case Control n1 π1 n2 π2 (col prob sums to 1)
X
Poisson µ21 µ22
2 Double Dichotomy np21 n(1 − p11 − p12 − p21 )
Prospective n2 p 2 n2 (1 − p2 )
Case Control n1 (1 − π1 ) n2 (1 − π2 )

Lecture 22: Introduction to Log-linear Models – p. 16/59


Log-linear models

• Often, when you are not really sure how you want to model the data (conditional on the
total, conditional on the rows or conditional on the columns), you can treat the data as
if they are Poisson (the most general model) and use log-linear models to explore
relationships between the row and column variables.
• The most general model for a (2 × 2) table is a Poisson model (4 non-redundant
expected cell counts).
• Since the expected cell counts are always positive, we model µjk as an exponential
function of row and column effects:

µjk = exp(µ + λX Y XY
j + λk + λjk )

where
λX
j =j
th row effect

λY
k =k
th column effect

λXY
jk = interaction effect in j
th row, k th column

Lecture 22: Introduction to Log-linear Models – p. 17/59


• Equivalently, we can write the model as a log-linear model:

log(µjk ) = µ + λX Y XY
j + λk + λjk

• Treating the 4 expected cell counts as non-redundant, we can write the model for µjk
as a function of at most 4 parameters. However, in this model, there are 9 parameters,

µ, λX X Y Y XY XY XY XY
1 , λ2 , λ1 , λ2 , λ11 , λ12 , λ21 , λ22 ,

but only four expected cell counts µ11 , µ12 , µ21 , µ22 .

Lecture 22: Introduction to Log-linear Models – p. 18/59


• Thus, we need to put constraints on the λ’s, so that only four are non-redundant.
• We will use the ‘reference cell’ constraints, in which we set any parameter with a ‘2’ in
the subscript to 0, i.e.,

λX Y XY XY XY
2 = λ2 = λ12 = λ21 = λ22 = 0,

leaving us with 4 unconstrained parameters

µ, λX Y XY
1 , λ1 , λ11

as well as 4 expected cell counts:

[µ11 , µ12 , µ21 , µ22 ]

Lecture 22: Introduction to Log-linear Models – p. 19/59


Expected Cell Counts for the Model

• Again. the model for the expected cell count is written as

µjk = exp(µ + λX Y XY
j + λk + λjk )

• In particular, given the constraints, we have:

µ11 = exp(µ + λX Y XY
1 + λ1 + λ11 )

µ12 = exp(µ + λX
1 )

µ21 = exp(µ + λY
1 )

µ22 = exp(µ)

Lecture 22: Introduction to Log-linear Models – p. 20/59


Regression Framework

• In terms of a regression framework, you write the model as

2 3 2 3 2 32 3
log(µ11 ) µ + λX Y XY
1 + λ1 + λ11 1 1 1 1 µ
6 log(µ12 ) 7 6 µ + λX 7 6 1 1 0 0 76 λX 7
6 7 6 1 7 6 76 1 7
6 7=6 7=6 76 7
4 log(µ21 ) 5 4 µ + λY1 5 4 1 0 1 0 54 λY
1 5
log(µ22 ) µ 1 0 0 0 XY
λ11

• i.e., you create dummy or indicator variables for the different categories.

log(µjk ) = µ + I(j = 1)λX Y XY


1 + I(k = 1)λ1 + I[(j = 1), (k = 1)]λ11

where
(
1 if A is true
I(A) = .
0 if A is not true

Lecture 22: Introduction to Log-linear Models – p. 21/59


• For example,

log(µ21 ) = µ + I(2 = 1)λX Y XY


1 + I(1 = 1)λ1 + I[(2 = 1), (1 = 1)]λ11

= µ + 0 · λX Y XY
1 + 1 · λ1 + 0 · λ11

= µ + λY
1

Lecture 22: Introduction to Log-linear Models – p. 22/59


Interpretation of the λ’s

• We can solve for the λ’s in terms of the µjk ’s.

log(µ22 ) = µ

log(µ12 ) − log(µ22 ) = (µ + λX
1 )−µ

= λX
1

log(µ21 ) − log(µ22 ) = (µ + λY
1 )−µ

= λY
1

Lecture 22: Introduction to Log-linear Models – p. 23/59


Odds Ratio

µ11 µ22
log(OR) = log µ21 µ12
= log(µ11 ) + log(µ22 ) − log(µ21 ) − log(µ12 )

= (µ + λX Y XY
1 + λ1 + λ11 ) + µ
−(µ + λY X
1 ) − (µ + λ1 )

= λXY
11

Important: the main parameter of interest is the log odd ratio, which equals λXY
11 in this
model.

Lecture 22: Introduction to Log-linear Models – p. 24/59


• The model with the 4 parameters

µ, λX Y XY
1 , λ1 , λ11

is called the ‘saturated model’ since it has as many free parameters as possible for a
(2 × 2) table which has the four expected cell counts µ11 , µ12 , µ21 , µ22 .

Lecture 22: Introduction to Log-linear Models – p. 25/59


• Also, you will note that Agresti uses different constraints for the log-linear model,
namely
2
X
λX
j = 0,
j=1

2
X
λX
k = 0,
k=1

and
2
X
λXY
jk = 0 for k = 1, 2
j=1

and
2
X
λXY
jk = 0 for j = 1, 2
k=1

Lecture 22: Introduction to Log-linear Models – p. 26/59


• Agresti’s model is just a different parameterization for the ‘saturated model’. I think the
one we are using (Reference Category) is a little easier to work with.
• The log-linear model as we have written it, makes no distinction between what margins
are fixed by design, and what margins are random.
• Again, when you are not really sure how you want to model the data (conditional on
the total, conditional on the rows or conditional on the columns) or which model is
appropriate, you can use log-linear models to explore the data.

Lecture 22: Introduction to Log-linear Models – p. 27/59


Parameters of interest for different designs and the MLE’s

• For all sampling plans, we are interested in testing independence:

H0 :OR = 1.

• As shown earlier for the log-linear model, the null is

H0 :λXY
11 = log(OR) = 0.

• Depending on the design, some of the parameters of the log-linear model are actually
fixed by the design.
• However, for all designs, we can estimate the parameters (that are not fixed by the
design) with a Poisson likelihood, and get the MLE’s of the parameters for all designs.
• This is because the kernel of the log-likelihood for any of these design is the same

Lecture 22: Introduction to Log-linear Models – p. 28/59


The different designs

Random Counts

• To derive the likelihood, note that

n
e−µjk µjkjk
P (Yjk = njk |Poisson) =
njk !

• Thus, the full likelihood is


n
Y Y e−µjk µjkjk
L=
j k
njk !

• Or,
XX XX
l= −µjk + njk log µjk + K
j k j k

• Or, in terms of the kernel, the Poisson log-likelihood is

l∗ = −µ++ + y11 log(µ11 ) + y12 log(µ12 ) + y21 log(µ21 ) + y22 log(µ22 )

Lecture 22: Introduction to Log-linear Models – p. 29/59


• Consider the log-linear model

µjk = exp[µ + λX Y XY
j + λk + λjk ]

• Then, substituting this in the log-likelihood, we get

log[L(µ, λX Y XY
1 , λ1 , λ11 )] =
P2 P2
−µ++ + j=1 k=1 yjk [µ + λX Y XY
j + λk + λjk ] =
P2 X
P2 Y
P2 P2
−µ++ + µy++ + j=1 λj yj+ + k=1 λk y+k + j=1 k=1 yjk λXY
jk =

−µ++ + µy++ + λX Y XY
1 y1+ + λ1 y+1 + λ11 y11

since we constrained all λ terms to be 0 with a subscript equal to 2.

Lecture 22: Introduction to Log-linear Models – p. 30/59


• Note, here, that the likelihood is a function of the parameters

(µ, λX Y XY
1 , λ1 , λ11 )

and the random variables


(y++ , y1+ , y+1 , y11 )

• The random variables


(y++ , y1+ , y+1 , y11 )

are called sufficient statistics, i.e., all the information from the data in the likelihood are
contained in the sufficient statistics
• In particular, when taking derivatives of the log-likelihood to find the MLE, we will be
solving for the estimate of (µ, λX Y XY
1 , λ1 , λ11 ) as a function of the sufficient statistics
(y++ , y1+ , y+1 , y11 )

Lecture 22: Introduction to Log-linear Models – p. 31/59


Example

Cold incidence among French Skiers (Pauling, Proceedings of the national Academy of
Sciences, 1971).

OUTCOME

NO
|COLD | COLD | Total
T ---------+--------+--------+
R VITAMIN | | |
E C | 17 | 122 | 139
A | | |
T ---------+--------+--------+
M NO | | |
E VITAMIN | 31 | 109 | 140
N C | | |
T ---------+--------+--------+
Total 48 231 279

Lecture 22: Introduction to Log-linear Models – p. 32/59


Poisson log-linear model Model

• For the Poisson likelihood, we write the log-linear model for the expected cell counts
as:

2 3 2 3 2 32 3
log(µ11 ) µ + λX Y XY
1 + λ1 + λ11 1 1 1 1 µ
6 log(µ12 ) 7 6 µ + λX 7 6 1 1 0 0 76 λX 7
6 7 6 1 7 6 76 1 7
6 7=6 7=6 76 7
4 log(µ21 ) 5 4 µ + λY1 5 4 1 0 1 0 54 λY
1 5
log(µ22 ) µ 1 0 0 0 XY
λ11

• We will use this in SAS Proc Genmod to obtain the estimates

Lecture 22: Introduction to Log-linear Models – p. 33/59


SAS PROC GENMOD

data one;
input vitc cold count;
cards;
1 1 17
1 2 122
2 1 31
2 2 109
;
run;

proc genmod data=one;


class vitc cold; /* Class automatically create */
/* constraints, i.e., dummy */
/* variables */

model count = vitc cold vitc*cold / /* can put interaction terms in */


link=log dist = poi; /* directly */
run;

Lecture 22: Introduction to Log-linear Models – p. 34/59


/* SELECTED OUTPUT */

The GENMOD Procedure

Analysis Of Parameter Estimates

Parameter DF Estimate Std Err ChiSquare Pr>Chi

INTERCEPT 1 4.6913 0.0958 2398.9532 0.0000


VITC 1 1 0.1127 0.1318 0.7308 0.3926
VITC 2 0 0.0000 0.0000 . .
COLD 1 1 -1.2574 0.2035 38.1575 0.0000
COLD 2 0 0.0000 0.0000 . .
VITC*COLD 1 1 1 -0.7134 0.3293 4.6934 0.0303
VITC*COLD 1 2 0 0.0000 0.0000 . .
VITC*COLD 2 1 0 0.0000 0.0000 . .
VITC*COLD 2 2 0 0.0000 0.0000 . .
SCALE 0 1.0000 0.0000 . .

Lecture 22: Introduction to Log-linear Models – p. 35/59


Estimates

• From the SAS Output, the Estimates are:


µ
b = 4.6913

bV IT C = 0.1127
λ1

bCOLD = −1.2574
λ1

λV
11
IT C,COLD
= log(OR) = −0.7134
• The OR the “regular” way is

17 · 109
log(OR) = log( ) = log(0.499) = −0.7134
31 · 122

Lecture 22: Introduction to Log-linear Models – p. 36/59


Double Dichotomy

• For the double dichotomy in which the data follow a multinomial, we first rewrite the
log-likelihood

l∗ = −µ++ + y11 log(µ11 ) + y12 log(µ12 ) + y21 log(µ21 ) + y22 log(µ22 )

in terms of the expected cell counts, and the λ’s :


µ11 = np11 = exp(µ + λX Y XY
1 + λ1 + λ11 )
µ12 = np12 = exp(µ + λX 1 )
µ21 = np21 = exp(µ + λY 1 )
µ22 = n(1 − p11 − p12 − p21 ) = exp(µ)

Lecture 22: Introduction to Log-linear Models – p. 37/59


• Recall, the multinomial is a function of 3 probabilities

(p11 , p12 , p21 )

since p22 = 1 − p11 − p12 − p21 .


• Adding up the µjk ’s in terms of the npjk ’s, it is pretty easy to see that

2 X
X 2
µ++ = µjk = n
j=1 k=1

(fixed by design), so that the first term in the log-likelihood, −µ++ = −n is not a
function of the unknown parameters for the multinomial.

Lecture 22: Introduction to Log-linear Models – p. 38/59


• Then, the multinomial probabilities can be written as

µjk µjk
pjk = =
n µ++

• We can also write µ++ in terms of the λ’s,

P2 P2
µ++ = j=1 k=1 µjk =
P2 P2
j=1 k=1 exp[µ + λX Y XY
j + λk + λjk ] =
P2 P2
exp[µ] j=1 k=1 exp[λX Y XY
j + λk + λjk ]

Lecture 22: Introduction to Log-linear Models – p. 39/59


• Then, we can rewrite the multinomial probabilities as

µjk
pjk =
µ++

exp[µ + λX Y XY
j + λk + λjk ]
= P2 P2 X Y XY
j=1 k=1 exp[µ + λj + λk + λjk ]

exp[µ] exp[λX Y XY
j + λk + λjk ]
= P P
exp[µ] 2j=1 2k=1 exp[λX Y XY
j + λk + λjk ]

exp[λX Y XY
j + λk + λjk ]
= P2 P2 X + λY + λXY ]
,
j=1 k=1 exp[λj k jk

which is not a function of µ

Lecture 22: Introduction to Log-linear Models – p. 40/59


The Multinomial

• We see that these probabilities do not depend on the parameter


µ.
• In particular, for the multinomial, there are only three free probabilities

(p11 , p12 , p21 )

and three parameters.


(λX Y XY
1 , λ1 , λ11 ).

Lecture 22: Introduction to Log-linear Models – p. 41/59


• These probabilities could also have been determined by noting that, conditioning on
the table total n = Y++ , the Poisson random variables follow a conditional multinomial,

(Y11 , Y12 , Y21 |Y++ = y++ ) ∼ M ult(y++ , p11 , p12 , p21 )

with
µjk
pjk =
µ++

which we showed above equals

exp[λX Y XY
j + λk + λjk ]
pjk = P2 P2 X + λY + λXY ]
,
j=1 k=1 exp[λj k jk

and is not a function of µ.

Lecture 22: Introduction to Log-linear Models – p. 42/59


Obtaining MLE’s

Thus, to obtain the MLE’s for (λX Y XY


1 , λ1 , λ11 ), we have 2 choices:
1. We can maximize the Poisson likelihood.
2. We can maximize the conditional multinomial likelihood.

• If the data are from a double dichotomy, the multinomial likelihood is not a function of
µ. Thus, if you use a Poisson likelihood to estimate the log-linear model when the data
are multinomial, the estimate of µ really is not of interest.
• We will use this in SAS Proc Catmod to obtain the estimates using the multinomial
likelihood.

Lecture 22: Introduction to Log-linear Models – p. 43/59


Multinomial log-linear model Model

• For the Multinomial likelihood in SAS Proc Catmod, we write the log-linear model for
the three probabilities (p11 , p12 , p21 ) as:

exp(λX Y XY
1 + λ1 + λ11 )
p11 =
exp(λX
1 + λY XY X Y
1 + λ11 ) + exp(λ1 ) + exp(λ1 ) + 1

exp(λX1 )
p12 =
exp(λX Y
1 + λ1 + λXY X Y
11 ) + exp(λ1 ) + exp(λ1 ) + 1

exp(λY1 )
p21 =
exp(λX Y
1 + λ1 + λXY X Y
11 ) + exp(λ1 ) + exp(λ1 ) + 1

Lecture 22: Introduction to Log-linear Models – p. 44/59


• Note that the denominator in each probability is

2 X
X 2
exp[λX Y XY
j + λk + λjk ]
j=1 k=1

For j = k = 2 in this sum, we have the constraint that λX Y XY


2 = λ2 = λ22 = 0 so that

0
exp[λX Y XY
2 + λ2 + λ22 ] = e = 1

• Using SAS Proc Catmod, we make the design matrix equal to the combinations of
(λX Y XY
1 , λ1 , λ11 ) found in the exponential function in the numerators:
2 3 2 32 3
λX
1 + λY
1 + λXY
11 1 1 1 λX
1
6 7 6 76 7
4 λX
1 5=4 1 0 0 54 λY
1 5
λY
1 0 1 0 XY
λ11

Lecture 22: Introduction to Log-linear Models – p. 45/59


SAS PROC CATMOD

data one;
input vitc cold count;
cards;
1 1 17
1 2 122
2 1 31
2 2 109
;
run;

proc catmod data=one;


model vitc*cold = ( 1 1 1, /* 1st col = lambdaˆV */
1 0 0, /* 2nd col = lambdaˆC */
0 1 0 ); /* 3rd col = lambdaˆVC */
weight count;
run;

Lecture 22: Introduction to Log-linear Models – p. 46/59


/* SELECTED OUTPUT */

Response Profiles

Response vitc cold


------------------------
1 1 1
2 1 2
3 2 1
4 2 2

Analysis of Maximum Likelihood Estimates

Standard Chi-
Effect Parameter Estimate Error Square Pr > ChiSq
---------------------------------------------------------------------
Model 1 0.1127 0.1318 0.73 0.3926
2 -1.2574 0.2035 38.16 <.0001
3 -0.7134 0.3293 4.69 0.0303

Lecture 22: Introduction to Log-linear Models – p. 47/59


Estimates

• From the SAS Output, the Estimates are:


bV IT C = 0.1127
λ1

bCOLD = −1.2574
λ1

λV
11
IT C,COLD
= log(OR) = −0.7134

Which is the same as for the Poisson Log-Linear Model and

e(−0.7134) = 0.49

which is the estimate obtained from PROC FREQ


Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits


-----------------------------------------------------------------
Case-Control (Odds Ratio) 0.4900 0.2569 0.9343

Lecture 22: Introduction to Log-linear Models – p. 48/59


Prospective Study

• Now, suppose the data are from a prospective study, or, equivalently, we condition on
the row totals of the (2 × 2) table. We know that, conditional on the row totals
n1 = Y1+ and n2 = Y2+ are fixed, and the total sample size is n++ = n1 + n2 .
• Further, we are left with a likelihood that is a product of two independent row binomials.

(Y11 |Y1+ = y1+ ) ∼ Bin(y1+ , p1 )

where
µ11 µ11
p1 = P [Y = 1|X = 1] = = ;
µ1+ µ11 + µ12

and
(Y21 |Y2+ = y2+ ) ∼ Bin(y2+ , p2 )

where
µ21 µ21
p2 = P [Y = 1|X = 2] = =
µ2+ µ21 + µ22

• And the conditional binomials are independent.

Lecture 22: Introduction to Log-linear Models – p. 49/59


• Conditioning on the rows, the log-likelihood kernel is

l∗ = −(n1 +n2 )+y11 log(n1 p1 )+y12 log(n1 (1−p1 ))+y21 log(n2 p2 )+y22 log(n2 (1−p2 ))

• What are p1 and p2 in terms of the λ’s?


• The probability of success for row 1 is

µ11
p1 =
µ11 + µ12

exp(µ + λX Y XY
1 + λ1 + λ11 )
=
exp(µ + λX Y XY X
1 + λ1 + λ11 ) + exp(µ + λ1 )

exp(µ + λX Y XY
1 ) exp(λ1 + λ11 )
=
exp(µ + λX Y XY
1 )[exp(λ1 + λ11 ) + 1]

exp(λY XY
1 + λ11 )
=
1 + exp(λY XY
1 + λ11 )

Lecture 22: Introduction to Log-linear Models – p. 50/59


• The probability of success for row 2 is

µ21
p2 =
µ21 + µ22

exp(µ + λY1 )
=
exp(µ + λY
1 ) + exp(µ)

exp(µ) exp(λY
1 )
=
exp(µ)[exp(λY
1 ) + 1]

exp(λY
1 )
=
1 + exp(λY
1 )

• Now, conditional on the row totals (as in a prospective study), we are left with two free
probabilities (p1 , p2 ), and the conditional likelihood is a function of two free parameters
(λY XY
1 , λ11 ).

Lecture 22: Introduction to Log-linear Models – p. 51/59


Logistic Regression

• Looking at the previous pages, the conditional probabilities of Y given X from the
log-linear model follow a logistic regression model:

px = P [Y = 1|X ∗ = x∗ ]

Y
e[λ1 +λXY ∗
11 x ]
= Y +λXY
e[λ1 11 x ]

+1


e[β0 +β1 x ]
=
1 + e[β0 +β1 x ]

where
(
1 if x = 1
x∗ = .
0 if x = 2

and
β0 = λY
1

and
β1 = λXY
11

Lecture 22: Introduction to Log-linear Models – p. 52/59


• From the log-linear model, we had that λXY
11 is the log-odds ratio, which we know from
the logistic regression, is β1 .
• Note, the intercept in a logistic regression with Y as the response is the main effect of
Y in the log-linear model:
β0 = λY
1

• The conditional probability px is not a function of µ or λX


1 .

Lecture 22: Introduction to Log-linear Models – p. 53/59


Obtaining MLE’s

• Thus, to obtain the MLE’s for (λY XY


1 , λ11 ), we have 3 choices:
1. We can maximize the Poisson likelihood.
2. We can maximize the conditional multinomial likelihood.
3. We can maximize the row product binomial likelihood using a logistic regression
package.
• If the data are from a prospective study, the product binomial likelihood is not a
function of µ or λX1 .
• Thus, if you use a Poisson likelihood to estimate the log-linear model when the data
are from a prospective study, the estimate of µ or λX
1 really are not of interest.

Lecture 22: Introduction to Log-linear Models – p. 54/59


Revisiting the Cold Vitamin C Example

• We will let the ‘covariate’ X =TREATMENT and ‘outcome’ Y =COLD.


• We will use SAS Proc Logistic to get the MLES of the intercept

β0 = λY COLD
1 = λ1

and log-odds ratio

V IT C,COLD
β1 = λXY
11 = λ11

Lecture 22: Introduction to Log-linear Models – p. 55/59


SAS PROC LOGISTIC

data one;
input vitc cold count;
if vitc=2 then vitc=0;
if cold=2 then cold=0;
cards;
1 1 17
1 2 122
2 1 31
2 2 109
;
run;

proc logistic data=one descending; /* descending model pr(Y=1) */


model cold = vitc / rl ; /* rl gives 95 % CI for OR */
freq count; /* tells SAS how many subjects */
/* each record in dataset represent */
run;

Lecture 22: Introduction to Log-linear Models – p. 56/59


/* SELECTED OUTPUT */

Analysis of Maximum Likelihood Estimates

Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -1.2574 0.2035 38.1575 <.0001


vitc 1 -0.7134 0.3293 4.6934 0.0303

Wald Confidence Interval for Adjusted Odds Ratios

Effect Unit Estimate 95% Confidence Limits

vitc 1.0000 0.490 0.257 0.934

Estimates

• From the SAS Output, the Estimates are:


βb0 = λ
bCOLD = −1.2574
1

βb1 = λV
11
IT C,COLD
= log(OR) = −0.7134
• Which are the same as for the Poisson and Multinomial Log-Linear Models.
Lecture 22: Introduction to Log-linear Models – p. 57/59
Recap

• Except for combinatorial terms that are not function of any unknown parameters, using
µjk from the previous table, the kernel of the log-likelihood for any of these design can
be written as

l∗ = −µ++ + y11 log(µ11 ) + y12 log(µ12 ) + y21 log(µ21 ) + y22 log(µ22 )

• In this likelihood, the table total µ++ is actually known for all designs,

Double Dichotomy n
Prospective n1 + n2
Case Control n1 + n2

except for the Poisson, in which

µ++ = E(Y++ )

is a parameter that must be estimated (i.e., the sum of JK independent Poisson


random variables).

Lecture 22: Introduction to Log-linear Models – p. 58/59


Recap

Key Points:
• We have introduced Log-linear models
• We have defined a parameter in the model to represent the OR
• We do not have an “outcome” per se
• If you can designate an outcome, you minimize the number of parameters estimated
• You should feel comfortable writing likelihoods, If not, you have 3 weeks to gain the
comfort
• Expect the final exam to have at least one likelihood problem

Lecture 22: Introduction to Log-linear Models – p. 59/59

You might also like