0% found this document useful (0 votes)

68 views

Lecture 22: Introduction To Log-Linear Models: Dipankar Bandyopadhyay, PH.D

This document provides an introduction to log-linear models. It discusses how log-linear models can be used to model cell counts in contingency tables and explore concepts of conditional independence between variables. It also provides examples of using log-linear models for 2x2 contingency tables and shows how expected cell counts can be modeled for different study designs using log-linear models.

Uploaded by

Rodrigo Fonseca de Carvalho

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views

Lecture 22: Introduction To Log-Linear Models: Dipankar Bandyopadhyay, PH.D

Uploaded by

Rodrigo Fonseca de Carvalho

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

Lecture 22: Introduction to Log-linear Models

Dipankar Bandyopadhyay, Ph.D.

BMTRY 711: Analysis of Categorical Data Spring 2011

Division of Biostatistics and Epidemiology
Medical University of South Carolina

Lecture 22: Introduction to Log-linear Models – p. 1/59

Log-linear Models

• Log-linear models are a Generalized Linear Model

• A common use of a log-linear model is to model the cell counts of a contingency table
• The systematic component of the model describe how the expected cell counts vary as
a result of the explanatory variables
• Since the response of a log linear model is the cell count, no measured variables are
considered the response

Lecture 22: Introduction to Log-linear Models – p. 2/59

Recap from Previous Lectures

• Lets suppose that we have an I × J × Z contingency table.

• That is, There are I rows, J columns and Z layers.

(picture of cube)

Lecture 22: Introduction to Log-linear Models – p. 3/59

Conditional Independence

We want to explore the concepts of independence using a log-linear model.

But first, lets review some probability theory.

Recall, two variables A and B are independent if and only if

P (AB) = P (A) × P (B)

Also recall that Bayes Law states for any two random variables

P (AB)
P (A|B) =
P (B)

and thus, when X and Y are independent,

P (A)P (B)
P (A|B) = = P (A)
P (B)

Lecture 22: Introduction to Log-linear Models – p. 4/59

Conditional Independence

Definitions:

In layer k where k ∈ {1, 2, . . . , Z}, X and Y are conditionally independent at level k of Z

when
P (Y = j|X = i, Z = k) = P (Y = j|Z = k), ∀i, j

If X and Y are conditionally independent at ALL levels of Z, then X and Y are

CONDITIONALLY INDEPENDENT.

Lecture 22: Introduction to Log-linear Models – p. 5/59

Application of the Multinomial

Suppose that a single multinomial applies to the entire three-way table with cell probabilities
equal to
πijk = P (X = i, Y = j, Z = k)

Let
P
π·jk = P (X = i, Y = j, Z = k)
X
= P (Y = j, Z = k)

Then,
πijk = P (X = i, Z = k)P (Y = j|X = i, Z = k)

by application of Bayes law. (The event (Y = j) = A and (X = i, Z = k) = B).

Lecture 22: Introduction to Log-linear Models – p. 6/59

Then if X and Y are conditionally independent at level z of Z,

πijk = P (X = i, Z = k)P (Y = j|X = i, Z = k)

= πi·k P (Y = j|Z = k)
= πi·k P (Y = j, Z = k)/P (Z = k)
= πi·k π·jk /π··k

for all i, j, and k.

Lecture 22: Introduction to Log-linear Models – p. 7/59

(2 × 2) table

• Lets suppose we are interested in a (2 × 2) table for the moment

• Let X describe the row effect and Y describe the column effect
• If X and Y are independent, then

πij = πi· π·j

• Then the expected cell count for the ij th cell would be

nπij = µij = nπi· π·j

Or,
log µij = λ + λX Y
i + λj

• This model is called the log-linear model of independence

Lecture 22: Introduction to Log-linear Models – p. 8/59

Interaction term

• In terms of a regression model, a significant interaction term indicates that the

response varies as a function of the combination of X and Y
• That is, changes in the response as a function of X require the specification of Y to
explain the change
• This implies that X and Y are NOT INDEPENDENT
• Let λXY
ij denote the interaction term

• Testing λXY
ij = 0 is a test of independence

Lecture 22: Introduction to Log-linear Models – p. 9/59

Log-linear Models for (2 × 2) tables

• Unifies all probability models discussed.

• We will use log-linear models to describe designs in which
1. Nothing is fixed (Poisson)
2. The total is fixed (multinomial sampling or double dichotomy)
3. One margin is fixed (prospective or case-control)
• Represents expected cell counts as functions of row and column effects and
interactions
• Makes no distinction between response and explanatory variables.
• Can be generalized to larger dimensions (R × C, 2 × 2 × 2, 2 × 2 × K, etc.)

Lecture 22: Introduction to Log-linear Models – p. 10/59

As before, for random counts, double dichotomy, prospective, and case-control designs
Variable (Y )
1 2

Y11 Y12 Y1+

1
Variable (X)
2
Y21 Y22 Y2+

Y+1 Y+2 Y++

Lecture 22: Introduction to Log-linear Models – p. 11/59

The expected counts are µjk = E(Yjk )
Variable (Y )
1 2

µ11 µ12 µ1+

1
Variable (X)
2
µ21 µ22 µ2+

µ+1 µ+2 µ++

Lecture 22: Introduction to Log-linear Models – p. 12/59

Example

An example of such a (2 × 2) table is

Cold incidence among French Skiers (Pauling, Proceedings of the national Academy of
Sciences, 1971).

OUTCOME

NO
|COLD | COLD | Total
T ---------+--------+--------+
R VITAMIN | | |
E C | 17 | 122 | 139
A | | |
T ---------+--------+--------+
M NO | | |
E VITAMIN | 31 | 109 | 140
N C | | |
T ---------+--------+--------+
Total 48 231 279

Regardless of how these data were actually collected, we have shown that the estimate of
the odds ratio is the same for all designs, as is the likelihood ratio test and Pearson’s
chi-square for independence.
Lecture 22: Introduction to Log-linear Models – p. 13/59
Using SAS Proc Freq

data one;
input vitc cold count;
cards;
1 1 17
1 2 122
2 1 31
2 2 109
;

proc freq;
table vitc*cold / chisq measures;
weight count;
run;

Lecture 22: Introduction to Log-linear Models – p. 14/59

/* SELECTED OUTPUT */

Statistics for Table of vitc by cold

Statistic DF Value Prob

------------------------------------------------------
Chi-Square 1 4.8114 0.0283 (Pearsons’)
Likelihood Ratio Chi-Square 1 4.8717 0.0273

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits

-----------------------------------------------------------------
Case-Control (Odds Ratio) 0.4900 0.2569 0.9343
Instead of just doing this analysis for a (2 × 2) table, we will now discuss a ‘log-linear’ model
for a (2 × 2) table

Lecture 22: Introduction to Log-linear Models – p. 15/59

Expected Counts

Expected cell counts µjk = E(Yjk ) for different designs

Y
1 2
Poisson µ11 µ12 (Poisson mean)
1 Double Dichotomy np11 np12 (table prob sums to 1)
Prospective n1 p 1 n1 (1 − p1 ) (row prob sums to 1)
Case Control n1 π1 n2 π2 (col prob sums to 1)
X
Poisson µ21 µ22
2 Double Dichotomy np21 n(1 − p11 − p12 − p21 )
Prospective n2 p 2 n2 (1 − p2 )
Case Control n1 (1 − π1 ) n2 (1 − π2 )

Lecture 22: Introduction to Log-linear Models – p. 16/59

Log-linear models

• Often, when you are not really sure how you want to model the data (conditional on the
total, conditional on the rows or conditional on the columns), you can treat the data as
if they are Poisson (the most general model) and use log-linear models to explore
relationships between the row and column variables.
• The most general model for a (2 × 2) table is a Poisson model (4 non-redundant
expected cell counts).
• Since the expected cell counts are always positive, we model µjk as an exponential
function of row and column effects:

µjk = exp(µ + λX Y XY
j + λk + λjk )

where
λX
j =j
th row effect

λY
k =k
th column effect

λXY
jk = interaction effect in j
th row, k th column

Lecture 22: Introduction to Log-linear Models – p. 17/59

• Equivalently, we can write the model as a log-linear model:

log(µjk ) = µ + λX Y XY
j + λk + λjk

• Treating the 4 expected cell counts as non-redundant, we can write the model for µjk
as a function of at most 4 parameters. However, in this model, there are 9 parameters,

µ, λX X Y Y XY XY XY XY
1 , λ2 , λ1 , λ2 , λ11 , λ12 , λ21 , λ22 ,

but only four expected cell counts µ11 , µ12 , µ21 , µ22 .

Lecture 22: Introduction to Log-linear Models – p. 18/59

• Thus, we need to put constraints on the λ’s, so that only four are non-redundant.
• We will use the ‘reference cell’ constraints, in which we set any parameter with a ‘2’ in
the subscript to 0, i.e.,

λX Y XY XY XY
2 = λ2 = λ12 = λ21 = λ22 = 0,

leaving us with 4 unconstrained parameters

µ, λX Y XY
1 , λ1 , λ11

as well as 4 expected cell counts:

[µ11 , µ12 , µ21 , µ22 ]

Lecture 22: Introduction to Log-linear Models – p. 19/59

Expected Cell Counts for the Model

• Again. the model for the expected cell count is written as

µjk = exp(µ + λX Y XY
j + λk + λjk )

• In particular, given the constraints, we have:

µ11 = exp(µ + λX Y XY
1 + λ1 + λ11 )

µ12 = exp(µ + λX
1 )

µ21 = exp(µ + λY
1 )

µ22 = exp(µ)

Lecture 22: Introduction to Log-linear Models – p. 20/59

Regression Framework

• In terms of a regression framework, you write the model as

2 3 2 3 2 32 3
log(µ11 ) µ + λX Y XY
1 + λ1 + λ11 1 1 1 1 µ
6 log(µ12 ) 7 6 µ + λX 7 6 1 1 0 0 76 λX 7
6 7 6 1 7 6 76 1 7
6 7=6 7=6 76 7
4 log(µ21 ) 5 4 µ + λY1 5 4 1 0 1 0 54 λY
1 5
log(µ22 ) µ 1 0 0 0 XY
λ11

• i.e., you create dummy or indicator variables for the different categories.

log(µjk ) = µ + I(j = 1)λX Y XY

1 + I(k = 1)λ1 + I[(j = 1), (k = 1)]λ11

where
(
1 if A is true
I(A) = .
0 if A is not true

Lecture 22: Introduction to Log-linear Models – p. 21/59

• For example,

log(µ21 ) = µ + I(2 = 1)λX Y XY

1 + I(1 = 1)λ1 + I[(2 = 1), (1 = 1)]λ11

= µ + 0 · λX Y XY
1 + 1 · λ1 + 0 · λ11

= µ + λY
1

Lecture 22: Introduction to Log-linear Models – p. 22/59

Interpretation of the λ’s

• We can solve for the λ’s in terms of the µjk ’s.

log(µ22 ) = µ

log(µ12 ) − log(µ22 ) = (µ + λX
1 )−µ

= λX
1

log(µ21 ) − log(µ22 ) = (µ + λY
1 )−µ

= λY
1

Lecture 22: Introduction to Log-linear Models – p. 23/59

Odds Ratio

µ11 µ22
log(OR) = log µ21 µ12
= log(µ11 ) + log(µ22 ) − log(µ21 ) − log(µ12 )

= (µ + λX Y XY
1 + λ1 + λ11 ) + µ
−(µ + λY X
1 ) − (µ + λ1 )

= λXY
11

Important: the main parameter of interest is the log odd ratio, which equals λXY
11 in this
model.

Lecture 22: Introduction to Log-linear Models – p. 24/59

• The model with the 4 parameters

µ, λX Y XY
1 , λ1 , λ11

is called the ‘saturated model’ since it has as many free parameters as possible for a
(2 × 2) table which has the four expected cell counts µ11 , µ12 , µ21 , µ22 .

Lecture 22: Introduction to Log-linear Models – p. 25/59

• Also, you will note that Agresti uses different constraints for the log-linear model,
namely
2
X
λX
j = 0,
j=1

2
X
λX
k = 0,
k=1

and
2
X
λXY
jk = 0 for k = 1, 2
j=1

and
2
X
λXY
jk = 0 for j = 1, 2
k=1

Lecture 22: Introduction to Log-linear Models – p. 26/59

• Agresti’s model is just a different parameterization for the ‘saturated model’. I think the
one we are using (Reference Category) is a little easier to work with.
• The log-linear model as we have written it, makes no distinction between what margins
are fixed by design, and what margins are random.
• Again, when you are not really sure how you want to model the data (conditional on
the total, conditional on the rows or conditional on the columns) or which model is
appropriate, you can use log-linear models to explore the data.

Lecture 22: Introduction to Log-linear Models – p. 27/59

Parameters of interest for different designs and the MLE’s

• For all sampling plans, we are interested in testing independence:

H0 :OR = 1.

• As shown earlier for the log-linear model, the null is

H0 :λXY
11 = log(OR) = 0.

• Depending on the design, some of the parameters of the log-linear model are actually
fixed by the design.
• However, for all designs, we can estimate the parameters (that are not fixed by the
design) with a Poisson likelihood, and get the MLE’s of the parameters for all designs.
• This is because the kernel of the log-likelihood for any of these design is the same

Lecture 22: Introduction to Log-linear Models – p. 28/59

The different designs

Random Counts

• To derive the likelihood, note that

n
e−µjk µjkjk
P (Yjk = njk |Poisson) =
njk !

• Thus, the full likelihood is

n
Y Y e−µjk µjkjk
L=
j k
njk !

• Or,
XX XX
l= −µjk + njk log µjk + K
j k j k

• Or, in terms of the kernel, the Poisson log-likelihood is

l∗ = −µ++ + y11 log(µ11 ) + y12 log(µ12 ) + y21 log(µ21 ) + y22 log(µ22 )

Lecture 22: Introduction to Log-linear Models – p. 29/59

• Consider the log-linear model

µjk = exp[µ + λX Y XY
j + λk + λjk ]

• Then, substituting this in the log-likelihood, we get

log[L(µ, λX Y XY
1 , λ1 , λ11 )] =
P2 P2
−µ++ + j=1 k=1 yjk [µ + λX Y XY
j + λk + λjk ] =
P2 X
P2 Y
P2 P2
−µ++ + µy++ + j=1 λj yj+ + k=1 λk y+k + j=1 k=1 yjk λXY
jk =

−µ++ + µy++ + λX Y XY
1 y1+ + λ1 y+1 + λ11 y11

since we constrained all λ terms to be 0 with a subscript equal to 2.

Lecture 22: Introduction to Log-linear Models – p. 30/59

• Note, here, that the likelihood is a function of the parameters

(µ, λX Y XY
1 , λ1 , λ11 )

and the random variables

(y++ , y1+ , y+1 , y11 )

• The random variables

(y++ , y1+ , y+1 , y11 )

are called sufficient statistics, i.e., all the information from the data in the likelihood are
contained in the sufficient statistics
• In particular, when taking derivatives of the log-likelihood to find the MLE, we will be
solving for the estimate of (µ, λX Y XY
1 , λ1 , λ11 ) as a function of the sufficient statistics
(y++ , y1+ , y+1 , y11 )

Lecture 22: Introduction to Log-linear Models – p. 31/59

Example

Cold incidence among French Skiers (Pauling, Proceedings of the national Academy of
Sciences, 1971).

OUTCOME

Lecture 22: Introduction to Log-linear Models – p. 32/59

Poisson log-linear model Model

• For the Poisson likelihood, we write the log-linear model for the expected cell counts
as:

• We will use this in SAS Proc Genmod to obtain the estimates

Lecture 22: Introduction to Log-linear Models – p. 33/59

SAS PROC GENMOD

data one;
input vitc cold count;
cards;
1 1 17
1 2 122
2 1 31
2 2 109
;
run;

proc genmod data=one;

class vitc cold; /* Class automatically create */
/* constraints, i.e., dummy */
/* variables */

model count = vitc cold vitccold / / can put interaction terms in */

link=log dist = poi; /* directly */
run;

Lecture 22: Introduction to Log-linear Models – p. 34/59

/* SELECTED OUTPUT */

The GENMOD Procedure

Analysis Of Parameter Estimates

Parameter DF Estimate Std Err ChiSquare Pr>Chi

INTERCEPT 1 4.6913 0.0958 2398.9532 0.0000

VITC 1 1 0.1127 0.1318 0.7308 0.3926
VITC 2 0 0.0000 0.0000 . .
COLD 1 1 -1.2574 0.2035 38.1575 0.0000
COLD 2 0 0.0000 0.0000 . .
VITC*COLD 1 1 1 -0.7134 0.3293 4.6934 0.0303
VITC*COLD 1 2 0 0.0000 0.0000 . .
VITC*COLD 2 1 0 0.0000 0.0000 . .
VITC*COLD 2 2 0 0.0000 0.0000 . .
SCALE 0 1.0000 0.0000 . .

Lecture 22: Introduction to Log-linear Models – p. 35/59

Estimates

• From the SAS Output, the Estimates are:

µ
b = 4.6913

bV IT C = 0.1127
λ1

bCOLD = −1.2574
λ1

λV
11
IT C,COLD
= log(OR) = −0.7134
• The OR the “regular” way is

17 · 109
log(OR) = log( ) = log(0.499) = −0.7134
31 · 122

Lecture 22: Introduction to Log-linear Models – p. 36/59

Double Dichotomy

• For the double dichotomy in which the data follow a multinomial, we first rewrite the
log-likelihood

l∗ = −µ++ + y11 log(µ11 ) + y12 log(µ12 ) + y21 log(µ21 ) + y22 log(µ22 )

in terms of the expected cell counts, and the λ’s :

µ11 = np11 = exp(µ + λX Y XY
1 + λ1 + λ11 )
µ12 = np12 = exp(µ + λX 1 )
µ21 = np21 = exp(µ + λY 1 )
µ22 = n(1 − p11 − p12 − p21 ) = exp(µ)

Lecture 22: Introduction to Log-linear Models – p. 37/59

• Recall, the multinomial is a function of 3 probabilities

(p11 , p12 , p21 )

since p22 = 1 − p11 − p12 − p21 .

• Adding up the µjk ’s in terms of the npjk ’s, it is pretty easy to see that

2 X
X 2
µ++ = µjk = n
j=1 k=1

(fixed by design), so that the first term in the log-likelihood, −µ++ = −n is not a
function of the unknown parameters for the multinomial.

Lecture 22: Introduction to Log-linear Models – p. 38/59

• Then, the multinomial probabilities can be written as

µjk µjk
pjk = =
n µ++

• We can also write µ++ in terms of the λ’s,

P2 P2
µ++ = j=1 k=1 µjk =
P2 P2
j=1 k=1 exp[µ + λX Y XY
j + λk + λjk ] =
P2 P2
exp[µ] j=1 k=1 exp[λX Y XY
j + λk + λjk ]

Lecture 22: Introduction to Log-linear Models – p. 39/59

• Then, we can rewrite the multinomial probabilities as

µjk
pjk =
µ++

exp[µ + λX Y XY
j + λk + λjk ]
= P2 P2 X Y XY
j=1 k=1 exp[µ + λj + λk + λjk ]

exp[µ] exp[λX Y XY
j + λk + λjk ]
= P P
exp[µ] 2j=1 2k=1 exp[λX Y XY
j + λk + λjk ]

exp[λX Y XY
j + λk + λjk ]
= P2 P2 X + λY + λXY ]
,
j=1 k=1 exp[λj k jk

which is not a function of µ

Lecture 22: Introduction to Log-linear Models – p. 40/59

The Multinomial

• We see that these probabilities do not depend on the parameter

µ.
• In particular, for the multinomial, there are only three free probabilities

(p11 , p12 , p21 )

and three parameters.

(λX Y XY
1 , λ1 , λ11 ).

Lecture 22: Introduction to Log-linear Models – p. 41/59

• These probabilities could also have been determined by noting that, conditioning on
the table total n = Y++ , the Poisson random variables follow a conditional multinomial,

(Y11 , Y12 , Y21 |Y++ = y++ ) ∼ M ult(y++ , p11 , p12 , p21 )

with
µjk
pjk =
µ++

which we showed above equals

exp[λX Y XY
j + λk + λjk ]
pjk = P2 P2 X + λY + λXY ]
,
j=1 k=1 exp[λj k jk

and is not a function of µ.

Lecture 22: Introduction to Log-linear Models – p. 42/59

Obtaining MLE’s

Thus, to obtain the MLE’s for (λX Y XY

1 , λ1 , λ11 ), we have 2 choices:
1. We can maximize the Poisson likelihood.
2. We can maximize the conditional multinomial likelihood.

• If the data are from a double dichotomy, the multinomial likelihood is not a function of
µ. Thus, if you use a Poisson likelihood to estimate the log-linear model when the data
are multinomial, the estimate of µ really is not of interest.
• We will use this in SAS Proc Catmod to obtain the estimates using the multinomial
likelihood.

Lecture 22: Introduction to Log-linear Models – p. 43/59

Multinomial log-linear model Model

• For the Multinomial likelihood in SAS Proc Catmod, we write the log-linear model for
the three probabilities (p11 , p12 , p21 ) as:

exp(λX Y XY
1 + λ1 + λ11 )
p11 =
exp(λX
1 + λY XY X Y
1 + λ11 ) + exp(λ1 ) + exp(λ1 ) + 1

exp(λX1 )
p12 =
exp(λX Y
1 + λ1 + λXY X Y
11 ) + exp(λ1 ) + exp(λ1 ) + 1

exp(λY1 )
p21 =
exp(λX Y
1 + λ1 + λXY X Y
11 ) + exp(λ1 ) + exp(λ1 ) + 1

Lecture 22: Introduction to Log-linear Models – p. 44/59

• Note that the denominator in each probability is

2 X
X 2
exp[λX Y XY
j + λk + λjk ]
j=1 k=1

For j = k = 2 in this sum, we have the constraint that λX Y XY

2 = λ2 = λ22 = 0 so that

0
exp[λX Y XY
2 + λ2 + λ22 ] = e = 1

• Using SAS Proc Catmod, we make the design matrix equal to the combinations of
(λX Y XY
1 , λ1 , λ11 ) found in the exponential function in the numerators:
2 3 2 32 3
λX
1 + λY
1 + λXY
11 1 1 1 λX
1
6 7 6 76 7
4 λX
1 5=4 1 0 0 54 λY
1 5
λY
1 0 1 0 XY
λ11

Lecture 22: Introduction to Log-linear Models – p. 45/59

SAS PROC CATMOD

data one;
input vitc cold count;
cards;
1 1 17
1 2 122
2 1 31
2 2 109
;
run;

proc catmod data=one;

model vitc*cold = ( 1 1 1, /* 1st col = lambdaˆV */
1 0 0, /* 2nd col = lambdaˆC */
0 1 0 ); /* 3rd col = lambdaˆVC */
weight count;
run;

Lecture 22: Introduction to Log-linear Models – p. 46/59

/* SELECTED OUTPUT */

Response Profiles

Response vitc cold

------------------------
1 1 1
2 1 2
3 2 1
4 2 2

Analysis of Maximum Likelihood Estimates

Standard Chi-
Effect Parameter Estimate Error Square Pr > ChiSq
---------------------------------------------------------------------
Model 1 0.1127 0.1318 0.73 0.3926
2 -1.2574 0.2035 38.16 <.0001
3 -0.7134 0.3293 4.69 0.0303

Lecture 22: Introduction to Log-linear Models – p. 47/59

Estimates

• From the SAS Output, the Estimates are:

bV IT C = 0.1127
λ1

bCOLD = −1.2574
λ1

λV
11
IT C,COLD
= log(OR) = −0.7134

Which is the same as for the Poisson Log-Linear Model and

e(−0.7134) = 0.49

which is the estimate obtained from PROC FREQ

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits

-----------------------------------------------------------------
Case-Control (Odds Ratio) 0.4900 0.2569 0.9343

Lecture 22: Introduction to Log-linear Models – p. 48/59

Prospective Study

• Now, suppose the data are from a prospective study, or, equivalently, we condition on
the row totals of the (2 × 2) table. We know that, conditional on the row totals
n1 = Y1+ and n2 = Y2+ are fixed, and the total sample size is n++ = n1 + n2 .
• Further, we are left with a likelihood that is a product of two independent row binomials.

(Y11 |Y1+ = y1+ ) ∼ Bin(y1+ , p1 )

where
µ11 µ11
p1 = P [Y = 1|X = 1] = = ;
µ1+ µ11 + µ12

and
(Y21 |Y2+ = y2+ ) ∼ Bin(y2+ , p2 )

where
µ21 µ21
p2 = P [Y = 1|X = 2] = =
µ2+ µ21 + µ22

• And the conditional binomials are independent.

Lecture 22: Introduction to Log-linear Models – p. 49/59

• Conditioning on the rows, the log-likelihood kernel is

l∗ = −(n1 +n2 )+y11 log(n1 p1 )+y12 log(n1 (1−p1 ))+y21 log(n2 p2 )+y22 log(n2 (1−p2 ))

• What are p1 and p2 in terms of the λ’s?

• The probability of success for row 1 is

µ11
p1 =
µ11 + µ12

exp(µ + λX Y XY
1 + λ1 + λ11 )
=
exp(µ + λX Y XY X
1 + λ1 + λ11 ) + exp(µ + λ1 )

exp(µ + λX Y XY
1 ) exp(λ1 + λ11 )
=
exp(µ + λX Y XY
1 )[exp(λ1 + λ11 ) + 1]

exp(λY XY
1 + λ11 )
=
1 + exp(λY XY
1 + λ11 )

Lecture 22: Introduction to Log-linear Models – p. 50/59

• The probability of success for row 2 is

µ21
p2 =
µ21 + µ22

exp(µ + λY1 )
=
exp(µ + λY
1 ) + exp(µ)

exp(µ) exp(λY
1 )
=
exp(µ)[exp(λY
1 ) + 1]

exp(λY
1 )
=
1 + exp(λY
1 )

• Now, conditional on the row totals (as in a prospective study), we are left with two free
probabilities (p1 , p2 ), and the conditional likelihood is a function of two free parameters
(λY XY
1 , λ11 ).

Lecture 22: Introduction to Log-linear Models – p. 51/59

Logistic Regression

• Looking at the previous pages, the conditional probabilities of Y given X from the
log-linear model follow a logistic regression model:

px = P [Y = 1|X ∗ = x∗ ]

Y
e[λ1 +λXY ∗
11 x ]
= Y +λXY
e[λ1 11 x ]
∗
+1

∗
e[β0 +β1 x ]
=
1 + e[β0 +β1 x ]
∗

where
(
1 if x = 1
x∗ = .
0 if x = 2

and
β0 = λY
1

and
β1 = λXY
11

Lecture 22: Introduction to Log-linear Models – p. 52/59

• From the log-linear model, we had that λXY
11 is the log-odds ratio, which we know from
the logistic regression, is β1 .
• Note, the intercept in a logistic regression with Y as the response is the main effect of
Y in the log-linear model:
β0 = λY
1

• The conditional probability px is not a function of µ or λX

1 .

Lecture 22: Introduction to Log-linear Models – p. 53/59

Obtaining MLE’s

• Thus, to obtain the MLE’s for (λY XY

1 , λ11 ), we have 3 choices:
1. We can maximize the Poisson likelihood.
2. We can maximize the conditional multinomial likelihood.
3. We can maximize the row product binomial likelihood using a logistic regression
package.
• If the data are from a prospective study, the product binomial likelihood is not a
function of µ or λX1 .
• Thus, if you use a Poisson likelihood to estimate the log-linear model when the data
are from a prospective study, the estimate of µ or λX
1 really are not of interest.

Lecture 22: Introduction to Log-linear Models – p. 54/59

Revisiting the Cold Vitamin C Example

• We will let the ‘covariate’ X =TREATMENT and ‘outcome’ Y =COLD.

• We will use SAS Proc Logistic to get the MLES of the intercept

β0 = λY COLD
1 = λ1

and log-odds ratio

V IT C,COLD
β1 = λXY
11 = λ11

Lecture 22: Introduction to Log-linear Models – p. 55/59

SAS PROC LOGISTIC

data one;
input vitc cold count;
if vitc=2 then vitc=0;
if cold=2 then cold=0;
cards;
1 1 17
1 2 122
2 1 31
2 2 109
;
run;

proc logistic data=one descending; /* descending model pr(Y=1) */

model cold = vitc / rl ; /* rl gives 95 % CI for OR */
freq count; /* tells SAS how many subjects */
/* each record in dataset represent */
run;

Lecture 22: Introduction to Log-linear Models – p. 56/59

/* SELECTED OUTPUT */

Analysis of Maximum Likelihood Estimates

Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -1.2574 0.2035 38.1575 <.0001

vitc 1 -0.7134 0.3293 4.6934 0.0303

Wald Confidence Interval for Adjusted Odds Ratios

Effect Unit Estimate 95% Confidence Limits

vitc 1.0000 0.490 0.257 0.934

Estimates

• From the SAS Output, the Estimates are:

βb0 = λ
bCOLD = −1.2574
1

βb1 = λV
11
IT C,COLD
= log(OR) = −0.7134
• Which are the same as for the Poisson and Multinomial Log-Linear Models.
Lecture 22: Introduction to Log-linear Models – p. 57/59
Recap

• Except for combinatorial terms that are not function of any unknown parameters, using
µjk from the previous table, the kernel of the log-likelihood for any of these design can
be written as

l∗ = −µ++ + y11 log(µ11 ) + y12 log(µ12 ) + y21 log(µ21 ) + y22 log(µ22 )

• In this likelihood, the table total µ++ is actually known for all designs,

Double Dichotomy n
Prospective n1 + n2
Case Control n1 + n2

except for the Poisson, in which

µ++ = E(Y++ )

is a parameter that must be estimated (i.e., the sum of JK independent Poisson

random variables).

Lecture 22: Introduction to Log-linear Models – p. 58/59

Recap

Key Points:
• We have introduced Log-linear models
• We have defined a parameter in the model to represent the OR
• We do not have an “outcome” per se
• If you can designate an outcome, you minimize the number of parameters estimated
• You should feel comfortable writing likelihoods, If not, you have 3 weeks to gain the
comfort
• Expect the final exam to have at least one likelihood problem

Lecture 22: Introduction to Log-linear Models – p. 59/59

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
57% (83)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (108)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
100 Questions To Ask Your Partner
78% (36)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
91% (35)
The 36 Questions That Lead To Love - The New York Times
3 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
1001 Songs
70% (73)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
دور التمكين الاداري في تعزيز الابتكار التنظيمي
No ratings yet
دور التمكين الاداري في تعزيز الابتكار التنظيمي
31 pages
An Introduction To Generalized Linear Models (Third Edition, 2008) by Annette Dobson & Adrian Barnett Outline of Solutions For Selected Exercises
No ratings yet
An Introduction To Generalized Linear Models (Third Edition, 2008) by Annette Dobson & Adrian Barnett Outline of Solutions For Selected Exercises
23 pages
SAHADEB - Categorical - Data - Lecture3
No ratings yet
SAHADEB - Categorical - Data - Lecture3
84 pages
02 Subat 12
No ratings yet
02 Subat 12
4 pages
Chap9 Agresti
No ratings yet
Chap9 Agresti
12 pages
1 Loglinear Models For Contingency Tables
No ratings yet
1 Loglinear Models For Contingency Tables
12 pages
Loglinear Models: Angela Jeansonne
No ratings yet
Loglinear Models: Angela Jeansonne
20 pages
Genlog Multinomial
No ratings yet
Genlog Multinomial
21 pages
Chapter 6
No ratings yet
Chapter 6
24 pages
6 Loglinear Models Beamer-Online PDF
No ratings yet
6 Loglinear Models Beamer-Online PDF
112 pages
Introduction To Generalized Linear Models: Logit Model With Categorical Predictors. Before
No ratings yet
Introduction To Generalized Linear Models: Logit Model With Categorical Predictors. Before
24 pages
Application of Poisson Regression
No ratings yet
Application of Poisson Regression
5 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
Models for Polytomous Responses AA 2016-2017
No ratings yet
Models for Polytomous Responses AA 2016-2017
48 pages
Department of Statistics Paper STATS 765 Special Topic in Regression 2006
No ratings yet
Department of Statistics Paper STATS 765 Special Topic in Regression 2006
2 pages
PD2004 9
No ratings yet
PD2004 9
26 pages
Log Linear Notes
No ratings yet
Log Linear Notes
4 pages
Panel Data Problem Set 6
No ratings yet
Panel Data Problem Set 6
4 pages
02 Subat 14
No ratings yet
02 Subat 14
8 pages
Econometria Avanzada: Generalized Linear Models
No ratings yet
Econometria Avanzada: Generalized Linear Models
30 pages
02 Subat 28
No ratings yet
02 Subat 28
3 pages
26GeneralizedLinearModelBernoulliAnnotated PDF
No ratings yet
26GeneralizedLinearModelBernoulliAnnotated PDF
46 pages
2.1972 Generalized Linear Models Nelder Wedderburn
No ratings yet
2.1972 Generalized Linear Models Nelder Wedderburn
16 pages
UT Dallas Syllabus For Poec6344.501.07f Taught by Paul Jargowsky (Jargo)
No ratings yet
UT Dallas Syllabus For Poec6344.501.07f Taught by Paul Jargowsky (Jargo)
9 pages
STAT511Q2Q4
No ratings yet
STAT511Q2Q4
11 pages
Nelder 1972
No ratings yet
Nelder 1972
16 pages
Regression 101
No ratings yet
Regression 101
18 pages
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
No ratings yet
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
3 pages
2101 F 17 Assignment 1
No ratings yet
2101 F 17 Assignment 1
8 pages
Presentation Generalized Linear Model Theory
No ratings yet
Presentation Generalized Linear Model Theory
77 pages
Generalized Linear Models: 45 Heagerty, Bio/Stat 571
No ratings yet
Generalized Linear Models: 45 Heagerty, Bio/Stat 571
39 pages
IRT Models and Mixed Models: Theory and Lmer Practice: Paul de Boeck Sun-Joo Cho
No ratings yet
IRT Models and Mixed Models: Theory and Lmer Practice: Paul de Boeck Sun-Joo Cho
26 pages
Midterm Review STA216: Generalized Linear Models: I I I I I I
No ratings yet
Midterm Review STA216: Generalized Linear Models: I I I I I I
26 pages
Log-linear Model for Contingency Tables in R
No ratings yet
Log-linear Model for Contingency Tables in R
1 page
Binary Logistic Regression
No ratings yet
Binary Logistic Regression
8 pages
On Parameter Estimation by Nonlinear Least Squares in Some Special Two-Parameter Exponential Type Models
No ratings yet
On Parameter Estimation by Nonlinear Least Squares in Some Special Two-Parameter Exponential Type Models
7 pages
Genlog Poisson
No ratings yet
Genlog Poisson
16 pages
20190720221926D3611 6-7
No ratings yet
20190720221926D3611 6-7
33 pages
Countdata2018 2
No ratings yet
Countdata2018 2
23 pages
Logreg
No ratings yet
Logreg
26 pages
Statistical Methodology Past Paper 2021-2022
No ratings yet
Statistical Methodology Past Paper 2021-2022
4 pages
Logistic and Nonlinear Regression: Department of Political Science AND International Relations Posc/Uapp 816
No ratings yet
Logistic and Nonlinear Regression: Department of Political Science AND International Relations Posc/Uapp 816
15 pages
J X X R X R: (B) Multivariate Regression Models
No ratings yet
J X X R X R: (B) Multivariate Regression Models
7 pages
Analysis of Binary Panel Data by Static and Dynamic Logit Models
No ratings yet
Analysis of Binary Panel Data by Static and Dynamic Logit Models
45 pages
R Regression
No ratings yet
R Regression
3 pages
Applied Econometrics: William Greene Department of Economics Stern School of Business
No ratings yet
Applied Econometrics: William Greene Department of Economics Stern School of Business
68 pages
Dsur I Chapter 18 Categorical Data
No ratings yet
Dsur I Chapter 18 Categorical Data
47 pages
1 e Exercises
No ratings yet
1 e Exercises
31 pages
Lecture Notes 5
100% (1)
Lecture Notes 5
53 pages
Econometrics - Exercise set 2 (solution)
No ratings yet
Econometrics - Exercise set 2 (solution)
12 pages
Tugas Analisis Data Survival: Kelas B
No ratings yet
Tugas Analisis Data Survival: Kelas B
11 pages
An Introduction To Logistic Regression: Johnwhitehead Department of Economics East Carolina University
No ratings yet
An Introduction To Logistic Regression: Johnwhitehead Department of Economics East Carolina University
48 pages
Lecture 5: Contingency Tables: Dipankar Bandyopadhyay, PH.D
No ratings yet
Lecture 5: Contingency Tables: Dipankar Bandyopadhyay, PH.D
46 pages
Chapter 4: Transformations of Variables: Box-Cox Tests of Functional Specification
No ratings yet
Chapter 4: Transformations of Variables: Box-Cox Tests of Functional Specification
16 pages
Lecture 8
No ratings yet
Lecture 8
39 pages
CQF ML Lab Estimating Default Probability With Logistic Regression
No ratings yet
CQF ML Lab Estimating Default Probability With Logistic Regression
7 pages
Introduction To Loglinear Models
No ratings yet
Introduction To Loglinear Models
1 page
Theory Generalized Linear Model
No ratings yet
Theory Generalized Linear Model
16 pages
Article: An Introduction Tos Logistic Regression Analysis and Reporting
No ratings yet
Article: An Introduction Tos Logistic Regression Analysis and Reporting
5 pages
03 Mart 4
No ratings yet
03 Mart 4
6 pages
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Econometrics ch11
No ratings yet
Econometrics ch11
44 pages
Cochran-Armitage Test For Trend in Proportions
No ratings yet
Cochran-Armitage Test For Trend in Proportions
22 pages
Independent Samples T-Test: Module No. 2
No ratings yet
Independent Samples T-Test: Module No. 2
9 pages
Data Set - Answer Key
No ratings yet
Data Set - Answer Key
33 pages
(eBook PDF) Elementary Survey Sampling 7th Editionpdf download
100% (3)
(eBook PDF) Elementary Survey Sampling 7th Editionpdf download
47 pages
Parametric and Nonparametric
No ratings yet
Parametric and Nonparametric
2 pages
Instant Access to (Original PDF) Real Stats Using Econometrics for Political Science and Public Policy ebook Full Chapters
100% (5)
Instant Access to (Original PDF) Real Stats Using Econometrics for Political Science and Public Policy ebook Full Chapters
45 pages
31101
100% (1)
31101
20 pages
Statistik Multivariat Dalam Riset C277e11a
No ratings yet
Statistik Multivariat Dalam Riset C277e11a
34 pages
GMT 206 Numerical Data Analysis
No ratings yet
GMT 206 Numerical Data Analysis
18 pages
Frfref
No ratings yet
Frfref
6 pages
Average: Sagni D. 1
No ratings yet
Average: Sagni D. 1
85 pages
TQ StatisticsProbability
No ratings yet
TQ StatisticsProbability
8 pages
AB1202 Lect 06
No ratings yet
AB1202 Lect 06
14 pages
Correlation
No ratings yet
Correlation
53 pages
Quantitative Techniques For Business Decisions-2022 Admission Onwards-November-2022
No ratings yet
Quantitative Techniques For Business Decisions-2022 Admission Onwards-November-2022
2 pages
Analysis of Covariance (Chapter 16) : Regression
No ratings yet
Analysis of Covariance (Chapter 16) : Regression
27 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
10 pages
MATH 1281 - Unit 7 Assignment
No ratings yet
MATH 1281 - Unit 7 Assignment
3 pages
STR Jmulti
No ratings yet
STR Jmulti
17 pages
Demand Forecasting(1)
No ratings yet
Demand Forecasting(1)
1 page
IE354 Slides 10 Chp11
No ratings yet
IE354 Slides 10 Chp11
68 pages
Multiple Comparisons and Multiple Tests Using SAS Second Edition Ph.D. - Download the full ebook now to never miss any detail
100% (1)
Multiple Comparisons and Multiple Tests Using SAS Second Edition Ph.D. - Download the full ebook now to never miss any detail
33 pages
01 One Sample T-Test Spss
No ratings yet
01 One Sample T-Test Spss
11 pages
An Introduction To Statistical Learning From A Reg PDF
No ratings yet
An Introduction To Statistical Learning From A Reg PDF
25 pages
Download ebooks file Asymptotic Theory for Econometricians Revised Edition Economic Theory Econometrics and Mathematical Economics White all chapters
100% (1)
Download ebooks file Asymptotic Theory for Econometricians Revised Edition Economic Theory Econometrics and Mathematical Economics White all chapters
67 pages
Heteroskedasticity
No ratings yet
Heteroskedasticity
2 pages
GridDataReport Ambatukam Fams 30 FEBRUARI 2222
No ratings yet
GridDataReport Ambatukam Fams 30 FEBRUARI 2222
7 pages