Log-Linear_Models
Log-Linear_Models
net/publication/229704337
Log‐Linear Models
CITATION READS
1 5,151
1 author:
Jeroen K. Vermunt
Tilburg University
422 PUBLICATIONS 22,884 CITATIONS
SEE PROFILE
All content following this page was uploaded by Jeroen K. Vermunt on 22 October 2018.
1 Introduction
Log-linear analysis has become a widely used method for the analysis of
multivariate frequency tables obtained by cross- classifying sets of nominal,
ordinal, or discrete interval level variables. Examples of textbooks discussing
categorical data analysis by means of log-linear models are [4], [2], [14], [15],
[16], and [27].
We start by introducing the standard hierarchical log-linear modelling
framework. Then, attention is paid to more advanced types of log-linear
models that make it possible to impose interesting restrictions on the model
parameters, for example, restrictions for ordinal variables. Subsequently, we
present “regression-analytic”, ”path-analytic”, and “factor-analytic” variants
of log-linear analysis. The last section discusses parameter estimation by
maximum likelihood, testing, and software for log-linear analysis.
1
The consequence of specifying a linear model for the log of mabc is that a
multiplicative model is obtained for mabc , i.e.,
mabc = exp λ + λA B C AB AC BC ABC
a + λb + λc + λab + λac + λbc + λabc
From Equations 1 and 2, it can be seen that the saturated model contains
all interactions terms among A, B, and C. That is, no a priori restrictions
are imposed on the data. However, Equations 1 and 2 contain too many
parameters to be identifiable. Given the values for the expected frequencies
mabc , there is not a unique solution for the λ and τ parameters. Therefore,
constraints must be imposed on the log-linear parameters to make them
identifiable. One option is to use ANOVA-like constraints, namely,
λA λB λC
X X X
a = b = c = 0,
a b c
2
rather different. When effect coding is used, the parameters must be inter-
preted in terms of deviations from the mean, while under dummy coding,
they must interpreted in terms of deviations from the reference category.
log mabc = λ + λA B C AC BC
a + λb + λc + λac + λbc . (3)
log mabc = λ + λA B C
a + λb + λc .
Hierarchical log-linear models are the most popular log-linear models be-
cause, in most applications, it is not meaningful to include higher-order
interaction terms without including the lower-order interaction terms con-
cerned. Another reason is that it is relatively easy to estimate the parameters
of hierarchical log-linear models because of the existence of simple minimal
sufficient statistics (see maximum likelihood estimation).
3
3 Other types of log-linear models
3.1 General log-linear model
So far, attention has been paid to only one special type of log-linear models,
the hierarchical log-linear models. As demonstrated, hierarchical log-linear
models are based on one particular type of restriction on the log-linear param-
eters. But, when the goal is to construct models which are as parsimonious as
possible, the use of hierarchical log-linear models is not always appropriate.
To be able to impose other kinds of linear restrictions on the parameters, it
is necessary to use more general kinds of log-linear models.
As shown by McCullagh and Nelder [23], log-linear models can also be
defined in a much more general way by viewing them as a special case of
the generalized linear modelling (GLM) family. In its most general form, a
log-linear model can be defined as
X
log mi = λj xij , (4)
j
4
two-variable interaction terms can be obtained by multiplying the columns
for the one-variable terms for the variables concerned (see [10] and [14]).
The design matrix can also be used to specify all kinds of non-hierarchi-
cal and non-standard models. Actually, by means of the design matrix, three
kinds of linear restrictions can be imposed on the log-linear parameters: a
parameter can be fixed to zero, specified to be equal to another parameter,
and specified to be in a fixed ratio to another parameter.
The first kind of restriction, fixing to zero, is accomplished by deleting the
column of the design matrix referring to the effect concerned. Note that, in
contrast to hierarchical log-linear models, parameters can be fixed to be equal
to zero without the necessity of deleting the higher-order effects containing
the same indices as a subset.
Equating parameters is likewise very simple. Equality restrictions are
imposed by adding up the columns of the design matrix which belong to the
effects which are assumed to be equal. Suppose, for instance, that we want
to specify a model with a symmetric association between the variables A and
B,1 each having three categories. This implies that
λAB
ab = λAB
ba .
The design matrix for the unrestricted effect λAB ab contains four columns,
one for each of the parameters λ11 , λ12 , λ21 , λAB
AB AB AB
22 . In terms of these four
parameters, the symmetric association between A and B implies that λAB 12
is assumed to be equal to λAB21 . This can be accomplished by summing the
columns of the design matrix referring to these two effects.
As already mentioned above, parameters can also be restricted to be
in a fixed ratio to each other. This is especially useful when the variables
concerned can be assumed to be measured on an ordinal or interval level
scale, with known scores for the different categories. Suppose, for instance,
that we wish to restrict the one-variable effect of variable A to be linear.
Assume that the categories scores of A, denoted by a, are equidistant, that
is, that they take on the values 1, 2, and 3. Retaining the effect coding
scheme, a linear effect of A is obtained by
λA
a = (a − ā)λA .
1
Log-linear models with symmetric interaction terms may be used for various pur-
poses. In longitudinal research, they may be applied to test the assumption of marginal
homogeneity (see [3] and [16]). Other applications of log-linear models with symmetric as-
sociation parameters are Rasch models for dichotomous (see [24] and [17]) and polytomous
items (see [4]).
5
Here, ā denotes the mean of the category scores of A, which in this case is
2. Moreover, λA denotes the single parameter describing the one-variable
term for A. It can be seen that the distance between the λA a parameters
A
of adjacent categories of A is λ . In terms of the design matrix, such a
specification implies that instead of including A∗ − 1 columns for the one-
variable term for A, one column with scores (a − ā) has to be included.
These kinds of linear constraints can also be imposed on the bivariate
association parameters of a log-linear model. The best known examples are
linear-by-linear interaction terms and row- or column-effect models (see [5],
[7], [13], and [15]). When specifying a linear-by-linear interaction term, it
is assumed that the scores of the categories of both variables are known.
Assuming equidistant scores for the categories of the variables A and B and
retaining the effect coding scheme, the linear-by-linear interaction between
A and B is given by
λAB
ab = (a − ā)(b − b̄)λAB . (5)
Using this specification, which is sometimes also called uniform association,
the (partial) association between A and B is described by a single parameter
instead of using (A∗ − 1)(B ∗ − 1) independent λAB
ab parameters. As a result,
the design matrix contains only one column for the interaction between A
and B consisting of the scores (a − ā)(b − b̄).
A row association structure is obtained by assuming the column variable
to be linear. When A is the row variable, it is defined as
λAB
ab = (b − b̄)λAB
a .
Note that for every value of A, there is a λAB a parameter. Actually, there
are (A∗ − 1) independent row parameters. Therefore, the design matrix will
contain (A∗ − 1) columns which are based on the scores (b − b̄). The column
association model is, in fact, identical to the row association model, only the
roles of the column and row variable change.
6
which can also written as
X
log mi = log zi + λj xij ,
j
X
mi = zi exp( λj xij ) ,
j
where the zi are the fixed a priori cell weights. Sometimes the vector with
elements log zi is also called the offset matrix.
The specification of a zi for every cell of the contingency table has several
applications. One of its possible uses is in the specification Poisson regres-
sion models that take into account the population size or the length of the
observation period. This leads to what is called a log-rate model, a model for
rates instead of frequency counts ([14] and [6]). A rate is a number of events
divided by the size of the population exposed to the risk of having the event.
The weight vector can also be used for taking into account sampling or
nonresponse weights, in which case the zi are equated to the inverse of the
sampling weights ([3] and [6]). Another use is the inclusion of fixed effects in
a log-linear model. This can be accomplished by adding the values of the λ
parameters which attain fixed values to the corresponding log zi ’s. The last
application I will mention is in the analysis of tables with structural zeros,
sometimes also called incomplete tables ([15]). This simply involves setting
the zi = 0 for the structurally zero cells.
7
C. This gives the following log-multiplicative model:
The φ parameters describe the strength of the association between the vari-
ables concerned. The µ’s are the unknown scores for the categories of the
variables concerned. As in standard log-linear models, identifying restric-
tions have to be imposed on the parameters µ. One possible set of identify-
ing restrictions on the log-multiplicative parameters which was also used by
Goodman [13] is:
This gives row and column scores with a mean of zero and a sum of squares
of one.
On the basis of the model described in Equation 6, both more restricted
models and less restricted models can be obtained. One possible restriction is
to assume the row and column scores within a particular partial association
to be equal, for instance, µAB a equal to µAB
b for all a equal to b. Of course,
this presupposes that the number of rows equals the number of columns.
Such a restriction is often used in the analysis of mobility tables ([22]). It is
also possible to assume that the scores for a particular variable are equal for
different partial associations ([5]), for example, µABb = µBC
b . Less restricted
models may allow for different µ and/or φ parameters within the levels of
some other variable ([5]), for example, different values of µAB AB
a , µb , or φ
AB
8
4 Regression-, path-, and factor-analytic mod-
els
4.1 Log-linear regression analysis: the logit model
In the log-linear models discussed so far, the relationships between the cat-
egorical variables are modelled without making a priori assumptions about
their ‘causal’ ordering: no distinction is made between dependent and inde-
pendent variables. However, one is often interested in predicting the value
of a categorical response variable by means of explanatory variables. The
logit model is such a ‘regression analytic’ model for a categorical dependent
variable.
Suppose we have a response variable denoted by C and two categorical
explanatory variables denoted by A and B. Moreover, assume that both A
and B influence C, but that their effect is equal within levels of the other
variable. In other words, it is assumed that there is no interaction between
A and B with respect to their effect on C. This gives the following logistic
model for the conditional probability of C given A and B, πc|ab :
exp λC AC BC
c + λac + λbc
πc|ab = P BC
. (7)
c exp (λC AC
c + λac + λbc )
When the response variable C is dichotomous, the logit can also be written
as
! !
π1|ab π1|ab
log = log
1 − π1|ab π2|ab
= (λC C AC AC BC BC
1 − λ2 ) + (λa1 − λa2 ) + (λb1 − λb2 )
= β + βaA + βbB .
It should be noted that the logistic form of the model guarantees that the
probabilities remain in the admissible interval between 0 and 1.
It has been shown that a logit model is equivalent to a log-linear model
which not only includes the same λ terms, but also the effects corresponding
to the marginal distribution of the independent variables ([3], [11], [14]).
For example, the logit model described in Equation 7 is equivalent to the
following log-linear model
AB
log mabc = αab + λC AC BC
c + λac + λbc , (8)
9
A ``
-
-
Z `````
Z ```
Z ```
```
Z
```
B hh
Z `z
E
-
- -
h Z
PP hhhZ *
Q PP hh S
Q PP ZhZ
hhh
h h S
Q PP Z h hh
Q PPZ h hhh w
S
Qr
Q ~
P
q hhz
- D
: F
-
1
( ( (((((
((
(((
((((((
(
(((
(
C
-
-
where
AB
αab = α + λA B AB
a + λb + λab .
In other words, it equals log-linear model {AB, AC, BC} for the frequency
table with expected counts mabc . With polytomous response variables, the
log-linear or logit model of the form given in Equation 8 is sometimes referred
to as a multinomial response model. As shown by Haberman [15], in its most
general form, the multinomial response model may be written as
X
log mik = αk + λj xijk , (9)
j
where k is used as the index for the joint distribution of the independent
variables and i as an index for the response variable.
10
pointed arrow indicates that variables are directly related to each other, and
a ‘knot’ that there is a higher order interaction. The variables A, B, and C
are exogenous variables. This means that neither their mutual causal order
nor their mutual relationships are specified. The other variables are endoge-
nous variables, where E is assumed to be posterior to D, and F is assumed
to be posterior to E. From Figure 1, it can be seen that D is assumed to
depend on A and on the interaction of B and C. Moreover, E is assumed to
depend on A, B, and D, and F on B, C, D, and E.
Let πdef |abc denote the probability that D = d, E = e, and F = f , given
A = a, B = b, and C = c. The information on the causal ordering of the
endogenous variables is used to decompose this probability into a product of
marginal conditional probabilities ([12] and [30]). In this case, πdef |abc can
also be written as
πdef |abc = πd|abc πe|abcd πf |abcde . (10)
This is a straightforward way to indicate that the value on a particular vari-
able can only depend on the preceding variables and not on the posterior
ones. For instance, E is assumed to depend only on the preceding variables
A, B, C, and D, but not on the posterior variable F . Therefore, the proba-
bility that E = e depends only on the values of A, B, C, and D, and not on
the value of F .
Decomposing the joint probability πdef |abc into a set of marginal condi-
tional probabilities is only the first step in describing the causal relationships
between the variables under study. In fact, the model given in Equation 10
is still a saturated model in which it is assumed that a particular dependent
variable depends on all its posterior variables, including all the higher-order
interaction terms. A more parsimonious specification is obtained by using a
log-linear or logit parameterization for the conditional probabilities appear-
ing in Equation 10 ([12]). While only simple hierarchical log-linear models
will be here used, the results presented apply to other kinds of log-linear
models as well, including the log-multiplicative models discussed in section
3.3.
A system of logit models consistent with the path model depicted in Fig-
ure 1 leads to the following parameterization of the conditional probabilities
appearing in Equation 10:
exp λD AD BD CD BCD
d + λad + λbd + λcd + λbcd
πd|abc = P ,
d exp (λD AD BD CD BCD
d + λad + λbd + λcd + λbcd )
11
exp λE AE BE DE
e + λae + λbe + λde
πe|abcd = P BE DE
,
e exp (λE AE
e + λae + λbe + λde )
exp λFf + λBF CF DF EF
bf + λcf + λdf + λef
πf |abcde = P .
f exp λFf + λBF CF DF EF
bf + λcf + λdf + λef
12
W
CS
C S
C S
C S
C S
C S
C S
C S
C S
C S
/
CW w
S
A B C D
Figure 2: Latent class model
In addition to the overall mean and the one-variable terms, it contains only
the two-variable associations between the latent variable W and the manifest
variables. As none of the interactions between the manifest variables are in-
cluded, it can be seen that they are assumed to be conditionally independent
of each other given W .
In its classical parameterization proposed by Lazarsfeld [21], the latent
class model is defined as
It can be seen that again the observed variables A, B, C, and D are pos-
tulated to be mutually independent given a particular score on the latent
variable W . Note that this is in fact a log-linear path model in which one
variable is unobserved. The relation between the conditional probabilities ap-
pearing in Equation 12 and the log-linear parameters appearing in Equations
11 is
exp λA WA
a + λwa
πa|w = P . (13)
a exp (λA WA
a + λwa )
13
scheme, but the same estimates are obtained with a multinomial or product-
multinomial sampling scheme. Denoting an observed frequency in a three-
way table by nabc , the relevant part of the Poisson log-likelihood function
is
X
log L = (nabc log mabc − mabc ) , (14)
abc
where the expected frequencies mabc are a function of the unknown λ param-
eters.
Suppose we want to find ML estimates for the parameters of the hier-
archical log-linear model described in Equation 3. Substituting Equation 3
into Equation 14 and collapsing the cells containing the same λ parameter,
yields the following log-likelihood function:
14
values are needed for the log-linear parameters that are in the model. In most
computer programs based on the IPF algorithm, the iterations are started
with all the λ parameters equal to zero, in other words, with all estimated
(0)
expected frequencies m̂abc equal to 1. For the model in Equation 3, every
IPF iteration consists of the following two steps:
(ν)0 (ν−1) nab+
m̂abc = m̂abc (ν−1)
,
m̂ab+
(ν) (ν)0 n+bc
m̂abc = m̂abc (ν)0
,
m̂+bc
(ν)0 (ν)
where the m̂abc and m̂abc denote the improved estimated expected frequencies
after imposing the ML related restrictions. The log-linear parameters are
easily computed from the estimated expected frequencies.
Finding ML estimates for the parameters of other types of log-linear
models is a bit more complicated than for the hierarchical log-linear model
because the sufficient statistics are no longer equal to particular observed
marginals. Most program solve this problem using a Newton-Raphson al-
gorithm. An alternative to the Newton-Raphson algorithm is the uni-di-
mensional Newton algorithm. It differs from the multi-dimensional Newton
algorithm in that it adjusts only one parameter at a time instead of adjust-
ing them all simultaneously. In that sense, it resembles IPF. Goodman [13]
proposed using the uni-dimensional Newton algorithm for the estimation of
log-multiplicative models.
For ML estimation of latent class models, one can make use of an IPF-like
algorithm called the Expectation-Maximization (EM) algorithm, a Newton-
Raphson algorithm, or a combination of these.
X (nabc − m̂abc )2
2
X = ,
abc m̂abc
15
and the likelihood-ratio chi-square statistic is
nabc
L2 = 2
X
nabc log . (16)
abc m̂abc
where the subscript (u) refers to the unrestricted model and the subscript
(r) to the restricted model. Note that in Equation 16, a particular model
2
An alternative approach is based estimating the sampling distributions of the statistics
concerned rather than using their asymptotic distributions. This can be done by bootstrap
methods (Langeheine, Pannekoek, and Van de Pol, 1996). These computationally intensive
methods are becoming more and more applicable as computers become faster.
16
is tested against the completely unrestricted model, the saturated model.
Therefore, in Equation 16, the estimated expected frequency in the numer-
ator is the observed frequency nabc . The L2(r|u) statistic has a large sample
chi-square distribution if the restricted model is approximately true. The
approximation of the chi-square distribution may be good for conditional L2
tests between non-saturated models even if the test against the saturated
model is problematic, as in sparse tables. The number of degrees of freedom
in conditional tests equals the number of parameters which are fixed in the
restricted model compared to the unrestricted model. The L2(r|u) statistic can
also be computed from the unconditional L2 values of two nested models,
with
AIC = L2 − 2 df . (17)
BIC = L2 − log N df . (18)
5.3 Software
Software for log-linear analysis is readily available. Major statistical pack-
ages such as SAS and SPSS have modules for log-linear analysis that can
be used for estimating hierarchical and general log-linear models, log-rate
models, and logit models. Special software is required for estimating log-
multiplicative models, log-linear path models, and latent class models. The
command language based `EM program developed by Vermunt [27][28] can
deal with any of the models discussed in this article, as well as combinations
of these. Vermunt and Magidson’s [29] Windows based Latent GOLD can
deal with certain types of log-linear models, logit models, and latent class
models, as well as combinations of logit and latent class models.
17
6 An application
Consider the four-way cross-tabulation presented in Table 1 containing data
taken from four annual waves (1977-1980) of the National Youth Survey
[9]. The table reports information on marijuana use of 237 respondents who
were age 14 in 1977. The variable of interest is an ordinal variable measuring
marijuana use in the past year. It has the three levels “never” (1), “no more
than once a month” (2), and “more than once a month” (3). We will denote
these four time-specific measures by A, B, C, and D, respectively.
Several types of log-linear models are of interest for this data set. First,
we might wish to investigate the overall dependence structure of these re-
peated responses, for example, whether it is possible to describe the data
by a hierarchical log-linear model containing only the two-way associations
between consecutive time points; that is, by a first-order Markov structure.
Second, we might want to investigate whether it is possible to simplify the
model by making use of the ordinal nature of the variables using uniform
or RC association structures. Third, latent class analysis could be used to
determine whether it is possible to explain the associations by assuming that
there is a small number of groups of children with similar developments in
marijuana use.
Table 2 reports the L2 values for the estimated models. Because the
asymptotic p values are unreliable when analyzing sparse frequency tables
such as the one we have here, we estimated the p values by means of 1000
parametric bootstrapping replications. The analysis was performed with the
Latent GOLD program.
18
turns out that we need to include one additional term; that is, the associa-
tion between the second and fourth time point, yielding {AB, BC, BD, CD}
(Model 4).
Model 5 has the same structure as Model 4, with the only difference that
the two-variable terms are assumed to be uniform associations (see Equation
5). This means that each two-way association contains only one instead of
four independent parameters. These “ordinal” constraints seems to be too
restrictive for this data set.
Models 6 and 7 are latent class models or, equivalently, log-linear models
of the form {XA, XB, XC, XD}, where X is a latent variable with either two
or three categories. The fit measures indicate that the associations between
the time points can be explained by the existence of three types of trajectories
of marijuana use.
Based on the comparison of the goodness-of-fit measures for the various
models, as well as their AIC values that also take into account parsimony,
one can conclude that Model 4 is the preferred one. The three-class, however,
yields a somewhat simpler explanation for the associations between the time-
specific responses.
References
[1] Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317-332.
[2] Bishop, R.J., Fienberg, S.E., and Holland, P.W. (1975). Discrete multi-
variate analysis: theory and practice. Cambridge, Mass.: MIT Press.
[3] Agresti, A. (1990). Categorical data analysis. New York: Wiley, second
edition 2002.
[5] Clogg, C.C. (1982). Some models for the analysis of association in multi-
way cross-classifications having ordered categories. Journal of the Amer-
ican Statistical Association, 77, 803-815.
[6] Clogg, C.C., and Eliason, S.R. (1987). Some common problems in log-
linear analysis. Sociological Methods and Research, 16, 8-14.
19
[7] Clogg, C.C., and Shihadeh, E.S. (1994). Statistical models for ordinal
data. Thousand Oakes, CA: Sage Publications.
[8] Darroch, J.N., and Ratcliff, D. (1972). Generalized iterative scaling for
log-linear models. The Annals of Mathematical Statistics, 43, 1470-1480.
[9] Elliot, D.S., Huizinga, D., and Menard, S. (1989). Multiple problem
youth: delinquence, substance use, and mental health problems. New
York: Springer-Verlag.
[10] Evers, M., and Namboodiri, N.K. (1978). On the design matrix strategy
in the analysis of categorical data. K.F. Schuessler (ed.), Sociological
Methodology 1979, 86-111. San Fransisco: Jossey Bass.
[11] Goodman, L.A. (1972). A modified multiple regression approach for the
analysis of dichotomous variables. American Sociological Review, 37, 28-
46.
[13] Goodman, L.A. (1979). Simple models for the analysis of association in
cross-classifications having ordered categories. Journal of the American
Statistical Association, 74, 537-552.
[15] Haberman, S.J. (1979). Analysis of qualitative data, Vol 2, New devel-
opments. New York: Academic Press.
20
[19] Laird, N., and Oliver, D. (1981). Covariance analysis of censored sur-
vival data using log-linear analysis techniques. Journal of the American
Statistical Association, 76, 231-240.
[20] Langeheine, R., Pannekoek, J., and Van de Pol, F. (1996). Bootstrap-
ping goodness-of-fit measures in categorical data analysis. Sociological
Methods and Research, 24, 492-516.
[21] Lazarsfeld, P.F. (1950). The logical and mathematical foundation of
latent structure analysis. S.A. Stouffer et al. (eds.), Measurement and
Prediction, 362-412. Princeton, NJ: Princeton University Press.
[22] Luijkx, R. (1994). Comparative loglinear analyses of social mobility and
heterogamy. Tilburg: Tilburg University Press.
[23] McCullagh, P., and Nelder, J.A. (1983). Generalized linear models. Lon-
don: Chapman & Hall, second edition 1989.
[24] Mellenbergh, G.J., and Vijn, P. (1981). The Rasch model as a log-linear.
Applied Psychological Measurement, 5, 369-376.
[25] Rindskopf, D. (1990). Nonstandard loglinear models. Psychological Bul-
letin, 108, 150-162.
[26] Schwarz, G. (1978). Estimating the dimensions of a model. Annals of
Statistics, 6, 461-464.
[27] Vermunt J.K. (1997). Log-linear models for event history histories. Ad-
vanced Quantitative Techniques in the Social Sciences Series, Volume 8.
Thousand Oakes: Sage Publications.
[28] Vermunt, J.K. (1997) LEM: A general program for the analysis of cate-
gorical data. User’s manual. Tilburg University, The Netherlands.
[29] Vermunt, J.K., Magidson, J. (2000). Latent GOLD 2.0 User’s Guide.
Belmont, MA: Statistical Innovations Inc.
[30] Wermuth, N., and Lauritzen, S.L. (1983). Graphical and recursive mod-
els for contingency tables. Biometrika, 70, 537-552.
[31] Xie, Yu (1992). The log-multiplicative layer effects model for comparing
mobility tables. American Sociological Review, 57, 380-395.
21
Table 1: Data on marijuana use in the past year taken from four yearly waves
of the National Youth Survey (1977-1980)
1979 (C)
1 2 3
1977 1978 1980 (D) 1980 (D) 1980 (D)
(A) (B) 1 2 3 1 2 3 1 2 3
1 1 115 18 7 6 6 1 2 1 5
1 2 2 2 1 5 10 2 0 0 6
1 3 0 1 0 0 1 0 0 0 4
2 1 1 3 0 1 0 0 0 0 0
2 2 2 1 1 2 1 0 0 0 3
2 3 0 1 0 0 1 1 0 2 7
3 1 0 0 0 0 0 0 0 0 1
3 2 1 0 0 0 1 0 0 0 1
3 2 0 0 0 0 2 1 1 1 6
22
Table 2: Goodness-of-fit statistics for the estimated models for the data in
table 1
Model L2 df pb AIC
1. Independence 403.3 72 .00 259.3
2. All two-variable terms 36.9 48 .12 -59.1
3. First-order Markov 58.7 60 .05 -61.3
4. Model 3 + λBDj` 41.6 56 .30 -70.4
5. Model 4 with uniform associations 83.6 68 .00 -52.4
6. Two-class latent class model 126.6 63 .00 -51.0
7. Three-class latent class model 57.0 54 .12 -56.2