0% found this document useful (0 votes)
16 views

Finding Latent Groups in Observed Data

This DOCUMENT HELPS US TO KNOW MORE ABOUT LATENT PROFILE ANALYSIS

Uploaded by

janetyunlinsu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Finding Latent Groups in Observed Data

This DOCUMENT HELPS US TO KNOW MORE ABOUT LATENT PROFILE ANALYSIS

Uploaded by

janetyunlinsu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Wayne State University

Kinesiology, Health and Sport Studies College of Education

11-22-2019

Finding Latent Groups in Observed Data: A Primer on Latent


Profile Analysis in Mplus for Applied Researchers
Sarah L. Ferguson
Rowan University

E. Whitney G. Moore
Wayne State University, [email protected]

Darrell M. Hull
University of North Texas

Follow this and additional works at: https://digitalcommons.wayne.edu/coe_khs

Part of the Education Commons, Kinesiology Commons, and the Sports Sciences Commons

Recommended Citation
Ferguson, S. L., Moore, E. W. G., & Hull, D. M. (2020). Finding latent groups in observed data: A primer on
latent profile analysis in Mplus for applied researchers. International Journal of Behavioral Development,
44(5), 458-468. DOI: 10.1177/0165025419881721

This Article is brought to you for free and open access by the College of Education at
DigitalCommons@WayneState. It has been accepted for inclusion in Kinesiology, Health and Sport Studies by an
authorized administrator of DigitalCommons@WayneState.
International Journal of Behavioral Development

Finding Latent Groups in Observed Data: A Primer on Latent


Profile Analysis in Mplus for Applied Researchers

Journal: International Journal of Behavioral Development

Manuscript ID JBD-2019-05-3637.R1

Manuscript Type: Methods & Measures


Fo
Keywords: latent profile analysis, latent variable modeling, teaching paper

The present guide provides a practical guide to conducting latent profile


analysis (LPA) in the Mplus software system. This guide is intended for
rP

researchers familiar with some latent variable modeling but not LPA
specifically. A general procedure for conducting LPA is provided in six
Abstract: steps: (a) data inspection, (b) iterative evaluation of models, (c) model
fit and interpretability, (d) investigation of patterns of profiles in a
ee

retained model, (e) covariate analysis, and (f) presentation of results. A


worked example is provided with syntax and results to exemplify the
steps.
rR
ev
iew

http://mc.manuscriptcentral.com/ijbd
Page 1 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 1


1
2
3 Abstract
4
5
6 The present guide provides a practical guide to conducting latent profile analysis (LPA) in the
7
8
9 Mplus software system. This guide is intended for researchers familiar with some latent variable
10
11 modeling but not LPA specifically. A general procedure for conducting LPA is provided in six
12
13 steps: (a) data inspection, (b) iterative evaluation of models, (c) model fit and interpretability, (d)
14
15
16
investigation of patterns of profiles in a retained model, (e) covariate analysis, and (f)
17
18 presentation of results. A worked example is provided with syntax and results to exemplify the
19
Fo
20 steps.
21
22
23 Keywords: latent profile analysis; latent variable modeling; teaching paper
rP

24
25
26
ee

27
28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 2 of 54

FINDING LATENT GROUPS 2


1
2
3 Finding Latent Groups in Observed Data: A Primer on Latent Profile Analysis in Mplus for
4
5
6 Applied Researchers
7
8
9 The purpose of the present paper is to provide a practical guide on the use of latent
10
11 profile analysis (LPA) in applied research studies. Use of LPA as an analysis approach has
12
13 increased over recent years and is becoming more common in applied research. However, many
14
15
16
applied researchers do not take formal coursework in advanced statistical analyses such as LPA
17
18 (Henson, Hull, & Williams, 2010), and may turn to the internet and research literature for
19
Fo
20 assistance with teaching themselves new statistical techniques. Online resources and supports for
21
22
LPA do exist (e.g. the Mplus manual and discussion boards, online lectures and class notes, etc.),
23
rP

24
25 but a unified guide for understanding and utilizing LPA with clear steps and examples
26
ee

27 specifically targeted for applied research does not currently appear in the literature. This paper is
28
29 therefore intended for researchers already familiar with latent variable modeling, though not
rR

30
31
32 mixture modeling or LPA specifically. A general discussion of LPA will be presented followed
ev

33
34 by a worked example using data collected from a prior research study (Author, 2018). Syntax is
35
36
iew

also provided to conduct LPA in Mplus with a discussion of common options and additions for
37
38
39
this program. The paper concludes with a discussion of LPA reporting practices, highlighting the
40
41 primary information needed to report an LPA in applied research.
42
43 This work is intended predominately for applied researchers who are interested in
44
45
exploring the applicability of LPA to their own research. Students still developing their methods
46
47
48 and analysis expertise may also benefit from this introduction to LPA, as well as reviewers of
49
50 research needing a brief introduction to the decision points and reporting practices of LPA. It
51
52 must be noted that this is a primer on LPA and the foundational functions and processes of this
53
54
55 analysis. As with any general introduction to a complex analysis, a balance is sought in this
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 3 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 3


1
2
3 guide between clarity of LPA use in applied research contexts, and the complexity of potential
4
5
6 issues with LPA analyses related to continuing debates in the field on key decisions (e.g.
7
8 statistical power, covariate inclusion, etc.). Therefore, the more complex technical discussions
9
10 and ongoing debates in the LPA and mixture modeling literature are outside of the scope of this
11
12
13
work. This guide will focus on the practical decision points in LPA and suggest common
14
15 approaches to these decisions supported by current methods research, while pointing the reader
16
17 to additional sources to cover these issues in more depth.
18
19
Latent Profile Analysis
Fo
20
21
22 Latent profile analysis (LPA) and latent class analysis (LCA) are techniques for
23
rP

24 recovering hidden groups in data by obtaining the probability that individuals belong to different
25
26 groups. This occurs through examination of the distributions of groups in the data and
ee

27
28
29 determining if those distributions are meaningful. It might be helpful to think of these groups,
rR

30
31 whether they are classes of people or profiles of people as unobserved latent mixture
32
ev

33 components. Indeed, both LCA and LPA are often referred to using the broad term Mixture
34
35
36
Models. The distinction between LCA and LPA is the way that groups are defined on the basis of
iew

37
38 the observed variables. In LCA, observed variables are discrete, analogous to a binomial model
39
40 (See Masyn, 2013 and Nylund-Gibson & Choi, 2018 for more information). In LPA, observed
41
42
variables are continuous, analogous to a gaussian model (Oberski, 2016). Additionally, there is
43
44
45 latent transition analysis (LTA), which is any model that includes two or more latent class or
46
47 profile constructs; these constructs can be informed by different indicators or the same indicators
48
49 measured at different timepoints as a longitudinal extension (for a worked example, see Nylund-
50
51
52 Gibson, Grimm, Quirk, & Furlong, 2014).
53
54 LPA, and the closely associated latent class analysis, are person-oriented approaches to
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 4 of 54

FINDING LATENT GROUPS 4


1
2
3 latent variable analysis in the same family of methods as cluster analysis and mixture modeling
4
5
6 (Bergman & Magnusson, 1997; Bergman, Magnusson, & El-Khouri, 2003; Collins & Lanza,
7
8 2010; Gibson, 1959; Masyn, 2013; Sterba, 2013). Researchers new to LPA may benefit from a
9
10 comparison to factor analysis methods, such as confirmatory factor analysis (CFA), as LPA also
11
12
13
uses covariance matrices to explore relationships between observed data and latent variables
14
15 (Bauer & Curran, 2004; Bergman & Magnusson, 1997; Bergman, et al., 2003; Marsh, Lüdtke,
16
17 Trautwein, & Morin, 2009; Masyn, 2013). However, where CFA uses a covariance matrix of
18
19
items to uncover latent constructs, LPA uses a matrix of individuals to uncover latent groups of
Fo
20
21
22 people. The main difference is “the common factor model decomposes the covariances to
23
rP

24 highlight relationships among variables, whereas the latent profile model decomposes the
25
26 covariances to highlight relationships among individuals” (Bauer & Curran, 2004, p. 6).
ee

27
28
29 The person-oriented approach in LPA is grounded in three arguments. First, individual
rR

30
31 differences are present and important within an effect or phenomenon. Second, these differences
32
ev

33 occur in a logical way, which can be examined through patterns. Third, a small number of
34
35
36
patterns (profiles in LPA) are meaningful and occur across individuals (Bergman & Magnusson,
iew

37
38 1997; Bergman, et al., 2003; Sterba, 2013). LPA is particularly useful for researchers in social
39
40 sciences as patterns of shared behavior between and within samples may be missed when
41
42
researchers conduct inter-individual, variable-centered analyses. For instance, if the LPA results
43
44
45 in three profiles are relatively evenly spread across the sample, it is easy to miss the possibly
46
47 meaningful differences between the profiles. LPA provides the opportunity to examine these
48
49 profiles and what predicts or is predicted by membership within the different profiles. Variable-
50
51
52 centered analyses assume the individuals within the sample all belong to a single profile or
53
54 population with no differentiation between latent subgroups.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 5 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 5


1
2
3 LPA is undertaken in multiple steps, similar to structural equation modeling (SEM)
4
5
6 analyses. In SEM, for instance, researchers will typically fit a conceptual model, check the
7
8 measurement model to support their data, fit a series of structural models to identify the best fit
9
10 of their data to their theoretical models (Bauer & Curran, 2004; Kline, 2011). For LPA, the
11
12
13
process is similar in that the researcher works through an iterative modeling process to identify
14
15 the number of profiles to retain, fits a covariate model to explore the impact of these profiles on
16
17 other variables in the study or predict profile membership (Masyn, 2013; Sterba, 2013). If the
18
19
researcher also has a categorical grouping variable that they want to compare the profile
Fo
20
21
22 structures across, they can conduct a multi-group LPA, including measurement invariance
23
rP

24 (Morin, Meyer, Creusier, & Bietry, 2016). Researchers can also test for measurement invariance
25
26 by covariate predictors by extending upon Masyn’s (2017) description of measurement
ee

27
28
29 invariance testing for differential item functioning by covariates when conducting LCA models.
rR

30
31 The overall goal of LPA is to uncover latent profiles or groups (k) of individuals (i) who
32
ev

33 share a meaningful and interpretable pattern of responses on the measures of interest (j)
34
35
36
(Bergman, et al., 2003; Marsh, et al., 2009; Masyn, 2013; Sterba, 2013). This is done using joint
iew

37
38 and marginal probabilities in within-class and between-class models. Two equations define the
39
40 within-class model:
41
42
𝑦𝑖𝑗 = 𝜇(𝑘)
𝑗 + 𝜀𝑖𝑗 (1)
43
44
45 𝜀𝑖𝑗~𝑁(0, 𝜎2(𝑘)
𝑗 ) (2)
46
47 where 𝜇(𝑘) 2(𝑘)
𝑗 is the model implied mean and 𝜎𝑗 is the model implied variance, which will vary
48
49
50 across j = 1…J outcomes and k = 1…K classes or profiles. The general assumptions of LPA
51
52 include that outcome variables are normally distributed within each class and these within-class
53
54 outcomes are locally independent (Sterba, 2013). The between-class model represents the
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 6 of 54

FINDING LATENT GROUPS 6


1
2
3 probability of membership in a given class k:
4
5 𝐾
6 𝑝(𝑐𝑖 = 𝑘) = exp (𝜔(𝑘))/∑𝑘 = 1exp (𝜔(𝑘)) (3)
7
8
9
where 𝜔(𝑘)is a multinomial intercept (fixed at 0 for the final class) and ci is the latent
10
11 classification variable for the individual. The within-class and between-class models can
12
13 therefore be combined into a single model using the law of total probability resulting in:
14
15 𝐾
16 𝑓(𝒚𝑖) = ∑𝑘 = 1𝑝(𝑐𝑖 = 𝑘)𝑓(𝒚𝑖|𝑐𝑖 = 𝑘) (4)
17
18 which is the marginal probability density function for an individual (i) after summing across the
19
Fo
20
21 joint within-class density probabilities for the J outcome variables, weighted by the probability
22
23 of class or profile membership from equation 3. Finally, the LPA analysis results in a posterior
rP

24
25 probability for each individual defined as:
26
ee

27 𝑝(𝑐𝑖 = 𝑘)𝑓(𝒚𝑖|𝑐𝑖 = 𝑘)
28 𝑡𝑖𝑘 = 𝑝(𝑐𝑖 = 𝑘│𝒚𝑖) = 𝑓𝒚𝑖 , (5)
29
rR

30
31
representing the probability of an individual (i) being assigned membership (ci) in a specific
32
class or profile (k) given their scores on the outcome variables in the yi vector. A posterior
ev

33
34
35 probability (t) is calculated for each individual in each profile, with values closer to 1.0
36
iew

37
indicating a higher probability of membership in a specific profile. The more distinction between
38
39
40 the posterior probabilities for an individual, the more certainty there is around their membership
41
42 assignment (Sterba, 2013).
43
44 Model Retention Decisions
45
46
47 As LPA is a model testing process, multiple models are fit with varying levels of classes
48
49 or profiles. The number of models to test depends on the research topic; often published LPA
50
51 studies have found the best fitting model theoretically and statistically after comparing five to six
52
53
54
models (Masyn, 2013; Tein, Coxe, & Cham, 2013). Each model is then compared against the
55
56 previous model or models to make a decision regarding the number of latent profiles in the data
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 7 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 7


1
2
3 (Christie & Masyn, 2008; Marsh, et al., 2009; Masyn, 2013). Commonly, decisions regarding
4
5
6 model retention in LPA use Bayesian Information Criterion (BIC), Sample-Adjusted BIC
7
8 (SABIC), and Akaike’s Information Criterion (AIC) (Celeux & Soromenho, 1996; Marsh, et al.,
9
10 2009; Masyn, 2013; Tein, et al., 2013). BIC is used for model selection decisions with a lower
11
12
13
BIC value representing the preferred model:
14
15 BIC(𝐾) = ― 2𝐿(𝐾) +𝑣(𝐾)ln 𝑛 (6)
16
17 with v(K) representing the number of parameters to be estimated in the model. The BIC can be
18
19
conservative, but it prefers parsimony in a model and has been shown to outperform other
Fo
20
21
22 indices with more continuous indicators (Morgan, 2014; Nylund, Asparouhov, & Muthén, 2007).
23
rP

24 An alternative is the SABIC, which adjusts the formula to account for n and is less punitive on
25
26 the number of parameters in the model (Tein, et al., 2013). SABIC has been supported as the
ee

27
28
29 most accurate information criteria index in simulation studies, particularly with smaller samples
rR

30
31 and low class separation (Kim, 2014; Morgan, 2014):
32
ev

33 SABIC(𝐾) = ― 2𝐿(𝐾) +𝑣(𝐾)ln 𝑛 ∗ ((𝑛 + 2)/24) (7)


34
35
36 Finally, AIC is an inconsistent model fit measure as it does not have strong parsimony
iew

37
38 constraints, leading to more variation in the AIC values between models. Like BIC, lower values
39
40 of AIC indicate better model fit. AIC is calculated as:
41
42
43
AIC(𝐾) = ― 2𝐿(𝐾) +2𝑣(𝐾) (8)
44
45 With BIC, SABIC, and AIC, it should be noted that while lower values indicate better fit,
46
47 lower is relative (Masyn, 2013). Therefore, attention should be given to the magnitude of
48
49
difference. Consider two very different models, both of which are being examined for goodness
50
51
52 of fit by evaluating the relative improvement produced by a 2-class structure to a 3-class
53
54 structure. If one observes a reduction in BIC values from 18,000 to 17,950 for Example A, and a
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 8 of 54

FINDING LATENT GROUPS 8


1
2
3 reduction from 800 to 750 for Example B, the absolute difference of 50 points is not equivalent
4
5
6 for both models. The relative difference from 18,000 to 17,950 is smaller than the relative
7
8 difference of 800 to 750, and therefore the change for Example A might suggest equivalency
9
10 where the change for Example B might suggest a meaningful difference. Researchers need to
11
12
13
take the context into consideration when evaluating change between models, as there is no rule
14
15 on what level of change in fit values is considered “meaningful” across the board.
16
17 Entropy, a measure of classification uncertainty, is less common as a model retention
18
19
index due to lack of support in simulation studies (Masyn, 2013; Tein, et al., 2013). However,
Fo
20
21
22 entropy as a statistical measure of uncertainty can still be useful in supporting LPA model
23
rP

24 retention as high entropy may indicate more classification uncertainty. Entropy is calculated as:
25
26 𝐾 𝑛
ee

27 E(𝐾) = ∑𝑘 = 1∑𝑖 = 1𝑡𝑖𝑘ln 𝑡𝑖𝑘 (9)


28
29 where tik represents the posterior probabilities as shown in equation 5. Entropy is, therefore, a
rR

30
31
32
measure of how well each LPA model partitions the data into profiles (Celeux & Soromenho,
ev

33
34 1996). Entropy can range from 0 to 1 with higher values representing better fit of the profiles to
35
36
iew

the data (Tein, et al., 2013). Note that this interpretation is somewhat counter-intuitive, as lower
37
38
entropy values actually represent more uncertainty or chaos in the model, which is akin to saying
39
40
41 lower values on the entropy statistic indicate more entropy (i.e. more classification uncertainty).
42
43 Values of .80 or greater provide supporting evidence that profile classification of individuals in
44
45 the model occurs with minimal uncertainty (Celeux & Soromenho, 1996; Tein, et al., 2013).
46
47
48 Additionally, the Lo, Mendell, and Rubin (LMR) test is sometimes used to compare
49
50 models, in a similar fashion to the χ2 difference test in other model testing analyses (Lo, Mendell,
51
52 & Rubin, 2001; Marsh, et al., 2009; Masyn, 2013; Tein, et al., 2013). LMR tests the likelihood
53
54
55
ratio of one model as compared to another with an adjusted asymptotic distribution instead of a
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 9 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 9


1
2
3 χ2 distribution (Lo, et al., 2001). As with the chi-square difference test, the LMR test is assessed
4
5
6 for significance across difference in degrees of freedom. It is interpreted with a significance test
7
8 in which statistical significance indicates the more parsimonious model (fewer profiles)
9
10 represents the relationships present in the data significantly worse than the less parsimonious
11
12
13
model (more profiles). In other words, the LMR test assists in determining when additional
14
15 profiles are not improving fit or discrimination of the model. Thus, a non-significant LMR test
16
17 suggests that the more parsimonious model is the better fitting and representative model.
18
19
Alternatively, the bootstrap likelihood ratio test (BLRT) can be used to evaluate the fit of
Fo
20
21
22 one model compared to a model with one less profile (k-1). BLRT uses parameter estimation
23
rP

24 methods to create multiple bootstrap samples to represent the sampling distribution (Masyn,
25
26 2013; McLachlan, 1987). A statistically significant BLRT indicates the current model is a better
ee

27
28
29 fit than a model with k-1 profiles. This approach has shown favorable results in simulation
rR

30
31 studies over the LMR test (Nylund, et al., 2007). However, both the LMR and the BLRT can
32
ev

33 suffer from never reaching a non-significant value, as the addition of parameters can represent
34
35
36
more of the information contained within the data. In such a situation, it is recommended that the
iew

37
38 log likelihood values be plotted and examined for a bend or “elbow” to determine where the
39
40 model improvement gain starts to diminish relative to the additional parameters estimated
41
42
(Masyn, 2013).
43
44
45 Finally, as in any model testing analysis, theoretical support should exist for the final
46
47 model retained, and patterns and profiles uncovered should be interpretable (Marsh, Hau, &
48
49 Wen, 2004; Marsh, et al., 2009; Masyn, 2013). Reliance upon theory and prior work to evaluate
50
51
52 the reasonableness of a model is essential to LPA to ensure the final model and underlying
53
54 profiles represent interpretable and meaningful groupings of individuals within the context of
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 10 of 54

FINDING LATENT GROUPS 10


1
2
3 prior research. While it may be possible to produce models with more profiles that produce better
4
5
6 fit, if this results in reduced distinction between profiles, ability to define and interpret those
7
8 profiles, or increases the likelihood of capturing on chance or common error variance (e.g.,
9
10 method variance) rather than true classification distinctions, then the model with more profiles is
11
12
13
not beneficial to theory, science, or practical application. Researchers should examine the
14
15 underlying profiles produced in the retained model, including examination of trend or pattern
16
17 lines of profiles, to determine when profiles may be near to one another in patterns such that
18
19
distinctness is not meaningful. Profiles or classes containing less than 5% of the sample may be
Fo
20
21
22 spurious, and the relevance of such profiles should be carefully considered and examined for
23
rP

24 interpretability and substantiveness (Marsh, et al, 2009; Masyn, 2013). Empirically, the lack of
25
26 support for a small proportion profile may come from examining the results of the k + 1 model to
ee

27
28
29 see if that profile “collapsed” or no longer appears in the results (Masyn, 2013). Lastly, the
rR

30
31 number of individuals represented by a small proportion profile should be taken into account, as
32
ev

33 a small percentage from a large sample may include sufficient individuals (n = 30-60) to support
34
35
36
generalizability (Vincent & Weir, 2012).
iew

37
38 It is worth noting that good practice includes reporting different close fitting models and
39
40 providing a detailed explanation of the decision-making process in selecting the retained model.
41
42
This detail helps those attempting to replicate the existence of established profiles since the
43
44
45 rational and decision-making process should be re-applied when possible. For example, if prior
46
47 work suggested five profiles, but the fifth was removed as a spurious minor grouping, perhaps
48
49 subsequent examinations that reveal a fifth profile would lead to the conclusion that it was not
50
51
52 spurious after all. Alternatively, good fit for only four profiles would help to confirm prior
53
54 judgment about the removal of the spurious, and non-theory fitting fifth profile. Additionally,
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 11 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 11


1
2
3 class separation can be used to assist researchers in understanding the differences between the
4
5
6 retained profiles. A profile plot of class probabilities is one way to evaluate profile separation,
7
8 and odds ratios can be used to further evaluate the differences between profiles (Masyn, 2013).
9
10 Power and sample size requirements should also be considered for planned LPA studies.
11
12
13
Power analysis in LPA is a developing area in the field with promising advances (Gudicha, 2015;
14
15 Park & Yu, 2017; Tein, et al., 2013). However, there is currently no simple formula or calculator
16
17 to estimate required sample size in LPA. The required sample size is dependent on the number of
18
19
profiles and the distance between the profiles, which is unknown in advance and can only be
Fo
20
21
22 estimated based on prior research (Tein, et al., 2013). Across 38 studies, the median sample size
23
rP

24 was found to be n = 377, while some simulation studies have suggested samples of 300 to 500
25
26 would qualify as a minimum sample (Finch & Bronk, 2011; Nylund, et al., 2007; Peugh & Fan,
ee

27
28
29 2014; Tein, et al., 2013). Readers are referred to Preacher and Coffman (2006) and MacCallum,
rR

30
31 Browne, and Sugawara (1996) for further information on power analysis in covariance modeling.
32
ev

33 Covariate Analyses
34
35
36
Following the profile retention decision in LPA is the examination of covariates to
iew

37
38 discover relationships and differences between latent groups (Clark & Muthen, 2009; Marsh, et
39
40 al., 2009; Nylund-Gibson & Masyn, 2016). Exploring relationships with covariates provides
41
42
additional information on the latent profiles and how the covariate variables may have differing
43
44
45 effects on these profiles. One key point here, as highlighted in Marsh, et al. (2009), is the
46
47 “covariates are assumed to be strictly antecedent variables” (p. 195) and have no effect on the
48
49 formation of the profiles themselves. Historically, covariates were considered antecedents and
50
51
52 not outcomes, because if the covariates were included in the model as outcomes, then they would
53
54 alter the definition of the latent classes when entered into the model (Nylund-Gibson & Masyn,
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 12 of 54

FINDING LATENT GROUPS 12


1
2
3 2016). Thus, latent classes were regressed on the covariates of interest, not the other way around;
4
5
6 which allowed for some freedom in the method of covariate inclusion, as these covariates should
7
8 not influence the LPA profile solution. When to include covariates has been debated within the
9
10 LPA/LCA literature; simulation studies have supported the preference for including covariates
11
12
13
after the original model retention decision is made (Nylund-Gibson & Masyn, 2016).
14
15 There are three methods of covariate inclusion in the current LPA literature. Method one
16
17 is to examine the profiles based on each individual’s most likely membership, and then explore
18
19
the relationship of the profiles with the covariates in a separate post hoc analysis. Using this
Fo
20
21
22 method, researchers first assign each individual to the profile for which they have the highest
23
rP

24 posterior probability. Then, the relationship between profile groups, as defined by individuals
25
26 placed into their most likely profile, and the study covariates are evaluated. This turns the latent
ee

27
28
29 profiles into a categorical grouping variable, which decreases the computational burden of other
rR

30
31 approaches. However, this option is only recommended if entropy, or the level of classification
32
ev

33 uncertainty in the model, is above 0.80 (Clark & Muthen, 2009). This is recommended because
34
35
36
profile classifications are treated as fixed, categorical variables, rather than latent profiles with
iew

37
38 flexibility (e.g., the probability associated with them). If the entropy value is less than .80 (i.e.,
39
40 classification uncertainty in the model is increased), this approach may falsely force individuals
41
42
into profiles without clear justification. Marsh, et al. (2009) and Christie and Masyn (2010)
43
44
45 provide examples of this post hoc approach to covariate analysis.
46
47 Method two is a more advanced approach to LPA covariate inclusion called the ML
48
49 three-step approach, which is also sometimes referred to as the VAM approach (Asparouhov &
50
51
52 Muthén, 2014; Vermunt, 2010). This approach follows the same first step as method one in that
53
54 the latent profiles are evaluated first. In step two, the individuals are assigned to their most likely
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 13 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 13


1
2
3 profile using the posterior probabilities provided from the initial latent profile analysis, similar to
4
5
6 method one. However, the ML three-step approach approaches differs in the estimation of the
7
8 model with the covariates included, because the profile assignment uncertainty is added into the
9
10 model syntax by using the estimated average classification errors for each profile from step two.
11
12
13
In this way, the third method accounts for the average uncertainty in profile assignment while
14
15 modeling the effects of covariates on or with the profiles (Asparouhov & Muthén, 2014;
16
17 Vermunt, 2010). For further reading on this method, readers are pointed to Asparouhov and
18
19
Muthén (2018) and Nylund-Gibson, Grimm, and Masyn (2018).
Fo
20
21
22 Finally, method three is the BCH approach, also a three-step approach (Asparouhov &
23
rP

24 Muthen, 2018; Masyn, 2013; McLarnon & O’Neill, 2018). The first step of the BCH approach is
25
26 again determining the number of latent profiles without including the covariates in the model
ee

27
28
29 (Clark & Muthen, 2009; Marsh, et al., 2009). Similar to the VAM approach described above, in
rR

30
31 the second step the participants’ individual class probabilities are used to specify their
32
ev

33 probability of membership into each latent profile. Therefore, this method includes individual
34
35
36
rather than average uncertainty in profile classification. Although this makes the BCH approach
iew

37
38 computationally complex, simulation studies have found this approach to be a relatively robust
39
40 method over method one (Clark & Muthen, 2009; Nylund-Gibson & Masyn, 2016). There are
41
42
now default commands available within Mplus for the most basic implementation of the BCH
43
44
45 approach (see syntax for example of how to implement these steps; for further details regarding
46
47 the limitations of the default command options in Mplus see Asparouhov & Muthen, 2018).
48
49 Using the BCH approach means that indicators for the profiles are present in the model with the
50
51
52 covariates during analysis as shown in Figure 1. Note in the figure how the latent profiles predict
53
54 responses on the indicators, while the antecedent covariates are predicting the profiles. The
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 14 of 54

FINDING LATENT GROUPS 14


1
2
3 example provided in the current article uses this approach.
4
5
6 The second and third methods described above are currently recommended over the post
7
8 hoc approach because they both allow for the uncertainty of profile assignment to remain in the
9
10 model while also usually being able to maintain the integrity of the profiles when the covariates
11
12
13
are added in. It is important to confirm that the profiles have not changed once the covariates are
14
15 added into the model, as such fluctuations may indicate the presence of differential item
16
17 functioning for one or more of the profile indicators in relation to the covariates (Masyn, 2017;
18
19
Nylund-Gibson, Grimm, & Masyn, 2019; Nylund-Gibson & Masyn, 2016; Osterlind & Everson,
Fo
20
21
22 2009). Great new work is being done to address this, which, to date, has shown it is possible to
23
rP

24 implement measurement invariance steps with predictor covariates that result in differential item
25
26 functioning information regarding the indicators, profiles, and covariates. To learn more about
ee

27
28
29 this approach see Masyn (2017) and Osterlind and Everson (2009). Syntax examples for each of
rR

30
31 these approaches are provided in the supplemental files for this article.
32
ev

33 INSERT FIGURE 1 ABOUT HERE


34
35
36
Worked Example
iew

37
38 A worked example of the use of LPA with observed data is provided to walk the reader
39
40 through the LPA process. Data for this example comes from a larger empirical study published
41
42
elsewhere (see Author, 2018). The data is used here only as an example to demonstrate LPA, as
43
44
45 opposed to the theoretical implications of the results. Data was collected from 295 high students
46
47 from grades 11 and 12 in one public school district in the southern United States.
48
49 LPA was used as the primary inferential analysis method for the study. All latent profile
50
51
52 analyses were conducted using Mplus 8 (Muthén & Muthén, 1998-2017) with maximum
53
54 likelihood estimation. LPA was used in two applications: first in exploring latent profiles of
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 15 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 15


1
2
3 science occupational preference, and second to examine group differences in the resulting latent
4
5
6 profiles based on study covariates. The research questions used to inform the study were:
7
8 1. Based on measures of science motivation, science attitude, science interest, and science
9
10 achievement, how are latent profiles defined to represent occupational preferences for
11
12
13
science career choice?
14
15 2. What is the magnitude of group differences in occupational preferences for science by
16
17 sex, socio-economic status, personality, and cognitive ability?
18
19
Steps of LPA
Fo
20
21
22 There are six general steps in the LPA process we will detail in this example (see Figure
23
rP

24 3). Step one, as with all analyses, the data should be cleaned for analysis and checked for
25
26 standard statistical assumptions (e.g., normality of continuous variables, independence of
ee

27
28
29 observations, etc.; see Osborne, 2012). In this example, item-level missing values were estimated
rR

30
31 prior to the analysis process using maximum likelihood estimation in Mplus. Cases where all
32
ev

33 items were missing for one composite variable (i.e., missingness at the composite level) were
34
35
36
estimated in the LPA process in Mplus utilizing full-information maximum likelihood (FIML).
iew

37
38 Participants were removed from the analysis if values on all variables in the study were missing
39
40 (n = 4). As in other latent variable analyses, missing data can be handled by FIML or multiple
41
42
imputation, depending on what is best for the scenario. Multiple imputation is recommended
43
44
45 when a large dataset is being utilized to answer different research questions, planned missing
46
47 data designs were implemented, or the computational burden for model convergence will be
48
49 lessened utilizing imputed datasets rather than maximum likelihood estimation to handle the
50
51
52 missingness while estimating the model (Baraldi & Enders, 2010). In this example, we will
53
54 highlight how to utilize FIML with Mplus, which can often address individuals’ needs with
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 16 of 54

FINDING LATENT GROUPS 16


1
2
3 smaller datasets and the use of informative auxiliary variables (Baraldi & Enders, 2010; Howard,
4
5
6 Rhemtulla, & Little, 2015). Readers are referred to Osborne (2012) and Chapter 11 of the Mplus
7
8 user’s manual (Muthén & Muthén, 1998-2017) for further introduction to missing data issues.
9
10 Also of note in the present study is the use of composite variables as opposed to item-level data
11
12
13
as indicators in the model for simplicity (i.e., reduce complexity and increase model
14
15 convergence).
16
17 Step two involves evaluating a series of hypothetically plausible iterative LPA models,
18
19
starting with a model with one profile and typically ending with a model estimating five or six
Fo
20
21
22 profiles (Masyn, 2013; Tein, et al., 2013). Mplus syntax examples are provided in the Mplus
23
rP

24 Version 8 user’s manual examples 7.9 and 7.12 (Muthén & Muthén, 1998-2017). Samples of the
25
26 basic LPA syntax can be found in Appendix A, and the syntax from this article’s example may
ee

27
28
29 be found in Appendix B. In Mplus syntax, LPA is defined in two ways. First, the CLASSES
rR

30
31 command is added after the USEVARIABLES command to specify the number of classes or
32
ev

33 profiles being estimated in the model. So for the first one-profile model, CLASSES is set to c (1)
34
35
36
as shown in the example. For a two-profile model this would be CLASSES = c (2), and so on.
iew

37
38 Second, the analysis TYPE is entered as MIXTURE, as LPA is a special type of mixture model.
39
40 Additionally, Mplus output options can be requested to support the LPA analysis process
41
42
(Muthén & Muthén, 1998-2017). TECH 1 is used for multiple analyses in Mplus as it provides
43
44
45 parameter specification and starting values for all estimated parameters in the model. This output
46
47 can be helpful in identifying estimation errors. TECH 8 provides the optimization history for
48
49 RANDOM, MIXTURE, and TWOLEVEL analyses in Mplus. TECH 11 is used to call for the
50
51
52 LMR test to compare the current model against the prior model estimated with k-1 classes or
53
54 profiles. Finally, TECH 14 calls for the bootstrap likelihood ratio test. Note that TECH 11 and
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 17 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 17


1
2
3 TECH 14 cannot be provided if there is only one class or profile in the model, so this option
4
5
6 must be removed from the syntax for the first model where only one profile is estimated (it is
7
8 provided in the one-profile example here for reference only). Additionally, in the SAVE line,
9
10 CPROBABILITIES is added to call for the class or profile probability estimates. This option
11
12
13
provides the posterior probabilities for each individual in each latent class or profile as well as
14
15 the most likely class membership for each individual in the sample. Also, note it is necessary to
16
17 change the SAVEDATA file name for each model run to avoid overwriting existing
18
19
SAVEDATA output files.
Fo
20
21
22 There are two other commands that are often used when conducting LPA that can assist
23
rP

24 in obtaining a solution in Mplus. The first is using the STARTS subcommand under ANALYSIS
25
26 to adjust upward the number of either the initial stage starts in the maximization step and the
ee

27
28
29 final stage optimizations in the likelihood step of the ML estimation (Muthén & Muthén, 1998-
rR

30
31 2017). This helps the iteration process of the ML or MLR estimators reach convergence by
32
ev

33 extending the number of attempts (Muthén & Muthén, 1998-2017). It is recommended that the
34
35
36
second value for the final stage optimizations be no more than a quarter of the initial stage starts
iew

37
38 value (Muthén, 2010; Muthén & Muthén, 1998-2017).
39
40 Another approach that can assist in obtaining a solution for a model is providing start
41
42
values. There are two ways to specify start values. One is to manually specify start values, which
43
44
45 is done in the syntax by adding an asterisk (*) after a parameter in the model followed by the
46
47 start value. For example, Y ON X*.50 instructs Mplus to use .50 as the start value for the
48
49 regression coefficient Y ON X (i.e., X predicting Y). Another option, particularly if you have
50
51
52 attempted to model a solution that did not converge, is to use the values the model estimation did
53
54 reach for the parameter estimates as start values. Mplus always provides these start values in
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 18 of 54

FINDING LATENT GROUPS 18


1
2
3 TECH 1 output, SVALUES, and, when the model does not converge, under the MODEL
4
5
6 RESULTS (Muthén & Muthén, 1998-2017). TECH1 output start values is a good section to
7
8 examine if a particular parameter seems to be nearing an out of bounds estimation that results in
9
10 non-convergence (e.g., correlation too close to 1). The SVALUES section will provide the model
11
12
13
syntax including starting values based upon the model-estimated parameter values. This syntax
14
15 can be copied and pasted into the input syntax file under the MODEL command to inform
16
17 estimation in Mplus. As with TECH1, SVALUES can be included in the OUTPUT command
18
19
line of any model. When a model does not converge, the MODEL RESULTS output is formatted
Fo
20
21
22 for Mplus syntax so that it can be copied and pasted under the MODEL command in the syntax
23
rP

24 file. To test if the model resulted from a local maxima in the estimation process, different,
25
26 random start values can be used to run the model again (see Hipp & Bauer, 2006 for further
ee

27
28
29 information). Providing start values essentially helps the ML estimation process by informing the
rR

30
31 Mplus estimator to pick-up where it left off, and hopefully reach model convergence.
32
ev

33 In the example, a latent profile analysis was used for Research Question 1 to identify the
34
35
36
presence of latent profiles on measures of science interest, motivation, attitude, and academic
iew

37
38 experiences. Model 1 was estimated with only one profile, Model 2 with two profiles, and so on
39
40 to Model 5 with five profiles. Model fit statistics are provided in Table 1.
41
42
INSERT TABLE 1 ABOUT HERE
43
44
45 Step three involves evaluating models to identify model fit and interpretability. In the
46
47 example presented, Model 4 was retained as the best model to fit the data based on the low
48
49 loglikelihood value, AIC, BIC, and SABIC values, high entropy value, non-significant LMR test,
50
51
52 the smallest class contained more than 5% of the sample, and the profiles were supported by
53
54 theory. The loglikelihood value revealed relatively large decreases until the difference between
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 19 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 19


1
2
3 Model 4 and 5. This is also true of the AIC and SABIC trend across models. BIC was marginally
4
5
6 lower for Model 4 in comparison to Model 5. All four of these statistics support models 4 and 5
7
8 as the better models. Entropy for both models 4 and 5 is above .80 and nearly the same. The
9
10 LMR and BLRT tests are significant for Model 4, which means the four-profile model is a better
11
12
13
representation than the three-profile model. However, LMR is not significant for Model 5, which
14
15 supports the more parsimonious Model 4 as a better fit than the less parsimonious Model 5. The
16
17 smallest profile in Model 4 comprises 16% (n = 44) of sample participants, whereas the smallest
18
19
profile for Model 5 comprises 2% (n = 6) of sample participants. When a small number of
Fo
20
21
22 participants from the sample are represented in a profile, as in Model 5, it is difficult to be
23
rP

24 confident the profile represents a distinct grouping that might be generalizable to other samples.
25
26 Finally, profiles from Model 5 did not align with theory as well as Model 4 profiles, which
ee

27
28
29 makes it difficult to justify and interpret (see Author, 2018).
rR

30
31 In step four of the LPA, the retained model is interpreted by examining patterns of the
32
ev

33 profiles and weights of included variables in each profile. The four-profile model for this
34
35
36
example is detailed in Table 2. The means and standard deviations of variables used to create the
iew

37
38 profiles are presented for each profile, and all were found to be statistically significant in the
39
40 model. Note that standard deviations are the same as they are constrained in Mplus by default.
41
42
The differences between the four latent groups are largely due to differences in interest,
43
44
45 motivation, and attitude towards science, which aligns with the theoretical approach used in the
46
47 study (see Figure 2).
48
49 It may be helpful in reporting LPA to provide names or labels for the profiles based on
50
51
52 the observed differences in included variables. However, researchers should be cautious in
53
54 providing names to avoid a naming fallacy, suggesting the label assigned to a profile is correct
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 20 of 54

FINDING LATENT GROUPS 20


1
2
3 and clearly understood, or a reification error suggesting the label represents a real construct
4
5
6 (Kline, 2011; Masyn, 2013). In the example, Profile 1 contains, on average, students with the
7
8 lowest level of interest in science, a more negative attitude and low motivation towards science,
9
10 and a low GPA, so was referred to as “Low Science Interest and GPA.” Profile 2 contains those
11
12
13
students with the highest interest in science, high attitude and motivation scores, and the highest
14
15 average GPA, so was called “High Science Interest and GPA”. Profile 3 students are low in
16
17 terms of science interest, but more towards the middle in science attitude and motivation, with a
18
19
GPA almost as high on average as seen in Profile 2. This profile was called “Low Science
Fo
20
21
22 Interest, High GPA.” Finally, Profile 4 contains students who are somewhat interested in science,
23
rP

24 have mid-level attitudes and motivations towards science, and a GPA lower than Profile 3 but
25
26 higher than Profile 1. This profile was referred to as “Medium Science Interest and GPA.” These
ee

27
28
29 profile names are illustrations of selecting names that are clear in the description without
rR

30
31 overstating the profile or being cumbersome. It is important that when directional words are used
32
ev

33 in naming (e.g., low, high, negative, positive) that they are accurate of not only the relative
34
35
36
relationship between the profiles in the sample but also of the absolute magnitude of the
iew

37
38 variables/profiles that they are describing. Carefulness with naming ensures accuracy and clarity
39
40 when interpreting results and when generalizing results beyond the study sample.
41
42 INSERT TABLE 2 ABOUT HERE
43
44 INSERT FIGURE 2 ABOUT HERE
45
46 Next, in step five, conduct a covariate analysis. This step should be included when: a) the
47
48 LPA analysis indicates there are profiles worth interpreting further, and b) there is a theoretical
49
50 reason to evaluate the impact of the covariates on the profiles. To add the covariates into the
51
52
53
model for concurrent analysis in Mplus, start with the syntax from step two. However, this time
54
55 add the variable information for covariates to the NAMES and USEVARIABLES lines.
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 21 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 21


1
2
3 Additionally, in the MODEL section of the syntax, specify the model relationships. For the
4
5
6 particular example included here, the two-step approach to covariate inclusion was used in the
7
8 original study (see Supplemental Files for Mplus syntax examples of all approaches). The syntax
9
10 subcommand “%OVERALL%” informs Mplus that the following lines describe the overall
11
12
13
model, as opposed to %class label%, which can be used to indicate an adjustment to class-
14
15 specific portions of the model such as constraining profiles to be equal on indicators or
16
17 estimating the variances of indicators across profiles (Muthén & Muthén, 1998-2017).
18
19
Covariates are entered into the model as predictors of c in the line “c ON X1 X2 X3”.
Fo
20
21
22 For the second research question in the example, profiles identified in Research Question
23
rP

24 1 are further analyzed to evaluate effects of theoretically identified covariates. Covariates of


25
26 interest are: personality; cognitive ability, both verbal and spatial; sex; and socio-economic
ee

27
28
29 status, as defined by parent education level and occupation. Covariates are added by regressing
rR

30
31 the latent profile construct on the covariates (e.g., c ON X1 X2 X3). As the research is focused
32
ev

33 on adolescents who would like to pursue science occupations, Profile 2 (Highest Science Interest
34
35
36
& GPA) was used as the reference group. Comparisons of all profiles against a specific profile as
iew

37
38 a reference group are provided in the Mplus syntax by default, and the decision on which profile
39
40 is used as reference can be arbitrary in an exploratory analysis, or strategic to address a specific
41
42
research question as shown here. For this example, group difference values and statistical
43
44
45 significance tests presented in the output are tests between each profile and Profile 2 based upon
46
47 the research question of the study. The results of this covariate analysis are presented in Table 3.
48
49 However, individuals could replicate this approach using more than one profile as the reference
50
51
52 profile depending on their research question.
53
54 INSERT TABLE 3 ABOUT HERE
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 22 of 54

FINDING LATENT GROUPS 22


1
2
3 Based on this analysis, some covariates produce significant differences across profiles.
4
5
6 Results suggest sex is statistically significantly different between Profile 2 (Highest Science
7
8 Interest & GPA) and Profile 4 (Medium Science Interest & GPA), with more females
9
10 represented in the high science interest profile. This difference represents a difference in
11
12
13
membership by sex, not necessarily in measurement. If measurement across genders was of
14
15 research interest or concern, then a two-group (i.e., male and female) model could have been run
16
17 to test for measurement invariance (i.e., differences in parameter values by sex) of the latent
18
19
profile construct. Vocabulary scores are also statistically significantly different between Profile 2
Fo
20
21
22 (Highest Science Interest & GPA) and both Profile 1 (Lowest Science Interest & GPA) and
23
rP

24 Profile 4 (Medium Science Interest & GPA). Negative coefficients indicate the high science
25
26 interest group tends to have higher vocabulary scores than the other two profiles. Finally,
ee

27
28
29 personality is shown to be statistically, significantly different for Profile 1 (Lowest Science
rR

30
31 Interest & GPA) as compared to Profile 2 (Highest Science Interest & GPA). A cross-tabulation
32
ev

33 demonstrates the difference is that significantly more of the students considered “Well-Adjusted”
34
35
36
fall into the high science interest profile (see Author, 2018).
iew

37
38 Step six is preparing the results for dissemination, the final step in the analysis.
39
40 Presentation of an LPA should generally follow the same sequence as the analysis. First, detail
41
42
data cleaning conducted, what assumptions are checked and the result, and how missing data are
43
44
45 handled. Include information about the software used for analysis (e.g., Mplus 8) and estimation
46
47 decisions made (e.g. changes to random starts, estimation method). Second, report steps taken to
48
49 estimate LPA models. How many models were included in the analysis, and why? Did any of the
50
51
52 models not converge? Finally, detail all decisions and problems that occurred in this step and
53
54 how any problems were addressed.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 23 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 23


1
2
3 Third, report in a table all models that were estimated and the diversity of appropriate fit
4
5
6 indices that informed the gestalt decision process. We recommend including the loglikelihood
7
8 value, AIC, BIC, SABIC, LMR test, and BLRT. Preference in model retention should be given
9
10 to SABIC, BIC, and BLRT, based on the simulation studies discussed previously, but model
11
12
13
retention decisions can be strengthened with agreement between multiple pieces of information
14
15 (Kim, 2014; Masyn, 2013; Morgan, 2014; Nylund, Asparouhov, & Muthén, 2007). Entropy may
16
17 also be reported, particularly to support the accuracy of assigning individuals to profiles for
18
19
further analysis. Reporting the percentage of the sample that is found in the smallest profile can
Fo
20
21
22 also be a useful metric to support model retention decisions. There are other fit indices and
23
rP

24 metrics that have been recommended, and these may be included as the researcher feels is
25
26 appropriate and/or as is common in a particular field’s literature (Marsh, et al., 2009; Masyn,
ee

27
28
29 2013; Nylund, et al., 2007; Tein, et al., 2013; Vermunt & Magidson, 2002). Overall, a holistic
rR

30
31 approach to reporting metrics should be used to support model retention decisions. Model
32
ev

33 retention decisions should be justified by the majority of the metrics and indices included, even if
34
35
36
some do not suggest the same model (Kline, 2011; Marsh, et al., 2004; Masyn, 2013; Nylund, et
iew

37
38 al., 2007; Schreiber, Nora, Stage, Barlow, & King, 2006).
39
40 Fourth, describe the profiles from the retained model. Pay attention to indicator variables
41
42
that appear to best differentiate between profiles and highlight the significant differences for the
43
44
45 reader. Tables and graphs can both be used to help the reader understand the relationships
46
47 between the profiles and the indicators. Naming the profiles is not necessary, but can be a helpful
48
49 way for readers to differentiate between groups and understand what indicators appear
50
51
52 meaningful. As the investigator, guide the reader in interpreting the profiles and implications for
53
54 differences observed.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 24 of 54

FINDING LATENT GROUPS 24


1
2
3 Fifth, if applicable, detail the covariate analysis used. If covariates were added to the
4
5
6 LPA model and examined concurrently or in a three-step approach, report results of this analysis.
7
8 If individuals were assigned to profiles for further analysis independent of the LPA, justify this
9
10 process and clearly detail steps taken. How were individuals assigned to a profile? Were
11
12
13
individuals not clearly assigned to one profile, and if so how was this handled?
14
15 INSERT FIGURE 3 ABOUT HERE
16
17 Conclusion
18
19
This article is intended to serve as a primer on the use of latent profile analysis (LPA) in
Fo
20
21
22 Mplus. LPA can be a very useful approach in research focused on personal behaviors and
23
rP

24 characteristics as it uses person-level indicators to sort individuals into latent profiles based on
25
26 shared response patterns. LPA can be used in any social science research context where
ee

27
28
29 individual differences and/or underlying patterns of shared behavior may be of interest. The
rR

30
31 example presented here is one such application, but many other possibilities exist for
32
ev

33 applications of LPA. LPA models of family functioning (Rybak, et al., 2017), employees’
34
35
36
commitment mindsets (Morin, Meyer, Creusier, & Bietry, 2016), adolescents’ coping strategies
iew

37
38 Aldridge & Roesch, 2008), and school climate (Eck, Johnson, Bettencourt, & Johnson, 2017) are
39
40 examples of other psychological and behavioral phenomena that have been examined.
41
42
However, researchers should also be thoughtful of the assumptions and limitations of
43
44
45 LPA before using this method in their own work. LPA assumes the underlying latent profiles are
46
47 continuous, as opposed to categorical as would be found in latent class analysis. Additionally,
48
49 variables selected for inputs and covariates should be both theoretically supported as well as
50
51
52 appropriate for LPA. Using advanced statistical methods for their own sake does not improve
53
54 research practice, but using a technique like LPA when appropriate and theoretically justified can
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 25 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 25


1
2
3 increase our understanding of phenomena and uncover previously unknown differences or latent
4
5
6 groupings in data that may be meaningful to the field. This is not intended as an exhaustive
7
8 discussion of all current issues in LPA literature, and researchers should continue to evaluate
9
10 new approaches and improvements in LPA practice when undertaking a new analysis. However,
11
12
13
this resource is intended to be a starting place with a clear guide and discussion on the use of
14
15 LPA in applied research for novice researchers and reviewers new to this approach.
16
17
18
19
Fo
20
21
22
23
rP

24
25
26
ee

27
28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 26 of 54

FINDING LATENT GROUPS 26


1
2
3 References
4
5
6 Aldridge, A. A. & Roesch, S. C. (2008). Developing coping typologies of minority adolescents:
7
8 A latent profile analysis. Journal of Adolescence, 31, 499-517. doi:
9
10 10.1016/j.adolescence.2007.08.005
11
12
13
Asparouhov, T., & Muthén, B. (2014). Auxiliary variables in mixture modeling: Three-step
14
15 approaches using M plus. Structural Equation Modeling: A Multidisciplinary
16
17 Journal, 21(3), 329-341.
18
19
Author (2018)
Fo
20
21
22 Baraldi, A. N. & Enders, C. K. (2010). An introduction to modern missing data analyses. Journal
23
rP

24 of School Psychology, 48, 5 – 37. doi :10.1016/j.jsp.2009.10.001


25
26 Bauer, D. J., & Curran, P. J. (2004). The integration of continuous and discrete latent variable
ee

27
28
29 models: Potential problems and promising opportunities. Psychological Methods, 9, 3-29.
rR

30
31 doi: 10.1037/1082-989X.9.1.3
32
ev

33 Bergman, L. R., & Magnusson, D. (1997). A person-oriented approach in research on


34
35
36
developmental psychopathology. Development and Psychopathology, 9, 291-319. doi:
iew

37
38 10.1017/S095457949700206X
39
40 Bergman, L. R., Magnusson, D., & El Khouri, B. M. (2003). Studying individual development in
41
42
an interindividual context: A person-oriented approach. Mahwah, NJ: Psychology Press.
43
44
45 Celeux, G., & Soromenho, G. (1996). An entropy criterion for assessing the number of clusters
46
47 in a mixture model. Journal of Classification, 13, 195-212.
48
49 Clark, S. L., & Muthén, B. (2009). Relating latent class analysis results to variables not included
50
51
52 in the analysis. Available online: http://hbanaszak.mjr.uw.edu.pl/TempTxt/relatinglca.pdf
53
54 Collins, L. M., & Lanza, S. T. (2013). Latent class and latent transition analysis: With
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 27 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 27


1
2
3 applications in the social, behavioral, and health sciences. Hoboken, NJ: John Wiley &
4
5
6 Sons.
7
8 Christie, C. A., & Masyn, K. E. (2008). Latent profiles of evaluators’ self-reported practices. The
9
10 Canadian Journal of Program Evaluation, 23(2), 225-254.
11
12
13
Eck, K. V., Johnson, S. R., Bettencourt, A., & Johnson, S. L. (2017). How school climate relates
14
15 to chronic absence: A multi-level latent profile analysis. Journal of School Psychology, 61,
16
17 89-102. doi: 10.1016/j.jsp.2016.10.001
18
19
Gibson, W. A. (1959). Three multivariate models: Factor analysis, latent structure analysis, and
Fo
20
21
22 latent profile analysis. Psychometrika, 24, 229-252.
23
rP

24 Gudicha, D. (2015). Power analysis methods for tests in latent class and latent Markov models
25
26 Ridderkerk, The Netherlands: Ridderprint BV.
ee

27
28
29 Henson, R. K., Hull, D. M., & Williams, C. S. (2010). Methodology in our education research
rR

30
31 culture: Toward a stronger collective quantitative proficiency. Educational
32
ev

33 Researcher, 39(3), 229-240.


34
35
36
Howard, W. J., Rhemtulla, M., & Little, T. D. (2015). Using principal components as auxiliary
iew

37
38 variables in missing data estimation. Multivariate Behavioral Research, 50(3), 285-299.
39
40 Kim, S. Y. (2014). Determining the number of latent classes in single-and multiphase growth
41
42
mixture models. Structural Equation Modeling, 21(2), 263-279. doi:
43
44
45 10.1080/10705511.2014.882690
46
47 Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd Ed.). New
48
49 York, NY: Guilford Press.
50
51
52 Lo, Y., Mendell, N. R., & Rubin, D. B. (2001). Testing the number of components in a normal
53
54 mixture. Biometrika, 88, 767-778.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 28 of 54

FINDING LATENT GROUPS 28


1
2
3 MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and
4
5
6 determination of sample size for covariance structure modeling. Psychological
7
8 methods, 1(2), 130-149.
9
10 Marsh, H. W., Hau, K. T., & Wen, Z. (2004). In search of golden rules: Comment on hypothesis-
11
12
13
testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing
14
15 Hu and Bentler's (1999) findings. Structural equation modeling, 11, 320-341. doi:
16
17 10.1207/s15328007sem1103_2
18
19
Marsh, H. W., Lüdtke, O., Trautwein, U., & Morin, A. J. (2009). Classical latent profile analysis
Fo
20
21
22 of academic self-concept dimensions: Synergy of person-and variable-centered approaches
23
rP

24 to theoretical models of self-concept. Structural Equation Modeling, 16, 191-225. doi:


25
26 10.1080/10705510902751010
ee

27
28
29 Masyn, K. E. (2013). Latent class analysis and finite mixture modeling. In T. Little (Eds), The
rR

30
31 Oxford Handbook of Quantitative Methods (551-611). New York, NY: Oxford University
32
ev

33 Press.
34
35
36
Masyn, K. E. (2017). Measurement invariance and differential item functioning in latent class
iew

37
38 analysis with stepwise multiple indicator multiple cause modeling. Structural Equation
39
40 Modeling: A Multidisciplinary Journal, 24(2), 180-197.
41
42
McLachlan, G. J. (1987). On bootstrapping the likelihood ratio test statistic for the number of
43
44
45 components in a normal mixture. Applied Statistics, 36, 318-324. doi: 10.2307/2347790
46
47 McLarnon, M. J. W. & O’Neill, T. A. (2018). Extensions of auxiliary variable approaches for the
48
49 investigation of mediation, moderation, and conditional effects in mixture models.
50
51
52 Organizational Research Methods, 21(4), 955-982. doi: 10.1177/1094428118770731.
53
54 Morgan, G. B. (2015). Mixed mode latent class analysis: An examination of fit index
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 29 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 29


1
2
3 performance for classification. Structural Equation Modeling, 22(1), 76-86. doi:
4
5
6 10.1080/10705511.2014.935751
7
8 Morin, A. J. S., Meyer, J. P., Creusier, J., & Bietry, F. (2016). Multiple-group analysis of
9
10 similarity in latent profile solutions. Organizational Research Methods, 19(2), 231-254.
11
12
13
doi: 10.1177/1094428115621148
14
15 Muthén, L. (2010, November 29). Re: Latent profile analysis [Online discussion group].
16
17 Retrieved from
18
19
http://www.statmodel.com/discussion/messages/13/115.html?1507757139
Fo
20
21
22 Muthén, L.K. and Muthén, B.O. (1998-2017). Mplus user’s guide. (8th Ed.). Los Angeles, CA:
23
rP

24 Muthén & Muthén


25
26 Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in
ee

27
28
29 latent class analysis and growth mixture modeling: A Monte Carlo simulation
rR

30
31 study. Structural Equation Modeling, 14(4), 535-569. doi: 10.1080/10705510701575396
32
ev

33 Nylund-Gibson, K. & Choi, A. Y. (2018). Ten frequently asked questions about latent class
34
35
36
analysis. Translational Issues in Psychological Sciences, 4(4), 440-461. doi:
iew

37
38 10.1037/tps0000176
39
40 Nylund-Gibson, K. & Masyn, K. E. (2016). Covariates and mixture modeling: Results of a
41
42
simulation study exploring the impact of misspecified effects on class enumeration.
43
44
45 Structural Equation Modeling, 23, 782-797. doi: 10.1080/10705511.2016.1221313
46
47 Nylund-Gibson, K., Grimm, R., Quirk, M., & Furlong, M. (2014). A latent transition mixture
48
49 model using the three-step specification. Structural Equation Modeling: A Multidisciplinary
50
51
52 Journal, 21, 1-16. doi: 10.1080/10705511.2014.915375
53
54 Oberski, D. (2016). Mixture models: Latent profile and latent class analysis. In J. Robertson and
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 30 of 54

FINDING LATENT GROUPS 30


1
2
3 M. Kaptein (Eds.), Modern Statistical Methods for HCI. Springer International Publishing:
4
5
6 Cham, Switzerland.
7
8 Osborne, J. W. (2012). Best practices in data cleaning: A complete guide to everything you need
9
10 to do before and after collecting your data. Thousand Oaks, CA: Sage.
11
12
13
Osterlind, S. J., & Everson, H. T. (2009). Differential item functioning (Vol. 161). Sage
14
15 Publications.
16
17 Park, J., & Yu, H. T. (2017). Recommendations on the sample sizes for multilevel latent class
18
19
models. Educational and Psychological Measurement, Available online:
Fo
20
21
22 http://journals.sagepub.com/doi/abs/10.1177/0013164417719111. doi:
23
rP

24 10.1177/0013164417719111.
25
26 Peugh, J., & Fan, X. (2015). Enumeration index performance in generalized growth mixture
ee

27
28
29 models: A Monte Carlo test of Muthén’s (2003) hypothesis. Structural Equation
rR

30
31 Modeling, 22(1), 115-131. doi: 10.1080/10705511.2014.919823
32
ev

33 Preacher, K. J., & Coffman, D. L. (2006, May). Computing power and minimum sample size for
34
35
36
RMSEA [Computer software]. Available from http://quantpsy.org/.
iew

37
38 Rybak, T. M., Ali, J. S., Berlin, K. S., Klages, K. L., Banks, G. G., Kamody, R. C., Ferry, R. J.,
39
40 Alemzadeh, R., & Diaz-Thomas, A. M. (2017). Patterns of family functioning and
41
42
diabetes-specific conflict in relation to glycemic control and health-related quality of life
43
44
45 among youth with Type I Diabetes. Journal of Pediatric Psychology, 42(1), 40-51. doi:
46
47 10.1093/jpepsy/jsw071
48
49 Schreiber, J. B., Nora, A., Stage, F. K., Barlow, E. A., & King, J. (2006). Reporting structural
50
51
52 equation modeling and confirmatory factor analysis results: A review. Journal of
53
54 Educational Research, 99, 323-338.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 31 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 31


1
2
3 Sterba, S. K. (2013). Understanding linkages among mixture models. Multivariate Behavioral
4
5
6 Research, 48, 775-815. doi:10.1080/00273171.2013.827564
7
8 Tein, J. Y., Coxe, S., & Cham, H. (2013). Statistical power to detect the correct number of
9
10 classes in latent profile analysis. Structural Equation Modeling, 20, 640-657. doi:
11
12
13
10.1080/10705511.2013.824781
14
15 Vermunt, J. K. (2010). Latent class modeling with covariates: Two improved three-step
16
17 approaches. Political Analysis, 18(4), 450-469.
18
19
Vermunt, J. K., & Magidson, J. (2002). Latent class cluster analysis. In J. Hagenaars, & A.
Fo
20
21
22 McCutcheon (Eds.), Applied latent class analysis. (pp. 89-106). Cambridge: Cambridge
23
rP

24 University Press.
25
26 Vincent, W. J. & Weir, J. P. (2012). Statistics in Kinesiology (4th Ed). Champaign, IL: Human
ee

27
28
29 Kinetics.
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 32 of 54

FINDING LATENT GROUPS 32


1
2
3 Table 1
4
5
6 LPA Model Fit Summary for Research Question 1
7
8 Model Log AIC BIC SABIC Entropy Smallest LMR LMR BLRT BLRT
9 likelihood Class % p-value Meaning p-value Meaning
10 1 -8909.94 17867.88 17955.96 17879.85 - - - -
11
12 2 -8507.38 17088.75 17224.54 17107.21 0.89 45% <0.001 2>1 <0.001 2>1
13
14 3 -8370.30 16840.60 17024.10 16865.54 0.90 30% 0.002 3>2 <0.001 3>2
15
16 4 -8290.15 16706.31 16937.51 16737.73 0.89 16% 0.010 4>3 <0.001 4>3
17
18
5 -8253.86 16659.72 16938.63 16697.62 0.90 2% 0.614 5<4 <0.001 5>4
19
Fo
20
21 Note. n = 295, The Lo-Mendell Ruben (LMR) test and the bootstrap likelihood ratio test (BLRT)
22
23 compare the current model to a model with k-1 profiles.
rP

24
25
26
ee

27
28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 33 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 33


1
2
3 Table 2
4
5
6 Four-Profile Model Results
7
8 Profile 3 Profile 4
9
10
Profile 1 Profile 2 Low Science Medium
11 Variable Low Science High Science Interest, High Science
12
13 Interest & GPA Interest & GPA GPA Interest & GPA
14 (n = 42) (n = 78) (n = 89) (n =77)
15
16 Number of Science Classes 3.74 (0.88) 4.31 (0.88) 3.78 (0.88) 3.71 (0.88)
17
18
GPA 2.95 (0.63) 3.77 (0.63) 3.59 (0.63) 3.11 (0.63)
19 Science Interest 3.65 (0.52) 1.32 (0.52) 3.49 (0.52) 2.00 (0.52)
Fo
20
21 Science Motivation
22 Intrinsic Motivation 10.31 (3.05) 21.03 (3.05) 15.58 (3.05) 16.91 (3.05)
23
rP

24 Career Motivation 8.18 (2.84) 23.06 (2.84) 11.27 (2.84) 16.27 (2.84)
25
Self-Determination 9.19 (3.00) 19.37 (3.00) 14.50 (3.00) 14.43 (3.00)
26
ee

27 Self-Efficacy 12.63 (3.22) 21.75 (3.22) 18.23 (3.22) 17.26 (3.22)


28
29 Grade Motivation 12.74 (3.70) 21.80 (3.70) 18.37 (3.70) 17.62 (3.70)
rR

30 Attitude Towards Science


31
32 Instrumental Value 41.36 (6.75) 59.05 (6.75) 48.80 (6.75) 51.67 (6.75)
ev

33 Academic 19.20 (5.39) 23.89 (5.39) 21.73 (5.39) 23.24 (5.39)


34
35 Difficulties & Complexities 20.15 (4.52) 21.25 (4.52) 20.88 (4.52) 22.75 (4.52)
36
iew

General School 11.23 (2.71) 9.433 (2.71) 9.54 (2.71) 10.16 (2.71)
37
38 Note. Values respresenting highest positive response in bold (Science Interest and ATS-General School
39
40 are reverse coded). Means and standard deviations for variables across all profiles: Number of Science
41 Classes M=3.78 (SD=0.90), GPA M=3.41 (SD=0.70), Science Interest M=2.00 (SD=0.50), Intrinsic
42
43 Motivation M=16.58 (SD=4.60), Career Motivation M=15.32 (SD=6.11), Self-Determination M=14.95
44
(SD=4.42), Self-Efficacy M=18.02 (SD=4.35), Grade Motivation M=18.20 (SD=4.69), Instrumental
45
46 Value M=51.18 (SD=8.90), Academic M=22.34 (SD=5.63), Difficulties & Complexities M=21.38
47
48 (SD=4.62), General School M=9.94 (SD=2.78 )
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 34 of 54

FINDING LATENT GROUPS 34


1
2
3 Table 3
4
5
6 Covariate Analysis Results for the 4-Profile Model
7
8 Profile 1 Low Profile 3 Low Profile 4 Medium
9
10 Variable Science Interest & Science Interest, Science Interest &
11
12
13 GPA (n = 42) High GPA (n = 89) GPA (n = 77)
14
15 Sex -0.96 -0.32 -1.373*
16
17 Shipley-2 Vocabulary -0.19* -0.10 -0.23*
18
19
Shiple-2 Block Patterns -0.07 -0.002 -0.05
Fo
20
21
22 Mother’s SES 0.02 -0.02 0.003
23
rP

24 Father’s SES -0.02 0.01 -0.002


25
26
Personality -1.08* -0.12 -0.556
ee

27
28
29 Note. * = p < .05, Profile 2 (High Science Interest & GPA, n = 78) served as reference group.
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 35 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 35


1
2
3
4
5
6 0
0 0 0
7
8
9
10 Interest in Science Attitudes Science
11
12
13
Science Motivation Towards Courses
14
15 Items Questionnair Science Taken and
r a1 r a2 r a3 r a4
16
17 e GPA
18 Interest Motivation Attitude Academic
19
Fo
20
21
Experienc
22
23 e
rP

24 c g1 Sex
25
26
ee

27
28
g2
29
Personality
rR

30
31
32 g3 Profile
ev

33
34
35
g4 Cognitive
36
iew

37
38 Ability
39
40
41
42 SES
43
44
45
46 Figure 1. Example of LPA model with covariates
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 36 of 54

FINDING LATENT GROUPS 36


1
2
3
4 3
5
6 2
7
8
9 1
10
11
12 0
13 Profile 1
14 -1
15 Profile 2
16 Profile 3
17 -2
18 Profile 4
19
-3
Fo
20
21
s

rm on

cy

ue

l A es
de
22
es
se

Co itud
tio

io

tio
GP

ca

ti
em Val

tu
ti
er
as

at

ch lexi
iva

De iva

va
ffi

tti
23
t

in

t
Cl

rP
In

At
al
i
-E
t

ot

p
Ca Mo

o
e

m
e

ic
nc

24
l

oo
nc

Se
te

e
ie

m
sic

er

e
ie

ad
Sc

ru

25
&
d
re
Sc

lS
lf-

ca
Gr

st
tri

es
#

ra
Se

A
In

26
In

lti

ne
ee
cu

Ge
27
ffi
Di

28
29
rR

30 Figure 2. Line graph comparing profiles on indicator variables in z-score format


31
32 Note. Profile 1 n = 42; Profile 2 n = 78; Profile 3 n = 89; Profile 4 n = 77.
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 37 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 37


1
2
3
4 Step • Data Inspection
5 One
6
7 Step
8 • Iterative Evaluation of Models
Two
9
10
11 Step • Model Fit and Interpretability
12 Three
13
14 Step • Investigation of Patterns in
15 Four Profiles
16
17
18 Step • Covariate Analysis
19 Five
Fo
20
21 Step
22 • Presentation of Results
Six
23
rP

24
25 Figure 3. Six foundational steps of latent profile analysis
26
ee

27
28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 38 of 54

FINDING LATENT GROUPS 1


1
2
3 Appendix A: Mplus Syntax Simple Example
4
5 One Profile LPA Model
6
7
TITLE: LPA generic 2 profile syntax example
8
9
10 DATA:
11 FILE IS data.dat;
12 ! Specifies file location for data file. Make sure data is in format appropriate for Mplus analysis
13 ! per Mplus manual. This data file is in individual format (one row of data per participant)
14
15
16
VARIABLE:
17 NAMES ARE Y1-Y5 X1-X3;
18 ! All variables included in data file should be named here.
19 !X1-X3 is shorthand for X1 through X3.
Fo
20 USEVARIABLES ARE Y1-Y5;
21 ! Only variables intended for use in the analysis should be listed here
22
CLASSES = c (2);
23
rP

24 ! This is where you instruct Mplus on how many classes/profiles are being estimated. Initial
25 ! model contains only one class/profile, thus it would be CLASSES = c (1). Above specifies two
26 !profiles, and for each further iterative models the number in parentheses increases by one, so
ee

27 !three profiles/classes would be c (3), and so on.


28
29 MISSING ARE .;
rR

30
! Used to communicate how missing data is coded in data file. Here shown with a “.” which is
31
32 ! all that is included in each cell with missing data in the data file
ev

33
34 ANALYSIS:
35 TYPE = MIXTURE;
36
iew

! LPA is a version of mixture modeling, and this instructs Mplus to analyze in this way
37
38
39
ESTIMATOR = MLR;
40 !FIML robust to non-normal data
41
42 STARTS = 1000 250;
43 STITERATIONS = 500;
44 ! Default number of starts for each step of the ML estimation. First STARTS value specifies the
45
!number of unique start values to start with, the 250 represents the 250 best unique start values
46
47 !carrying forward to completion. The STITERATIONS specifies the number of ML iteration
48 !steps for those 250 selected start values to go through to be able to converge. This is a!
49 !maximum number of iteration; if a model converges in less than 500 iterations it will stop
50 !before reaching 500 iterations.
51 !These values can be increased … see “Four-Profile Final Model with Covariate Analysis
52 !Syntax” for an example.
53
54 LRTSTARTS = 2 1 50 10;
55 LRTBOOSTRAP = 250;
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 39 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 2


1
2
3 !The above start values are for the defaults for the LRT statistic being run to compare the current
4
5
!model fit with the model fit of a model with one less class (k-1). The BOOTSTRAP statement
6 !specifies the number of bootstrap draws to inform Mplus’ bootstrapped LRT results.
7
8 MODEL:
9 !For a default Mplus model the LPA model does not need to be specified. However, it can be.
10 !The model can also be modified from the Mplus default of estimating the indicator means
11
(uniquely across profiles) and variances (constrained across profiles), as well as the latent
12
13
!profile mean.
14
15 %OVERALL%
16
17 [Y1-Y5]; !estimates the 5 indicators means for each profile. Without a label after the brackets,
18 !the means are freely estimated in each profile, not constrained.
19
Fo
20
21 Y1-Y5 (Var1-Var5); !Label Var1-Var5 constrains the estimates of the variances across the
22 !profiles to be equal.
23
rP

24 OUTPUT:
25 TECH1 TECH8 TECH11 TECH14;
26 ! TECH1 provides parameter specifications and starting values for the analysis
ee

27
28
! TECH8 provides optimization history for this analysis type
29 !TECH11 provides LRT results
rR

30 !TECH14 provides bootstrapped LRT test


31
32 SAVEDATA:
ev

33 FILE IS LPA2.dat;
34
! Tells Mplus where to save the output files from the analysis
35
36
SAVE = CPROBABILITIES;
iew

37 ! The above command lines are to save the most likely profile membership for each participant
38 ! and the posterior probabilities for their membership in each latent profile.
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 40 of 54

FINDING LATENT GROUPS 3


1
2
3
4
5
LPA Model with Covariate Analysis
6
7 TITLE: LPA generic syntax example with covariates
8
9 DATA:
10 FILE IS data.dat;
11
! Specifies file location for data file. Make sure data is in format appropriate for Mplus analysis
12
13
! per Mplus manual. This data file is in individual format (one row of data per participant)
14
15 VARIABLE:
16 NAMES ARE Y1-Y5 X1-X3;
17 ! All variables included in data file should be named here
18 USEVARIABLES ARE Y1-Y5 X1-X3;
19
! All variables intended for use in the analysis should be listed here. The covariate variables are
Fo
20
21 !included for this particular version of syntax though they were not included during the class
22 !enumeration steps.
23
rP

24 CLASSES = c (4);
25 ! This is where you instruct Mplus on how many classes/profiles are being estimated.
26 !For this version of covariate inclusion you would retain the best fitting model from prior
ee

27
28
!analyses without covariates
29 !In this example a model with 4 profiles/classes so CLASSES= c (4)
rR

30
31 MISSING ARE .;
32 ! Used to communicate how missing data is coded in data file, here shown with a “.” and this is
ev

33 ! included in in each cell with missing data in the data file


34
35
36
ANALYSIS:
iew

37 TYPE = MIXTURE;
38 ! LPA is a version of mixture modeling, and this instructs Mplus to analyze in this way
39 !See above model ANALYSIS syntax for additional options that can be added to improve model
40 !estimation and convergence.
41
42
MODEL:
43
44 %OVERALL%
45 c ON X1 X2 X3;
46 ! New inclusion here instructs Mplus to run the model as before and then add the covariates X1
47 ! X2 and X3 as predictors of “c” classes/profiles
48
49 OUTPUT:
50
51
TECH1 TECH8 TECH11 TECH14;
52 ! TECH1 provides parameter specifications and starting values for the analysis
53 ! TECH8 provides optimization history for this analysis type
54 ! TECH11 provides LMR test comparing this model to the previous model.
55 ! (cannot be calculated with one profile model)
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 41 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 4


1
2
3 ! TECH14 provides BLRT comparing this model to the previous model.
4
5
! (cannot be calculated with one profile model)
6
7 SAVEDATA:
8 FILE IS LPA4Cov.dat;
9 ! Tells Mplus where to save the output files from the analysis
10
11
SAVE = CPROBABILITIES;
12
13
! The above command lines are to save the most likely profile membership for each participant
14 ! and the posterior probabilities for their membership in each latent profile
15
16
17
18
19
Fo
20
21
22
23
rP

24
25
26
ee

27
28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 42 of 54

FINDING LATENT GROUPS 1


1
2
3 Appendix B: Mplus Syntax from Example Study
4
5 Enumeration Step
6
7 Two-Profile LPA Syntax 4 Classes
8 TITLE: Latent Profile Analysis Model 1, 2 Profile, Students
9
10
DATA:
11
12 FILE IS Mplus_LPA_RQ1_Final.csv;
13 ! Specifies file location for data file. Make sure data is in format appropriate for Mplus analysis
14 ! per Mplus manual. This data file is in individual format (one row of data per participant)
15
16
17 VARIABLE:
18
19
NAMES ARE ID Sex MomEd MomOcc DadEd
DadOcc ClassNum GPA Plans IPIP_N IPIP_E IPIP_O IPIP_A
Fo
20
21 IPIP_C SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
22 ATS_AP ATS_DC ATS_GS Ship_V Ship_B MomSES
23 DadSES;
rP

24 ! All variables included in data file should be named here.


25
26
ee

27
USEVARIABLES ClassNum GPA Plans SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
28 ATS_AP ATS_DC ATS_GS;
29 ! Selecting variables to use in the model (all profile indicators, no covariates at this stage)
rR

30
31 CLASSES=c(4);
32 ! This is where you instruct Mplus on how many classes/profiles are being estimated. Initial
ev

33
! model contains only one class/profile, thus it would be CLASSES = c (1). Above specifies four
34
35 !profiles, and for each further iterative models the number in parentheses increases by one.
36
iew

37 MISSING ARE .;
38 ! Used to communicate how missing data is coded in data file. Here shown with a “.” which is
39 ! all that is included in each cell with missing data in the data file
40
41
42
43 ANALYSIS:
44 TYPE=MIXTURE;
45 ! LPA is a version of mixture modeling, and this instructs Mplus to analyze in this way
46
47 ESTIMATOR = MLR;
48 !FIML robust to non-normal data
49
50
51 STARTS = 1000 250;
52 STITERATIONS = 500;
53 ! Default number of starts for each step of the ML estimation. First STARTS value specifies the
54 !number of unique start values to start with, the 250 represents the 250 best unique start values
55 !carrying forward to completion. The STITERATIONS specifies the number of ML iteration
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 43 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 2


1
2
3 !steps for those 250 selected start values to go through to be able to converge. This is a!
4
5
!maximum number of iteration; if a model converges in less than 500 iterations it will stop
6 !before reaching 500 iterations.
7 !These values can be increased … see “Four-Profile Final Model with Covariate Analysis
8 !Syntax” for an example.
9
10 LRTSTARTS = 2 1 50 10;
11 LRTBOOSTRAP = 250;
12 !The above start values are for the defaults for the LRT statistic being run to compare the current
13 !model fit with the model fit of a model with one less class (k-1). The BOOTSTRAP statement
14 !specifies the number of bootstrap draws to inform Mplus’ bootstrapped LRT results.
15
16
17
18 MODEL:
19 !For a default Mplus model the LPA model does not need to be specified. However, it can be.
Fo
20 !The model can also be modified from the Mplus default of estimating the indicator means
21 !(uniquely across profiles) and variances (constrained across profiles), as well as the latent
22
!profile mean. The syntax below specifies the Mplus defaults.
23
rP

24
25 %OVERALL%
26
ee

27 [ClassNum GPA Plans SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV


28 ATS_AP ATS_DC ATS_GS]; !estimates the indicator means for each profile. Without a label
29 !after the brackets, the means are freely estimated in each profile, not constrained.
rR

30
31
32 ClassNum GPA Plans SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
ATS_AP ATS_DC ATS_GS (Var1-Var12); !Label Var1-Var12 constrains the estimates of the
ev

33
34 !variances across the profiles to be equal.
35
36
iew

37 OUTPUT:
38
39
TECH1 TECH8 TECH11 TECH14;
40 ! TECH1 provides parameter specifications and starting values for the analysis
41 ! TECH8 provides optimization history for this analysis type
42 !TECH11 provides LRT results
43 !TECH14 provides bootstrapped LRT test
44
45
46
47 SAVEDATA:
48 FILE IS LPA1_2_FINAL.dat;
49 ! Tells Mplus where to save the output files from the analysis
50
51 SAVE = CPROBABILITIES;
52 ! The above command lines are to save the most likely profile membership for each participant
53
! and the posterior probabilities for their membership in each latent profile.
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 44 of 54

FINDING LATENT GROUPS 3


1
2
3 Appendix C: Mplus Syntax from Example Study
4
5 Four-Profile Final Model with BCH Covariate Analysis Syntax
6
7 Step 2, after enumeration phase completed is to create a file with the BCH weights and
8 covariates in the same file.
9
10 TITLE: Latent Profile Analysis Model 1, 4 Profile, Students
11
12 DATA:
13
FILE IS Mplus_LPA_RQ1_Final.csv;
14
15
16 VARIABLE:
17 NAMES ARE ID Sex MomEd MomOcc DadEd
18 DadOcc ClassNum GPA Plans IPIP_N IPIP_E IPIP_O IPIP_A
19 IPIP_C SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
Fo
20 ATS_AP ATS_DC ATS_GS Ship_V Ship_B MomSES
21
DadSES;
22
23
rP

24 USEVARIABLES ClassNum GPA Plans SM_IM SM_CM SM_SD


25 SM_SE SM_GM ATS_IV ATS_AP ATS_DC ATS_GS;
26
ee

27 CLASSES=c(4);
28 ! Four latent profiles specified, based on results of original iterative modeling process
29
rR

30
31 AUXILLARY = Sex Ship_V Ship_B MomSES DadSES PersProf;
32 ! Now, covariates are included in the AUXILIARY statement so that they will be included in the
!datafile outputted by the SAVEDATA command at the bottom of the syntax file.
ev

33
34
35
36
iew

MISSING ARE .;
37
38
39 ANALYSIS:
40 TYPE=MIXTURE;
41 STARTS=2000 500;
42 ! Added increased number of starts for each step of the ML estimation in response to message
43 ! about possible convergence issue noted in output
44
45
46 MODEL:
47 !Want to run the model that was decided upon through the enumeration phase.
48
49 !For a default Mplus model the LPA model does not need to be specified. However, it can be.
50 !The model can also be modified from the Mplus default of estimating the indicator means
51 !(uniquely across profiles) and variances (constrained across profiles), as well as the latent
52
53
!profile mean. The syntax below specifies the Mplus defaults.
54
55 %OVERALL%
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 45 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 4


1
2
3 [ClassNum GPA Plans SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
4
5
ATS_AP ATS_DC ATS_GS]; !estimates the indicator means for each profile. Without a label
6 !after the brackets, the means are freely estimated in each profile, not constrained.
7
8 ClassNum GPA Plans SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
9 ATS_AP ATS_DC ATS_GS (Var1-Var12); !Label Var1-Var12 constrains the estimates of the
10 !variances across the profiles to be equal.
11
12
13
14 OUTPUT:
15 TECH1 TECH8 TECH11 TECH 14;
16
17 SAVEDATA:
18 FILE IS LPA1_4_FINAL_Cov.dat;
19
SAVE = bchweights;
Fo
20
21 !This statement makes sure that the weights of the indicators for each of the profiles. The bch
22 !weights are based upon the “Classification Probabilities for the Most Likely Latent Class
23 !Membership (Column) by Latent Class (Row)”. These are used in the next modeling step to
rP

24 !specify the profiles so that they are not affected by the inclusion of the covariates in the model.
25
26
ee

27
28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 46 of 54

FINDING LATENT GROUPS 5


1
2
3
4
5
Mplus Syntax from Example Study
6 Four-Profile Final Model with BCH Covariate Analysis Syntax
7
8 !Step 3: Estimating covariate and profile relationships
9
10 TITLE: Latent Profile Analysis Model 1, 4 Profile, Students
11
12 DATA:
13
FILE IS LPA1_4_FINAL_Cov.dat;
14
15
!Note: the file name changed to the file outputted from the model run above in Step 2.
16
17 VARIABLE:
18 NAMES ARE ID Sex MomEd MomOcc DadEd
19 DadOcc ClassNum GPA Plans IPIP_N IPIP_E IPIP_O IPIP_A
Fo
20 IPIP_C SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
21
ATS_AP ATS_DC ATS_GS Ship_V Ship_B MomSES
22
23 DadSES W1-W4 MLC;
rP

24 !The BCH weights (W1-W4) are included at the end of the datafile.
25
26 USEVARIABLES Sex Ship_V Ship_B MomSES DadSES PersProf
ee

27 W1-W4;
28 ! Now, covariates are included in the USEVARIABLES statement. The BCH weights (W1-W4)
29
!are also included. Because the BCH weights are included, the original indicators do not need to
rR

30
31 !be included in the usevariables. The BCH weights are unique to the individual, therefore
32 !retaining the classification uncertainty present in the enumeration step in this model.
ev

33
34 CLASSES=c(4);
35 ! Four latent profiles specified, based on results of original iterative modeling process
36
iew

37
38
TRAINING = W1-W4 (bch);
39
40 MISSING ARE .;
41
42 ANALYSIS:
43 TYPE=MIXTURE;
44
STARTS=0;
45
46 ! Note. Starts now 0 because the BCH weights are specifying the classes based upon the prior
47 !model run.
48
49 MODEL:
50
51 %OVERALL%
52
53
C ON Sex Ship_V Ship_B MomSES DadSES PersProf;
54 !Latent profiles regressed on the covariate variables. These are uniquely estimated for each class.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 47 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 6


1
2
3 !If there are additional relationships (e.g., PersProf ON Sex) that should be estimated uniquely in
4
5
!one or more of the classes (i.e., not constrained across classes), then the class-specific syntax
6 !can be added below.
7
8 OUTPUT:
9 TECH1 TECH8 TECH11 TECH 14;
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
23
rP

24
25
26
ee

27
28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 48 of 54

FINDING LATENT GROUPS 7


1
2
3 Appendix D: Mplus Syntax from Example Study
4
5 Four-Profile Final Model with VAM (three-step) Covariate Analysis Syntax
6
7 Step 2, after enumeration phase completed is to create a file with the class posterior
8 probabilities along with the covariates by including them as auxiliary variables.
9
10 TITLE: Latent Profile Analysis Model 1, 4 Profile, Students
11
12 DATA:
13
FILE IS Mplus_LPA_RQ1_Final.csv;
14
15
16 VARIABLE:
17 NAMES ARE ID Sex MomEd MomOcc DadEd
18 DadOcc ClassNum GPA Plans IPIP_N IPIP_E IPIP_O IPIP_A
19 IPIP_C SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
Fo
20 ATS_AP ATS_DC ATS_GS Ship_V Ship_B MomSES
21
DadSES;
22
23
rP

24 USEVARIABLES ClassNum GPA Plans SM_IM SM_CM SM_SD


25 SM_SE SM_GM ATS_IV ATS_AP ATS_DC ATS_GS;
26
ee

27 CLASSES=c(4);
28 ! Four latent profiles specified, based on results of original iterative modeling process
29
rR

30
31 AUXILLARY = Sex Ship_V Ship_B MomSES DadSES PersProf;
32 ! Now, covariates are included in the AUXILIARY statement so that they will be included in the
!datafile outputted by the SAVEDATA command at the bottom of the syntax file.
ev

33
34
35
36
iew

MISSING ARE .;
37
38
39 ANALYSIS:
40 TYPE=MIXTURE;
41 STARTS=2000 500;
42 ! Added increased number of starts for each step of the ML estimation in response to message
43 ! about possible convergence issue noted in output
44
45
46 MODEL:
47 !Want to run the model that was decided upon through the enumeration phase.
48
49 !For a default Mplus model the LPA model does not need to be specified. However, it can be.
50 !The model can also be modified from the Mplus default of estimating the indicator means
51 !(uniquely across profiles) and variances (constrained across profiles), as well as the latent
52
53
!profile mean. The syntax below specifies the Mplus defaults.
54
55 %OVERALL%
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 49 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 8


1
2
3 [ClassNum GPA Plans SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
4
5
ATS_AP ATS_DC ATS_GS]; !estimates the indicator means for each profile. Without a label
6 !after the brackets, the means are freely estimated in each profile, not constrained.
7
8 ClassNum GPA Plans SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
9 ATS_AP ATS_DC ATS_GS (Var1-Var12); !Label Var1-Var12 constrains the estimates of the
10 !variances across the profiles to be equal.
11
12
13
14 OUTPUT:
15 TECH1 TECH8 TECH11 TECH 14;
16
17 SAVEDATA:
18 FILE IS LPA1_4_FINAL_Cov_VAM.dat;
19
SAVE = CPROB;
Fo
20
21 !This statement makes sure that the class posterior probabilities and most likely classification
22 (MODAL column) are saved along with the variables in the USE and AUXILIARY statements.
23
rP

24
25
26
ee

27
28
29
rR

30
31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 50 of 54

FINDING LATENT GROUPS 9


1
2
3
4
5
Mplus Syntax from Example Study
6 Four-Profile Final Model with VAM (three-step) Covariate Analysis Syntax
7
8 !Step 3: Estimating covariate and profile relationships
9
10 TITLE: Latent Profile Analysis Model 1, 4 Profile, Students
11
12 DATA:
13
FILE IS LPA1_4_FINAL_Cov_VAM.dat;
14
15
!Note: the file name changed to the file outputted from the model run above in Step 2.
16
17 VARIABLE:
18 NAMES ARE ID Sex MomEd MomOcc DadEd
19 DadOcc ClassNum GPA Plans IPIP_N IPIP_E IPIP_O IPIP_A
Fo
20 IPIP_C SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
21
ATS_AP ATS_DC ATS_GS Ship_V Ship_B MomSES
22
23 DadSES CPROB1 CPROB2 CPROB3 CPROB4 MODAL;
rP

24 !The there is a posterior probability for each enumerated class (in this case 4) included at the end
25 of the datafile. The values in these CPROB columns are unique to the individual, so including
26 them increases the
ee

27
28 USEVARIABLES Sex Ship_V Ship_B MomSES DadSES PersProf
29
MODAL;
rR

30
31 ! Now, covariates are included in the USEVARIABLES statement. The MODAL is also
32 !included, because it provides the classification for each individual in the dataset (i.e., class
!assignment to class 1, 2, 3, or 4 in this example).
ev

33
34
35 NOMINAL ARE MODAL; !Necessary, because the MODAL variable is nominal.
36
iew

37
38
CLASSES=c(4);
39 ! Four latent profiles specified, based on results of original iterative modeling process
40
41 MISSING ARE .;
42
43 ANALYSIS:
44
TYPE=MIXTURE;
45
46 STARTS=0;
47 ! Note. Starts now 0 because the BCH weights are specifying the classes based upon the prior
48 !model run.
49
50 MODEL:
51
52
53
%OVERALL%
54
55 C ON Sex Ship_V Ship_B MomSES DadSES PersProf;
56 !General statement to regression a variable onto the latent profile.
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 51 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 10


1
2
3
4
5
!Below is how the same regressions can be specified if the researcher wants to label the unique
6 !regressions by class to use the labels later in either MODEL CONSTRAINT or MODEL TEST.
7 !C#1 ON Sex (reg11);
8 !C#1 ON Ship_V (reg12);
9 !C#1 ON Ship_B (reg13);
10 !C#1 ON MomSES (reg14);
11
!C#1 ON DadSES (reg15);
12
13
!C#1 ON PersProf (reg16);
14
15 !C#2 ON Sex (reg21);
16 !C#2 ON Ship_V (reg22);
17 !C#2 ON Ship_B (reg23);
18 !C#2 ON MomSES (reg24);
19
!C#2 ON DadSES (reg25);
Fo
20
21 !C#2 ON PersProf (reg26);
22
23 !C#3 ON Sex (reg31);
rP

24 !C#3 ON Ship_V (reg32);


25 !C#3 ON Ship_B (reg33);
26 !C#3 ON MomSES (reg34);
ee

27
28
!C#3 ON DadSES (reg35);
29 !C#3 ON PersProf (reg36);
rR

30
31 !Latent profiles regressed on the covariate variables. These are uniquely estimated for each class.
32 !These statements are included for k-1 classes in the syntax as the last class is the reference class.
ev

33 !If there are additional relationships (e.g., PersProf ON Sex) that should be estimated uniquely in
34
!one or more of the classes (i.e., not constrained across classes) with the rest of the class-specific
35
36
!syntax below.
iew

37
38 %C#1%
39 [MODAL#1@ ];
40 [MODAL#2@ ];
41 [MODAL#3@ ];
42
43
44 [Sex] (mean11);
45 [Ship_V] (mean12);
46 [Ship_B] (mean13);
47 [MomSES] (mean14);
48 [DadSES] (mean15);
49 [PersProf] (mean16);
50
51
52 Sex;
53 Ship_V;
54 Ship_B;
55 MomSES;
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 52 of 54

FINDING LATENT GROUPS 11


1
2
3 DadSES;
4
5
PersProf;
6
7
8 %C#2%
9 [MODAL#1@ ];
10 [MODAL#2@ ];
11
[MODAL#3@ ];
12
13
14 [Sex] (mean21);
15 [Ship_V] (mean22);
16 [Ship_B] (mean23);
17 [MomSES] (mean24);
18 [DadSES] (mean25);
19
[PersProf] (mean26);
Fo
20
21
22 Sex;
23 Ship_V;
rP

24 Ship_B;
25 MomSES;
26 DadSES;
ee

27
28
PersProf;
29
rR

30
31 %C#3%
32 [MODAL#1@ ];
ev

33 [MODAL#2@ ];
34
[MODAL#3@ ];
35
36
iew

37 [Sex] (mean31);
38 [Ship_V] (mean32);
39 [Ship_B] (mean33);
40 [MomSES] (mean34);
41 [DadSES] (mean35);
42
[PersProf] (mean36);
43
44
45 Sex;
46 Ship_V;
47 Ship_B;
48 MomSES;
49 DadSES;
50
51
PersProf;
52
53
54 %C#4%
55 [MODAL#1@ ];
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 53 of 54 International Journal of Behavioral Development

FINDING LATENT GROUPS 12


1
2
3 [MODAL#2@ ];
4
5
[MODAL#3@ ];
6
7 [Sex] (mean41);
8 [Ship_V] (mean42);
9 [Ship_B] (mean43);
10 [MomSES] (mean44);
11
[DadSES] (mean45);
12
13
[PersProf] (mean46);
14
15 Sex;
16 Ship_V;
17 Ship_B;
18 MomSES;
19
DadSES;
Fo
20
21 PersProf;
22
23 !The means and variances are uniquely estimated in each class for each variable of interest. If
rP

24 !there were specific relationships to model between any of these variables, the regressions would
25 !be included in the class-specific syntax. Unique labels can be used in MODEL CONSTRAINT
26 !and/or MODEL TEST commands as is done with other model types to test class differences and
ee

27
28
!indirect effects.
29
rR

30 MODEL TEST:
31 mean11 = mean21;
32 mean11 = mean31;
ev

33 mean11 = mean41;
34
mean21 = mean31;
35
36
mean21 = mean41;
iew

37 mean31 = mean41;
38
39 mean12 = mean22;
40 mean12 = mean32;
41 mean12 = mean42;
42
mean22 = mean32;
43
44 mean22 = mean42;
45 mean32 = mean42;
46
47 mean13 = mean23;
48 mean13 = mean33;
49 mean13 = mean43;
50
51
mean23 = mean33;
52 mean23 = mean43;
53 mean33 = mean43;
54
55 mean14 = mean24;
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 54 of 54

FINDING LATENT GROUPS 13


1
2
3 mean14 = mean34;
4
5
mean14 = mean44;
6 mean24 = mean34;
7 mean24 = mean44;
8 mean34 = mean44;
9
10 mean15 = mean25;
11
mean15 = mean35;
12
13
mean15 = mean45;
14 mean25 = mean35;
15 mean25 = mean45;
16 mean35 = mean45;
17
18 mean16 = mean26;
19
mean16 = mean36;
Fo
20
21 mean16 = mean46;
22 mean26 = mean36;
23 mean26 = mean46;
rP

24 mean36 = mean46;
25
26 !The above syntax is directly testing the means of the variables in the different classes
ee

27
28
29 OUTPUT:
rR

30 TECH1 TECH8 TECH11 TECH 14;


31
32
ev

33
34
35
36
iew

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd

You might also like