Finding Latent Groups in Observed Data
Finding Latent Groups in Observed Data
11-22-2019
E. Whitney G. Moore
Wayne State University, [email protected]
Darrell M. Hull
University of North Texas
Part of the Education Commons, Kinesiology Commons, and the Sports Sciences Commons
Recommended Citation
Ferguson, S. L., Moore, E. W. G., & Hull, D. M. (2020). Finding latent groups in observed data: A primer on
latent profile analysis in Mplus for applied researchers. International Journal of Behavioral Development,
44(5), 458-468. DOI: 10.1177/0165025419881721
This Article is brought to you for free and open access by the College of Education at
DigitalCommons@WayneState. It has been accepted for inclusion in Kinesiology, Health and Sport Studies by an
authorized administrator of DigitalCommons@WayneState.
International Journal of Behavioral Development
Manuscript ID JBD-2019-05-3637.R1
researchers familiar with some latent variable modeling but not LPA
specifically. A general procedure for conducting LPA is provided in six
Abstract: steps: (a) data inspection, (b) iterative evaluation of models, (c) model
fit and interpretability, (d) investigation of patterns of profiles in a
ee
http://mc.manuscriptcentral.com/ijbd
Page 1 of 54 International Journal of Behavioral Development
24
25
26
ee
27
28
29
rR
30
31
32
ev
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 2 of 54
24
25 but a unified guide for understanding and utilizing LPA with clear steps and examples
26
ee
27 specifically targeted for applied research does not currently appear in the literature. This paper is
28
29 therefore intended for researchers already familiar with latent variable modeling, though not
rR
30
31
32 mixture modeling or LPA specifically. A general discussion of LPA will be presented followed
ev
33
34 by a worked example using data collected from a prior research study (Author, 2018). Syntax is
35
36
iew
also provided to conduct LPA in Mplus with a discussion of common options and additions for
37
38
39
this program. The paper concludes with a discussion of LPA reporting practices, highlighting the
40
41 primary information needed to report an LPA in applied research.
42
43 This work is intended predominately for applied researchers who are interested in
44
45
exploring the applicability of LPA to their own research. Students still developing their methods
46
47
48 and analysis expertise may also benefit from this introduction to LPA, as well as reviewers of
49
50 research needing a brief introduction to the decision points and reporting practices of LPA. It
51
52 must be noted that this is a primer on LPA and the foundational functions and processes of this
53
54
55 analysis. As with any general introduction to a complex analysis, a balance is sought in this
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 3 of 54 International Journal of Behavioral Development
24 recovering hidden groups in data by obtaining the probability that individuals belong to different
25
26 groups. This occurs through examination of the distributions of groups in the data and
ee
27
28
29 determining if those distributions are meaningful. It might be helpful to think of these groups,
rR
30
31 whether they are classes of people or profiles of people as unobserved latent mixture
32
ev
33 components. Indeed, both LCA and LPA are often referred to using the broad term Mixture
34
35
36
Models. The distinction between LCA and LPA is the way that groups are defined on the basis of
iew
37
38 the observed variables. In LCA, observed variables are discrete, analogous to a binomial model
39
40 (See Masyn, 2013 and Nylund-Gibson & Choi, 2018 for more information). In LPA, observed
41
42
variables are continuous, analogous to a gaussian model (Oberski, 2016). Additionally, there is
43
44
45 latent transition analysis (LTA), which is any model that includes two or more latent class or
46
47 profile constructs; these constructs can be informed by different indicators or the same indicators
48
49 measured at different timepoints as a longitudinal extension (for a worked example, see Nylund-
50
51
52 Gibson, Grimm, Quirk, & Furlong, 2014).
53
54 LPA, and the closely associated latent class analysis, are person-oriented approaches to
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 4 of 54
24 highlight relationships among variables, whereas the latent profile model decomposes the
25
26 covariances to highlight relationships among individuals” (Bauer & Curran, 2004, p. 6).
ee
27
28
29 The person-oriented approach in LPA is grounded in three arguments. First, individual
rR
30
31 differences are present and important within an effect or phenomenon. Second, these differences
32
ev
33 occur in a logical way, which can be examined through patterns. Third, a small number of
34
35
36
patterns (profiles in LPA) are meaningful and occur across individuals (Bergman & Magnusson,
iew
37
38 1997; Bergman, et al., 2003; Sterba, 2013). LPA is particularly useful for researchers in social
39
40 sciences as patterns of shared behavior between and within samples may be missed when
41
42
researchers conduct inter-individual, variable-centered analyses. For instance, if the LPA results
43
44
45 in three profiles are relatively evenly spread across the sample, it is easy to miss the possibly
46
47 meaningful differences between the profiles. LPA provides the opportunity to examine these
48
49 profiles and what predicts or is predicted by membership within the different profiles. Variable-
50
51
52 centered analyses assume the individuals within the sample all belong to a single profile or
53
54 population with no differentiation between latent subgroups.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 5 of 54 International Journal of Behavioral Development
24 (Morin, Meyer, Creusier, & Bietry, 2016). Researchers can also test for measurement invariance
25
26 by covariate predictors by extending upon Masyn’s (2017) description of measurement
ee
27
28
29 invariance testing for differential item functioning by covariates when conducting LCA models.
rR
30
31 The overall goal of LPA is to uncover latent profiles or groups (k) of individuals (i) who
32
ev
33 share a meaningful and interpretable pattern of responses on the measures of interest (j)
34
35
36
(Bergman, et al., 2003; Marsh, et al., 2009; Masyn, 2013; Sterba, 2013). This is done using joint
iew
37
38 and marginal probabilities in within-class and between-class models. Two equations define the
39
40 within-class model:
41
42
𝑦𝑖𝑗 = 𝜇(𝑘)
𝑗 + 𝜀𝑖𝑗 (1)
43
44
45 𝜀𝑖𝑗~𝑁(0, 𝜎2(𝑘)
𝑗 ) (2)
46
47 where 𝜇(𝑘) 2(𝑘)
𝑗 is the model implied mean and 𝜎𝑗 is the model implied variance, which will vary
48
49
50 across j = 1…J outcomes and k = 1…K classes or profiles. The general assumptions of LPA
51
52 include that outcome variables are normally distributed within each class and these within-class
53
54 outcomes are locally independent (Sterba, 2013). The between-class model represents the
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 6 of 54
24
25 probability for each individual defined as:
26
ee
27 𝑝(𝑐𝑖 = 𝑘)𝑓(𝒚𝑖|𝑐𝑖 = 𝑘)
28 𝑡𝑖𝑘 = 𝑝(𝑐𝑖 = 𝑘│𝒚𝑖) = 𝑓𝒚𝑖 , (5)
29
rR
30
31
representing the probability of an individual (i) being assigned membership (ci) in a specific
32
class or profile (k) given their scores on the outcome variables in the yi vector. A posterior
ev
33
34
35 probability (t) is calculated for each individual in each profile, with values closer to 1.0
36
iew
37
indicating a higher probability of membership in a specific profile. The more distinction between
38
39
40 the posterior probabilities for an individual, the more certainty there is around their membership
41
42 assignment (Sterba, 2013).
43
44 Model Retention Decisions
45
46
47 As LPA is a model testing process, multiple models are fit with varying levels of classes
48
49 or profiles. The number of models to test depends on the research topic; often published LPA
50
51 studies have found the best fitting model theoretically and statistically after comparing five to six
52
53
54
models (Masyn, 2013; Tein, Coxe, & Cham, 2013). Each model is then compared against the
55
56 previous model or models to make a decision regarding the number of latent profiles in the data
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 7 of 54 International Journal of Behavioral Development
24 An alternative is the SABIC, which adjusts the formula to account for n and is less punitive on
25
26 the number of parameters in the model (Tein, et al., 2013). SABIC has been supported as the
ee
27
28
29 most accurate information criteria index in simulation studies, particularly with smaller samples
rR
30
31 and low class separation (Kim, 2014; Morgan, 2014):
32
ev
37
38 constraints, leading to more variation in the AIC values between models. Like BIC, lower values
39
40 of AIC indicate better model fit. AIC is calculated as:
41
42
43
AIC(𝐾) = ― 2𝐿(𝐾) +2𝑣(𝐾) (8)
44
45 With BIC, SABIC, and AIC, it should be noted that while lower values indicate better fit,
46
47 lower is relative (Masyn, 2013). Therefore, attention should be given to the magnitude of
48
49
difference. Consider two very different models, both of which are being examined for goodness
50
51
52 of fit by evaluating the relative improvement produced by a 2-class structure to a 3-class
53
54 structure. If one observes a reduction in BIC values from 18,000 to 17,950 for Example A, and a
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 8 of 54
24 retention as high entropy may indicate more classification uncertainty. Entropy is calculated as:
25
26 𝐾 𝑛
ee
30
31
32
measure of how well each LPA model partitions the data into profiles (Celeux & Soromenho,
ev
33
34 1996). Entropy can range from 0 to 1 with higher values representing better fit of the profiles to
35
36
iew
the data (Tein, et al., 2013). Note that this interpretation is somewhat counter-intuitive, as lower
37
38
entropy values actually represent more uncertainty or chaos in the model, which is akin to saying
39
40
41 lower values on the entropy statistic indicate more entropy (i.e. more classification uncertainty).
42
43 Values of .80 or greater provide supporting evidence that profile classification of individuals in
44
45 the model occurs with minimal uncertainty (Celeux & Soromenho, 1996; Tein, et al., 2013).
46
47
48 Additionally, the Lo, Mendell, and Rubin (LMR) test is sometimes used to compare
49
50 models, in a similar fashion to the χ2 difference test in other model testing analyses (Lo, Mendell,
51
52 & Rubin, 2001; Marsh, et al., 2009; Masyn, 2013; Tein, et al., 2013). LMR tests the likelihood
53
54
55
ratio of one model as compared to another with an adjusted asymptotic distribution instead of a
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 9 of 54 International Journal of Behavioral Development
24 methods to create multiple bootstrap samples to represent the sampling distribution (Masyn,
25
26 2013; McLachlan, 1987). A statistically significant BLRT indicates the current model is a better
ee
27
28
29 fit than a model with k-1 profiles. This approach has shown favorable results in simulation
rR
30
31 studies over the LMR test (Nylund, et al., 2007). However, both the LMR and the BLRT can
32
ev
33 suffer from never reaching a non-significant value, as the addition of parameters can represent
34
35
36
more of the information contained within the data. In such a situation, it is recommended that the
iew
37
38 log likelihood values be plotted and examined for a bend or “elbow” to determine where the
39
40 model improvement gain starts to diminish relative to the additional parameters estimated
41
42
(Masyn, 2013).
43
44
45 Finally, as in any model testing analysis, theoretical support should exist for the final
46
47 model retained, and patterns and profiles uncovered should be interpretable (Marsh, Hau, &
48
49 Wen, 2004; Marsh, et al., 2009; Masyn, 2013). Reliance upon theory and prior work to evaluate
50
51
52 the reasonableness of a model is essential to LPA to ensure the final model and underlying
53
54 profiles represent interpretable and meaningful groupings of individuals within the context of
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 10 of 54
24 interpretability and substantiveness (Marsh, et al, 2009; Masyn, 2013). Empirically, the lack of
25
26 support for a small proportion profile may come from examining the results of the k + 1 model to
ee
27
28
29 see if that profile “collapsed” or no longer appears in the results (Masyn, 2013). Lastly, the
rR
30
31 number of individuals represented by a small proportion profile should be taken into account, as
32
ev
33 a small percentage from a large sample may include sufficient individuals (n = 30-60) to support
34
35
36
generalizability (Vincent & Weir, 2012).
iew
37
38 It is worth noting that good practice includes reporting different close fitting models and
39
40 providing a detailed explanation of the decision-making process in selecting the retained model.
41
42
This detail helps those attempting to replicate the existence of established profiles since the
43
44
45 rational and decision-making process should be re-applied when possible. For example, if prior
46
47 work suggested five profiles, but the fifth was removed as a spurious minor grouping, perhaps
48
49 subsequent examinations that reveal a fifth profile would lead to the conclusion that it was not
50
51
52 spurious after all. Alternatively, good fit for only four profiles would help to confirm prior
53
54 judgment about the removal of the spurious, and non-theory fitting fifth profile. Additionally,
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 11 of 54 International Journal of Behavioral Development
24 was found to be n = 377, while some simulation studies have suggested samples of 300 to 500
25
26 would qualify as a minimum sample (Finch & Bronk, 2011; Nylund, et al., 2007; Peugh & Fan,
ee
27
28
29 2014; Tein, et al., 2013). Readers are referred to Preacher and Coffman (2006) and MacCallum,
rR
30
31 Browne, and Sugawara (1996) for further information on power analysis in covariance modeling.
32
ev
33 Covariate Analyses
34
35
36
Following the profile retention decision in LPA is the examination of covariates to
iew
37
38 discover relationships and differences between latent groups (Clark & Muthen, 2009; Marsh, et
39
40 al., 2009; Nylund-Gibson & Masyn, 2016). Exploring relationships with covariates provides
41
42
additional information on the latent profiles and how the covariate variables may have differing
43
44
45 effects on these profiles. One key point here, as highlighted in Marsh, et al. (2009), is the
46
47 “covariates are assumed to be strictly antecedent variables” (p. 195) and have no effect on the
48
49 formation of the profiles themselves. Historically, covariates were considered antecedents and
50
51
52 not outcomes, because if the covariates were included in the model as outcomes, then they would
53
54 alter the definition of the latent classes when entered into the model (Nylund-Gibson & Masyn,
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 12 of 54
24 posterior probability. Then, the relationship between profile groups, as defined by individuals
25
26 placed into their most likely profile, and the study covariates are evaluated. This turns the latent
ee
27
28
29 profiles into a categorical grouping variable, which decreases the computational burden of other
rR
30
31 approaches. However, this option is only recommended if entropy, or the level of classification
32
ev
33 uncertainty in the model, is above 0.80 (Clark & Muthen, 2009). This is recommended because
34
35
36
profile classifications are treated as fixed, categorical variables, rather than latent profiles with
iew
37
38 flexibility (e.g., the probability associated with them). If the entropy value is less than .80 (i.e.,
39
40 classification uncertainty in the model is increased), this approach may falsely force individuals
41
42
into profiles without clear justification. Marsh, et al. (2009) and Christie and Masyn (2010)
43
44
45 provide examples of this post hoc approach to covariate analysis.
46
47 Method two is a more advanced approach to LPA covariate inclusion called the ML
48
49 three-step approach, which is also sometimes referred to as the VAM approach (Asparouhov &
50
51
52 Muthén, 2014; Vermunt, 2010). This approach follows the same first step as method one in that
53
54 the latent profiles are evaluated first. In step two, the individuals are assigned to their most likely
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 13 of 54 International Journal of Behavioral Development
24 Muthen, 2018; Masyn, 2013; McLarnon & O’Neill, 2018). The first step of the BCH approach is
25
26 again determining the number of latent profiles without including the covariates in the model
ee
27
28
29 (Clark & Muthen, 2009; Marsh, et al., 2009). Similar to the VAM approach described above, in
rR
30
31 the second step the participants’ individual class probabilities are used to specify their
32
ev
33 probability of membership into each latent profile. Therefore, this method includes individual
34
35
36
rather than average uncertainty in profile classification. Although this makes the BCH approach
iew
37
38 computationally complex, simulation studies have found this approach to be a relatively robust
39
40 method over method one (Clark & Muthen, 2009; Nylund-Gibson & Masyn, 2016). There are
41
42
now default commands available within Mplus for the most basic implementation of the BCH
43
44
45 approach (see syntax for example of how to implement these steps; for further details regarding
46
47 the limitations of the default command options in Mplus see Asparouhov & Muthen, 2018).
48
49 Using the BCH approach means that indicators for the profiles are present in the model with the
50
51
52 covariates during analysis as shown in Figure 1. Note in the figure how the latent profiles predict
53
54 responses on the indicators, while the antecedent covariates are predicting the profiles. The
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 14 of 54
24 implement measurement invariance steps with predictor covariates that result in differential item
25
26 functioning information regarding the indicators, profiles, and covariates. To learn more about
ee
27
28
29 this approach see Masyn (2017) and Osterlind and Everson (2009). Syntax examples for each of
rR
30
31 these approaches are provided in the supplemental files for this article.
32
ev
37
38 A worked example of the use of LPA with observed data is provided to walk the reader
39
40 through the LPA process. Data for this example comes from a larger empirical study published
41
42
elsewhere (see Author, 2018). The data is used here only as an example to demonstrate LPA, as
43
44
45 opposed to the theoretical implications of the results. Data was collected from 295 high students
46
47 from grades 11 and 12 in one public school district in the southern United States.
48
49 LPA was used as the primary inferential analysis method for the study. All latent profile
50
51
52 analyses were conducted using Mplus 8 (Muthén & Muthén, 1998-2017) with maximum
53
54 likelihood estimation. LPA was used in two applications: first in exploring latent profiles of
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 15 of 54 International Journal of Behavioral Development
24 3). Step one, as with all analyses, the data should be cleaned for analysis and checked for
25
26 standard statistical assumptions (e.g., normality of continuous variables, independence of
ee
27
28
29 observations, etc.; see Osborne, 2012). In this example, item-level missing values were estimated
rR
30
31 prior to the analysis process using maximum likelihood estimation in Mplus. Cases where all
32
ev
33 items were missing for one composite variable (i.e., missingness at the composite level) were
34
35
36
estimated in the LPA process in Mplus utilizing full-information maximum likelihood (FIML).
iew
37
38 Participants were removed from the analysis if values on all variables in the study were missing
39
40 (n = 4). As in other latent variable analyses, missing data can be handled by FIML or multiple
41
42
imputation, depending on what is best for the scenario. Multiple imputation is recommended
43
44
45 when a large dataset is being utilized to answer different research questions, planned missing
46
47 data designs were implemented, or the computational burden for model convergence will be
48
49 lessened utilizing imputed datasets rather than maximum likelihood estimation to handle the
50
51
52 missingness while estimating the model (Baraldi & Enders, 2010). In this example, we will
53
54 highlight how to utilize FIML with Mplus, which can often address individuals’ needs with
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 16 of 54
24 Version 8 user’s manual examples 7.9 and 7.12 (Muthén & Muthén, 1998-2017). Samples of the
25
26 basic LPA syntax can be found in Appendix A, and the syntax from this article’s example may
ee
27
28
29 be found in Appendix B. In Mplus syntax, LPA is defined in two ways. First, the CLASSES
rR
30
31 command is added after the USEVARIABLES command to specify the number of classes or
32
ev
33 profiles being estimated in the model. So for the first one-profile model, CLASSES is set to c (1)
34
35
36
as shown in the example. For a two-profile model this would be CLASSES = c (2), and so on.
iew
37
38 Second, the analysis TYPE is entered as MIXTURE, as LPA is a special type of mixture model.
39
40 Additionally, Mplus output options can be requested to support the LPA analysis process
41
42
(Muthén & Muthén, 1998-2017). TECH 1 is used for multiple analyses in Mplus as it provides
43
44
45 parameter specification and starting values for all estimated parameters in the model. This output
46
47 can be helpful in identifying estimation errors. TECH 8 provides the optimization history for
48
49 RANDOM, MIXTURE, and TWOLEVEL analyses in Mplus. TECH 11 is used to call for the
50
51
52 LMR test to compare the current model against the prior model estimated with k-1 classes or
53
54 profiles. Finally, TECH 14 calls for the bootstrap likelihood ratio test. Note that TECH 11 and
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 17 of 54 International Journal of Behavioral Development
24 in obtaining a solution in Mplus. The first is using the STARTS subcommand under ANALYSIS
25
26 to adjust upward the number of either the initial stage starts in the maximization step and the
ee
27
28
29 final stage optimizations in the likelihood step of the ML estimation (Muthén & Muthén, 1998-
rR
30
31 2017). This helps the iteration process of the ML or MLR estimators reach convergence by
32
ev
33 extending the number of attempts (Muthén & Muthén, 1998-2017). It is recommended that the
34
35
36
second value for the final stage optimizations be no more than a quarter of the initial stage starts
iew
37
38 value (Muthén, 2010; Muthén & Muthén, 1998-2017).
39
40 Another approach that can assist in obtaining a solution for a model is providing start
41
42
values. There are two ways to specify start values. One is to manually specify start values, which
43
44
45 is done in the syntax by adding an asterisk (*) after a parameter in the model followed by the
46
47 start value. For example, Y ON X*.50 instructs Mplus to use .50 as the start value for the
48
49 regression coefficient Y ON X (i.e., X predicting Y). Another option, particularly if you have
50
51
52 attempted to model a solution that did not converge, is to use the values the model estimation did
53
54 reach for the parameter estimates as start values. Mplus always provides these start values in
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 18 of 54
24 file. To test if the model resulted from a local maxima in the estimation process, different,
25
26 random start values can be used to run the model again (see Hipp & Bauer, 2006 for further
ee
27
28
29 information). Providing start values essentially helps the ML estimation process by informing the
rR
30
31 Mplus estimator to pick-up where it left off, and hopefully reach model convergence.
32
ev
33 In the example, a latent profile analysis was used for Research Question 1 to identify the
34
35
36
presence of latent profiles on measures of science interest, motivation, attitude, and academic
iew
37
38 experiences. Model 1 was estimated with only one profile, Model 2 with two profiles, and so on
39
40 to Model 5 with five profiles. Model fit statistics are provided in Table 1.
41
42
INSERT TABLE 1 ABOUT HERE
43
44
45 Step three involves evaluating models to identify model fit and interpretability. In the
46
47 example presented, Model 4 was retained as the best model to fit the data based on the low
48
49 loglikelihood value, AIC, BIC, and SABIC values, high entropy value, non-significant LMR test,
50
51
52 the smallest class contained more than 5% of the sample, and the profiles were supported by
53
54 theory. The loglikelihood value revealed relatively large decreases until the difference between
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 19 of 54 International Journal of Behavioral Development
24 confident the profile represents a distinct grouping that might be generalizable to other samples.
25
26 Finally, profiles from Model 5 did not align with theory as well as Model 4 profiles, which
ee
27
28
29 makes it difficult to justify and interpret (see Author, 2018).
rR
30
31 In step four of the LPA, the retained model is interpreted by examining patterns of the
32
ev
33 profiles and weights of included variables in each profile. The four-profile model for this
34
35
36
example is detailed in Table 2. The means and standard deviations of variables used to create the
iew
37
38 profiles are presented for each profile, and all were found to be statistically significant in the
39
40 model. Note that standard deviations are the same as they are constrained in Mplus by default.
41
42
The differences between the four latent groups are largely due to differences in interest,
43
44
45 motivation, and attitude towards science, which aligns with the theoretical approach used in the
46
47 study (see Figure 2).
48
49 It may be helpful in reporting LPA to provide names or labels for the profiles based on
50
51
52 the observed differences in included variables. However, researchers should be cautious in
53
54 providing names to avoid a naming fallacy, suggesting the label assigned to a profile is correct
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 20 of 54
24 have mid-level attitudes and motivations towards science, and a GPA lower than Profile 3 but
25
26 higher than Profile 1. This profile was referred to as “Medium Science Interest and GPA.” These
ee
27
28
29 profile names are illustrations of selecting names that are clear in the description without
rR
30
31 overstating the profile or being cumbersome. It is important that when directional words are used
32
ev
33 in naming (e.g., low, high, negative, positive) that they are accurate of not only the relative
34
35
36
relationship between the profiles in the sample but also of the absolute magnitude of the
iew
37
38 variables/profiles that they are describing. Carefulness with naming ensures accuracy and clarity
39
40 when interpreting results and when generalizing results beyond the study sample.
41
42 INSERT TABLE 2 ABOUT HERE
43
44 INSERT FIGURE 2 ABOUT HERE
45
46 Next, in step five, conduct a covariate analysis. This step should be included when: a) the
47
48 LPA analysis indicates there are profiles worth interpreting further, and b) there is a theoretical
49
50 reason to evaluate the impact of the covariates on the profiles. To add the covariates into the
51
52
53
model for concurrent analysis in Mplus, start with the syntax from step two. However, this time
54
55 add the variable information for covariates to the NAMES and USEVARIABLES lines.
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 21 of 54 International Journal of Behavioral Development
27
28
29 status, as defined by parent education level and occupation. Covariates are added by regressing
rR
30
31 the latent profile construct on the covariates (e.g., c ON X1 X2 X3). As the research is focused
32
ev
33 on adolescents who would like to pursue science occupations, Profile 2 (Highest Science Interest
34
35
36
& GPA) was used as the reference group. Comparisons of all profiles against a specific profile as
iew
37
38 a reference group are provided in the Mplus syntax by default, and the decision on which profile
39
40 is used as reference can be arbitrary in an exploratory analysis, or strategic to address a specific
41
42
research question as shown here. For this example, group difference values and statistical
43
44
45 significance tests presented in the output are tests between each profile and Profile 2 based upon
46
47 the research question of the study. The results of this covariate analysis are presented in Table 3.
48
49 However, individuals could replicate this approach using more than one profile as the reference
50
51
52 profile depending on their research question.
53
54 INSERT TABLE 3 ABOUT HERE
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 22 of 54
24 Profile 4 (Medium Science Interest & GPA). Negative coefficients indicate the high science
25
26 interest group tends to have higher vocabulary scores than the other two profiles. Finally,
ee
27
28
29 personality is shown to be statistically, significantly different for Profile 1 (Lowest Science
rR
30
31 Interest & GPA) as compared to Profile 2 (Highest Science Interest & GPA). A cross-tabulation
32
ev
33 demonstrates the difference is that significantly more of the students considered “Well-Adjusted”
34
35
36
fall into the high science interest profile (see Author, 2018).
iew
37
38 Step six is preparing the results for dissemination, the final step in the analysis.
39
40 Presentation of an LPA should generally follow the same sequence as the analysis. First, detail
41
42
data cleaning conducted, what assumptions are checked and the result, and how missing data are
43
44
45 handled. Include information about the software used for analysis (e.g., Mplus 8) and estimation
46
47 decisions made (e.g. changes to random starts, estimation method). Second, report steps taken to
48
49 estimate LPA models. How many models were included in the analysis, and why? Did any of the
50
51
52 models not converge? Finally, detail all decisions and problems that occurred in this step and
53
54 how any problems were addressed.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 23 of 54 International Journal of Behavioral Development
24 metrics that have been recommended, and these may be included as the researcher feels is
25
26 appropriate and/or as is common in a particular field’s literature (Marsh, et al., 2009; Masyn,
ee
27
28
29 2013; Nylund, et al., 2007; Tein, et al., 2013; Vermunt & Magidson, 2002). Overall, a holistic
rR
30
31 approach to reporting metrics should be used to support model retention decisions. Model
32
ev
33 retention decisions should be justified by the majority of the metrics and indices included, even if
34
35
36
some do not suggest the same model (Kline, 2011; Marsh, et al., 2004; Masyn, 2013; Nylund, et
iew
37
38 al., 2007; Schreiber, Nora, Stage, Barlow, & King, 2006).
39
40 Fourth, describe the profiles from the retained model. Pay attention to indicator variables
41
42
that appear to best differentiate between profiles and highlight the significant differences for the
43
44
45 reader. Tables and graphs can both be used to help the reader understand the relationships
46
47 between the profiles and the indicators. Naming the profiles is not necessary, but can be a helpful
48
49 way for readers to differentiate between groups and understand what indicators appear
50
51
52 meaningful. As the investigator, guide the reader in interpreting the profiles and implications for
53
54 differences observed.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 24 of 54
24 characteristics as it uses person-level indicators to sort individuals into latent profiles based on
25
26 shared response patterns. LPA can be used in any social science research context where
ee
27
28
29 individual differences and/or underlying patterns of shared behavior may be of interest. The
rR
30
31 example presented here is one such application, but many other possibilities exist for
32
ev
33 applications of LPA. LPA models of family functioning (Rybak, et al., 2017), employees’
34
35
36
commitment mindsets (Morin, Meyer, Creusier, & Bietry, 2016), adolescents’ coping strategies
iew
37
38 Aldridge & Roesch, 2008), and school climate (Eck, Johnson, Bettencourt, & Johnson, 2017) are
39
40 examples of other psychological and behavioral phenomena that have been examined.
41
42
However, researchers should also be thoughtful of the assumptions and limitations of
43
44
45 LPA before using this method in their own work. LPA assumes the underlying latent profiles are
46
47 continuous, as opposed to categorical as would be found in latent class analysis. Additionally,
48
49 variables selected for inputs and covariates should be both theoretically supported as well as
50
51
52 appropriate for LPA. Using advanced statistical methods for their own sake does not improve
53
54 research practice, but using a technique like LPA when appropriate and theoretically justified can
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 25 of 54 International Journal of Behavioral Development
24
25
26
ee
27
28
29
rR
30
31
32
ev
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 26 of 54
27
28
29 models: Potential problems and promising opportunities. Psychological Methods, 9, 3-29.
rR
30
31 doi: 10.1037/1082-989X.9.1.3
32
ev
37
38 10.1017/S095457949700206X
39
40 Bergman, L. R., Magnusson, D., & El Khouri, B. M. (2003). Studying individual development in
41
42
an interindividual context: A person-oriented approach. Mahwah, NJ: Psychology Press.
43
44
45 Celeux, G., & Soromenho, G. (1996). An entropy criterion for assessing the number of clusters
46
47 in a mixture model. Journal of Classification, 13, 195-212.
48
49 Clark, S. L., & Muthén, B. (2009). Relating latent class analysis results to variables not included
50
51
52 in the analysis. Available online: http://hbanaszak.mjr.uw.edu.pl/TempTxt/relatinglca.pdf
53
54 Collins, L. M., & Lanza, S. T. (2013). Latent class and latent transition analysis: With
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 27 of 54 International Journal of Behavioral Development
24 Gudicha, D. (2015). Power analysis methods for tests in latent class and latent Markov models
25
26 Ridderkerk, The Netherlands: Ridderprint BV.
ee
27
28
29 Henson, R. K., Hull, D. M., & Williams, C. S. (2010). Methodology in our education research
rR
30
31 culture: Toward a stronger collective quantitative proficiency. Educational
32
ev
37
38 variables in missing data estimation. Multivariate Behavioral Research, 50(3), 285-299.
39
40 Kim, S. Y. (2014). Determining the number of latent classes in single-and multiphase growth
41
42
mixture models. Structural Equation Modeling, 21(2), 263-279. doi:
43
44
45 10.1080/10705511.2014.882690
46
47 Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd Ed.). New
48
49 York, NY: Guilford Press.
50
51
52 Lo, Y., Mendell, N. R., & Rubin, D. B. (2001). Testing the number of components in a normal
53
54 mixture. Biometrika, 88, 767-778.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 28 of 54
27
28
29 Masyn, K. E. (2013). Latent class analysis and finite mixture modeling. In T. Little (Eds), The
rR
30
31 Oxford Handbook of Quantitative Methods (551-611). New York, NY: Oxford University
32
ev
33 Press.
34
35
36
Masyn, K. E. (2017). Measurement invariance and differential item functioning in latent class
iew
37
38 analysis with stepwise multiple indicator multiple cause modeling. Structural Equation
39
40 Modeling: A Multidisciplinary Journal, 24(2), 180-197.
41
42
McLachlan, G. J. (1987). On bootstrapping the likelihood ratio test statistic for the number of
43
44
45 components in a normal mixture. Applied Statistics, 36, 318-324. doi: 10.2307/2347790
46
47 McLarnon, M. J. W. & O’Neill, T. A. (2018). Extensions of auxiliary variable approaches for the
48
49 investigation of mediation, moderation, and conditional effects in mixture models.
50
51
52 Organizational Research Methods, 21(4), 955-982. doi: 10.1177/1094428118770731.
53
54 Morgan, G. B. (2015). Mixed mode latent class analysis: An examination of fit index
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 29 of 54 International Journal of Behavioral Development
27
28
29 latent class analysis and growth mixture modeling: A Monte Carlo simulation
rR
30
31 study. Structural Equation Modeling, 14(4), 535-569. doi: 10.1080/10705510701575396
32
ev
33 Nylund-Gibson, K. & Choi, A. Y. (2018). Ten frequently asked questions about latent class
34
35
36
analysis. Translational Issues in Psychological Sciences, 4(4), 440-461. doi:
iew
37
38 10.1037/tps0000176
39
40 Nylund-Gibson, K. & Masyn, K. E. (2016). Covariates and mixture modeling: Results of a
41
42
simulation study exploring the impact of misspecified effects on class enumeration.
43
44
45 Structural Equation Modeling, 23, 782-797. doi: 10.1080/10705511.2016.1221313
46
47 Nylund-Gibson, K., Grimm, R., Quirk, M., & Furlong, M. (2014). A latent transition mixture
48
49 model using the three-step specification. Structural Equation Modeling: A Multidisciplinary
50
51
52 Journal, 21, 1-16. doi: 10.1080/10705511.2014.915375
53
54 Oberski, D. (2016). Mixture models: Latent profile and latent class analysis. In J. Robertson and
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 30 of 54
24 10.1177/0013164417719111.
25
26 Peugh, J., & Fan, X. (2015). Enumeration index performance in generalized growth mixture
ee
27
28
29 models: A Monte Carlo test of Muthén’s (2003) hypothesis. Structural Equation
rR
30
31 Modeling, 22(1), 115-131. doi: 10.1080/10705511.2014.919823
32
ev
33 Preacher, K. J., & Coffman, D. L. (2006, May). Computing power and minimum sample size for
34
35
36
RMSEA [Computer software]. Available from http://quantpsy.org/.
iew
37
38 Rybak, T. M., Ali, J. S., Berlin, K. S., Klages, K. L., Banks, G. G., Kamody, R. C., Ferry, R. J.,
39
40 Alemzadeh, R., & Diaz-Thomas, A. M. (2017). Patterns of family functioning and
41
42
diabetes-specific conflict in relation to glycemic control and health-related quality of life
43
44
45 among youth with Type I Diabetes. Journal of Pediatric Psychology, 42(1), 40-51. doi:
46
47 10.1093/jpepsy/jsw071
48
49 Schreiber, J. B., Nora, A., Stage, F. K., Barlow, E. A., & King, J. (2006). Reporting structural
50
51
52 equation modeling and confirmatory factor analysis results: A review. Journal of
53
54 Educational Research, 99, 323-338.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 31 of 54 International Journal of Behavioral Development
24 University Press.
25
26 Vincent, W. J. & Weir, J. P. (2012). Statistics in Kinesiology (4th Ed). Champaign, IL: Human
ee
27
28
29 Kinetics.
rR
30
31
32
ev
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 32 of 54
24
25
26
ee
27
28
29
rR
30
31
32
ev
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 33 of 54 International Journal of Behavioral Development
24 Career Motivation 8.18 (2.84) 23.06 (2.84) 11.27 (2.84) 16.27 (2.84)
25
Self-Determination 9.19 (3.00) 19.37 (3.00) 14.50 (3.00) 14.43 (3.00)
26
ee
General School 11.23 (2.71) 9.433 (2.71) 9.54 (2.71) 10.16 (2.71)
37
38 Note. Values respresenting highest positive response in bold (Science Interest and ATS-General School
39
40 are reverse coded). Means and standard deviations for variables across all profiles: Number of Science
41 Classes M=3.78 (SD=0.90), GPA M=3.41 (SD=0.70), Science Interest M=2.00 (SD=0.50), Intrinsic
42
43 Motivation M=16.58 (SD=4.60), Career Motivation M=15.32 (SD=6.11), Self-Determination M=14.95
44
(SD=4.42), Self-Efficacy M=18.02 (SD=4.35), Grade Motivation M=18.20 (SD=4.69), Instrumental
45
46 Value M=51.18 (SD=8.90), Academic M=22.34 (SD=5.63), Difficulties & Complexities M=21.38
47
48 (SD=4.62), General School M=9.94 (SD=2.78 )
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 34 of 54
27
28
29 Note. * = p < .05, Profile 2 (High Science Interest & GPA, n = 78) served as reference group.
rR
30
31
32
ev
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 35 of 54 International Journal of Behavioral Development
24 c g1 Sex
25
26
ee
27
28
g2
29
Personality
rR
30
31
32 g3 Profile
ev
33
34
35
g4 Cognitive
36
iew
37
38 Ability
39
40
41
42 SES
43
44
45
46 Figure 1. Example of LPA model with covariates
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 36 of 54
rm on
cy
ue
l A es
de
22
es
se
Co itud
tio
io
tio
GP
ca
ti
em Val
tu
ti
er
as
at
ch lexi
iva
De iva
va
ffi
tti
23
t
in
t
Cl
rP
In
At
al
i
-E
t
ot
p
Ca Mo
o
e
m
e
ic
nc
24
l
oo
nc
Se
te
e
ie
m
sic
er
e
ie
ad
Sc
ru
25
&
d
re
Sc
lS
lf-
ca
Gr
st
tri
es
#
ra
Se
A
In
26
In
lti
ne
ee
cu
Ge
27
ffi
Di
28
29
rR
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 37 of 54 International Journal of Behavioral Development
24
25 Figure 3. Six foundational steps of latent profile analysis
26
ee
27
28
29
rR
30
31
32
ev
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 38 of 54
24 ! This is where you instruct Mplus on how many classes/profiles are being estimated. Initial
25 ! model contains only one class/profile, thus it would be CLASSES = c (1). Above specifies two
26 !profiles, and for each further iterative models the number in parentheses increases by one, so
ee
30
! Used to communicate how missing data is coded in data file. Here shown with a “.” which is
31
32 ! all that is included in each cell with missing data in the data file
ev
33
34 ANALYSIS:
35 TYPE = MIXTURE;
36
iew
! LPA is a version of mixture modeling, and this instructs Mplus to analyze in this way
37
38
39
ESTIMATOR = MLR;
40 !FIML robust to non-normal data
41
42 STARTS = 1000 250;
43 STITERATIONS = 500;
44 ! Default number of starts for each step of the ML estimation. First STARTS value specifies the
45
!number of unique start values to start with, the 250 represents the 250 best unique start values
46
47 !carrying forward to completion. The STITERATIONS specifies the number of ML iteration
48 !steps for those 250 selected start values to go through to be able to converge. This is a!
49 !maximum number of iteration; if a model converges in less than 500 iterations it will stop
50 !before reaching 500 iterations.
51 !These values can be increased … see “Four-Profile Final Model with Covariate Analysis
52 !Syntax” for an example.
53
54 LRTSTARTS = 2 1 50 10;
55 LRTBOOSTRAP = 250;
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 39 of 54 International Journal of Behavioral Development
24 OUTPUT:
25 TECH1 TECH8 TECH11 TECH14;
26 ! TECH1 provides parameter specifications and starting values for the analysis
ee
27
28
! TECH8 provides optimization history for this analysis type
29 !TECH11 provides LRT results
rR
33 FILE IS LPA2.dat;
34
! Tells Mplus where to save the output files from the analysis
35
36
SAVE = CPROBABILITIES;
iew
37 ! The above command lines are to save the most likely profile membership for each participant
38 ! and the posterior probabilities for their membership in each latent profile.
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 40 of 54
24 CLASSES = c (4);
25 ! This is where you instruct Mplus on how many classes/profiles are being estimated.
26 !For this version of covariate inclusion you would retain the best fitting model from prior
ee
27
28
!analyses without covariates
29 !In this example a model with 4 profiles/classes so CLASSES= c (4)
rR
30
31 MISSING ARE .;
32 ! Used to communicate how missing data is coded in data file, here shown with a “.” and this is
ev
37 TYPE = MIXTURE;
38 ! LPA is a version of mixture modeling, and this instructs Mplus to analyze in this way
39 !See above model ANALYSIS syntax for additional options that can be added to improve model
40 !estimation and convergence.
41
42
MODEL:
43
44 %OVERALL%
45 c ON X1 X2 X3;
46 ! New inclusion here instructs Mplus to run the model as before and then add the covariates X1
47 ! X2 and X3 as predictors of “c” classes/profiles
48
49 OUTPUT:
50
51
TECH1 TECH8 TECH11 TECH14;
52 ! TECH1 provides parameter specifications and starting values for the analysis
53 ! TECH8 provides optimization history for this analysis type
54 ! TECH11 provides LMR test comparing this model to the previous model.
55 ! (cannot be calculated with one profile model)
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 41 of 54 International Journal of Behavioral Development
24
25
26
ee
27
28
29
rR
30
31
32
ev
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 42 of 54
27
USEVARIABLES ClassNum GPA Plans SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
28 ATS_AP ATS_DC ATS_GS;
29 ! Selecting variables to use in the model (all profile indicators, no covariates at this stage)
rR
30
31 CLASSES=c(4);
32 ! This is where you instruct Mplus on how many classes/profiles are being estimated. Initial
ev
33
! model contains only one class/profile, thus it would be CLASSES = c (1). Above specifies four
34
35 !profiles, and for each further iterative models the number in parentheses increases by one.
36
iew
37 MISSING ARE .;
38 ! Used to communicate how missing data is coded in data file. Here shown with a “.” which is
39 ! all that is included in each cell with missing data in the data file
40
41
42
43 ANALYSIS:
44 TYPE=MIXTURE;
45 ! LPA is a version of mixture modeling, and this instructs Mplus to analyze in this way
46
47 ESTIMATOR = MLR;
48 !FIML robust to non-normal data
49
50
51 STARTS = 1000 250;
52 STITERATIONS = 500;
53 ! Default number of starts for each step of the ML estimation. First STARTS value specifies the
54 !number of unique start values to start with, the 250 represents the 250 best unique start values
55 !carrying forward to completion. The STITERATIONS specifies the number of ML iteration
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 43 of 54 International Journal of Behavioral Development
24
25 %OVERALL%
26
ee
30
31
32 ClassNum GPA Plans SM_IM SM_CM SM_SD SM_SE SM_GM ATS_IV
ATS_AP ATS_DC ATS_GS (Var1-Var12); !Label Var1-Var12 constrains the estimates of the
ev
33
34 !variances across the profiles to be equal.
35
36
iew
37 OUTPUT:
38
39
TECH1 TECH8 TECH11 TECH14;
40 ! TECH1 provides parameter specifications and starting values for the analysis
41 ! TECH8 provides optimization history for this analysis type
42 !TECH11 provides LRT results
43 !TECH14 provides bootstrapped LRT test
44
45
46
47 SAVEDATA:
48 FILE IS LPA1_2_FINAL.dat;
49 ! Tells Mplus where to save the output files from the analysis
50
51 SAVE = CPROBABILITIES;
52 ! The above command lines are to save the most likely profile membership for each participant
53
! and the posterior probabilities for their membership in each latent profile.
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 44 of 54
27 CLASSES=c(4);
28 ! Four latent profiles specified, based on results of original iterative modeling process
29
rR
30
31 AUXILLARY = Sex Ship_V Ship_B MomSES DadSES PersProf;
32 ! Now, covariates are included in the AUXILIARY statement so that they will be included in the
!datafile outputted by the SAVEDATA command at the bottom of the syntax file.
ev
33
34
35
36
iew
MISSING ARE .;
37
38
39 ANALYSIS:
40 TYPE=MIXTURE;
41 STARTS=2000 500;
42 ! Added increased number of starts for each step of the ML estimation in response to message
43 ! about possible convergence issue noted in output
44
45
46 MODEL:
47 !Want to run the model that was decided upon through the enumeration phase.
48
49 !For a default Mplus model the LPA model does not need to be specified. However, it can be.
50 !The model can also be modified from the Mplus default of estimating the indicator means
51 !(uniquely across profiles) and variances (constrained across profiles), as well as the latent
52
53
!profile mean. The syntax below specifies the Mplus defaults.
54
55 %OVERALL%
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 45 of 54 International Journal of Behavioral Development
24 !specify the profiles so that they are not affected by the inclusion of the covariates in the model.
25
26
ee
27
28
29
rR
30
31
32
ev
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 46 of 54
24 !The BCH weights (W1-W4) are included at the end of the datafile.
25
26 USEVARIABLES Sex Ship_V Ship_B MomSES DadSES PersProf
ee
27 W1-W4;
28 ! Now, covariates are included in the USEVARIABLES statement. The BCH weights (W1-W4)
29
!are also included. Because the BCH weights are included, the original indicators do not need to
rR
30
31 !be included in the usevariables. The BCH weights are unique to the individual, therefore
32 !retaining the classification uncertainty present in the enumeration step in this model.
ev
33
34 CLASSES=c(4);
35 ! Four latent profiles specified, based on results of original iterative modeling process
36
iew
37
38
TRAINING = W1-W4 (bch);
39
40 MISSING ARE .;
41
42 ANALYSIS:
43 TYPE=MIXTURE;
44
STARTS=0;
45
46 ! Note. Starts now 0 because the BCH weights are specifying the classes based upon the prior
47 !model run.
48
49 MODEL:
50
51 %OVERALL%
52
53
C ON Sex Ship_V Ship_B MomSES DadSES PersProf;
54 !Latent profiles regressed on the covariate variables. These are uniquely estimated for each class.
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 47 of 54 International Journal of Behavioral Development
24
25
26
ee
27
28
29
rR
30
31
32
ev
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 48 of 54
27 CLASSES=c(4);
28 ! Four latent profiles specified, based on results of original iterative modeling process
29
rR
30
31 AUXILLARY = Sex Ship_V Ship_B MomSES DadSES PersProf;
32 ! Now, covariates are included in the AUXILIARY statement so that they will be included in the
!datafile outputted by the SAVEDATA command at the bottom of the syntax file.
ev
33
34
35
36
iew
MISSING ARE .;
37
38
39 ANALYSIS:
40 TYPE=MIXTURE;
41 STARTS=2000 500;
42 ! Added increased number of starts for each step of the ML estimation in response to message
43 ! about possible convergence issue noted in output
44
45
46 MODEL:
47 !Want to run the model that was decided upon through the enumeration phase.
48
49 !For a default Mplus model the LPA model does not need to be specified. However, it can be.
50 !The model can also be modified from the Mplus default of estimating the indicator means
51 !(uniquely across profiles) and variances (constrained across profiles), as well as the latent
52
53
!profile mean. The syntax below specifies the Mplus defaults.
54
55 %OVERALL%
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 49 of 54 International Journal of Behavioral Development
24
25
26
ee
27
28
29
rR
30
31
32
ev
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 50 of 54
24 !The there is a posterior probability for each enumerated class (in this case 4) included at the end
25 of the datafile. The values in these CPROB columns are unique to the individual, so including
26 them increases the
ee
27
28 USEVARIABLES Sex Ship_V Ship_B MomSES DadSES PersProf
29
MODAL;
rR
30
31 ! Now, covariates are included in the USEVARIABLES statement. The MODAL is also
32 !included, because it provides the classification for each individual in the dataset (i.e., class
!assignment to class 1, 2, 3, or 4 in this example).
ev
33
34
35 NOMINAL ARE MODAL; !Necessary, because the MODAL variable is nominal.
36
iew
37
38
CLASSES=c(4);
39 ! Four latent profiles specified, based on results of original iterative modeling process
40
41 MISSING ARE .;
42
43 ANALYSIS:
44
TYPE=MIXTURE;
45
46 STARTS=0;
47 ! Note. Starts now 0 because the BCH weights are specifying the classes based upon the prior
48 !model run.
49
50 MODEL:
51
52
53
%OVERALL%
54
55 C ON Sex Ship_V Ship_B MomSES DadSES PersProf;
56 !General statement to regression a variable onto the latent profile.
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 51 of 54 International Journal of Behavioral Development
27
28
!C#3 ON DadSES (reg35);
29 !C#3 ON PersProf (reg36);
rR
30
31 !Latent profiles regressed on the covariate variables. These are uniquely estimated for each class.
32 !These statements are included for k-1 classes in the syntax as the last class is the reference class.
ev
33 !If there are additional relationships (e.g., PersProf ON Sex) that should be estimated uniquely in
34
!one or more of the classes (i.e., not constrained across classes) with the rest of the class-specific
35
36
!syntax below.
iew
37
38 %C#1%
39 [MODAL#1@ ];
40 [MODAL#2@ ];
41 [MODAL#3@ ];
42
43
44 [Sex] (mean11);
45 [Ship_V] (mean12);
46 [Ship_B] (mean13);
47 [MomSES] (mean14);
48 [DadSES] (mean15);
49 [PersProf] (mean16);
50
51
52 Sex;
53 Ship_V;
54 Ship_B;
55 MomSES;
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 52 of 54
24 Ship_B;
25 MomSES;
26 DadSES;
ee
27
28
PersProf;
29
rR
30
31 %C#3%
32 [MODAL#1@ ];
ev
33 [MODAL#2@ ];
34
[MODAL#3@ ];
35
36
iew
37 [Sex] (mean31);
38 [Ship_V] (mean32);
39 [Ship_B] (mean33);
40 [MomSES] (mean34);
41 [DadSES] (mean35);
42
[PersProf] (mean36);
43
44
45 Sex;
46 Ship_V;
47 Ship_B;
48 MomSES;
49 DadSES;
50
51
PersProf;
52
53
54 %C#4%
55 [MODAL#1@ ];
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
Page 53 of 54 International Journal of Behavioral Development
24 !there were specific relationships to model between any of these variables, the regressions would
25 !be included in the class-specific syntax. Unique labels can be used in MODEL CONSTRAINT
26 !and/or MODEL TEST commands as is done with other model types to test class differences and
ee
27
28
!indirect effects.
29
rR
30 MODEL TEST:
31 mean11 = mean21;
32 mean11 = mean31;
ev
33 mean11 = mean41;
34
mean21 = mean31;
35
36
mean21 = mean41;
iew
37 mean31 = mean41;
38
39 mean12 = mean22;
40 mean12 = mean32;
41 mean12 = mean42;
42
mean22 = mean32;
43
44 mean22 = mean42;
45 mean32 = mean42;
46
47 mean13 = mean23;
48 mean13 = mean33;
49 mean13 = mean43;
50
51
mean23 = mean33;
52 mean23 = mean43;
53 mean33 = mean43;
54
55 mean14 = mean24;
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd
International Journal of Behavioral Development Page 54 of 54
24 mean36 = mean46;
25
26 !The above syntax is directly testing the means of the variables in the different classes
ee
27
28
29 OUTPUT:
rR
33
34
35
36
iew
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 http://mc.manuscriptcentral.com/ijbd