Introduction To SEM
Introduction To SEM
What is SEM?
SEM refers to a family of procedures (Kline, 2005) that is primarily used to test theoretical models involving
proposed causal associations among a set of variables (Schumacker & Lomax 2016). In this regard, SEM can be
thought of as largely a “confirmatory approach” to analyzing structural associations among a set of variables
(Byrne, 2010). Even so, SEM is flexible enough to incorporate exploratory analyses of data.
SEM assumes that proposed relations among variables can be represented in a set of structural regression
equations and that these relations can be represented pictorially (Byrne, 2010). There are particular drawing
conventions that are utilized in the representation of theoretical models (Kline, 2005).
Other terms for SEM: covariance structure analysis, covariance structure modeling, analysis of covariance
structures (Kline, 2005). [The term “causal modeling” has been used in the past, but is somewhat “dated”; see
Kline, 2005).
These terms reflect the notion that the researcher typically begins his/her analysis with a causal theory in
mind. This theory implies a particular covariance structure among variables, and the researcher compares this
implied covariance structure against the pattern of covariances found in his/her data. The researcher is able to
assess the degree of model fit by determining how similar the covariances implied by the model are to those
found in the dataset (see Byrne, 2010).
Why SEM?
1. Many traditional statistical techniques propose relationships between one or more independent variables and
a single dependent variable (as in the case of ANOVA or regression). In those cases where a researcher is
interested in including two or more dependent variables, the researcher is often limited to procedures such as
multivariate regression, MANOVA, and canonical correlation that provide tests of relationships between
multiple independent and multiple dependent variables. Nevertheless, these techniques do not really allow for
an analysis of the relationships among variables on the “dependent” variable side of the model. SEM provides
a mechanism for testing more complex multivariate relationships among variables, and allows for testing
predictive relationships among the dependent variables (called endogenous variables) themselves.
2. Because of (1) above, SEM provides a more flexible way of testing for mediated (i.e., indirect) effects of an
independent variable on a dependent variable.
3. Conventional statistical techniques (e.g., ANOVA and regression) assume that the variables included in those
analyses are measured without error (an untenable assumption). Unfortunately, measurement error
associated with ones variables can attenuate the relationships observed in one’s data and/or lead to biased
parameter estimates. SEM provides a flexible system that can allow a researcher to “build in” or adjust
parameter estimates within a model to decrease attenuation and possible biases in parameter estimates.
4. SEM provides a mechanism (i.e., confirmatory factor analysis) for testing proposed factor structures.
5. SEM is flexible enough to provide mechanisms for comparing models across groups, modeling growth curves,
etc.
Conventional drawing notation in SEM
Observed (measured, or manifest) variable is one that is directly measured and for which data has been
acquired in a study.
Latent variable is one that is not directly measured in a study. (These variables are indirectly measured by
observed variables.)
Exogenous variable is a variable that has no proposed cause in a model. Rather, it is postulated as a cause
of variation in other variables in the model. (Analogous to the notion of independent variable)
Unanalyzed association is a proposed relationship between two variables without any assumption of a
causal relation between them.
Some terminology
Direct effect refers to the proposed direct effect of X on Y. Again, using conventional notation, this is
indicated by a single-headed arrow: XY.
Mediator (or mediating variable) is a variable that is proposed to explain the relationship between two
other variables. Variation in the mediator (Y) is proposed to be a function of variation in X, and is also
proposed to produce variation in Z. (see diagram below)
a b
X Y Z
Indirect effect refers to the product of paths (a) and (b) above, which represents the indirect effect of X on
Z, via the mediator Y.
Some terminology
Goodness of fit refers to the judgement of the degree to which a proposed theoretical model fits one’s
data. In the context of SEM it captures the degree to which the covariance structure implied by one’s
theoretical model fits with the observed covariance structure in one’s data.
Goodness of fit statistics are used to evaluate the overall fit of a theoretical model (i.e., the proposed
structural relations among variables). They summarize and/or test the fit of a theoretical model.
Regression analysis can be thought of as a fairly simple form of structural equation modeling. The researcher
proposes a theoretical model of causation among a set of variables and evaluates its fit to the data. Goodness of fit
of the model is evaluated in terms of tests of R-square and the individual regression parameters.
t1 t2 t3
Time 1 starting scores serve as each
person’s intercept associated with the
growth curve.
Here, we’ve added a predictor variable (gender) into the
previous model.