0% found this document useful (0 votes)
33 views

Introduction To SEM

1) Structural equation modeling (SEM) is a statistical technique used to test theoretical models involving causal relationships between observed and latent variables. 2) SEM allows researchers to test complex multivariate relationships and indirect effects, while accounting for measurement error. 3) A key aspect of SEM is that it compares the covariance structure implied by a theoretical model to the actual covariance structure of the observed data to assess how well the model fits.

Uploaded by

Geleta
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Introduction To SEM

1) Structural equation modeling (SEM) is a statistical technique used to test theoretical models involving causal relationships between observed and latent variables. 2) SEM allows researchers to test complex multivariate relationships and indirect effects, while accounting for measurement error. 3) A key aspect of SEM is that it compares the covariance structure implied by a theoretical model to the actual covariance structure of the observed data to assess how well the model fits.

Uploaded by

Geleta
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Introduction to structural equation modeling (SEM)

What is SEM?

SEM refers to a family of procedures (Kline, 2005) that is primarily used to test theoretical models involving
proposed causal associations among a set of variables (Schumacker & Lomax 2016). In this regard, SEM can be
thought of as largely a “confirmatory approach” to analyzing structural associations among a set of variables
(Byrne, 2010). Even so, SEM is flexible enough to incorporate exploratory analyses of data.

SEM assumes that proposed relations among variables can be represented in a set of structural regression
equations and that these relations can be represented pictorially (Byrne, 2010). There are particular drawing
conventions that are utilized in the representation of theoretical models (Kline, 2005).

Other terms for SEM: covariance structure analysis, covariance structure modeling, analysis of covariance
structures (Kline, 2005). [The term “causal modeling” has been used in the past, but is somewhat “dated”; see
Kline, 2005).

These terms reflect the notion that the researcher typically begins his/her analysis with a causal theory in
mind. This theory implies a particular covariance structure among variables, and the researcher compares this
implied covariance structure against the pattern of covariances found in his/her data. The researcher is able to
assess the degree of model fit by determining how similar the covariances implied by the model are to those
found in the dataset (see Byrne, 2010).
Why SEM?

1. Many traditional statistical techniques propose relationships between one or more independent variables and
a single dependent variable (as in the case of ANOVA or regression). In those cases where a researcher is
interested in including two or more dependent variables, the researcher is often limited to procedures such as
multivariate regression, MANOVA, and canonical correlation that provide tests of relationships between
multiple independent and multiple dependent variables. Nevertheless, these techniques do not really allow for
an analysis of the relationships among variables on the “dependent” variable side of the model. SEM provides
a mechanism for testing more complex multivariate relationships among variables, and allows for testing
predictive relationships among the dependent variables (called endogenous variables) themselves.
2. Because of (1) above, SEM provides a more flexible way of testing for mediated (i.e., indirect) effects of an
independent variable on a dependent variable.
3. Conventional statistical techniques (e.g., ANOVA and regression) assume that the variables included in those
analyses are measured without error (an untenable assumption). Unfortunately, measurement error
associated with ones variables can attenuate the relationships observed in one’s data and/or lead to biased
parameter estimates. SEM provides a flexible system that can allow a researcher to “build in” or adjust
parameter estimates within a model to decrease attenuation and possible biases in parameter estimates.
4. SEM provides a mechanism (i.e., confirmatory factor analysis) for testing proposed factor structures.
5. SEM is flexible enough to provide mechanisms for comparing models across groups, modeling growth curves,
etc.
Conventional drawing notation in SEM

Rectangle or square denotes a measured, or observed, or


or manifest variable (i.e., variable is directly measured)

Oval or circle denotes a latent, implicit, or unobserved


or variable (i.e., not directly measured)

Single-headed arrow represents a proposed direct effect of


one variable on another in a model.

Double-headed arrow represents a proposed correlation


between two variables in the model. It is also referred to as
an “unanalyzed association” (Kline, 2005) between variables
as no causal relationship between variables is postulated.
Some terminology

Observed (measured, or manifest) variable is one that is directly measured and for which data has been
acquired in a study.

Latent variable is one that is not directly measured in a study. (These variables are indirectly measured by
observed variables.)

Exogenous variable is a variable that has no proposed cause in a model. Rather, it is postulated as a cause
of variation in other variables in the model. (Analogous to the notion of independent variable)

Endogenous variable is a variable whose variation is proposed to be an outcome of other variables in a


model. This variable may or may not be treated as a proposed cause of variation in other variables. For
this reason, these variables can serve as both independent and dependent variables within a model.

Unanalyzed association is a proposed relationship between two variables without any assumption of a
causal relation between them.
Some terminology

Direct effect refers to the proposed direct effect of X on Y. Again, using conventional notation, this is
indicated by a single-headed arrow: XY.

Mediator (or mediating variable) is a variable that is proposed to explain the relationship between two
other variables. Variation in the mediator (Y) is proposed to be a function of variation in X, and is also
proposed to produce variation in Z. (see diagram below)

a b
X Y Z

Indirect effect refers to the product of paths (a) and (b) above, which represents the indirect effect of X on
Z, via the mediator Y.
Some terminology

Goodness of fit refers to the judgement of the degree to which a proposed theoretical model fits one’s
data. In the context of SEM it captures the degree to which the covariance structure implied by one’s
theoretical model fits with the observed covariance structure in one’s data.

Goodness of fit statistics are used to evaluate the overall fit of a theoretical model (i.e., the proposed
structural relations among variables). They summarize and/or test the fit of a theoretical model.
Regression analysis can be thought of as a fairly simple form of structural equation modeling. The researcher
proposes a theoretical model of causation among a set of variables and evaluates its fit to the data. Goodness of fit
of the model is evaluated in terms of tests of R-square and the individual regression parameters.

Standard multiple regression model, where


achievement (the dependent, or criterion,
variable) is regressed onto self-efficacy,
mastery goals, and performance goals (the
independent, or predictor, variables. All
variables in the model are observed, or
measured, variables. The “r” is residual
variance associated with the dependent
variable. This is the variation in achievement
not explained by the predictors. (In SEM
notation, residuals appear as circles or ovals)

The double-headed arrows represent


unanalyzed associations among the
predictors (exogenous variables).
Achievement is an endogenous variable.
Example of AMOS interface use to draw regression model
Path analysis involves modeling proposed direct and/or indirect relationships among measured or latent
variables in a model. Here, we have an example using only measured variables, where self-efficacy,
mastery goals, and performance goals (exogenous variables) predict engagement, and engagement
(endogenous variable) predicts achievement (another endogenous variable). The effect of the exogenous
variables on achievement is proposed to be fully mediated via engagement.
The double-headed arrows, again, are unanalyzed associations among the exogenous variables.
Partial mediation model, where the effect of the exogenous predictors on achievement runs through
engagement. However, the mediated effect of performance goals is partial, in that part of the effect runs
through engagement (the mediator) and the remaining portion of the effect is direct.
Confirmatory factor analysis (CFA) is utilized to test a
theoretical factor structure against one’s data.

Here is an example, where items from a multifactorial scale


serving as solely as indicators of one specific factor. Factors
are allowed to correlate.

The “e’s” represent measurement error – called uniqueness


– that represents a combination of random + specific
variance.
Example of CFA model, specified with 3
factors and correlated errors/uniquenesses.

This strategy might be employed if there is


something common measurement-wise with
items 2, 9, and 7 (for example, the items are
negatively worded).
Path analysis with latent variables. Incorporates measurement model (CFA) and proposed structural
relations (paths) among latent variables (i.e., the theoretical constructs). Paths are adjusted for
measurement error associated with the latent constructs.
This is another example of a model
that mimics the behavior of a one-
way ANCOVA. It is essentially a
regression model that incorporates
a treatment variable (assuming two
levels) and latent variables as
covariate and dependent variable,
respectively.

Through the use of latent variables,


the model is able to test for group
differences at posttest (the latent
dependent variable) as a function
of the grouping variable
(treatment) and the covariate,
pretest (the latent covariate). This is
simply an ANCOVA that adjusts
parameters for measurement error.
Latent growth curve analysis can be used to model inter-
individual (i.e., between-person) change on a variable over
time.

In this example, we are modeling change in math


knowledge (measured variable at time 1, time 2, and time
3) over time.

The intercept models inter-individual differences in time 1


math knowledge, whereas slope models inter-individual
differences in rate of change in math knowledge over time.
The double-headed arrow is used to test whether a
person’s beginning math score is correlated with his/her
rate of change in knowledge over time.

(It is assumed that the time interval between


measurements is equal.)
Individual growth
curves in math
Math knowledge knowledge over
time. The slope of
each line reflects a
particular change in
knowledge per unit
of elapsed time.

t1 t2 t3
Time 1 starting scores serve as each
person’s intercept associated with the
growth curve.
Here, we’ve added a predictor variable (gender) into the
previous model.

Now, we can test (a) whether there is a between-(gender)


group difference in math knowledge at time 1 and (b)
whether there is a difference in the intra-individual rate of
change in math knowledge between gender groups.

You might also like