0% found this document useful (0 votes)
35 views

SEM Essentials

Some content found on the internet about Structural Equation Modeling

Uploaded by

Mon
Copyright
© © All Rights Reserved
Available Formats
Download as PPS, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

SEM Essentials

Some content found on the internet about Structural Equation Modeling

Uploaded by

Mon
Copyright
© © All Rights Reserved
Available Formats
Download as PPS, PDF, TXT or read online on Scribd
You are on page 1/ 55

Structural Equation Modeling

(SEM) Essentials

Purpose of this module is to provide a


very brief presentation of the things one
needs to know about SEM before
learning how apply SEM.

by
Jim Grace
1

Where You can Learn More about SEM


Grace (2006) Structural Equation Modeling and Natural Systems.
Cambridge Univ. Press.
Shipley (2000) Cause and Correlation in Biology. Cambridge
Univ. Press.
Kline (2005) Principles and Practice of Structural Equation
Modeling. (2nd Edition) Guilford Press.
Bollen (1989) Structural Equations with Latent Variables. John
Wiley and Sons.
Lee (2007) Structural Equation Modeling: A Bayesian Approach.
John Wiley and Sons.
2

Outline
I. Essential Points about SEM
II. Structural Equation Models: Form and Function

I. SEM Essentials:
1. SEM is a form of graphical modeling, and therefore, a system
in which relationships can be represented in either graphical
or equational form.
equational
form
graphical
form

y1 = 11x1 + 1
x1

11

1
y1

2. An equation is said to be structural if there exists sufficient


evidence from all available sources to support the
interpretation that x1 has a causal effect on y1.

3. Structural equation modeling can be defined as the use of two


or more structural equations to represent complex hypotheses.

Complex
Hypothesis

x1

y3
3

y1
Corresponding
Equations

e.g.

y2
1

y1 = 11x1 + 1
y2 = 21y1 + 21x1 + 2
y3 = 32y2 + 31x1 + 3

4. Some practical criteria for supporting an assumption of


causal relationships in structural equations:
a. manipulations of x can repeatably be demonstrated to be
followed by responses in y, and/or
b. we can assume that the values of x that we have can serve as
indicators for the values of x that existed when effects on y
were being generated, and/or
c. if it can be assumed that a manipulation of x would result in
a subsequent change in the values of y
Relevant References:
Pearl (2000) Causality. Cambridge University Press.
Shipley (2000) Cause and Correlation in Biology. Cambridge
6

5. A Grossly Oversimplified History of SEM


Contemporary

Spearman
(1904)
Pearson
(1890s)

r, ch

path analysis
r
facto
are
u
q
s
-

Joreskog
(1973)

s is
a na ly

oo
h
i
el
lik

Conventional
Statistics

Fisher
(1922)
Neyman & E. Pearson
(1934)

Bayes & LaPlace


(1773/1774)
MCMC
(1948-)

SEM
Lee
(2007)

tes
a lt . t in g
mod
els

Wright
(1918)

Raftery
(1993)

Bayesian
Analysis

note that SEM is a framework and incorporates new statistical


7
techniques as they become available (if appropriate to its purpose)

6. SEM is a framework for building and evaluating multivariate


hypotheses about multiple processes. It is not dependent on a
particular estimation method.
7. When it comes to statistical methodology, it is important to
distinguish between the priorities of the methodology
versus those of the scientific enterprise. Regarding the
diagram below, in SEM we use statistics for the purposes of
the scientific enterprise.

Statistics and other


Methodological
Tools, Procedures,
and Principles.

The Scientific
Enterprise
8

The Methodological Side of SEM

The Relationship of SEM to the Scientific Enterprise


multivariate
descriptive statistics

structural
equation
modeling

univariate
descriptive statistics

univariate data
modeling

Data
exploration,
methodology and
theory development

simplistic
models

Understanding of Processes
modified from Starfield and Bleloch (1991)

realistic
predictive models

detailed process
models

10

8. SEM seeks to progress knowledge through cumulative


learning. Current work is striving to increase the capacity
for model memory and model generality.

structural equation modeling


exploratory/
model-building
applications

one aim of
SEM

confirmatory/
hypothesis-testing
applications

11

9. It is not widely understood that the univariate model, and


especially ANOVA, is not well suited for studying systems,
but rather, is designed for studying individual processes, net
effects, or for identifying predictors.

10. The dominance of the univariate statistical model in the


natural sciences has, in my personal view, retarded the
progress of science.

12

11. An interest in systems under multivariate control


motivates us to explicitly consider the relative
importances of multiple processes and how they interact.
We seek to consider simultaneously the main factors that
determine how system responses behave.
12. SEM is one of the few applications of statistical
inference where the results of estimation are frequently
you have the wrong model!. This feedback comes
from the unique feature that in SEM we compare
patterns in the data to those implied by the model. This
is an extremely important form of learning about
systems.
13

13. Illustrations of fixed-structure protocol models:


Univariate Models

Multivariate Models

x1

x1

y1

x2

x2

y2

x3

y1

x3

y3

x4

x4

y4

x5

x5

y5

Do these model structures match the causal forces that influenced


the data? If not, what can they tell you about the processes operating?
14

14. Structural equation modeling and its associated scientific


goals represent an ambitious undertaking. We should be
both humbled by the limits of our successes and inspired
by the learning that takes place during the journey.

15

II. Structural Equation Models: Form and Function


A. Anatomy of Observed Variable Models

16

Some Terminology
direct effect of x1 on y2

path
coefficients

x1
exogenous
variable

21
11

21
y1

1
indirect effect of x1 on y2
is 11 times

21

y2

2
endogenous
variables
17

model B, which has paths between all variables


is saturated (vs A, which is unsaturated)

A
x1

y1

B
x1

y2

y1

1
C

y2
1

D
x1

y2

x1

y2

2
x2

y1

2
x2

nonrecursive

y1
1

recursive (the term recursive refers to the mathematical property that


each item in a series is directly determined by the preceding item).

18

First Rule of Path Coefficients: the path coefficients for


unanalyzed relationships (curved arrows) between
exogenous variables are simply the correlations
(standardized form) or covariances (unstandardized form).
x1

x1
y1

.40

x2

x2

y1

----------------------------x1
1.0
x2

0.40

1.0

y1

0.50

0.60

1.0

19

Second Rule of Path Coefficients: when variables are


connected by a single causal path, the path
coefficient is simply the standardized or unstandardized
regression coefficient (note that a standardized regression
coefficient = a simple correlation.)

x1

11 = .50

y1

21 = .60

y2

x1
y1
y2
------------------------------------------------x1
1.0
y1
0.50
1.0
y2
0.30
0.60
1.0
(gamma) used to represent effect of exogenous on endogenous.
(beta) used to represent effect of endogenous on endogenous.20

Third Rule of Path Coefficients: strength of a


compound path is the product of the coefficients along
the path.

x1

.50

y1

.60

y2

Thus, in this example the effect of x1 on y2 = 0.5 x 0.6 = 0.30


Since the strength of the indirect path from x1 to y2 equals the
correlation between x1 and y2, we say x1 and y2 are
conditionally independent.
21

What does it mean when two separated variables


are not conditionally independent?
x1

y1

y2

------------------------------------------------x1
1.0
y1

0.55

1.0

y2

0.50

0.60

x1

r = .55

y1

1.0

r = .60

y2

0.55 x 0.60 = 0.33, which is not equal to 0.50

22

The inequality implies that the true model is

x1

additional process

y2

y1
Fourth Rule of Path Coefficients: when variables are
connected by more than one causal pathway, the path
coefficients are "partial" regression coefficients.
Which pairs of variables are connected by two causal paths?
answer: x1 and y2 (obvious one), but also y1 and y2, which are connected by
the joint influence of x1 on both of them.
23

And for another case:

x1
y1
x2
A case of shared causal influence: the unanalyzed relation
between x1 and x2 represents the effects of an unspecified
joint causal process. Therefore, x1 and y1 connected by two
causal paths. x2 and y1 likewise.
24

How to Interpret Partial Path Coefficients:


- The Concept of Statistical Control

x1
.40

.31

y1

y2
.48

The effect of y1 on y2 is controlled for the joint effects of x1.


I have an article on this subject that is brief and to the point.
Grace, J.B. and K.A. Bollen 2005. Interpreting the results from multiple
regression and structural equation models. Bull. Ecological Soc.
25
Amer. 86:283-295.

Interpretation of Partial Coefficients


Analogy to an electronic equalizer

from Sourceforge.net

With all other variables in model held to their


means, how much does a response variable
change when a predictor is varied?

26

Fifth Rule of Path Coefficients: paths from error


variables are correlations or covariances.

x1
.40
equation for path
from error variable

1 R

2
yi

.31
R2 = 0.16

y1
.92

R2 = 0.44

y2
.48

1
.84

.73

2
.56
alternative is to
show values for zetas,
which = 1-R2

27

Now, imagine y1 and y2

R2 = 0.25

are joint responses

.50

y2

.40

y1

x1
x1

y1

y2

------------------------------x1
1.0
y1

0.40

1.0

y2

0.50

0.60

R2 = 0.16

1.0

Sixth Rule of Path Coefficients: unanalyzed residual


correlations between endogenous variables are partial
correlations or covariances.

28

R2 = 0.25

.50

y2

2
.40

x1
.40

y1

R2 = 0.16

the partial correlation between y1 and y2 is typically


represented as a correlated error term
This implies that some other factor is influencing y1 and y2
29

Seventh Rule of Path Coefficients: total effect one


variable has on another equals the sum of its direct and
indirect effects.
x1

.15
.64

.80
x2

-.11

Total Effects:
x1
x2

y2
.27

y1
1

Eighth Rule of Path Coefficients:


sum of all pathways between two
variables (causal and noncausal)
equals the correlation/covariance.

y1

------------------------------y1
0.64 -0.11 --y2

0.32

-0.03 0.27

note: correlation between


x1 and y1 = 0.55, which
equals 0.64 - 0.80*0.11
30

Suppression Effect - when presence of another


variable causes path coefficient to strongly differ from
bivariate correlation.
x1

.15
.64

.80
x2

-.11

y2
.27

y1

x1
x2
y1
y2
----------------------------------------------x1
1.0
x2
0.80
1.0
y1
0.55
0.40
1.0
y2
0.30
0.23
0.35
1.0

path coefficient for x2 to y1 very different from correlation,


(results from overwhelming influence from x1.)
31

II. Structural Equation Models: Form and Function


B. Anatomy of Latent Variable Models

32

Latent Variables
Latent variables are those whose presence we suspect or
theorize, but for which we have no direct measures.
fixed loading*
Intelligence

latent variable

1.0

IQ score

1.0

observed indicator

*note that we must specify some parameter, either error,


loading, or variance of latent variable.

error
variable

33

Latent Variables (cont.)


Purposes Served by Latent Variables:
(1) Specification of difference between observed data
and processes of interest.
(2) Allow us to estimate and correct for measurement error.
(3) Represent certain kinds of hypotheses.

34

Range of Examples
single-indicator
estimate
from map

multi-method

Elevation

soil C
loss on
ignition

repeated measures
singing range, t1
singing range, t2
singing range, t3

repeatability
observer 1

Territory
Size

Soil
Organic

Caribou
Counts

observer 2

35

The Concept of Measurement Error


the argument for universal use of latent variables
1. Observed variable models, path or other, assume all
independent variables are measured without error.
2. Reliability - the degree to which a measurement is
repeatable (i.e., a measure of precision).
illustration
y

0.60

y
R2 = 0.30

error in measuring x is ascribed to error in


predicting/explaining y

36

Example
Imagine that some of the observed variance in x is
due to error of measurement.
calibration data set based
on repeated measurement trials
plot
x-trial1 x-trial2 x-trial3
1
1.272 1.206 1.281
2
1.604 1.577 1.671
3
2.177 2.192 2.104
4
1.983 2.080 1.999
.
........
........
.......
n
2.460 2.266 2.418

average correlation between trials = 0.90


therefore, average R-square = 0.81

reliability = square root of R2


measurement error variance =
(1 - R2) times VARx

imagine in this case VARx = 3.14, so error variance = 0.19 x 3.14 = 0.60
.60

.90

LV1

.65

LV2

1.0

R2 = .42

y
37

II. Structural Equation Models: Form and Function


C. Estimation and Evaluation

38

1. The Multiequational Framework


(a) the observed variable model
We can model the interdependences among a set of predictors and responses
using an extension of the general linear model that accommodates the
dependences of response variables on other response variables.

y = + y + x +
= p x 1 vector of intercepts

= cov (x) = q x q matrix of


covariances among xs

= p x p coefficient matrix of ys on ys

= cov () = q x q matrix of

y = p x 1 vector of responses

= p x q coefficient matrix of ys on xs

covariances among errors

x = q x 1 vector of exogenous predictors

= p x 1 vector of errors for the elements of y

39

(b) the latent variable model

= + + +

The LISREL
Equations
Jreskg 1973

where:
is a vector of latent responses,
is a vector of latent predictors,
and are matrices of coefficients,
is a vector of errors for , and
is a vector of intercepts for

(c) the measurement model

x = x +
y = y +

where:
x is a vector of loadings that link observed x
variables to latent predictors,
y is a vector of loadings that link observed y
variables to latent responses, and
and are vectors are errors
40

2. Estimation Methods
(a) decomposition of correlations (original path analysis)
(b) least-squares procedures (historic or in special cases)
(c) maximum likelihood (standard method)
(d) Markov chain Monte Carlo (MCMC) methods
(including Bayesian applications)

41

Bayesian References:
Bayesian SEM:
Lee, SY (2007) Structural Equation Modeling: A Bayesian
Approach. Wiley & Sons.
Bayesian Networks:
Neopolitan, R.E. (2004). Learning Bayesian Networks. Upper
Saddle River, NJ, Prentice Hall Publs.

42

SEM is Based on the Analysis of Covariances!


Why? Analysis of correlations represents loss of information.
illustration with regressions having same slope and intercept
A

r = 0.86

r = 0.50

Analysis of covariances allows for estimation of


both standardized and unstandardized parameters.
43

2. Estimation (cont.) analysis of covariance structure


The most commonly used method of estimation over the past
3 decades has been through the analysis of covariance
structure (think analysis of patterns of correlations among
variables).

compare
Observed Correlations*

{ }
1.0
.24 1.0
.01 .70 1.0

Model-Implied Correlations

{ }
11
12 22
13 23 33

* typically the unstandardized correlations, or covariances44

3. Evaluation
Observed Covariance Matrix

Hypothesized Model
x1

y2
y1
(e.g

Parameter
Estimates

S=

)
tion lihood
a
m
esti m like
mu
i
x
a
., m

Model Fit
Evaluations

{ }
1.3
.24 .41
.01 9.7 12.3

compare

{ }

11
=
12 22
13 23 33
Implied Covariance Matrix
45

Model Identification - Summary


1. For the model parameters to be estimated with unique
values, they must be identified. As in linear algebra, we
have a requirement that we need as many known pieces of
information as we do unknown parameters.
2. Several factors can prevent identification, including:
a. too many paths specified in model
b. certain kinds of model specifications can make
parameters unidentified
c. multicollinearity
d. combination of a complex model and a small sample
3. Good news is that most software checks for identification
(in something called the information matrix) and lets you
know which parameters are not identified.
46

Fitting Functions
The most commonly used fitting function in maximum likelihood
estimation of structural equation models is based on the log
likelihood ratio, which compares the likelihood for a given model to
the likelihood of a model with perfect fit.

FML

log tr S log S p q

Note that when sample matrix and implied matrix are equal, terms 1 and
3 = 0 and terms 2 and 4 = 0. Thus, perfect model fit yields a value of FML
of 0.

47

Fitting Functions (cont.)


Maximum likelihood estimators, such as FML, possess
several important properties: (1) asymptotically unbiased,
(2) scale invariant, and (3) best estimators.
Assumptions:
(1) and S matrices are positive definite (i.e., that they do not
have a singular determinant such as might arise from a
negative variance estimate, an implied correlation greater
than 1.0, or from one row of a matrix being a linear function
of another), and
(2) data follow a multinormal distribution.
48

Assessment of Fit between Sample Covariance and ModelImplied Covariance Matrix

The 2 Test
One of the most commonly used approaches to performing such tests (the
model 2 test) utilizes the fact that the maximum likelihood fitting function
FML follows a X2 (chi-square) distribution.

X2 = n-1(FML)
Here, n refers to the sample size, thus X2 is a direct function of sample
size.

49

Illustration of the use of 2


correlation matrix

issue: should there be


a path from x to y2?

0.40

y1

y2

0.50

1.0
0.4 1.0
0.35 0.5

1.0

rxy2 expected to be 0.2


(0.40 x 0.50)

X2 = 1.82 with 1 df and 50 samples


P = 0.18
X2 = 3.64 with 1 df and 100 samples
P = 0.056
X2 = 7.27 with 1 df and 200 samples
P = 0.007

Essentially, our ability


to detect significant
differences from our
base model, depends as
usual on sample size.
50

Additional Points about Model Fit Indices:


1.The chi-square test appears to be reasonably effective
at sample sizes less than 200.
2. There is no perfect answer to the model selection
problem.
3. No topic in SEM has had more attention than the
development of indices that can be used as guides for
model selection.
4. A lot of attention is being paid to Bayesian model
selection methods at the present time.
5. In SEM practice, much of the weight of evidence falls
on the investigator to show that the results are
repeatable (predictive of the next sample).

51

Alternatives when data extremely nonnormal


Robust Methods:
Satorra, A., & Bentler, P. M. (1988). Scaling corrections for
chi-square statistics in covariance structure analysis. 1988
Proceedings of the Business and Economics Statistics
Section of the American Statistical Association, 308-313.
Bootstrap Methods:
Bollen, K. A., & Stine, R. A. (1993). Bootstrapping goodnessof-fit measures in structural equation models. In K. A.
Bollen and J. S. Long (Eds.) Testing structural equation
models. Newbury Park, CA: Sage Publications.
Alternative Distribution Specification: - Bayesian and other:
52

Diagnosing Causes of Lack of Fit (misspecification)


Modification Indices: Predicted effects of model
modification on model chi-square.
Residuals: Most fit indices represent average of residuals
between observed and predicted covariances. Therefore,
individual residuals should be inspected.
Correlation Matrix to be Analyzed

y1
y2
x

y1
y2
x
-------- -------- -------1.00
0.50
1.00
0.40
0.35
1.00

Fitted Correlation Matrix


y1
y2
x
-------- -------- -------y1
1.00
y2
0.50
1.00
x
0.40
0.20
1.00

residual = 0.15

53

The topic of model selection, which


focuses on how you choose among
competing models, is very important.
Please refer to additional tutorials for
considerations of this topic.

54

While we have glossed over as many


details as we could, these fundamentals
will hopefully help you get started with
SEM.
Another gentle introduction to SEM oriented to the
community ecologist is Chapter 30 in McCune, B. and J.B.
Grace 2004. Analysis of Ecological Communities. MJM.
(sold at cost with no profit)
55

You might also like