0% found this document useful (0 votes)
70 views9 pages

Semc 3 Q

The document provides an overview of the basic steps in structural equation modeling (SEM): specification, identification, estimation, testing, and modification. It uses examples to explain each step. Specifically, it describes how to: 1) specify theoretical models based on relevant theory and research; 2) determine if a model is identified based on the number of parameters and equations; 3) estimate model parameters to optimize the fit between the theoretical and sample covariance matrices; 4) assess model fit through omnibus and parameter tests; and 5) modify models if fit is inadequate based on modification indices.

Uploaded by

shelygandhi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views9 pages

Semc 3 Q

The document provides an overview of the basic steps in structural equation modeling (SEM): specification, identification, estimation, testing, and modification. It uses examples to explain each step. Specifically, it describes how to: 1) specify theoretical models based on relevant theory and research; 2) determine if a model is identified based on the number of parameters and equations; 3) estimate model parameters to optimize the fit between the theoretical and sample covariance matrices; 4) assess model fit through omnibus and parameter tests; and 5) modify models if fit is inadequate based on modification indices.

Uploaded by

shelygandhi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

3.

SEM Basics
General steps in Structural Equation Model-
ing are:
(1) Specication
(2) Identication
(3) Estimation
(4) Testing
(5) Modication
1
3.1 Model Specication
Involves using all available relevant theory,
research, and information to construct the
theoretical model.
In structural equation modeling this results
to specifying the relationships between the
relevant variables describing the phenomenon
in interest.
2
Example 3.1: For a study in an industrial sales force

the researcher wanted to answer questions as: Is the


relationship between performance and job satisfaction
myth or reality? Does performance inuence satis-
faction, or does satisfaction inuence performance?

job sat

perform

?
?
Remark 3.1: In diagrams the (latent) theoretical vari-
ables are presented as circles or ovals and the observed
variables in rectangles.

Bagozzi, R.P. (1980). Performance and satisfaction in an in-


dustrial sales force: An examination of their antecedents and
simultaneity. Journal of Marketing, 44, 6577.
3
The researcher further believes that background fac-
tors, like achievement motivation, task specific self
esteem, and verbal intelligence have possibly an im-
pact on job satisfaction as well as performance.
The impact of these background factors should be
eliminated. This can be done by including them into
the model.
Thus, the preliminary sketch of the model is the fol-
lowing:

job sat
perform
`

? ?

achv mot
self est
verb int

'
&

'
&

'
&

4
Remark 3.2: In SEM graphs, like above, the one-
headed arrows indicate causal relations and the two-
headed arched arrows indicate mutual dependencies
(correlation).
Remark 3.3: Inclusion of unimportant factors or ex-
clusion of important factors will produce implied mod-
els that are misspecied.
Remark 3.4: Exclusion of important factors may re-
sult in biased parameter estimates.
Remark 3.5: Inclusion of unimportant factors may re-
sult in loss of estimation accuracy.
Remark 3.6: A misspecied model cannot reproduce
adequately the observed covariances, and hence will
not t the data.
5
Measurement Models
If some of the variables in the model are
latent (not directly observable), a measure-
ment model must be specied.
This results to constructing observable vari-
ables that are supposed to measure the latent
variables and dening relationships between
these variables and latent variables.
6
Example 3.2: In the above example e.g. the achieve-
ment motivation was measured with two indices, each
of which were constructed as a sum of several other
variables.
The measurement model is constructed for the ex-
planatory (exogenous) latent variables and the ex-
plained (endogenous) latent variables separately.
Thus there are two measurement models, one for the
exogenous variables and one fore the endogenous vari-
ables.
7
Measurement model for the exogenous factors:

achv mot
self est
verb int
achm1
achm2

sestm1
sestm2

intlm1
intlm2

$
%

$
%

$
%

The measurement model for the endogenous variables


is constructed in the same fashion.
8
3.2 Model Identication
In SEM the estimated parameters are func-
tions of the sample covariances.
That is, the system is a set of equations,
where the parameters are the unknowns.
If the unknowns can be solved uniquely, it is
said that that the model is identied.
Otherwise it is under-identied.
9
Example 3.3: In principle the the question of analo-
gous to the situation where we have variables x and
y, which satisfy the equation x +y = 10.
With this information we cannot solve uniquely the
values of x and y.
There are innitely many solutions. If we know that,
say x = 2, then there is a unique solution for y = 8.
10
Traditionally there are three levels of identi-
cation:
(1) Underidentied: If one or more parame-
ters cannot solved uniquely on the basis of
the covariance matrix S (more unknown pa-
rameters than equations).
(2) Just-identied: There is just enough in-
formation in the covariance matrix S to solve
the parameters of the model (equally number
of equations as unknown parameters).
(3) Over-identied: More than one way to
estimate the unknown parameters (more equa-
tions than unknowns).
It is said that the model is identied if it is
either just-identied or over-identied.
11
Remark 3.7: In an over-identied case one can statis-
tically test whether the data is supporting the extra
restriction in the model.
Remark 3.8: Checking whether a model is identied
is unfortunately not a straightforward task.
Usually, however, the computer programs can indicate
which parameters are not identied (not water proof).
A necessary condition is that there are at most p(p +1)/2
parameters to be estimated, where p is the number of
observed variables (dimension of the covariance ma-
trix).
12
3.3 Model Estimation
The observed correlation matrix is S and the
model implied (theoretical) correlation ma-
trix is , which is a function of the model
parameters.
The goal in the estimation is to nd such pa-
rameter values that that the theoretical co-
variance matrix is as close as possible to
the empirical covariance matrix S.
13
3.4 Model Testing
After the model is estimated the next task is
to asses how well the model ts the data.
I.e., how well the theoretical model is sup-
ported by the data.
(a) Global type omnibus tests of the good-
ness of t of the entire model (in SEM, one
of the most popular is the the
2
goodness
of t test, the chi-square test).
(b) Individual parameter tests: (i) statistical
signicance of individual parameter estimates
(the t-ratios, see Example 2.1), (ii) restric-
tions, e.g., equality of some parameters. etc.
14
3.5 Model Modication
If the t of the model is not good enough
(which is the usual case), the model must be
modied and subsequently evaluate the t of
the modied model.
In modern SEM-packages, like LISREL, there
are powerful tools for evaluating how to mod-
ify the model (modication indices).
These modication indices suggest which ones
of the imposed restriction most strongly dis-
agree with the empirical data.
15
Example 3.4: Simple regression for patient satisfac-
tion (n = 23).
Variables:
age = patients age in years,
sev = severity of illness,
anx = level of anxiety, and
sat = satisfaction level.
Specication:
Based of relevant theory and prior research, the re-
searcher is interested in specifying a regression model
to predict the satisfaction level of patients based on
patients age, severity of illness, and level of anxiety.
Thus, (s)he species the model
sat =
0
+
1
age +
2
sev +
3
anx +u, (1)
where
0
,
1
,
2
, and
3
are the parameters of the
model to be estimated and u is an (unobservable)
random error term.
16
Scematically the model is:
sat

error

age
sev
anx

>
>
>
>
>
>
>
>

Figure 3.1: Path model of patient satisfaction regres-


sion model.
17
Identication:
Identication referes to deciding whether a set of unique
parameter estimates can be computed for the regres-
sion equation.
The major intrest in the regression are parameters
(
0
),
1
,
2
, and
3
.
From structural equation point of view there are also
the varance and covariance parameters of the explana-
tory variables, which are altogether 6.
In addition there is the variance of the error term.
Thus, in all there are altogether 10 parameters plus
the intercept term
0
.
18
There are altogether 10 covariances and variances in
the covariance matrix of the variables age sat.
Thus, the model is just identied.
The 11th parameter
0
is estimated as

0
= sar

1
age

2
sev

3
anx,
where hat () denotes the estimate of the correspond-
ing parameter and the bar indicates sample mean of
the corresponding variable.
In the SEM computer output the just-identied is in-
dicated by the notion that the model is saturated and

2
= 0, df = 0, and p-val= 0.
19
Estimation:
Model estimation involves estimating the parameters
in the regression model (1).
This can be done with usual statistical packages (SAS,
SPSS, etc), and even in Excel.
Here we demonstrate PRELIS and LISREL.
In PRELIS data is the following:
Menu Statistics Regressions opens
20
Selecting
and clicking Run gives output
21
Model Testing:
In regression the the global t is most often mea-
sured with the R-square (coecient of determina-
tion), which indicates the fraction the model can ex-
plain from the total variation in the dependent vari-
able.
R
2
=

n
i=1
( y
i
y)
2

n
i=1
(y
i
y)
2
, (2)
where y is the predicted (tted) value of y with the
estimated regression model and y is the sample mean.
In the example R
2
= 0.685, which indicates variables
age, sev, and anx esplain 68.5 percent of the variation
in sat, the rest 31.5% is from unknown sources.
The Error Variance in the out put is the variance
estimate of the error term u in the regression (1).
22
A global test whether any of the explanatory variables
have eect on the dependent variables is the F-test.
The null hypothesis of the test is whether all the slope
parameters are simultaneously zero, i.e.,
H
0
:
1
=
2
=
3
= 0. (3)
Unlike standard regression programs, PRELIS does
not print this global.
However, if the null hypothesis is true, it implies that
the R-square, R
2
= 0 in the population regression.
Thus, hypothesis (3) is equivalent to the null hypoth-
esis that the population R
2
is zero (remember that
statistical testing is always about testing population
parameters).
Consequently the F-test can be formulated in terms
of the R
2
and
F =
R
2
1 R
2
(n p 1)
p
, (4)
where p is the number of explanatory variables, here
p = 3 and n is the number of observations.
23
Thus,
F =
0.685
1 0.685
(24 3 1)
3
14.5
The degrees of freedom of the nominator is p = 3 and
denominator n p 1 = 20.
The implied p-value is 0.00003.
Thus, the null hypothesis (3) is clearly rejected, and
the conclusion is that at least one of the explanatory
variables aect the level of satisfaction.
24
Individual parameters are tested with the t-test.
More general parameter restrictions are again tested
with a suitable F-test.
Inspection of the individual coecients reveals that
the coecient estimate of sev is not statistically sig-
nicant (p-val 0.626).
Thus, there is no empirical evidence that the severity
of illness aects the patient satisfaction.
25
Model Modication:
Variable sev proved not to be statistically signicant,
thus it can be dropped out from the model.
Doing this gives
The goodness-of-t in terms of R
2
= 0.682 shows that
the reduced model is virtually as good as the original
one.
If the explanatory power of about 68 percent is not
good enough, additional variables should be included
if more resent research indicates that another variables
was important in explaining patient satisfaction.
26
Example 3.5: LISREL demonstration of estimating
the model. LISREL is not really intended for simple
regression analysis applications, but the running the
example in LISREL serves demonstrating the use of
graphical modeling in forthcoming more complicated
cases.
27
In LISREL one can estimate the model by writing the
syntax or starting with the path window to generate
the path model (Figure 3.1)
The raw data is in le ex34.psf (PRELIS system le).
In LISREL select File New, which opens
Scroll down to get Path Diagram, select it and click OK
to open
Type in the name you want to save your path diagram
(here ex35) (the extension will be .pth) and click Save.
28
Next opens
From the menu select Setup Title and Comments. . .
to open
Type in the Title and click Next to open the Group
Names dialog.
29
We have only one group, thus click Next to open
Type in variables names
Click Next
select Raw Data in Statistics from: and in File name:
nd the le you have the raw data saved as a PRELIS
system le (here, ex34.psf).
After nding the le, click OK.
30
Next opens
Check sat as an Y-variable as indicated above.
31
Next formulate the path-diagram by dragging and drop-
ping the variables to get (you can use the Image menu
to tinker the diagram) and using the tool window
to dene the relationships to get.
Click the Run LISREL to generate SIMPLIS code.
Clicking again the Run LISREL button estimates the
model
32
The estimated are output also to the path diagram
In the equation form it is as before
33
Activating the .PTH window you can output dier-
ent path statistics to the diagram from the View
Estimations sub-menu
For example the standardized coecients are
The bi-variate relations between explanatory variables
(x-variables) are correlations.
34

You might also like