model specification
model specification
Model Specification
in Regression Analysis
One of the most important but least understood issues in all of regres-
sion analysis concerns model specification. Model specification refers
to the determination of which independent variables should be
included in or excluded from a regression equation. In general, the
specification of a regression model should be based primarily on
theoretical considerations rather than empirical or methodological
ones. A multiple regression model is, in fact, a theoretical statement
about the causal relationship between one or more independent vari-
ables and a dependent variable. Indeed, it can be observed that
regression analysis involves three distinct stages: the specification of a
model, the estimation of the parameters of this model, and the inter-
pretation of these parameters. Specification is the first and most criti-
cal of these stages. Our estimates of the parameters of a model and
our interpretation of them depend on the correct specification of the
model. Consequently, problems can arise whenever we misspecify a
model. There are two basic types of specification errors. In the first,
we misspecify a model by including in the regression equation an
independent variable that is theoretically irrelevant. In the second, we
misspecify the model by excluding from the regression equation an
independent variable that is theoretically relevant. Both types of
specification errors can lead to problems of estimation and interpreta-
tion.
It must be noted, at the outset, that specification errors are not
especially problematic whenever the independent variables in a
multiple regression model are orthogonal or uncorrelated with one
MODEL SPECIFICATION IN REGRESSION ANALYSIS 167
z I = z 2 (0.341) + z 3 (0.157) + e