0% found this document useful (0 votes)
3 views

model specification

Model specification in regression analysis is crucial for determining which independent variables to include in a regression equation, primarily based on theoretical considerations. Specification errors can occur by including irrelevant variables or excluding relevant ones, leading to estimation and interpretation issues, especially when independent variables are correlated. The document emphasizes that while orthogonal independent variables minimize these issues, they are rare in practice.

Uploaded by

Brian Nyamuzinga
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

model specification

Model specification in regression analysis is crucial for determining which independent variables to include in a regression equation, primarily based on theoretical considerations. Specification errors can occur by including irrelevant variables or excluding relevant ones, leading to estimation and interpretation issues, especially when independent variables are correlated. The document emphasizes that while orthogonal independent variables minimize these issues, they are rare in practice.

Uploaded by

Brian Nyamuzinga
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

C H A P T E R 35

Model Specification
in Regression Analysis

One of the most important but least understood issues in all of regres-
sion analysis concerns model specification. Model specification refers
to the determination of which independent variables should be
included in or excluded from a regression equation. In general, the
specification of a regression model should be based primarily on
theoretical considerations rather than empirical or methodological
ones. A multiple regression model is, in fact, a theoretical statement
about the causal relationship between one or more independent vari-
ables and a dependent variable. Indeed, it can be observed that
regression analysis involves three distinct stages: the specification of a
model, the estimation of the parameters of this model, and the inter-
pretation of these parameters. Specification is the first and most criti-
cal of these stages. Our estimates of the parameters of a model and
our interpretation of them depend on the correct specification of the
model. Consequently, problems can arise whenever we misspecify a
model. There are two basic types of specification errors. In the first,
we misspecify a model by including in the regression equation an
independent variable that is theoretically irrelevant. In the second, we
misspecify the model by excluding from the regression equation an
independent variable that is theoretically relevant. Both types of
specification errors can lead to problems of estimation and interpreta-
tion.
It must be noted, at the outset, that specification errors are not
especially problematic whenever the independent variables in a
multiple regression model are orthogonal or uncorrelated with one
MODEL SPECIFICATION IN REGRESSION ANALYSIS 167

another. It will be recalled that the partial regression coefficients for a


set of orthogonal independent variables in a multiple regression equa-
tion are equal to their respective simple regression coefficients.
Consequently, the addition or deletion of an orthogonal independent
variable does not have any effect on the partial regression coefficients
of the other independent variables in the regression equation.
However, the addition or deletion of an orthogonal independent vari-
able will affect the standard errors of the partial regression coefficients
of other independent variables in the equation. Specifically, the addi-
tion of an independent variable to a regression equation will increase
the standard errors of the partial regression coefficients by decreasing
their degrees of freedom. At the same time, however, the addition of
an independent variable to a regression equation may decrease these
same standard errors by decreasing the error variance in the depen-
dent variable. Conversely, the deletion of an independent variable
from a regression equation will decrease the standard errors of the
partial regression coefficients by increasing their degrees of freedom.
However, once again, the deletion of an independent variable may
increase the standard errors of the partial regression coefficients by
decreasing the error variance of the dependent variable.
Of course, orthogonal independent variables are a rarity. In prac-
tice, the independent variables in a multiple regression equation are
often correlated with one another to some extent. Unfortunately,
specification errors are more problematic in models that contain
correlated independent variables because the partial regression coef-
ficient of each independent variable is likely to be affected by the
inclusion or exclusion of other independent variables. We can
demonstrate the problems association with specification errors using a
familiar example. This example involves the regression of income on
education and occupation. Specifically, the multiple regression of
income on these two independent variables, in the case of standardized
variables, is given by:

z I = z 2 (0.341) + z 3 (0.157) + e

where variable zl is income, variable z2 is occupation, and variable z3


is education. In this particular case, we have employed standardized
variables because they allow us to compare directly the partial regres-
sion coefficients of the different independent variables. We shall
assume that this model is correctly specified inasmuch as it contains
the theoretically relevant predictors of income.

You might also like