Regression
Regression
This procedure performs multiple linear regression with five methods for entry and
removal of variables. It also provides extensive analysis of residual and influential
cases. Caseweight (CASEWEIGHT) and regression weight (REGWGT) can be
specified in the model fitting.
Notation
The following notation is used throughout this chapter unless otherwise stated:
l
Xk Sample mean for the kth independent variable: X k = ∑ wi xki
i =1
W
l
Sample mean for the dependent variable: Y = ∑ wi yi W
Y
i =1
hi Leverage for case i
1
2 REGRESSION
~ gi
hi + hi
W
S kj Sample covariance for X k and X j
otherwise p ∗ = p + 1
R The sample correlation matrix for X 1 ,K , X p and Y
Descriptive Statistics
r 11 K r1 p r1 y "#
R=
r21 K r2 p r2 y ##
⋅ K ⋅ ⋅
##
!
ry1 K ryp ryy $
where
S kj
rkj =
S kk S jj
and
S ky
ryk = rky =
S kk S yy
The sample mean X i and covariance Sij are computed by a provisional means
algorithm. Define
k
Wk = ∑w = i cumulative weight up to case k
i =1
REGRESSION 3
then
Cij 16
k = Cij 1 6
k −1 4
+ xik − X i 1 6 94 x
k −1 jk −Xj 1 96 w
k −1 k −
wk2
Wk
Otherwise,
X i 116 = xi1
and
16
Cij 1 = 0
Yi = β 0 + β 1 X 1i + β 2 X 2i + L + β p X pi + ei
sweep operations are used to compute the least squares estimates b of β and the
associated regression statistics. The sweeping starts with the correlation matrix R.
4 REGRESSION
~
Let R be the new matrix produced by sweeping on the kth row and column of R.
~
The elements of R are
1
r~kk =
rkk
r
r~ik = ik , i≠k
rkk
rkj
r~kj = , j≠k
rkk
and
R R 12
R
11
R=
21 R 22
where R 11 contains independent variables in the equation at the current step, the
result is
−1 −1
− R 11
~
R=
R 11
−1
R 21R 11
R 12
−1
R 22 − R 21R 11 R 12
The last row of
−1
R 21R 11
−1
R 22 − R 21R 11 R 12
REGRESSION 5
can be used to obtain the partial correlations for the variables not in the equation,
controlling for the variables already in the equation. Note that this routine is its own
inverse; that is, exactly the same operations are performed to remove a variable as
to enter a variable.
r t ≤1
r jk rkj
jj −
rkk
The above condition is imposed so that entry of the variable does not reduce the
tolerance of variables already in the model to unacceptable levels.
The F-to-enter value for X k is computed as
F − to − enterk =
4C − p − 19V
∗
k
ryy − Vk
ryk rky
Vk =
rkk
F − to − removek =
4C − p 9 V
∗
k
ryy
Stepwise
If there are independent variables currently entered in the model, choose X k such
that F − to − removek is minimum. X k is removed if F − to − removek < Fout
1
(default = 2.71) or, if probability criteria are used, P F − to − removek > Pout 6
(default = 0.1). If the inequality does not hold, no variable is removed from the
model.
If there are no independent variables currently entered in the model or if no
entered variable is to be removed, choose X k such that F − to − enterk is
maximum. Xk is entered if F − to − enterk > Fin (default = 3.84) or,
1 6
P F − to − enterk < Pin (default = 0.05). If the inequality does not hold, no
variable is entered.
At each step, all eligible variables are considered for removal and entry.
Forward
This procedure is the entry phase of the stepwise procedure.
Backward
This procedure is the removal phase of the stepwise procedure and can be used only
after at least one independent variable has been entered in the model.
Statistics
Summary
For the summary statistics, assume p independent variables are currently entered in
the equation, of which a block of q variables have been entered or removed in the
current step.
Multiple R
R = 1 − ryy
R Square
R 2 = 1 − ryy
Adjusted R Square
2
Radj =R 2
−
41 − R 9 p
2
C − p∗
∆R 2 = Rcurrent
2
− R previous
2
8 REGRESSION
%K ∆R 3C − p 8
2 ∗
∆F = &
K q31 − R 8 2
current
for the removal of q independent variables
KK ∆R 3C − p − q8
2 ∗
K' q3 R − 18
2
for the addition of q independent variables
previous
the degrees of freedom for the addition are q and C − p∗ , while the degrees of
freedom for the removal are q and C − p∗ − q .
1 6
SSe = ryy C − 1 S yy
1 6
SS R = R 2 C − 1 S yy
ANOVA Table
Regression p SS R 1SS 6 p
R
w C − p∗ SSe 1SS 6 4C − p 9
e
∗
A square matrix of size p with diagonal elements equal to the variance, the below
diagonal elements equal to the covariance, and the above diagonal elements equal
to the correlations:
1 6
var bk =
rkk ryy Syy
3
Skk C − p∗ 8
3 8
cov bk , b j =
rkj ryy Syy
3
Skk S jj C − p∗ 8
3 8
cor bk , b j =
rkj
rkk rjj
Selection Criteria
SS + 2 p ∗
AIC = C ln
C
e
10 REGRESSION
PC =
41 − R 94C + p 9
2 ∗
C − p∗
Mallow’s Cp (CP)
SSe
CP = + 2 p* − C
σ$ 2
where σ$ 2 is the mean square error from fitting the model that includes all the
variables in the variable list.
SS + p ln1C6
∗
SBC = C ln
C
e
Collinearity
1
VIFi =
rii
Tolerance
Tolerancei = rii
REGRESSION 11
Eigenvalues, lk
Condition Indices
max λ j
ηk =
λk
Variance-Decomposition Proportions
Let
3
v i = vi1 ,K , vip 8
be the eigenvector associated with eigenvalue λ i . Also, let
p
Φ ij = vij2 λ i and Φ j = ∑Φ ij
i =1
π ij = Φ ij Φ j
Regression Coefficient bk
ryk Syy
bk = for k = 1,K , p
Skk
12 REGRESSION
rkk ryy S yy
σ$ bk =
4
S kk C − p∗ 9
bk ± σ$ bk t 0.025, C − p∗
p
b0 = y − ∑b X k k
k =1
1C − 16ryy S yy + p X 2σ$ 2 p p −1
σ$ b2 =
0
C 4C − p ∗ 9
∑ k b k
+2 ∑ ∑ X k X j est. cov3bk , b j 8
k =1 k = j +1 j =1
Beta Coefficients
Beta k = ryk
ryy rkk
σ$ Betak =
C − p∗
REGRESSION 13
F =
Beta k
2
σ$ Beta
k
1 6
Partial − Corr X k =
ryk
rkk ryy − ryk rky
Standardized regression coefficient Beta ∗k if Xk enters the equation at the next step
ryk
Beta k∗ =
rkk
F=
4C − p∗ − 19ryk2
rkk ryy − ryk
2
1 6
Partial X k =
ryk
ryy rkk
Tolerance of Xk
Tolerancek = rkk
Minimum tolerance among variables already in the equation if Xk enters at the next step is
1
1≤ j ≤ p r jj − 3rkj r jk 8 rkk
min , rkk
%K
KK 0Cg− 15 ∑ ∑ 3 X − X S83 XS 8
− Xk rjk
p p
i ji j ki
if intercept is included
h =&
K j =1 k =1 jj kk
i
KK
KK 0Cg− 15 ∑ ∑ X SX S r
p p
i ji ki jk
otherwise
' j =1 k =1 jj kk
REGRESSION 15
For selected cases, leverage is hi ; for unselected case i with positive caseweight,
leverage is
%Kg 1 + h 1 + 1 + h − 1 "
K W W W + 1#$
h′ = & !
i i i if intercept is included
i
KKh 11 + h g 6
' i i i otherwise
%K p
K ∑b X k ki if no intercept
Y =&
$ k =1
i
KKb + b X p
K' ∑
0 k ki otherwise
k =1
Unstandardized Residuals
ei = Yi − Y$i
Standardized Residuals
%K e i
if no regression weight is specified
ZRESID = & s
K'SYSMIS
i
otherwise
%K Y$ − Y i
if no regression weight is specified
ZPRED = & sd
KKSYSMIS
i
' otherwise
where sd is computed as
l
4
ci Y$i − Y 9 2
sd = ∑ C −1
i =1
Studentized Residuals
%K e s i
for selected cases with ci > 0
=&
K 41− h 9 g
~
i i
KK e ~s
SRESi
i
otherwise
K' 41 + h 9 g i i
Deleted Residuals
%K DRESID i
for selected cases with ci > 0
SDRESID = &
K s1 6 i
KK e~
i
i
otherwise
K' s 41 + h 9 g i i
16
where s i is computed as
1 4C − p 9s ∗ 2
16
si =
∗
~
1 − hi
− DRESIDi2
C − p −1
ADJPREDi = Yi − DRESIDi
DfBeta
16
g e X ′WX
DFBETAi = b − b i = i i ~
1 6 −1
Xit
1 − hi
where
1
and W = diag w1 ,K , wl . 6
18 REGRESSION
Standardized DfBeta
SDBETAij =
bj − bj i 16
4Xt WX9 jj
−1
16
si
16
where b j − b j i is the jth component of b − b i . 16
DfFit
~
DFFITi = X i b − b i = 16 hi ei
~
1 − hi
Standardized DfFit
DFFITi
SDFITi = ~
16
s i hi
Covratio
s1 6
COVRATIO =
2 p∗
s
i 1
i × ~
1 − hi
Mahalanobis Distance
MAHALi =
%&1C − 16h i if intercept is included
'C h i otherwise
REGRESSION 19
%K4 DRESID h~ g 9 s 1 p + 16
2 2
=&
i i i if intercept is included
K'4 DRESID h g 9 4s p9
COOKi
2 2
i i i otherwise
%Ks ~
SEPREDi = &Ks hi gi if intercept is included
' hi gi otherwise
%KY$ − t 4h~ + 19 g
=&
i 0.025, C − p ∗
s i i if intercept is included
K'Y$ − t
LICINi
i 0.025, C − p s 1h + 16 g
i i otherwise
%KY$ + t 4h~ + 19 g
=&
i 0.025, C − p ∗
s i i if intercept is included
K'Y$ + t
UICINi
i 0.025, C − p s 1h + 16 g
i i otherwise
Durbin-Watson Statistic
∑ 1e~ − e~ 6 i i −1
2
i =2
DW = l
∑ c e~ i i
2
i =1
where e~i = ei gi .
REGRESSION 21
Missing Values
By default, a case that has a missing value for any variable is deleted from the
computation of the correlation matrix on which all consequent computations are
based. Users are allowed to change the treatment of cases with missing values.
References
Cook, R. D. 1977. Detection of influential observations in linear regression,
Technometrics, 19: 15–18.
Wilkinson, J. H., and Reinsch, C. 1971. Linear algebra. In: Handbook for
Automatic Computation, Volume II, J. H. Wilkinson and C. Reinsch, eds. New
York: Springer-Verlag.