Ols Derivation
Ols Derivation
1.1
The Ordinary Least Squares (OLS) technique involves nding parameter estimates by minimizing the sum of square errors, or, what is the same thing, minimizing the sum of square residuals (SSR) n or i=1 (Yi Yi )2 , where Yi = 0 + 1 Xi is the tted value of Yi corresponding to a particular observation Xi . We minimize the SSR by taking the partial derivatives with respect to 0 and 1 , setting each equal to 0, and solving the resulting pair of simultaneous equations. 0 1
n i=1 n
(Yi 0 1 Xi )2 = 2
(Yi 0 1 Xi )
(1) (2)
(Yi 0 1 Xi )2 = 2
i=1 n i=1
Xi (Yi 0 1 Xi )
i=1
(Yi 0 1 Xi ) = 0
(3) (4)
Xi (Yi 0 1 Xi ) = 0
Finally, rewriting eqns. 3 and 4 we obtain a pair of simultaneous equations (known as the normal equations):
n i=1
Yi = n0 + 1
n i=1
Xi
i=1 n 2 Xi i=1 n i=1
n i=1
Xi Yi = 0
Xi + 1
Xi
i=1 n i=1
Yi = n0
n i=1 n i=1
Xi + 1 X i + n 1
Xi
i=1 n 2 Xi i=1
(7) (8)
n
i=1
Xi Yi = n0
n
i=1
Xi Yi
i=1
Xi
i=1
Yi = 1 n
n 2 Xi i=1
Xi
i=1
Xi Yi
2 Xi
n i=1
Xi Xi
n i=1
n i=1
n i=1 2
Yi
(9)
Dividing eqn. 9 by 1/n2 give the OLS derivation for 1 corresponding to the text, i.e. 1 =
1 n 1 n n i=1 Xi Yi X Y n 2 X2 i=1 Xi
(10)
Yi
n i=1
Xi
= Y 1 X
(11)
(12)
Hence, the above two equations help us to nd the OLS estimates of 0 and 1 , respectively (note: when doing the calculations, nd 1 rst).
1.2
The goal is to nd parameter estimates by minimizing the sum of square errors, as was done with the n simple regression model above, i.e., minimize i=1 (Yi Yi )2 , where, say, Yi = 0 + 1 X1i + 2 X2i . We can do this by calculating the partial derivatives with respect to the three unknown parameters 1 , 2 , and 3 , equating each to zero, and solving. The normal equations then become: n0 + 1 0 0
n i=1 n i=1 n i=1
X1i + 2
n
X2i =
i=1 i=1 n
Yi X1i Yi
i=1 n
X1i + 1 X2i + 1
n i=1 n i=1
2 X1i + 2
X1i X2i =
i=1 n 2 X2i = i=1 i=1
X1i X2i + 2
X2i Yi
which can be easily solved using Cramers rule or matrix algebra to nd the formula for the parameter estimates. An alternative approach is to begin by expressing all the data in the form of deviations from the sample means. The least-squares equation (for the three-variable regression model)is Yi = 0 + 1 X1i + 2 X2i + ei Averaging over the sample observations gives Yi = 0 + 1 X1i + 2 X2i which gives no term in e, since e is zero. Now, subtracting the second equation from the rst gives us the deviation form:
yi = 1 x1i + 2 x2i + ei where lowercase letters denote deviations from the sample means. Note the intercept 0 disappears from the deviation form of the equation, but it may be recovered from 0 = Y 1 X1i 2 X2i So, to minimize
n
(13)
SSR =
i=1
x1i yi = 1 x2i yi = 1
n i=1 n i=1
x2 + 2 1i
x1i x2i
i=1 n
(14)
x1i x2i + 2
x2 2i
i=1 n i=1
(15)
To solve this, we can multiply eqn. 14 by subtract the latter from the former to get
n n n n
n i=1
x1i yi
i=1 i=1
x2 2i
i=1
x2i yi
i=1
x1i x2i = 1 [
x2 1i
i=1 i=1
x2 ( 2i
i=1
x1i x2i )2 ]
or ( 1 = It follows that ( 2 =
n i=1 n i=1
x1i yi )( i=1 x2 ) ( i=1 x2i yi )( i=1 x1i x2i ) 2i n n n ( i=1 x2 )( i=1 x2 ) ( i=1 x1i x2i )2 2i 1i
(16)
x2i yi )( i=1 x2 ) ( i=1 x1i yi )( i=1 x1i x2i ) 1i n n n ( i=1 x2 )( i=1 x2 ) ( i=1 x1i x2i )2 1i 2i
(17)
Hence, equations 16, 17 and 13 help us to nd the OLS estimates of 1 , 2 , and 0 respectively 0 last). (note: when doing the calculations, nd