Lecture-4
Lecture-4
Lecture Notes 4
Assa Mulagha-Maganga
Dept of Agricultural and Applied Economics, LUANAR
Department of Mathematical Sciences (Statistics), Chancellor College
Summer 2022
4.1 Introduction
In the previous section, we looked at measures to express the strength and the direction of
the relationship between two variables. In this section, we wish to look at an equation to
express the linear (straight line) relationship between two variables. In addition, we want
to be able to estimate the value of the dependent variable Y based on a selected value of
the independent variable X. The technique used to develop the equation and provide the
estimates is called regression analysis. Thus, Regression analysis is a statistical technique
that utilizes the relationship between two or more quantitative values (variables). The
methodology is widely used in business, social, medical, biological sciences, and many other
disciplines.It is used for explaining or modeling relationship between, a single variable Y ,
called the response, output or dependent variable and one or more predictors, explanatory
or independent variable denoted by X1 , X2 , . . . , Xp . When p = 1, the model is called a
simple regression model, but when p > 1 it is called a multiple regression model. Regression
modeling has several objectives including:
A simple linear regression is a regression model with one explanatory variable. A regression
function is linear in parameters. Simple Linear Regression model is expressed as:
yi = β0 + β1 xi + ϵi
1
Statistics for Economists 2
where
yi is the response variable at the ith trial
β0 is the intercept, a regression parameter which is the value of y when x = 0
β1 is the slope, the regression coefficient which is the change of y when x changes by one
unit
xi is the explanatory variable in the i-th trial
ϵi is the error term which is assumed to be normally distributed with mean zero and variance
σ 2 i.e ϵi ∼ N (0, σ 2 )
yi = β0 + β1 xi + ϵi
ϵi ∼ N (0, σ 2 )
i = 1, 2, . . . , n
Taking variance
Therefore;
yi ∼ N (β0 + β1 xi , σ 2 )
The ordinary least-squares method(OLS) is a technique for fitting the “best” straight line
to the sample of (Xi , Yi ) observations for i = 1, 2, . . . , n. For each case, the method of least
squares considers the deviation of Yi from each expected value.
ϵi = yi − β0 + β1 xi
It involves minimizing the sum of the squared (vertical) deviations of points from the line:
n n
ϵ2i (yi − β0 − β1 xi )2
X X
Q= =
i=1 i=1
n
∂Q X
= (−2) (yi − β0 − β1 xi )(1) = 0
∂β0 i=1
n
∂Q X
= (−2) (yi − β0 − β1 xi )(xi ) = 0
∂β1 i=1
n
X n
X n
X
yi − β0 − β 1 xi = 0
i=1 i=1 i=1
n n n
β1 x2i = 0
X X X
y i xi − β 0 xi −
i=1 i=1 i=1
n
X n
X
yi = nβ0 + β1 xi (1)
i=1 i=1
n n n
x21 (2)
X X X
xi yi = β0 xi + β1
i=1 i=1 i=1
Solving simultaneously Eqs. (1) and (2), using Cramer’s rule yields:
Pn Pn
i=1 xi yi −
P
n xi i=1 yi
β̂1 = Pn Pn
n i=1 x2i − ( i=1 xi )
2
Pn
i=1 (xi − x̄)(yi − ȳ) cov(xi , yi )
β̂1 = Pn =
i=1 (xi − x̄)
2 σx2
In a similar manner, it can be shown that the value of b̂0 is then given by
β̂0 = ȳ − β̂1 x̄
yi = βˆ0 + βˆ1 xi
The i-th residual denoted by ϵi is the difference between the observed value yi and fitted
value ŷi that is ϵi = yi − ŷi . Properties of the Residuals are:
Pn
1. i=1 ϵi =0
Pn
2. i=1 ϵi xi =0
Pn
3. i=1 ϵi ŷ =0
For estimation of σ 2 , the recall for a sample y1 , y2 , . . . , yn . The sample variance is given by
Pn
− ȳ)2
i=1 (yi
n−1
Pn n
− β̂0 − β̂1 xi ) ϵ2
P
i=1 (yi
2
σ̂ = = i=1 i
n−2 n−2
We are using n − 2 since we have lost 2 degrees of freedom in estimating β0 and β1 .
When the functional form of the probability distribution of error terms is specified, the
estimators of β0 , β1 and σ 2 can be obtained by the method of maximum likelihood. The
likelihood function of n observations y1 , y2 , . . . , yn is the product of the individual densities.
Recall, from previous section that yi ∼ N (β0 + β1 xi , σ 2 ). In this case of simple linear
regression, each yi is the distributed as yi ∼ N (β0 + β1 xi , σ 2 ). This implies that:
1 (yi − β0 − β1 xi )2
f (yi ) = √ exp{− }
2πσ 2 2σ 2
Since yi are independent, the likelihood function denoted by L(β0 , β1 , σ 2 ), is just the product
of this individual densities i.e
n n
Y Y 1 (yi − β0 − β1 xi )2
f (yi ) = √ exp{− }
i=1 i=1 2πσ 2 2σ 2
n
1 n 1 X
= (√ ) 2 exp{− 2 (yi − β0 − β1 xi )2 }
2πσ 2 2σ i=1
The values of β0 , β1 and σ 2 that maximize this likelihood function are the MLE’s and are
denoted by β̂0 , β̂1 and σ̂ 2 . From theory, its sometimes easy to maximise log likelihood
function rather than the likelihood function itself.
In this case the log likelihood is given by
n
n n 1 X
l = ln L = − ln 2π − ln σ 2 − 2 (yi − β0 − β1 xi )2
2 2 2σ i=1
n
∂l 1 X
= 2 (yi − β0 − β1 xi )
∂β0 σ i=1
n
∂l n 1 X
= − − (yi − β0 − β1 xi )
∂σ 2 2σ 2 2σ 4 i=1
n
X
(yi − β0 − β1 xi ) = 0
i=1
n
(yi xi − β0 xi − β1 x2i ) = 0
X
i=1
Pn
i=1 (yi − β0 − β1 xi )2
= σ2
n
Show that the estimators from the systems equations to be:
Pn
i=1 (xi − x̄)(yi − ȳ) cov(xi , yi )
β̂1 = Pn =
i=1 (xi − x̄)
2 σx2
β̂0 = ȳ − β̂1 x̄
Pn Pn
2 i=1 (yi − β̂0 − β̂1 xi )2 2
i=1 ϵi
σ̂ = =
n n
Example
Table below gives the kgs of popcorn per acre, Y, resulting from the use of various amounts
of fertilizer in kgs per acre, X, produced on a farm in each of 10 years from 2007 to 2016.
These are plotted in the scatter diagram of Fig. 4-1. The relationship between X and Y in
Fig. 4-1 is approximately linear (i.e., the points would fall on or near a straight line).
Year n Yi Xi
2007 1 40 6
2008 2 44 10
2009 3 46 12
2010 4 48 14
2011 5 52 16
2012 6 58 18
2013 7 60 22
2014 8 68 24
2015 9 74 26
2016 10 80 32
\
5.1.4 Solution
Table below shows the calculations to estimate the regression equation for the popcorn-
fertilizer problem in Table 4.1. Using Eq.
P 2 P 2
ei x
Sb20 = ×
n−k (x − x̄)2
P
n
P 2
e i 1
Sb21 = ×P
n−k (x − x̄)2
so that Sb20 and Sb21 are the standard errors of the estimates. Since ui is normally distributed,
Yi and therefore bˆ0 and bˆ1 are also normally distributed, so that we can use the t distribution
with n-k degrees of freedom, to test hypotheses about and construct confidence intervals for
bˆ0 and bˆ1 . Table 4.3 (an extension of Table 4.2) shows the calculations required to test the
statistical significance of bˆ0 and bˆ1 . The values of ŷi in Table 4.3 are obtained by substituting
the values of Xi into the estimated regression equation found in Example above. Table 4.3
PopCorn - Fertilizer Calculations to Test Significance of Parameters
n Yi Xi ŷi ei e2i X2 (x − x̄)2 (y − ȳ)2
1 40 6 37.08 2.92 8.526 36 144 -17
2 44 10 43.74 0.28 0.078 100 64 -13
3 46 12 47.04 -1.04 1.082 144 36 -11
4 48 14 50.36 -2.36 5.570 196 16 -9
5 52 16 53.68 -1.68 2.822 256 4 -5
6 58 18 57.00 1 1.00 324 0 1
7 60 22 63.64 -3.64 13.250 484 16 3
8 68 24 66.96 1.04 1.082 576 36 11
9 74 26 70.28 3.72 13.838 676 64 17
10 80 32 80.24 -0.24 0.058 1024 196 23
P 2 P 2
(x − x̄)2 (y − ȳ)2
P P P P P
n=10 y = 570 x = 180 e=0 e x
ȳ = 57 x̄ = 18 =47.3056 =3816 =576 =0
Since both t0 and t1 exceed t=2.306 with 8 degrees of freedom at the 5% level of significance
(using t-distribution tables), we conclude that both b0 and b1 are statistically significant at
the 5% level.
Where
(ŷi − ȳ)2 is the explained variation in Y or the Regression Sum of Squares (RSS)
P
(yi − ŷi )2 is the Residual variation in Y or the Error Sum of squares (ESS)
P
RSS ESS
1= +
T SS T SS
The coefficient of determination, or R2 , is then defined as the proportion of the total variation
in Y “explained” by the regression of Y on X:
RSS ESS
R2 = +
T SS T SS
This can be calculated by
R2 ranges in value from 0 (when the estimated regression equation explains none of the vari-
ation in Y) to 1 (when all points lie on the regression line). The coefficient of determination
for the popcorn-fertilizer example can be found from Table 4.3:
(yi − ŷi )2 e2
P P
2 47.31
R =1− P = 1 − =1− = 97%
(yi − ȳ) 2 (yi − ȳ)2
P
1634
Thus, the regression equation explains about 97% of the total variation in corn output. The
remaining 3% is attributed to factors included in the error term.
We can organize the results of a simple linear regression analysis in an ANOVA table. \
Source of DF Sum of Squares Mean Squares F
Variation in Y
Regression k-1 RSS RSS/(k-1)=MRS MRS/MER
Errors n-k ESS ESS/(n-k)=MER
Total n-1 TSS
Activity 9.1
The data in Table 4.4 reports the aggregate consumption (Y , in billions of MK) and dispos-
able income (X, also in billions of MK) for a developing economy for the 12 years from 2005
to 2016. State the general relationship between consumption Y and disposable income X in
b) stochastic form.
c) Why would you expect most observed values of Y not to fall exactly on a straight
line?
d) Find the regression equation for the consumption schedule in Table 4.4
5.3.1 Solution
yi = b0 + b1 bxi
Where i refers to each year in time-series analysis (as with the data in Table 4.4) or
to each economic unit (such as a family) in cross-sectional analysis. b0 and b1 are
unknown constants called parameters. Parameter b0 is the constant or Y intercept,
while b1 measures △Y = △X, which, in the context of income and consumption,
refers to the marginal propensity to consume (MPC). The specific linear relationship
corresponding to the general linear relationship is obtained by estimating the values of
b0 and b1 (represented by b̂0 and b̂1 and read as "b sub zero hat" and "b sub one hat").
b. The exact linear relationship in can be made stochastic by adding a random distur-
bance or error term, ui , giving
yi = b0 + b1 xi + ui
c. Most observed values of Y are not expected to fall precisely on a straight line