0% found this document useful (0 votes)
10 views

Week 2

Uploaded by

Như Quỳnh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Week 2

Uploaded by

Như Quỳnh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Week 2

Ordinary Least Squares (OLS)

Applied Econometrics
ECO440/ECO640
Niagara University
Lecture Outline
• Goal of OLS Regression: Estimating Empirical Regression
Equation

• Mechanics of OLS Estimator and Parameter Estimates in


Univariate Regression

• OLS and Multivariate Regression: Intuition and Interpretation

• Importance of Fit: Decomposition of Variance and


From Theory to Empirics: The Value of
OLS Regression
• The purpose of regression analysis is to take a
theoretical equation like:
Yi  β0  β1Xi  ε i (2.1)

• And use data to create an estimated equation:

Yˆ i  βˆ 0  βˆ 1Xi (2.2)

• Big question of today’s lecture:


– What empirical regression line should we use?
– Alternatively, how can we decide between several
alternative versions of equation 2.2?
Selecting an Empirical Regression Equation
• Ultimately, we want a regression equation that matches the form of our
theoretical regression equation.
– Linear function relating schooling to wages

• However, this equation should best “fit” the underlying relationship in the data
– Based on our data, what should be the value of ?
– How does an additional year of schooling translate into additional wages?
Possible Empirical Regression Candidates
• Do any of these possible empirical regression equations seem to fit the
data well?
• Consider the errors in prediction (residuals) between the data and the
regression line ()
• = level of wages that is predicted, given a specific level of education
• Consider the slope and what that implies for the relationship between
education and schooling
Preferred Empirical Regression Equation
• Best fitted regression line is seen below, chosen using a
method called OLS regression
• Why is this equation preferable to the others considered on
the previous slide?
Ordinary Least Squares Regression

• Ordinary Least Squares (OLS) is most widely used


method to obtain estimates.
– OLS is regression technique that chooses the empirical
regression line that minimizes the squared value of the
errors (residuals)
 Why minimize the squared value versus just the
overall sum of errors?
Distinguishing Between Estimators vs.
Estimates
• OLS is an example of an estimator
– An estimator is a mathematical technique applied to a
sample of data to produce an estimate of the true
population regression coefficient.
• Using OLS, we can compute estimates of the parameters
in our regression equation
– An estimate is the computed value of a population
regression coefficient by an estimator.
– Remember: parameters are the unknown betas (β),
variables are known from our data (Y, X)
– The β̂s produced by OLS are estimates.
Why use OLS?
• OLS is not the only regression estimation technique.
• Why should we use OLS? Three reasons:
1) OLS is relatively easy to use.
N
2) The goal of minimizing  ei2 has intuitive appeal.
i1
3) OLS estimates have at least two nice properties:
a. The sum of the residuals is exactly 0.
b. Under certain assumptions, OLS can be proven to
be the “best” estimator (discussed in Chapter 4).
a. OLS is assumed to be BLUE (best, linear,
unbiased estimator)
Lecture Outline
• Goal of OLS Regression: Estimating Empirical Regression
Equation

• Mechanics of OLS Estimator and Parameter Estimates in


Univariate Regression

• OLS and Multivariate Regression: Intuition and Interpretation

• Importance of Fit: Decomposition of Variance and


Ordinary Least Squares Mechanics
• Already mentioned that OLS picks values for β’s that
minimize the squared value of the errors
– Let us consider the mathematical form of this approach
and its solutions for and

(i = 1,2,…,N)
OLS Estimate Solutions for Single
Variable Regression
• The OLS solution estimates for the parameters in our
theoretical regression equation (2.1) are seen below:

Yi  β0  β1Xi  ε i (2.1)

 (X  X)(Y  Y)
i i
β̂1  i1
N (2.4)
 i
(X
i1
 X) 2

βˆ 0  Y  βˆ 1X (2.5)
Intuition of OLS Slope Coefficient Estimate
• Let’s consider the OLS estimate for to add some intuition behind the
mathematical form of our estimates
• Numerator represents the covariance between X and Y
– Covariance of X and Y: Statistically tells us how related is the
movement of our independent and dependent variables
– Important as is capturing how changing the X variable by 1 unit
impacts the variable of interest Y
– Can the numerator be positive or negative?
– As the relationship between X and Y becomes stronger, what
should that mean for magnitude of ?
N

 (X  X)(Y  Y)
i i
β̂1  i1
N

 i
(X
i1
 X) 2
Intuition of OLS Slope Coefficient Estimate
• Denominator represents the variance of X
– Variance of X: Statistically tells us the dispersion of values of X
(i.e., does X take on a wide or narrow range of values?)
 Think about X as gender, years of education, age of worker:
which X will likely have the greatest(smallest) variance?
– Can the denominator be positive or negative?
– Would your choice of units likely matter?
 Let’s say X is one’s salary and its measured in dollars or
thousands of dollars
 Will that choice impact variance of X?
N

 (X  X)(Y  Y)
i i
β̂1  i1
N

 i
(X
i1
 X) 2
Intuition of OLS Slope Coefficient Estimate
• represents the ratio of the covariance of X,Y to the variance of X
– Interpretation: We are weighting the relationship between X and Y
by the dispersion in the X variable
 Strong relation between X and Y lead to larger values of (in
absolute value terms)
 However, as is an estimate of the marginal effect of increasing
X by 1 unit on Y, this relationship between X and Y should be
scaled to account for the range of values of X

• Impact of sample used to estimate


– We see the means of the variables are included in . How might a
different sample impact our estimate of ?
N
– How could a different sample impact the
 Numerator?
 (Xi  X)(Yi  Y)
β̂1  i1 N
 Denominator?

i1
(X  X)2
i
Mechanically Calculating OLS Estimates
• Generally, you will use a statistical software (EViews, R, etc.) to
compute the estimates of and (and other betas in multivariate
regression)
• However, using our OLS solutions for a single variable equation, we
can manually compute these estimates as well
N
• Steps: 
(Xi  X)(Yi  Y)
– 1.) Compute means for X and Y β̂1  i1 N
– 2.) For each observation:
i1

(Xi  X)2
 Compute and
 Use these to compute and
– 3.) Across all the observations:
 Add up the values of and
– 4.) Plug these sums into the equation to find the estimate of
Example of Mechanically Calculating
OLS Estimates
Table 2.1 The Calculation of Estimated Regression Coefficients for the
Weight/Height Example
Example of Mechanically Calculating
OLS Estimates
Table 2.1 [Continued]
Example of Mechanically Calculating
OLS Estimates
• Using the computations on the last two slide, we can
find the values of and
– Furthermore, substituting these values into our
regression equation gives us the deterministic
component of the empirical regression equation

ˆβ  590.20  6.38
1
92.50
βˆ  169.4  (6.38 * 10.35)
0

 103.4
ˆ  103.4  6.38X
Yi i
Lecture Outline
• Goal of OLS Regression: Estimating Empirical Regression
Equation

• Mechanics of OLS Estimator and Parameter Estimates in


Univariate Regression

• OLS and Multivariate Regression: Intuition and


Interpretation

• Importance of Fit: Decomposition of Variance and


Estimating Multivariate Regression
Models with OLS
• As we discussed before, only a few dependent variables
can be explained fully by a single independent variable.
– Furthermore, we may not arrive at reliable estimates
of ’s if we only include one variable
• As such, it’s vital to move to multivariate regression
models.
– The general multivariate regression model with K
independent variables:

Yi  β0  β1X1i  β2 X 2i  ...  βK XKi  ε i (1.11)


Interpreting OLS Estimates from
Multivariate Regression Models
• Biggest difference between single-independent variable and multivariate
regression models is in the interpretation of coefficients.
• The interpretation of a multivariate regression coefficient is that :
– Indicates the change in the dependent variable (Y)
 Example: Impact on wages

– Associated with a one-unit increase in the independent variable (X) in


question
 Example: From increasing schooling by 1 year

– Holding constant the other independent variables in the


equation (V, W, Z)
 Example: Holding the values of experience, tenure at your
employer, and gender constant
Multivariate Regression
Interpretation Example 1 : Demand
for Beef in US
Example: per capital beef consumption in U.S.
· = 37.54 - 0.88P +11.9Yd
CB (2.7)
t t t

where: CBt = per capita consumption of beef in year t


Pt = price of beef in year t
Ydt = per capita disposable income in year t.
• Income’s estimated coefficient of 11.9 indicates:
– Beef consumption will increase by 11.9 pounds per
person
– If per capita disposable income goes up by $1000,
– Holding constant the price of beef.
Mechanics of OLS Multivariate
Regression
• Application of OLS to multivariate models is similar to single-
independent models.
• OLS still chooses β̂s to minimize the summed squared residuals.
– The multivariate procedure is more cumbersome.
– Similar intuition of the estimates [covariance(X,Y) scaled by
variance(X)] holds from the univariate case
– However, in order to hold Z,V,W constant:
 The numerator adjusts the covariance(X,Y) to remove
the impact of other variables on Y and X
 The denominator adjusts the variance(X) to remove the
impact of other variables on X
• Luckily, computer software can calculate estimates in less than
a second.
Multivariate Regression
Interpretation Example 2 : Financial
Aid
Example: Financial aid at a liberal arts college

FINAI Di =β 0 +β 1 PAREN T i +β 2 HSRAN K i +ε i (2.9)


where:
PARENTi = the amount (in dollars per year) that the parents
of the ith students are judged able to contribute
to college expenses
HSRANKi = the ith student’s GPA rank in high school,
measured as a percentage ranging from a low
of 0 to a high of 100).
FINAI D i=8927 −0.36 PAREN T i +87.4 HSRAN K i(2.10)
^
Multivariate Regression
Interpretation Example 2 : Financial
Aid
^
FINAI D i=8927 −0.36 PAREN T i +87.4 HSRAN K(2.10)
i

• Coefficient on PARENT means that the ith student’s financial aid


grant will fall by $0.36 for every dollar increase in parental ability
to pay, holding constant high school rank.
– Is HSRANK more important than PARENT?
• To consider impact of unit choice, we can alternatively measure
PARENT in thousands of dollars and find the following
estimates:

^
FINAI D i=8927 −357 PAREN T i +87.4 HSRAN K i (2.11)
Multivariate Regression
Interpretation Example 2 : Financial
Aid
Figure 2.1 Financial Aid as a Function of Parents’ Ability to Pay
Multivariate Regression
Interpretation Example 2 : Financial
Aid
Figure 2.2 Financial Aid as a Function of High School Rank
Lecture Outline
• Goal of OLS Regression: Estimating Empirical Regression
Equation

• Mechanics of OLS Estimator and Parameter Estimates in


Univariate Regression

• OLS and Multivariate Regression: Intuition and Interpretation

• Importance of Fit: Decomposition of Variance and


The Fit of a Regression
• Econometricians often focus or consider the fit of a
regression, particularly when their focus is on prediction
• In order to discuss fit, we need to introduce some
conceptual tools developed by econometrics to describe
the dispersion of the dependent variable
– Econometricians use the squared variation of Y around
its mean () as a measure of the amount of variation in
Y explained by the estimated regression equation.
– This computed quantity is usually called the total sum
of squares, or TSS.
N
TSS   (Yi  Yi )2 (2.12)
i1
Decomposing Total Sum of Squares
• For OLS, TSS has two components:
– Explained sum of squares (ESS)
– Variation that can be explained by the regression
(i.e., the variables included in your reg equation)
– Residual sum of squares (RSS)
– Variation that cannot be explained by the
regression (i.e., all remaining variation in Y)
N N N

 (Yi  Yi ) =
i1
2
 i i  i
ˆ
(Y
i1
 Y 2
)  (e
i1
)2
(2.13)

TSS = ESS + RSS


• This is usually called the decomposition of variance.
Graphical Depiction of
Decomposition of Variance
Figure 2.3 Decomposition of the Variance in Y
– Where is TSS, ESS, and RSS shown in the graph?
Measuring the Overall Fit of an
Estimated OLS Regression
• What are we looking for when we want to evaluate the fit of
an estimated regression equation?
– Does our model do a good job of predicting values of Y
given different values of X?
– In a decomposition of variance framework, this means
that the ESS (explained sum of squares) represents a
large fraction of the TSS (total sum of squares) from
our estimated regression equation
 Alternatively, RSS (residual sum of squares)
represents a small portion of TSS
• To measure the relative size of ESS to TSS,
econometricians use a measure called
and the Overall Fit of an Estimated
OLS Regression
Describing the Overall Fit of the
Estimated Model with : Example 1
Figure 2.4 X and Y are not related; in such a case, R2 would be 0.
Describing the Overall Fit of the
Estimated Model with : Example 2
Figure 2.5 A set of data for X and Y that can be “explained” quite
2
well with a regression line (R = .95).
Adding Variables and the
Interpretative Limitations of
• A major problem with R2 is adding another independent
variable to an equation can never R2 .
• decrease
Recall Equation (2.14):

R 
2 ESS
 1
RSS
 1
 ei2
(2.14)
TSS TSS  (Yi  Y )2
• Adding a variable will not change TSS. Why?
– Adding a variable will, in most cases, decrease RSS
and increase R2
– Even if the added variable is nonsensical, R2 will
increase, unless the OLS coefficient for that added
variable is exactly zero
Example of the Interpretative
Limitations of
Example: Chapter 1 Weight guessing regression

Estimated Weight  103.40  6.38 Height (over 5ft) (1.19)

R 2  0.74

How will r-squared change if we add a new, nonsensical


variable to our weight equation?
• Ex: student’s campus post office box number

Estimated Weight  102.35  6.36 Height (over 5ft)  0.02 Box#

R 2  0.75
Additional Variables, Degrees of
Freedom, and
• Including post office box demonstrates the problems of r-
squared and the need for a related alternative
• Furthermore, the inclusion of the post office box variable
requires the estimation of a coefficient.
– This lessons the degrees of freedom, or the excess of
the number of observations (N) over the coefficients
(including the intercept) estimated (K+1).
– The lower the degrees of freedom, the less reliable the
estimates are likely to be.
• Thus, the increase in the quality of fit needs to be
compared to the decrease in the degrees of freedom.
– R 2 was developed for this purpose.
Using To Describe the Overall Fit of
the Estimated Model
• R 2 measures the percentage variation of Y around its mean that
is explained by the regression equation, adjusted for degrees
of freedom.

R 2  1
 i /(N  K  1)
e 2

(2.15)
 (Y  Y)
i
2
/(N  1)

• R 2 will increase, decrease or stay the same when a variable is


added to an equation depending on whether the improvement in
fit outweighs the loss of degrees of freedom.
Appropriate and Inappropriate Uses
of
• R 2 can be used to compare the fits of equations with the
same dependent variable.

• R 2 cannot be used to compare the fits of equations with


different dependent variables or dependent variables
measured differently.
• A warning: quality of fit of an estimated equation is only
one measure of the overall quality.
– Do NOT pick variables to maximize R-bar squared
– Measures of fit are usually concerned with entire
regression equation when we are focused more on
one parameter of interest
Example of Misusing Fit and R-bar
Squared
Example: Estimate consumption of mozzarella cheese

 (2.16)
MOZZARELLA t = 0.85 + 0.378 INCOME t

N  10 R 2  0.88
where:
MOZZARELLAt = U.S. per capita consumption of mozzarella
cheese (in pounds) in year t
INCOMEt = U.S. real disposable per capital income (in thousands
of dollars) in year t
• On a hunch, add in new variable:

DROWNINGSt = U.S. deaths due to drowning after falling out of a


fishing boat in year t
Example of Misusing Fit and R-bar
Squared

MOZZARELLA t = 3.33 + 0.248 INCOME t  0.04 DROWINGS t (2.17)

N  10 R 2  0.97

• Equation (2.17) has a higher R 2 but no reasonable


theory could link drownings to cheese consumption!
• Researchers should not use R 2 as the sole measure
of the quality of an equation.

You might also like