0% found this document useful (0 votes)

18 views39 pages

Lecture 2

Uploaded by

marupakareshmitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views39 pages

Lecture 2

Uploaded by

marupakareshmitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 39

Part I.

Regression
Analysis with Cross-
Sectional Data
Prepared by Quanquan Liu
Fall 2024
Lecture 2. The Simple
Regression Model
Definition

 Simple linear regression model:

 A model where the dependent variable is a linear function of a single independent
variable, plus an error term.
.
 y - dependent variable, explained variable, response variable, predicted variable,
regressand;
 x - independent variable, explanatory variable, control variable, predictor variable,
regressor, covariate;
 u - error term, disturbance; represents factors other than x that affect y.
Definition

 Simple linear regression model:

.
 If
,
 then
.
 - the coefficient on an independent variable in a multiple regression model.
 - the constant term that gives the expected value of the dependent variable when all
the independent variables equal zero.
Definition

 Simple linear regression model:

 Example: A Simple Wage Equation

.
Definition

 When is there a causal interpretation?

 Assumption 1.
.
 Assumption 2.
.
 u is mean independent of x: The unobserved error has a mean that does not change
across subsets of the population defined by different values of the explanatory
variables.
 Zero conditional mean assumption:
.
Definition

 Example: A Simple Wage Equation

.
 The average ability level must be the same for all education levels.
 If, for example, we think that average ability actually increases with years of
education, then the zero conditional mean assumption here is false.
 The population regression function (PRF)

 This means that the average value of the dependent variable can be expressed as a
linear function of the explanatory variable.
Definition

 Figure. as a linear function of x.

Deriving the Ordinary Least Squares
Estimates
 How to estimate the parameters β0 and β1?
 Let denote a random sample of size n from the population:
for each i.

 ui is the error term for observation i; it contains all factors affecting yi other than xi.
Deriving the Ordinary Least Squares
Estimates
 Ordinary Least Squares (OLS): A method for estimating the parameters of a multiple linear
regression model. The ordinary least squares estimates are obtained by minimizing the
sum of squared residuals.
 For any and , define a fitted value for y when as
.
 The residual for observation i is the difference between the actual yi and its fitted value:
.
Deriving the Ordinary Least Squares
Estimates
 Figure. Fitted values and residuals.
Deriving the Ordinary Least Squares
Estimates
 We choose and to minimize the sum of squared residuals,

 First order conditions:

 Solution:
Deriving the Ordinary Least Squares
Estimates
 Once we have determined the OLS intercept and slope estimates, we form the OLS
regression line:

 It is also called the sample regression function (SRF): it is the estimated version of the
population regression function
 In most cases,

is of primary interest. It tells us the amount by which changes when x increases by one
unit.
Deriving the Ordinary Least Squares
Estimates
 Example. CEO Salary and Return on Equity
.
 Using CEOSAL1 – a dataset that contains information on 209 CEOs for the year 1990, the
OLS regression line relating salary to roe is
.
 use CEOSAL1, clear
 reg salary roe
Properties of OLS

 Fitted Values and Residuals

,
.
 If , the line underpredicts yi; if , the line overpredicts yi.
Properties of OLS

 Algebraic Properties of OLS Statistics:

 Property 1. The sum, and therefore the sample average of the OLS residuals, is zero.

 Property 2. The sample covariance between the regressors and the OLS residuals is zero.

 Property 3. The point is always on the OLS regression line.

Properties of OLS

 Algebraic Properties of OLS Statistics

.
 From property 1:

 From property 1&2:

.
Properties of OLS

 Algebraic Properties of OLS Statistics

 Define the total sum of squares (SST), the explained sum of squares (SSE), and the residual
sum of squares (SSR) as follows:
Properties of OLS

 Goodness-of-Fit
 The R-squared of the regression, or the coefficient of determination, is defined as

 is the ratio of the explained variation compared to the total variation; it is interpreted as
the fraction of the sample variation in y that is explained by x.

 Example. CEO Salary and Return on Equity

.
Units of Measurement and Functional
Form
 The Effects of Changing Units of Measurement on OLS Statistics
.
 Let be salary in dollars: would be interpreted as $845,761.
.
 If the dependent variable is multiplied by the constant c, then the OLS intercept and slope
estimates are also multiplied by c.
 Define to be the decimal equivalent of roe; thus, means a return on equity of 23%.
.
 If the independent variable is divided or multiplied by some nonzero constant, c, then the OLS
slope coefficient is multiplied or divided by c, respectively.
 The goodness-of-fit of the model should NOT depend on the units of measurement of our
variables.
Units of Measurement and Functional
Form
 Incorporating nonlinearities: Semi-logarithmic form
 Example. A Log Wage Equation
 A model that gives (approximately) a constant percentage effect is
,
where log(·) denotes the natural logarithm. This changes the interpretation of the regression
coefficient:

In particular, if , then
.
 We multiply by 100 to get the percentage change in wage given one additional year of education.
Units of Measurement and Functional
Form
 Using the data in WAGE1 we obtain:

 use WAGE1, clear

 reg lwage educ
 predict y
 gen predict_wage = exp(y)
 twoway (line predict_wage educ, sort)
Units of Measurement and Functional
Form
 Incorporating nonlinearities: Log-logarithmic form
 constant elasticity model: A model where the elasticity of the dependent variable, with
respect to an explanatory variable, is constant; in multiple regression, both variables
appear in logarithmic form.
 Example. CEO Salary and Firm Sales
,
Again, this changes the interpretation of the regression coefficient:

 is the elasticity of salary with respect to sales.

Units of Measurement and Functional
Form
 Example. CEO Salary and Firm Sales (cont.)

 use CEOSAL1, clear

 reg lsalary lsales
 If the dependent variable (or independent variable) is log(y) (or log(x)) and we change the
units of measurement of y (or x) before taking the log, the slope remains the same, but
the intercept changes.
Units of Measurement and Functional
Form
 semi-elasticity: The percentage change
in the dependent variable given a one-
unit increase in an independent
variable.
 elasticity: The percentage change in one
variable given a 1% ceteris paribus
increase in another variable.
Units of Measurement and Functional
Form
 The Meaning of “Linear” Regression
.
 This equation is linear in the parameters and .
 An example of nonlinear regression model:
Expected Values and Variances of the
OLS Estimators
 The estimated regression coefficients are random variables because they are calculated
from a random sample:

 The question is what the estimators will estimate on average and how large will their
variability be in repeated samples:
Unbiasedness of OLS

 Assumption SLR.1. Linear in Parameters

 In the population model, the dependent variable, y, is related to the independent
variable, x, and the error (or disturbance), u, as
,
where and are the population intercept and slope parameters, respectively.
 In the population, the relationship between y and x is linear in the parameters and .
 Assumption SLR.2. Random Sampling
 We have a random sample of size n, , following the population model in the equation
above.
 The data is a random sample drawn from the population.
 Each data point therefore follows the population equation: .
Unbiasedness of OLS
Unbiasedness of OLS

 Assumption SLR.3. Sample Variation in the Explanatory Variable

 The sample outcomes on x, namely, , are not all the same value.

 The values of the explanatory variables are not all the same (otherwise it would be impossible to study
how different values of explanatory variable lead to different values of the dependent variable.)
 Assumption SLR.4. Zero Conditional Mean
 The error u has an expected value of zero given any value of the explanatory variable. In other
words,
.
 The value of the explanatory variable must contain no information about the mean of the unobserved
factors.
 For a random sample, this assumption implies that , for all
Unbiasedness of OLS

 Theorem 1. Unbiasedness of OLS

 Under Assumptions SLR.1, SLR.2, SLR.3 and SLR.4,
and
for any values of and . In other words, is unbiased for , and is unbiased for .
 Interpretation of unbiasedness
 The estimated coefficients may be smaller or larger, depending on the sample that is the result of
a random draw.
 However, on average, they will be equal to the values that characterize the true relationship
between y and x in the population.
 “On average” means if sampling was repeated, i.e. if drawing the random sample and doing the
estimation was repeated many times.
 In a given sample, estimates may differ considerably from true values.
Variances of the OLS Estimators

 Homoskedasticity: The errors in a regression model have constant variance conditional on

the explanatory variables.
 Assumption SLR.5. Homoskedasticity
 The error u has the same variance given any value of the explanatory variable. In other
words,
.
 The value of the explanatory variable must contain no information about the variability of the
unobserved factors.
Variances of the OLS Estimators

 Graphical illustration of homoskedasticity

 Given and :
Variances of the OLS Estimators

 Heteroskedasticity: The variance of the error term, given the explanatory variables, is not
constant.
 An example for heteroskedasticity: Wage and education
Variances of the OLS Estimators

 Theorem 2. Sampling Variances of the OLS Estimators

 Under Assumptions SLR.1, SLR.2, SLR.3, SLR.4 and SLR.5,

where these are conditional on the sample values

 The larger the error variance , the larger is .
 More variability in the independent variable is preferred: as the variability in the
increases, the variance of decreases.
 As the sample size increases, so does the total variation in the . Therefore, a larger sample
size results in a smaller variance for .
Estimating the Error Variance

 The unbiased estimator of is

because in simple regression model.

 Theorem 3. Unbiased Estimation of
 Under Assumptions SLR.1, SLR.2, SLR.3, SLR.4 and SLR.5,
.
 standard error of the regression (SER):
 standard error of :
 standard error of :
Regression on a Binary Explanatory
Variable
 dummy variable/binary variable: A variable that takes on the value zero or one.
 Consider again the equation

where now x is a binary variable. If we impose the zero conditional mean assumption SLR.4 then
we obtain

The difference now is that x can take on only two values. By plugging the values zero and one
into the equation above, it is easily seen that
and .
It follows immediately that

 is the difference in the average value of y over the subpopulations with and .
Regression on a Binary Explanatory
Variable
 Example. Wage and Race
,
where if a person is classified as white and zero otherwise. Then
.
 is the difference in average hourly wages between white and nonwhite workers.
 The mechanics of OLS do not change just because x is binary.
 The statistical properties of OLS are also unchanged when x is binary.
Summary

 The Simple Regression Model

 Definition of the Simple Regression Model
 Deriving the Ordinary Least Squares Estimates
 Properties of OLS on Any Sample of Data
 Units of Measurement and Functional Form
 The Effects of Changing Units of Measurement on OLS Statistics
 Incorporating Nonlinearities in Simple Regression
 Expected Values and Variances of the OLS Estimators
 Unbiasedness of OLS
 Variances of the OLS Estimators
 Estimating the Error Variance
 Regression on a Binary Explanatory Variable

Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)
Microfit Guide2
No ratings yet
Microfit Guide2
17 pages
ECC321 chapter2
No ratings yet
ECC321 chapter2
5 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
The Simple Regression Model
No ratings yet
The Simple Regression Model
24 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Chapter 2 Econometric
No ratings yet
Chapter 2 Econometric
28 pages
Lecture 3
No ratings yet
Lecture 3
27 pages
02 Simple Regression
No ratings yet
02 Simple Regression
29 pages
Lecture 2 Simple Regression Model
100% (1)
Lecture 2 Simple Regression Model
47 pages
R18&19
No ratings yet
R18&19
32 pages
Econometrics___Lecture_2___Simple_Regression
No ratings yet
Econometrics___Lecture_2___Simple_Regression
33 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
AG909 Quantitative Methods For Finance
No ratings yet
AG909 Quantitative Methods For Finance
7 pages
Regression: Dr. Agustinus Suryantoro, M.S
No ratings yet
Regression: Dr. Agustinus Suryantoro, M.S
31 pages
Chapter 2
No ratings yet
Chapter 2
12 pages
Multiple regression
No ratings yet
Multiple regression
14 pages
Simple Linear Regression Model (1)
No ratings yet
Simple Linear Regression Model (1)
51 pages
The Simple Regression Model
No ratings yet
The Simple Regression Model
41 pages
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
No ratings yet
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
64 pages
CH 02
No ratings yet
CH 02
41 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
17 pages
Chapter 2 Econometrics
No ratings yet
Chapter 2 Econometrics
9 pages
2 - Model Linear Jamak Dan OLS
No ratings yet
2 - Model Linear Jamak Dan OLS
11 pages
Lecture 6. Linear Regression
No ratings yet
Lecture 6. Linear Regression
12 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
05 16 Simple Regression 2
No ratings yet
05 16 Simple Regression 2
84 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
Econometrics 7
No ratings yet
Econometrics 7
49 pages
Lecture 6
No ratings yet
Lecture 6
45 pages
Lecture 3 Multiple Regression Model-Estimation
No ratings yet
Lecture 3 Multiple Regression Model-Estimation
40 pages
Chapter 1: The Nature of Econometrics and Economic Data
No ratings yet
Chapter 1: The Nature of Econometrics and Economic Data
16 pages
Ordinary Least Squares-2
No ratings yet
Ordinary Least Squares-2
31 pages
Ols 2
No ratings yet
Ols 2
19 pages
1170_10045_411513
No ratings yet
1170_10045_411513
55 pages
Chapter 1 Article
No ratings yet
Chapter 1 Article
9 pages
Chapter 1: The Nature of Econometrics and Economic Data
No ratings yet
Chapter 1: The Nature of Econometrics and Economic Data
19 pages
DA Unit-3 Short Q&A
No ratings yet
DA Unit-3 Short Q&A
17 pages
Lecture 3 - Econometria I
No ratings yet
Lecture 3 - Econometria I
46 pages
Week 2 - The Simple Linear Regression Model PDF
No ratings yet
Week 2 - The Simple Linear Regression Model PDF
47 pages
Ch3 Slides Ed4 2024
No ratings yet
Ch3 Slides Ed4 2024
72 pages
1- The Simple Regression Model
No ratings yet
1- The Simple Regression Model
41 pages
Ch3_slides_Ed4_2024_20(1)
No ratings yet
Ch3_slides_Ed4_2024_20(1)
72 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
Econometrics 8
No ratings yet
Econometrics 8
35 pages
Eco 3
No ratings yet
Eco 3
68 pages
CH 03
No ratings yet
CH 03
17 pages
ch-02-wooldridge-5e-ppt20250307
No ratings yet
ch-02-wooldridge-5e-ppt20250307
51 pages
Chapter Three
No ratings yet
Chapter Three
22 pages
Linear Regression Slides
No ratings yet
Linear Regression Slides
129 pages
Ecc321 chapter 3
No ratings yet
Ecc321 chapter 3
8 pages
Ordinary Least Squares: Rómulo A. Chumacero
No ratings yet
Ordinary Least Squares: Rómulo A. Chumacero
50 pages
Tema I (Mínimos Cuadrados Ordinarios)
No ratings yet
Tema I (Mínimos Cuadrados Ordinarios)
49 pages
CH 02 PPT Simple Linear Regression
No ratings yet
CH 02 PPT Simple Linear Regression
43 pages
Manual ML 1
No ratings yet
Manual ML 1
8 pages
Lecture set 2
No ratings yet
Lecture set 2
47 pages
Econometrics jimma assignment
No ratings yet
Econometrics jimma assignment
6 pages
Introduction to Logarithms and Exponentials
From Everand
Introduction to Logarithms and Exponentials
Simone Malacrida
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Exercises of Multi-Variable Functions
From Everand
Exercises of Multi-Variable Functions
Simone Malacrida
No ratings yet
Chapter 2
No ratings yet
Chapter 2
14 pages
Hotelling 1944 Paper
No ratings yet
Hotelling 1944 Paper
11 pages
MATM - Z Table
No ratings yet
MATM - Z Table
1 page
Midterm Psych
No ratings yet
Midterm Psych
84 pages
CS772-Lec1
No ratings yet
CS772-Lec1
15 pages
unit 2_class_preceptron
No ratings yet
unit 2_class_preceptron
13 pages
Be - PQT - Problem Metiral
No ratings yet
Be - PQT - Problem Metiral
13 pages
Testing The Difference Between Proportions
100% (2)
Testing The Difference Between Proportions
20 pages
Chapter 7 - Methods of Finding Estimators: Chapter 7 For BST 695: Special Topics in Statistical Theory. Kui Zhang, 2011
No ratings yet
Chapter 7 - Methods of Finding Estimators: Chapter 7 For BST 695: Special Topics in Statistical Theory. Kui Zhang, 2011
30 pages
Module 5
No ratings yet
Module 5
42 pages
Package Bayeslogit': R Topics Documented
No ratings yet
Package Bayeslogit': R Topics Documented
15 pages
Application of Predictive Analytics in Volume Forecasting and Resource Planning
No ratings yet
Application of Predictive Analytics in Volume Forecasting and Resource Planning
69 pages
Math IV Quantum
No ratings yet
Math IV Quantum
147 pages
Ba Lab Manual
No ratings yet
Ba Lab Manual
84 pages
Chapter 4 Measures of Variation
No ratings yet
Chapter 4 Measures of Variation
24 pages
Mice Lectures
No ratings yet
Mice Lectures
109 pages
II Puc Statistics Practice Papers
No ratings yet
II Puc Statistics Practice Papers
9 pages
Sample 7620
No ratings yet
Sample 7620
11 pages
Department of Agricultural Statistics: JNR MSC Agri, Students, Uasd
No ratings yet
Department of Agricultural Statistics: JNR MSC Agri, Students, Uasd
2 pages
Range (R) Highest Value Lowest Value
No ratings yet
Range (R) Highest Value Lowest Value
5 pages
Lavaan Multilevel Zurich2017
100% (1)
Lavaan Multilevel Zurich2017
162 pages
CH 9
No ratings yet
CH 9
53 pages
Statistics in Hydrology
100% (1)
Statistics in Hydrology
4 pages
Data Mining - Utrecht University - 10. Slides
No ratings yet
Data Mining - Utrecht University - 10. Slides
49 pages
PPT-Hackathon Tiny Coders (1) (1)
No ratings yet
PPT-Hackathon Tiny Coders (1) (1)
21 pages
7 Stages of Factor Analysis Pdf_compressed
No ratings yet
7 Stages of Factor Analysis Pdf_compressed
21 pages
Unit II Descriptive-Statistics-And-Correlation
No ratings yet
Unit II Descriptive-Statistics-And-Correlation
19 pages
588165204
No ratings yet
588165204
3 pages
Chapter 3
No ratings yet
Chapter 3
17 pages

Lecture 2

Uploaded by

Lecture 2

Uploaded by

Part I.

 Simple linear regression model:

 Simple linear regression model:

 Simple linear regression model:

 Example: A Simple Wage Equation

 When is there a causal interpretation?

 Example: A Simple Wage Equation

 Figure. as a linear function of x.

 First order conditions:

 Fitted Values and Residuals

 Algebraic Properties of OLS Statistics:

 Property 3. The point is always on the OLS regression line.

 Algebraic Properties of OLS Statistics

 From property 1&2:

 Algebraic Properties of OLS Statistics

 Example. CEO Salary and Return on Equity

 use WAGE1, clear

 is the elasticity of salary with respect to sales.

 use CEOSAL1, clear

 Assumption SLR.1. Linear in Parameters

 Assumption SLR.3. Sample Variation in the Explanatory Variable

 Theorem 1. Unbiasedness of OLS

 Homoskedasticity: The errors in a regression model have constant variance conditional on

 Graphical illustration of homoskedasticity

 Theorem 2. Sampling Variances of the OLS Estimators

where these are conditional on the sample values

 The unbiased estimator of is

because in simple regression model.

 The Simple Regression Model

You might also like