0% found this document useful (0 votes)

20 views

Multiple Regression Model

The document describes a multiple regression model where the dependent variable Y is modeled as a linear function of independent variables X1 and X2 plus an error term. β1 and β2 represent the sensitivity of Y to changes in X1 and X2 respectively, holding other factors constant. β2 specifically measures the difference in the expected value of Y when X2=1 versus when X2=0. The model is written in matrix notation and OLS is used to estimate the unknown β parameters. The OLS estimators are proven to be unbiased under certain assumptions. The variance-covariance matrix of the estimators is also derived. An estimator for the error variance is provided and its properties discussed. Perfect multicollinearity and how

Uploaded by

frapass99

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Multiple Regression Model

Uploaded by

frapass99

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

MULTIPLE REGRESSION MODEL

The Data Generating Process (DGP), or the population, is described by the following linear model.
Y j=β 0 + β 1 X j ,1 + β 2 X j ,2 + ε j
where:
 Y j is the j-th observation of the dependent variable Y (it is known)
 X j ,1and X j ,2 are the j-th observations of the independent variables X 1 and X 2 (they are known)
 β 0 is the intercept term (it is unknown)
 β 1 and β 2 are parameters (they are unknown)
 ε j is the j-th error, the j-th unobserved factor that, besides X 1 and X 2 , affects Y (it is unknown)

If X j ,1varies of ∆ X j , 1, then Y j varies of

∆ Y j=β 1 ∆ X j ,1

So β 1 is the sensitivity of Y j to variations in X j ,1assuming that the other factors remain constant.

∆Y j
β 1=
∆ X j ,1
If X j ,2 varies of ∆ X j , 2, then Y j varies of
∆ Y j=β 2 ∆ X j ,2

So β 2 is the sensitivity of Y j to variations in X j ,2 assuming that the other factors remain constant.

∆Y j
β 2=
∆ X j ,2
In other words, β j measures the effect of a unit variation of the j-th regressor on Y by keeping
the other regressors (and the unknown factors) constant.

Suppose X j ,2 can be either 0 or 1 (dummy variable).

Example: X j ,2=0: no crisis, X j ,2=1 : crisis
Assume that
1) E [ ε j ] =0
2) E [ ε j∨ X 1 , X 2 ] =E [ ε j ] =0

Then
E [ Y j∨X j ,1 , X j , 2=0 ]=β 0 + β 1 X j , 1+ β2 ∙0=¿ β 0 + β 1 X j ,1 ¿
E [ Y j∨X j ,1 , X j , 2=1 ]=β 0 + β 1 X j ,1 + β 2 ∙ 1=¿ β 0 + β 1 X j , 1+ β 2 ¿

Now, subtract the second from the first

E [ Y j∨X j ,1 , X j , 2=1 ]−E [ Y j∨X j ,1 , X j , 2=0 ]= β0 + β 1 X j , 1+ β2− β0 + β 1 X j , 1=β 2

β 2 measures the difference between the expected value of Y j when X j ,2=1 and the expected value of
Y j when X j ,2=0, keeping the other regressors constant.
Multiple regression model: the matrix notation
Collect the observations { Y 1 , … , Y T } , {X 1, 1 , … , X T ,1 } and { Y 1 , … , Y T } , {X 1, 2 , … , X T ,2 } into vectors

Where T is the sample size.

The model Y j=β 0 + β 1 X j ,1 + β 2 X j ,2 + ε j becomes written as

That can be summarized with

Y = Xβ+ ε
If the number of the regressors increases, the matrix of the regressors and the vector of the beta increase
their dimension.

This is our new DGP. Let’s derive the OLS estimators minimizing the sum of the squared errors.
T
First of all, the objective function is O ( β 0 , β 1 , β 2 )=∑ ε t
2

t =1

Considering that X’X is always symmetric, the FOC’s are

' '
∇ β O ( β 0 , β 1 , β2 ) =0 ⟹ X Xβ= X Y
And assuming that X’X is invertible, the OLS estimator for a generic number of regressors is

^β=( X ' X )−1 (X ' Y )

You always have to verify that det(X’X) ≠ 0, because if it is not the matrix is not invertible.
Sometimes the determinant can be very close to 0 (ex. when there are regressors that are collinear, that
are high correlated), so even if the inverse can be computed and Matlab computes it, there will be huge
numerical errors and very poor results (closer is the determinant to 0, worst is the performance of the
algorithm that Matlab uses to calculate the inverse). For this reason, when the determinant is very small,
Matlab gives you a “Warning”.
Unbiasedness of the OLS estimators
Assume that
1) the DGP is Y = X β+ ε
2) det(X’X) ≠ 0, so X’X is invertible
3) E [ ε t ] =0
4) E [ ε t∨ X 1 , … , X K ] =E [ ε t ] =0

These 4 assumptions are enough to prove that the OLS estimator is unbiased.
^β can also be expressed as
^β=β + ( X ' X )−1 ( X ' ε )

From this, we derive that

E [ ^β|X ] = β
And then, by the law of Iterated expectations, ^β is an unbiased estimator of β , in the sense that

E [ ^β ]=β

The Conditional Variance-Covariance Matrix of the OLS estimators

The X-conditional variance-covariance matrix of the estimator ^β is calculated conditionally on having
observed the X. Consider the case K = 2:

It’s a 3x3 symmetric matrix which has on the main diagonal the conditional variances of the estimators
(there should be “|X” inside the square bracket) and in all the other positions the conditional covariances
between the estimators.

If we assume that
1) the DGP is Y = X β+ ε
2) det(X’X) ≠ 0, so X’X is invertible
3) E [ ε t ] =0
4) E [ ε t∨ X 1 , … , X K ] =E [ ε t ] =0
5) V [ ε j∨X 1 , … , X n ] =E [ ε j ∨X 1 , … , X n ] −( E [ ε j∨ X 1 , … , X n ] ) =E [ ε j ∨X 1 , … , X n ]=E [ ε j ]=σ ε
2 2 2 2 2

6) E [ ε t ε s ] =E [ ε t ] E [ ε s ] =0 for all t ≠ s (the errors are independent)

Then, the X-conditional variance covariance matrix of the estimator ^β coincides with…
2
−1
∑ ¿ E [( ^β−¿ β )( β−
'
^ β ) ∨ X ]¿ σ 2ε ( X ' X ) ¿
X

Estimator of the error’s variance

6) E [ ε t ε s ] =E [ ε t ] E [ ε s ] =0 for all t ≠ s (the errors are independent)

If we call ε^ =Y − X β^ =Y −Y
^ the residuals of the OLS regression, then the random variable
T

∑ ε^ 2j SSR
σ^2ε = j=1
=
T −( K +1) T −(K +1)
(where K is the number of the known regressors - then we add 1, that is the known constant regressor
associated to β 0 ¿ is an unbiased estimator of the error’s variance σ ε , in the sense that
2

E [ σ^ ¿ ¿ ε ]=σ ε ¿
2 2

If we have the additional assumption that the errors are i.i.d. r.vs. with normal distribution ε j N ( 0 , σ ε ),
2

then from the fact that ^β is a linear combination of the unknown errors ε

^β=β + ( X ' X )−1 ( X ' ε )

we derive that
^β N ( β , σ 2 ( X ' X )−1)
ε

Perfect multicollinearity
M out of K regressors X j ,t (where j = 1, …, M) have perfect multicollinearity if you can find a coefficient
different from 0 such that the linear combination of the M regressors is equal to 0 for all t (namely, if one
regressor can be expressed as a linear function of the others):
∃ λ 0 , … , λ M ∈ R such that λ0 + λ 1 X t ,1 +…+ λ M X t , M =0 t=1 , … , T

If among the regressors X 1 , … , X K two or more of them have perfect multicollinearity, then the
assumption number 2) is not respected, because det(X’X) = 0 and ∄(X’X)-1.

Some observations:
- Multicollinearity is a data problem
- Multicollinearity is more recurrent in small samples (micronumerosity) -> Solution: increase the
number of observations
- More regressors you include in your regression, higher is the probability to have multicollinearity (if a
model with 10 regressors has more or less the same results of a model with 100 regressors, it’s better
to choose the more parsimonious model, but you pay a price: the SSR increases)

Multicollinearity – detection rule: The Variance Inflation Factor

The variances of the regressors that are multicollinear are inflated. This implies a very poor estimate of the
regressors and a very poor statistical inference1.

1
Multicollinearity implies inflated variance and inflated variance implies deflated t-statistics (therefore, you could say
that the regressor isn’t significant while it is)
In order to find out if there is multicollinearity between the regressors, take the j-th regressor and regress it
on the other explanatory variables:
X t , j=β 0 + β 1 X t ,1 +…+ β j−1 X t , j−1+ …+ β j +1 X t , j+1 + β K X t , K + ε t
If the regressor that I regress on the others is multicollinear with one of the other explanatory variables, the
coefficient of determination of the regression will be very high, because a huge part of the variance of the
multicollinear regressor is explained by another one.

A possibility is to remove X t , j and re-check for multicollinearity.

Variance of the OLS estimators

If we assume that
1) the DGP is Y t =β 0 + β 1 X t ,1 +…+ β K X t , K +ε t
2) det(X’X) ≠ 0, so X’X is invertible
3) E [ ε t ] =0
4) E [ ε t∨ X 1 , … , X K ] =E [ ε t ] =0
5) V [ ε j∨X 1 , … , X n ] =E [ ε j ∨X 1 , … , X n ] −( E [ ε j∨ X 1 , … , X n ] ) =E [ ε j ∨X 1 , … , X n ]=E [ ε j ]=σ ε
2 2 2 2 2

6) E [ ε t ε s ] =E [ ε t ] E [ ε s ] =0 for all t ≠ s (the errors are independent)

7) the errors are i.i.d. r.vs. with normal distribution ε j N ( 0 , σ ε )
2

Then the OLS estimator minus its expected value is a gaussian distributed r.v. with variance equal to the j-th
element on the diagonal of the variance covariance matrix
^β −β N ( 0 , σ 2^ )
j j β j

Where
2
2 σε
σ ^β = 2
j
SS T j (1−R j )

( )
T T 2
1
 SS T j=∑ X t , j− ∑ X s , j
t =1 T s=1
 SST of the j-th regressor
R j is the R of the regression X t , j=β 0 + β 1 X t ,1 +…+ β j−1 X t , j−1+ …+ β j +1 X t , j+1 + β K X t , K + ε t
2 2

2
 R obtained regressing one regressor on the others

So, the old variance is multiplied by the VIF!

2 2
If there is multicollinearity, R j →1 and σ ^β →+ ∞ , so the estimates of β j are very noisy!
j

Remedy: remove the regressor or, whenever possible, increase T to have SS T j →+ ∞

The Gauss-Markov Theorem

Under the assumptions
1) the DGP is Y t =β 0 + β 1 X t ,1 +…+ β K X t , K +ε t
2) det(X’X) ≠ 0, so X’X is invertible
3) E [ ε t ] =0 (the errors have 0 mean)
4) E [ ε t∨ X 1 , … , X K ] =E [ ε t ] =0(the errors are mean independent from the regressors)
5) V [ ε j∨X 1 , … , X n ] =E [ ε j ∨X 1 , … , X n ] −( E [ ε j∨ X 1 , … , X n ] ) =E [ ε j ∨X 1 , … , X n ]=E [ ε j ]=σ ε (the
2 2 2 2 2

errors are variance independent from the regressors)

6) E [ ε t ε s ] =E [ ε t ] E [ ε s ] =0 for all t ≠ s (the errors are independent)

The OLS estimator ^β is BLUE (Best Linear Unbiased Estimator): among all the unbiased linear2 estimators, it
is the one with the least variance.

2 ^β is a linear combination of the data: ^β = (X’X)-1(X’Y)

Type Mha & Mla: Instructions On Installation Operation and Maintenance For Kirloskar Pump
100% (1)
Type Mha & Mla: Instructions On Installation Operation and Maintenance For Kirloskar Pump
30 pages
The Baby Browning
100% (1)
The Baby Browning
9 pages
Simple Linear Regression Model
No ratings yet
Simple Linear Regression Model
6 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
Lecture 14: Multiple Linear Regression 1 Review of Simple Linear Regression in Matrix Form
No ratings yet
Lecture 14: Multiple Linear Regression 1 Review of Simple Linear Regression in Matrix Form
7 pages
Chapter 3 Multiple Regression
No ratings yet
Chapter 3 Multiple Regression
49 pages
Chapter3
No ratings yet
Chapter3
52 pages
Matrix OLS NYU Notes
No ratings yet
Matrix OLS NYU Notes
14 pages
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
No ratings yet
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
17 pages
MLRM
No ratings yet
MLRM
22 pages
Mult Hetero Notes Agd
No ratings yet
Mult Hetero Notes Agd
29 pages
Classical Multiple Regression
No ratings yet
Classical Multiple Regression
5 pages
Chapter 3 Multiple regression
No ratings yet
Chapter 3 Multiple regression
49 pages
4 - Multiple Linear Regressions
No ratings yet
4 - Multiple Linear Regressions
61 pages
Topic3 Multiple Regression
No ratings yet
Topic3 Multiple Regression
12 pages
Non-Spherical Errors: 1 Efficient OLS
No ratings yet
Non-Spherical Errors: 1 Efficient OLS
14 pages
MultivariableRegression 4
No ratings yet
MultivariableRegression 4
98 pages
Chapter3 PDF
No ratings yet
Chapter3 PDF
52 pages
TA_session_06
No ratings yet
TA_session_06
13 pages
Week2 Lecture2
No ratings yet
Week2 Lecture2
59 pages
MLR Note
No ratings yet
MLR Note
3 pages
Wooldridge Notes
No ratings yet
Wooldridge Notes
15 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
17 pages
Ees 400 - Topic Four - Multivariate Regression Analysis
No ratings yet
Ees 400 - Topic Four - Multivariate Regression Analysis
9 pages
PPG MLRM Upto Autocorr PDF
No ratings yet
PPG MLRM Upto Autocorr PDF
20 pages
Lesson01 PDF 02
No ratings yet
Lesson01 PDF 02
5 pages
Regression Basics in Matrix Terms: 1 The Normal Equations of Least Squares
No ratings yet
Regression Basics in Matrix Terms: 1 The Normal Equations of Least Squares
3 pages
Lecture 2_regression_multiple_regressors
No ratings yet
Lecture 2_regression_multiple_regressors
30 pages
Ols Derivation
No ratings yet
Ols Derivation
3 pages
Ec2 1
No ratings yet
Ec2 1
11 pages
Multicollinearity and Endogeneity PDF
No ratings yet
Multicollinearity and Endogeneity PDF
37 pages
EE1_3_multiple linear regression
No ratings yet
EE1_3_multiple linear regression
30 pages
MFIN 305_Lecture1
No ratings yet
MFIN 305_Lecture1
77 pages
Chapter 6: Regression
No ratings yet
Chapter 6: Regression
7 pages
UnivariateRegression 2
No ratings yet
UnivariateRegression 2
72 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
MLRMSB2
No ratings yet
MLRMSB2
21 pages
Chapter7 Econometrics Multicollinearity
No ratings yet
Chapter7 Econometrics Multicollinearity
25 pages
LLICO2b_ECO1_English
No ratings yet
LLICO2b_ECO1_English
15 pages
lecture_8
No ratings yet
lecture_8
29 pages
Appendex E
No ratings yet
Appendex E
9 pages
Linear Regression Analysis: Module - Vii
No ratings yet
Linear Regression Analysis: Module - Vii
10 pages
Ec 2
No ratings yet
Ec 2
12 pages
Chapter9 Regression Multicollinearity
No ratings yet
Chapter9 Regression Multicollinearity
25 pages
2 - Model Linear Jamak Dan OLS
No ratings yet
2 - Model Linear Jamak Dan OLS
11 pages
統計摘要
No ratings yet
統計摘要
12 pages
Multiple Regression Model and Multicollinearity
No ratings yet
Multiple Regression Model and Multicollinearity
25 pages
Econometric Theory: Module - Ii
No ratings yet
Econometric Theory: Module - Ii
11 pages
2024 1 Metrics 6 Multipleols 4
No ratings yet
2024 1 Metrics 6 Multipleols 4
18 pages
3 Regression Diagnostics
100% (1)
3 Regression Diagnostics
53 pages
Appendix E - The Linear Regression Model in Matrix Form
No ratings yet
Appendix E - The Linear Regression Model in Matrix Form
14 pages
MFIN 305_Lecture2
No ratings yet
MFIN 305_Lecture2
50 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
37 pages
Lecture 15: Diagnostics and Inference For Multiple Linear Regression 1 Review
No ratings yet
Lecture 15: Diagnostics and Inference For Multiple Linear Regression 1 Review
7 pages
Derivation of The Ordinary Least Squares Estimator Multiple Regression Case
100% (1)
Derivation of The Ordinary Least Squares Estimator Multiple Regression Case
10 pages
Lecture 3
No ratings yet
Lecture 3
27 pages
Lec Topic3
No ratings yet
Lec Topic3
51 pages
Chapter # 6: Multiple Regression Analysis: The Problem of Estimation
No ratings yet
Chapter # 6: Multiple Regression Analysis: The Problem of Estimation
43 pages
Chapter2 Econometrics MultipleLinearRegressionModel 1 1
No ratings yet
Chapter2 Econometrics MultipleLinearRegressionModel 1 1
34 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
21 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Phase 5
No ratings yet
Phase 5
26 pages
Chilled Water System Presentation
83% (6)
Chilled Water System Presentation
119 pages
List of Power Stations in India: From Wikipedia, The Free Encyclopedia
No ratings yet
List of Power Stations in India: From Wikipedia, The Free Encyclopedia
26 pages
Development of Recommendations For SEMG Sensors and Sensor Placement Procedures - Hermens - 2000
No ratings yet
Development of Recommendations For SEMG Sensors and Sensor Placement Procedures - Hermens - 2000
14 pages
Emergency Generator Regulations: Custom Search
No ratings yet
Emergency Generator Regulations: Custom Search
4 pages
Partita BWV 1013 - Flauta
No ratings yet
Partita BWV 1013 - Flauta
5 pages
190
No ratings yet
190
69 pages
RM - Paddy Drying
No ratings yet
RM - Paddy Drying
49 pages
Unijunction Transistor Lecture Note
No ratings yet
Unijunction Transistor Lecture Note
7 pages
Literature Review On Inventory Control PDF
100% (3)
Literature Review On Inventory Control PDF
7 pages
Grade 7 Rationalized Social Studies Schemes of Work Term 2
No ratings yet
Grade 7 Rationalized Social Studies Schemes of Work Term 2
28 pages
Syllogism Set - 3 (Prelims) - 230801 - 214356
No ratings yet
Syllogism Set - 3 (Prelims) - 230801 - 214356
12 pages
1 Markov Chains: Indian Institute of Technology Bombay
No ratings yet
1 Markov Chains: Indian Institute of Technology Bombay
15 pages
C25 CHEMISTRY Practice Sheet - POC Part-2
No ratings yet
C25 CHEMISTRY Practice Sheet - POC Part-2
8 pages
Summer Internship Report
No ratings yet
Summer Internship Report
12 pages
Sample OPM
No ratings yet
Sample OPM
110 pages
Data Control
No ratings yet
Data Control
42 pages
Parsons Complaint File No. 2014-30,525 (09A) - Brewer Notice
No ratings yet
Parsons Complaint File No. 2014-30,525 (09A) - Brewer Notice
47 pages
Detection and Analysis of Magnetic Particle Testing Defects On Heavy Truck Crankshaft Manufactured by Microalloyed Medium-Carbon Forging Steel
No ratings yet
Detection and Analysis of Magnetic Particle Testing Defects On Heavy Truck Crankshaft Manufactured by Microalloyed Medium-Carbon Forging Steel
10 pages
Books Shalakya Tantra
75% (4)
Books Shalakya Tantra
4 pages
1 & 2. Wheat
No ratings yet
1 & 2. Wheat
8 pages
Enhancing Employee Performance Through Monetary Incentives
100% (1)
Enhancing Employee Performance Through Monetary Incentives
62 pages
Quick Drivers Installer (QDI) User Guide: Before You Begin
No ratings yet
Quick Drivers Installer (QDI) User Guide: Before You Begin
5 pages
Manual of Phone Case Printer A5-20
No ratings yet
Manual of Phone Case Printer A5-20
12 pages
Parties: Domingo Bearneza, Plaintiff-Appellee Balbino Dequilla, Defendant-Appellant C. Lozano and Cecilio I. Lim For
No ratings yet
Parties: Domingo Bearneza, Plaintiff-Appellee Balbino Dequilla, Defendant-Appellant C. Lozano and Cecilio I. Lim For
3 pages
Bibliography - Plant Design
No ratings yet
Bibliography - Plant Design
6 pages
Safety Review Report PPT
No ratings yet
Safety Review Report PPT
85 pages
A Company Analysis
No ratings yet
A Company Analysis
61 pages

Multiple Regression Model

Uploaded by

Multiple Regression Model

Uploaded by

MULTIPLE REGRESSION MODEL

If X j ,1varies of ∆ X j , 1, then Y j varies of

Suppose X j ,2 can be either 0 or 1 (dummy variable).

Now, subtract the second from the first

Where T is the sample size.

That can be summarized with

Considering that X’X is always symmetric, the FOC’s are

^β=( X ' X )−1 (X ' Y )

From this, we derive that

The Conditional Variance-Covariance Matrix of the OLS estimators

6) E [ ε t ε s ] =E [ ε t ] E [ ε s ] =0 for all t ≠ s (the errors are independent)

Estimator of the error’s variance

6) E [ ε t ε s ] =E [ ε t ] E [ ε s ] =0 for all t ≠ s (the errors are independent)

^β=β + ( X ' X )−1 ( X ' ε )

Multicollinearity – detection rule: The Variance Inflation Factor

A possibility is to remove X t , j and re-check for multicollinearity.

Variance of the OLS estimators

6) E [ ε t ε s ] =E [ ε t ] E [ ε s ] =0 for all t ≠ s (the errors are independent)

So, the old variance is multiplied by the VIF!

Remedy: remove the regressor or, whenever possible, increase T to have SS T j →+ ∞

The Gauss-Markov Theorem

errors are variance independent from the regressors)

2 ^β is a linear combination of the data: ^β = (X’X)-1(X’Y)

You might also like