0% found this document useful (0 votes)
4 views

Lecture 1_Multiple Regression Models

This document outlines the first lecture of the Econometrics course at the National Economics University, focusing on multiple regression models. It includes the instructor's profile, course objectives, and key topics such as estimating parameters, model specification, and issues like collinearity and insignificance in data. The lecture also emphasizes practical applications using statistical software and the importance of understanding economic models and their assumptions.

Uploaded by

nanhmeo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture 1_Multiple Regression Models

This document outlines the first lecture of the Econometrics course at the National Economics University, focusing on multiple regression models. It includes the instructor's profile, course objectives, and key topics such as estimating parameters, model specification, and issues like collinearity and insignificance in data. The lecture also emphasizes practical applications using statistical software and the importance of understanding economic models and their assumptions.

Uploaded by

nanhmeo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

5/31/2023

National Economics University


E-PhD, Cohort 6

EPHD 3121: Econometrics

LECTURE 1: MULTIPLE REGRSSSION MODELS

Bach Ngoc Thang


[email protected]

Hanoi, 2023

Instructor’s brief profile


• Education
• Ph.D. in Economics (UQ, 2014); M.A. and B.A. in
Development Economics (NEU, 2005 and 2002).
• Teaching
• Macroeconomics 1 & 2 (NEU, since 2014); Econometrics (E-
PhD, since 2019); Principles of Macroeconomics (BBAE,
since 2020), International Trade (UQ, 2010 – 14);
• Research interests
• Development economics, governance, SMEs.
• Some notable publications in World Development, Journal of
Development Studies, Economics of Transition and
Institutional Change, The Developing Economies, etc.
• Google Scholar:
https://scholar.google.com/citations?hl=en&user=nydtUkcAA
AAJ&view_op=list_works&sortby=pubdate

5/31/2023 2

1
5/31/2023

Outlines
• Course introduction
• From economic to econometric model
• Estimating the parameters
• Hamburger chain data (andy.dta)
• Sampling properties
• Model specification
• Family income equation (edu_inc.dta)
• Poor data, collinearity, and insignificance
• Cars data (cars.dat)
• Required reading: Chap. 5&6 (Hill et al., 2011)
5/31/2023 3

Course introduction
• Course objectives
• Changes in this semester
• Students’ assessment
• Course syllabus
• Students’ expectations?
• What do you expect to learn from this course?
• What could the instructor do to enhance students’ greater
learning outcomes?

5/31/2023 4

2
5/31/2023

Economic model
• The interplay between sales and advertising
expenditure:
𝑆𝑎𝑙𝑒𝑠 = 𝛽1 + 𝛽2 𝑃𝑟𝑖𝑐𝑒 + 𝛽3 𝐴𝑑𝑣𝑒𝑟𝑡 (1)
Where, 𝛽1 , 𝛽2 , 𝛽3 are the unknown parameters.
• A quantitative inference:
Marginal analysis: change in Sales when Advert
increase by one unit:
∆𝑆𝑎𝑙𝑒𝑠 𝜕𝑆𝑎𝑙𝑒𝑠
𝛽3 = = (2)
∆𝐴𝑑𝑣𝑒𝑟𝑡 (𝑃𝑟𝑖𝑐𝑒 ℎ𝑒𝑙𝑑 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡) 𝜕𝐴𝑑𝑣𝑒𝑟𝑡

5/31/2023 5

Where does an economic model


come from?
• From economic/management theory:
• The relationship between advertising and sales?
• The responsiveness of sales to advertising? Which
goods/services are more responsive to advertising?
• From empirical investigation:
• A toolkit model: lack of theoretical foundations?
• Data mining?
• From practices or observations:
• Are all the facts/objects observable?
• A combination of all the above?
• A due literature review is needed.

5/31/2023 6

3
5/31/2023

The econometric model


• Again, the sales-advertising nexus:
𝑆𝑎𝑙𝑒𝑠
= 𝐸 𝑆𝑎𝑙𝑒𝑠 + 𝑒 = 𝛽1 +𝛽2 𝑃𝑟𝑖𝑐𝑒 + 𝛽3 𝐴𝑑𝑣𝑒𝑟𝑡 + 𝑒 (3)
• The general model:
𝑦 = 𝛽1 + 𝛽2 𝑥2 +𝛽3 𝑥3 + ⋯ +𝛽𝐾 𝑥𝐾 + 𝑒 (4)
Where, 𝛽1 , 𝛽2 , … , 𝛽𝐾 are the unknown coefficients to be
estimated.
• The marginal analysis:
∆𝐸(𝑦) 𝜕𝐸(𝑦)
𝛽𝑘 = |𝑜𝑡ℎ𝑒𝑟 𝑥𝑠 ℎ𝑒𝑙𝑑 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 = (5)
∆𝑥𝑘 𝜕𝑥𝑘

5/31/2023 7

The multiple regression plane

5/31/2023 8

4
5/31/2023

Monthly sales, price, and advertising


in Big Andy’s Burger Barn

5/31/2023 9

Estimating the unknown


parameters
• Minimizing the sum of squares function:
𝑆 𝛽1 , 𝛽2 , 𝛽3
𝑁 𝑁
2 2
= ෍ 𝑦𝑖 − 𝐸(𝑦𝑖 ) = ෍ 𝑦𝑖 − 𝛽1 − 𝛽2 𝑥𝑖2 −𝛽3 𝑥𝑖3 (5)
𝑖=1 𝑖=1
• The least squares estimators:
𝑏1 , 𝑏2 , 𝑏3 correspond to the unknown
parameters/coefficients 𝛽1 , 𝛽2 , 𝛽3 .
• Are the above OLS estimators BLUE?

5/31/2023 10

10

5
5/31/2023

Stata practice: Sales equation


• Data source: andy.dat
• To do:
• Variable description and summary statistics
• Conducting OLS estimates of Sales on Price and Advert.
• Interpreting estimated results.

5/31/2023 11

11

OLS estimates for Sales equation

5/31/2023 12

12

6
5/31/2023

The error variance and standard


error (s.e.)
• The error variance:
2 𝑆𝑆𝐸 σ𝑁
𝑖=1 𝑒𝑖Ƹ
2 σ𝑁 ො 𝑖 )2
𝑖=1(𝑦𝑖 −𝑦
𝜎ො = = = (6)
𝑁−𝐾 𝑁−𝐾 𝑁−𝐾
• The standard errors of estimated coefficient 𝑏2 :
𝜎2
𝑣𝑎𝑟 𝑏2 = 2 σ𝑁 (7)
(1 − 𝑟23 ) 𝑖=1(𝑥𝑖2 − 𝑥ҧ2 )2

σ(𝑥𝑖2 −𝑥ҧ2 )(𝑥𝑖3 −𝑥ҧ3 )


𝑟23 = (8)
σ(𝑥𝑖2 −𝑥ҧ2 )2 (𝑥𝑖3 −𝑥ҧ3 )2

5/31/2023 13

13

The standard error (cont’d)


𝜎2
𝑣𝑎𝑟 𝑏2 = 2 σ𝑁 (7𝑎)
(1 − 𝑟23 ) 𝑖=1(𝑥𝑖2 − 𝑥ҧ2 )2

• Factor affecting the variance of 𝑏2 :


• The variance of estimated (random) error 𝜎ො 2 .
• A larger sample size N.
• More variance in an explanatory variable around its
mean.
• A larger correlation between x2 and x3.

5/31/2023 14

14

7
5/31/2023

Sampling properties:
Assumptions of the MRM
• MR1: 𝑦𝑖 = 𝛽1 + 𝛽2 𝑥𝑖2 + … + 𝛽𝐾 𝑥𝑖𝐾 + 𝑒𝑖 , 𝑖 = 1, … , 𝑁
• MR2: 𝐸(𝑦𝑖 ) = 𝛽1 + 𝛽2 𝑥𝑖2 + … + 𝛽𝐾 𝑥𝑖𝐾 ↔ 𝐸 𝑒𝑖 = 0
• MR3: 𝑣𝑎𝑟(𝑦𝑖 ) = 𝑣𝑎𝑟 𝑒𝑖 = 𝜎 2
• MR4: 𝑐𝑜𝑣(𝑦𝑖 ; 𝑦𝑗 ) = 𝑐𝑜𝑣(𝑒𝑖 , 𝑒𝑗 ) = 0, 𝑖 ≠ 𝑗
• MR5: The values of each 𝑥𝑖𝑘 are not random and are not
exact linear functions of the other explanatory variables.
• MR6: 𝑦𝑖 ~ 𝑁[(𝛽1 + 𝛽2 𝑥𝑖2 + … + 𝛽𝐾 𝑥𝑖𝐾 ), 𝜎 2 ] ↔ 𝑒𝑖 ~
N(0, 𝜎 2 ).

5/31/2023 15

15

Model specification:
Omitted variables
• Essential features of model choice:
• Choice of functional forms
• Choice of explanatory variables to be included in the model
• Whether the assumptions of MR1 – MR6 hold
• Omitted variables:
• The econometric model of family income regressed on
husband’s and wife’s years of education:
𝑦 = 𝛽1 + 𝛽2 𝑥2 +𝛽3 𝑥3 + 𝑒
• The omitted-variable bias of omitting wife’s year of
education:

𝑐𝑜𝑣(𝑥 2 , 𝑥3 )
𝑏𝑖𝑎𝑠 𝑏2∗ = 𝐸 𝑏2∗ − 𝛽2 = 𝛽3 (9)

𝑣𝑎𝑟(𝑥2 )

5/31/2023 16

16

8
5/31/2023

Correlation matrix:
Family income data

5/31/2023 17

17

Stata practice:
Family income data
• Data source: edu_inc.dat
• To do:
• Regressing family income (Faminc) on both husband’s
and wife’s years of education (Hedu and Wedu).
• Omitting wife’s years of education in the above
specification.
• Determining upward or downward estimates.
• Adding the number of young children (Kl6) as another
regressor.

5/31/2023 18

18

9
5/31/2023

Family income data: estimated


models
• On both husband’s and wife’s years of education:

• On husband’s years of education only:

• Adding the number of young children:

5/31/2023 19

19

Model specification:
Irrelevant variables
• Adding two artificially generated variables X5 and
X6:

• The consequences of adding irrelevant explanatory


variables:
• Reducing the precision of the estimated coefficients.

5/31/2023 20

20

10
5/31/2023

Choosing the model


• The basis of theoretical and general understanding
of the relationship.
• Unobserved heterogeneity: omitted-variable bias,…
• Significance tests: F- and T-statistics test.
• Using model selection criteria: Adjusted R-squared,
Akaike information criterion (AIC), Schwarz
criterion (BIC): see pages 237 - 8.
• The general specification test (RESET): see pages
238 - 9.
• Violation of the MRM assumptions MR1 – 6?

5/31/2023 21

21

Poor data, collinearity and


insignificance
• The survey data issues:
• Most economic data is non-experimental, or
“uncontrolled” data.
• The solution: Randomized Control Trial (RCT), quasi-
experiments, field or laboratory experiments, but the
disadvantages?
• More systemic issues: regressors no longer
exogenous or independent
• Correlated by definition/construction.
• Co-movement or confounding factors.
• Unobserved heterogeneity.
5/31/2023 22

22

11
5/31/2023

Collinearity
• Consequences:
• The standard errors are large, leading to insignificant
estimates of the coefficients/parameters.
• Sensitive estimators, due to addition or deletion of a few
observations, or variables.
• An example:
• Data source: cars.dat
• To do: (i) regressing energy consumption (miles per
gallon, MPG) on number of cylinders; (ii) adding on
engine displacement (ENG) and vehicle weight (WGT).

5/31/2023 23

23

Collinearity:
Estimated models of car data
• On the number of cylinders:

• Adding on engine displacement and vehicle weight:

5/31/2023 24

24

12
5/31/2023

Identifying and mitigating


collinearity
• Identifying:
• Excessively high partial correlation or R-squared
obtained from the OLS regression of one explanatory
variable on all the remaining explanatory variables. As a
rule of thumb, say above 0.8.
• Mitigating:
• A better sample.
• Using non-sample information, for example, using priors
to impose some restrictions on the parameters.
• Production technology: CTS, IRT, …

5/31/2023 25

25

Next week
• Lecture 2: Using indicator variables
• Indicator and qualitative factors.
• Application.
• Log-linear and log-log model.
• Treatment effects.
• Required reading: Chap. 4&7 (Hill et al., 2011).

5/31/2023 26

26

13

You might also like