0% found this document useful (0 votes)
11 views

PS4 Intro to Econometrics

The document outlines Problem Set 4 for an Introduction to Econometrics course, focusing on various econometric concepts including OLS estimators, variances, p-values, and the implications of heteroskedasticity. It includes specific questions and tasks related to regression analysis, hypothesis testing, and omitted variable bias. Additionally, it contains true or false statements regarding econometric principles and provides a standard normal and t-Student distribution table as a reference.

Uploaded by

tzhsfwtwjw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

PS4 Intro to Econometrics

The document outlines Problem Set 4 for an Introduction to Econometrics course, focusing on various econometric concepts including OLS estimators, variances, p-values, and the implications of heteroskedasticity. It includes specific questions and tasks related to regression analysis, hypothesis testing, and omitted variable bias. Additionally, it contains true or false statements regarding econometric principles and provides a standard normal and t-Student distribution table as a reference.

Uploaded by

tzhsfwtwjw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Econ 11020: Introduction to Econometrics

Instructor: Murilo Ramos

Problem Set 4
Please include the names of people that worked together with you in this assignment.
Make sure that your answers are clear and easily readable.

1. Two researchers independently collect samples of size N for an outcome y and regressor x from the same
population. They suggest that averaging their estimates will give a more precise estimate than either researcher’s
estimate alone for a simple linear regression model 𝑦! = 𝛽" + 𝛽# 𝑥! + 𝜖! . Let researcher A’s estimate be 𝛽#̇ and
researcher B’s estimate be 𝛽#̈ . Assume that each of them collected a sample of size N.
We learned in class that the variance associated with the OLS slope estimator is given by:

𝜎$
𝑉𝑎𝑟,𝛽#̇ - =
∑%
!&#(𝑥! − 𝑥̅ )$

𝜎$
𝑉𝑎𝑟,𝛽#̈ - = $
∑%'&#,𝑥' − 𝑥̅ -
Notice that each denominator varies with the sample collected from the regressor 𝑥! .
The “averaged” estimator they suggested is:

(𝛽#̇ + 𝛽#̈ )
𝛽5# =
2

i) Find the variance of this estimator. Hint: the two researchers used independent samples, so 𝛽#̇ , 𝛽#̈ are
also independent from each other.

ii) If instead of using the “averaged” estimator, assume that the two researchers combined their data into
one sample of size 2N, and get the combined estimator denoted by 𝛽#∗ . Show that the variance of 𝛽5# is
at least as big as the variance of 𝛽#∗ .
$
Hint 1: Notice that ∑$% $ % $ %
)&#(𝑥) − 𝑥̅ ) = ∑!&#(𝑥! − 𝑥̅ ) + ∑'&#,𝑥' − 𝑥̅ -
Hint 2: you will need the result that 4𝑧𝑦 ≤ (𝑧 + 𝑦)$

iii) What have you learned in this question?

2. Suppose you estimate a populational model 𝑦! = 𝛽" + 𝛽# 𝑥! + 𝜖! by OLS, where the outcome is house prices and
the regressor is household income. Assume you observe an estimate of 𝛽:# = 50 and that Stata reports a
𝑠. 𝑒. ,𝛽:# - = 20, calculated assuming homoskedasticity (or 𝑉𝑎𝑟(𝜖|𝑥! ) = 𝜎 $ for all x).
a) If the true value of 𝛽# is assumed to be 0, what is the associated (two-sided) p-value? What is the p-value is
the true value of this parameter is assumed to be 40? Explain in your own words what the p-value means.
(You may assume that the sample size is large, so you can use the normal approximation)

b) Explain why it might not be reasonable to assume homoskedasticity in this example.


Hint: use your economic intuition and not extensive math here.

c) Now you ask Stata to take heteroskedasticity into account in the estimation and get a new 𝑠. 𝑒. ,𝛽:# - = 50.
If the true value of 𝛽# is assumed to be zero, what is the associated (two-sided) p-value? Are you more or
less confident of an association between house price and household income? Explain.
3. Suppose you are estimating the model for the population of data scientists in Chicago:

𝑟𝑒𝑝𝑜𝑟𝑡𝑠! = 𝛽" + 𝛽# 𝑠𝑒𝑛𝑖𝑜𝑟! + 𝛽$ ℎ𝑜𝑢𝑟𝑠! + 𝑢! ,

where 𝑟𝑒𝑝𝑜𝑟𝑡𝑠! measures the number of reports produced in a month by employee i; 𝑠𝑒𝑛𝑖𝑜𝑟! is a dummy
variable equals to 1 if a worker i has been in the company for 10 years or more and zero otherwise; ℎ𝑜𝑢𝑟𝑠!
denote the weekly hours worked in the company. Assume 𝛽# > 0, 𝛽$ > 0, and 𝑐𝑜𝑣(𝑠𝑒𝑛𝑖𝑜𝑟! , ℎ𝑜𝑢𝑟𝑠! ) < 0.

True or False questions (if false, provide a justification):


i) The coefficient 𝛽# measures the average difference in reports produced by a senior worker versus
a junior.
ii) If we estimate the short model 𝑟𝑒𝑝𝑜𝑟𝑡𝑠! = 𝛼N" + 𝛼N# ℎ𝑜𝑢𝑟𝑠! + 𝜖̂! , the estimator for 𝛼N# will
overestimate the effect of hours in reports.
iii) Assume that the number of observations is 40. If we perform a two-sided hypothesis test for 𝛽# at
the significance level of 10%, the critical value for the test will be 1.69, coming from a t-Student
distribution with 37 degrees of freedom.
iv) If we know that: there are no average differences in the errom term between seniors and juniors;
the difference between average reports between seniors and juniors is 0; the partial effect of hours
in reports is 0.2; and average hours worked by seniors is 40 and juniors is 50. Then 𝛽# should be
2.
v) If the 𝐶𝐼*+,%
!
=[-0.5, 3], then the 90% C.I. for the same parameter will be wider than this interval.
+,%
vi) If the 𝐶𝐼*! =[-0.5, 3], then we reject the null that 𝛽# = 4 for the two-sided alternative at 5%
significance.
vii) If the 𝐶𝐼*+,%
!
=[-0.5, 3], then we reject the null that 𝛽# = 4 for the two-sided alternative at 10%
significance.
viii) If the 𝐶𝐼*+,%
!
=[-0.5, 3], then we reject the null that 𝛽# = 4 for the two-sided alternative at 1%
significance.
ix) If we omitted a regressor (which is not hours or seniority) from the main model which is negatively
correlated with seniority, then our OVB will be negative for the estimation of 𝛽# .
x) In order to avoid the dummy trap, we need to omit one dummy from the regression, if an intercept
is included and the dummies cover all (mutually exclusive) groups in the population.

4. Prove the OVB formula presented in lecture. Instead of simply copying lecture notes, reflect on the assumptions
used in every single step of the proof. Also, explain with your own words why we have positive (or negative)
OVB depending on the signs of 𝛽$ and the 𝑐𝑜𝑣(𝑥#! , 𝑥$! ). Give an intuition for the sign of the OVB.
Standard Normal table
t-Student table – part 1
414 APPENDIX C. DISTRIBUTION TABLES

−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
One Tail One Tail Two Tails

one tail 0.100 0.050 0.025 0.010 0.005


two tails 0.200 0.100 0.050 0.020 0.010
df 1 3.08 6.31 12.71 31.82 63.66
2 1.89 2.92 4.30 6.96 9.92
3 1.64 2.35 3.18 4.54 5.84
4 1.53 2.13 2.78 3.75 4.60
5 1.48 2.02 2.57 3.36 4.03
6 1.44 1.94 2.45 3.14 3.71
7 1.41 1.89 2.36 3.00 3.50
8 1.40 1.86 2.31 2.90 3.36
9 1.38 1.83 2.26 2.82 3.25
10 1.37 1.81 2.23 2.76 3.17
11 1.36 1.80 2.20 2.72 3.11
12 1.36 1.78 2.18 2.68 3.05
13 1.35 1.77 2.16 2.65 3.01
14 1.35 1.76 2.14 2.62 2.98
15 1.34 1.75 2.13 2.60 2.95
16 1.34 1.75 2.12 2.58 2.92
17 1.33 1.74 2.11 2.57 2.90
18 1.33 1.73 2.10 2.55 2.88
19 1.33 1.73 2.09 2.54 2.86
20 1.33 1.72 2.09 2.53 2.85
21 1.32 1.72 2.08 2.52 2.83
22 1.32 1.72 2.07 2.51 2.82
23 1.32 1.71 2.07 2.50 2.81
24 1.32 1.71 2.06 2.49 2.80
25 1.32 1.71 2.06 2.49 2.79
26 1.31 1.71 2.06 2.48 2.78
27 1.31 1.70 2.05 2.47 2.77
28 1.31 1.70 2.05 2.47 2.76
29 1.31 1.70 2.05 2.46 2.76
30 1.31 1.70 2.04 2.46 2.75
t-Student table – part 2
415

−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
One Tail One Tail Two Tails

one tail 0.100 0.050 0.025 0.010 0.005


two tails 0.200 0.100 0.050 0.020 0.010
df 31 1.31 1.70 2.04 2.45 2.74
32 1.31 1.69 2.04 2.45 2.74
33 1.31 1.69 2.03 2.44 2.73
34 1.31 1.69 2.03 2.44 2.73
35 1.31 1.69 2.03 2.44 2.72
36 1.31 1.69 2.03 2.43 2.72
37 1.30 1.69 2.03 2.43 2.72
38 1.30 1.69 2.02 2.43 2.71
39 1.30 1.68 2.02 2.43 2.71
40 1.30 1.68 2.02 2.42 2.70
41 1.30 1.68 2.02 2.42 2.70
42 1.30 1.68 2.02 2.42 2.70
43 1.30 1.68 2.02 2.42 2.70
44 1.30 1.68 2.02 2.41 2.69
45 1.30 1.68 2.01 2.41 2.69
46 1.30 1.68 2.01 2.41 2.69
47 1.30 1.68 2.01 2.41 2.68
48 1.30 1.68 2.01 2.41 2.68
49 1.30 1.68 2.01 2.40 2.68
50 1.30 1.68 2.01 2.40 2.68
60 1.30 1.67 2.00 2.39 2.66
70 1.29 1.67 1.99 2.38 2.65
80 1.29 1.66 1.99 2.37 2.64
90 1.29 1.66 1.99 2.37 2.63
100 1.29 1.66 1.98 2.36 2.63
150 1.29 1.66 1.98 2.35 2.61
200 1.29 1.65 1.97 2.35 2.60
300 1.28 1.65 1.97 2.34 2.59
400 1.28 1.65 1.97 2.34 2.59
500 1.28 1.65 1.96 2.33 2.59
1 1.28 1.65 1.96 2.33 2.58

You might also like