Problem Set 1
Problem Set 1
University of Navarra
Academic year: 2022/23
Econometrics I
Problem Set I: Ch. 1, 2a & 2b
NOTE: Please remember that problem sets do not count for the final grade, so it is not
needed to hand in the solutions for this problem set. However, it is highly recommended
trying to solve the questions before the practice class.
PROBLEMS
2. You are asked to study the causal effect of hours spent on employee training (mea-
sured in hours per worker per week) in a manufacturing plant on the productivity
of its workers (output per worker per hour). Describe:
3. The following table gives the joint probability distribution between employment
status and college graduation among those either employed or looking for work
(unemployed) in the working-age population of South Africa.
1
(e) A randomly selected member of this population reports being unemployed.
What is the probability that this worker is a college graduate? And a non-
college graduate?
(f) Are educational achievement and employment status independent? Explain.
4. In any year, the weather can inflict storm damage to a home. From year to year,
the damage is random. Let Y denote the dollar value of damage in any given year.
Suppose that in 95% of the years Y = $0, but in 5% of the years Y = $30, 000.
(a) What are the mean and standard deviation of the damage in any year?
(b) Consider an “insurance pool” of 120 people whose homes are sufficiently
dispersed so that, in any year, the damage to different homes can be viewed as
independently distributed random variables. Let Ȳ denote the average damage
to these 120 homes in a year.
i. What is the expected value of the average damage Ȳ ?
ii. What is the probability that Ȳ exceeds $3, 000?
(a) The authors plan to administer the test to all third-grade students in Pamplona.
Construct a 99% confidence interval for the mean score of all third graders in
Pamplona.
(b) Suppose the same test is given to 300 randomly selected third graders from
San Sebastián, producing a sample average Ȳ2 of 48 points and sample stan-
dard deviation sY2 of 10 points. Construct a 95% confidence interval for the
difference in mean scores between both cities.
(c) Can you conclude with a high degree of confidence that the population means
for San Sebastián and Pamplona students are different? (Hint: Think about
the standard error of the difference in the two sample means, and also about
the p-value of the test of no difference in means versus some difference).
2
7. To investigate possible gender discrimination in a Spanish firm, a sample of 120 men
and 150 women with similar job descriptions are selected at random. A summary
of the resulting monthly salaries (in Euro) follows:
(a) Let us assume that the monthly salary is normally distributed in both popula-
tions with equal variances. What do these data suggest about wage differences
in the firm? Do they represent statistically significant evidence that average
wages of men are higher than those for women?
(b) Is it possible to perform the analysis without assuming normality? If so, explain
how.
(c) Do these data suggest that the firm is guilty of gender discrimination in its
compensation policies? Explain.
8. Suppose Yi ∼ i.i.d. N (µY , σY2 ) for i = 1, 2, . . . , n. With σY2 known, the t-statistic
Ȳ −0 σY
for testing H0 : µY = 0 vs H1 : µY > 0 is t = SE( Ȳ )
, where SE(Ȳ ) = √ n
. Suppose
σY = 10 and n = 100, so that SE(Ȳ ) = 1. Using a test with a size of 5%, the null
hypothesis H0 is rejected if z ∗ > 1.64.
(a) Suppose µY = 0, so the null hypothesis is true. What is the probability that
the null hypothesis is rejected?
(b) Suppose µY = 2, so the alternative hypothesis is true. What is the probability
that the null hypothesis is rejected?
(c) Suppose that in 90% of cases the data are drawn from a population where the
null is true (µY = 0) and in 10% of cases the data come from a population
where the alternative is true and µY = 2. Your data came from either the first
or the second population, but you do not know which.
i. You compute the t-statistic. What is the probability that z ∗ > 1.64 —that
is, that you reject the null hypothesis?
ii. Suppose you reject the null hypothesis; that is, z ∗ > 1.64. What is the
probability that the sample data were drawn from the µY = 0 population?
9. Analyse whether the following statements are true or false. If they are true, prove
them and if they are false, justify the reason why.
(a) If X1 , X2 , X3 are three independent random variables such that X1 ∼ N (1, 2),
X2 ∼ N (1, 1), X3 ∼ N (2, 1), and we denote
√
(X1 − 1)/ 2
Y =q
(X2 −1)2 +(X3 −2)2
2
3
, then the probability that Y is below 1.89 is 0.90.
(b) The formula defining the bounds of a confidence interval for parameter θ can
never contain the parameter θ.
(c) In a hypothesis test, if the probability of the type I error is 0.01 and the null
hypothesis is true, then 1% of the times it will not be rejected.
(d) To test the null hypothesis θ = 8 against the alternative hypothesis θ < 8, we
decide to use {θb < 7} as a critical region, where θb is an estimator of θ. Then,
if the true value of θ is 7.5 and for the sample given θb is 7.5, we are making a
Type II error.
(e) Ten researchers separately test the null hypothesis that the mean difference
between two normal populations with known variances is 0, with a significance
level of α = 0.05. The samples used by these ten researchers are independent
from each other. If the null hypothesis is true, then the probability that at
least one of the ten researchers will reject the null hypothesis is greater than
40%.
(f) Given two independent samples from two normal populations with equal vari-
ances, if we reject H0 in the test of equal means with a one-sided alternative
hypothesis and a 0.05 significance level, then we will reject H0 in the test of
equal means with a two-sided alternative hypothesis and a 0.10 significance
level.