0% found this document useful (0 votes)
186 views

Worksheet 7 Olympic Swimmers

The document summarizes regressions analyzing various datasets: 1) Regressions of Olympic swimmers' weight on age and height found that age has a smaller effect on weight when controlling for height, since age captures some of height's effect as people get taller with age. 2) A regression of NCAA tournament team winning percentage on seed number found the #5 seed has a lower probability of winning than expected based on its ranking. 3) A regression of expected future MBA salaries found gender and international status significantly impact salaries, with international students expecting higher pay than non-international peers with the same previous salary.

Uploaded by

Kathy Thanh PK
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
186 views

Worksheet 7 Olympic Swimmers

The document summarizes regressions analyzing various datasets: 1) Regressions of Olympic swimmers' weight on age and height found that age has a smaller effect on weight when controlling for height, since age captures some of height's effect as people get taller with age. 2) A regression of NCAA tournament team winning percentage on seed number found the #5 seed has a lower probability of winning than expected based on its ranking. 3) A regression of expected future MBA salaries found gender and international status significantly impact salaries, with international students expecting higher pay than non-international peers with the same previous salary.

Uploaded by

Kathy Thanh PK
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Worksheet 7

Olympic Swimmers

Recall the Olympics case in which we were exploring the determinants of weight (in kilos) for
Olympic swimmers – in particular the role of height (in centimeters) and age.

You run the following three regressions:

(1) Weight=64.95+ 0.638∗Age+e


(2.48)(0.103)

(2) Weight=−75.63+ 0.837∗Height +e


( 6.48 ) (0.035)

(3) Weight=−77.46+ 0.326∗Age+ 0.806∗Height +e


( 6.34 )( 0.070 )( 0.035)

Standard errors in parenthesis

a. Interpret the coefficient on age in the first regression

For every additional year, Olympic swimmers, on average, gain 0.64 kg.

b. Interpret the coefficient of age in the third regression:

Holding height constant, for every additional year, Olympic swimmers gain 0.33 kg on
average.

c. Can you intuitively explain why the coefficient on age is much smaller in the third
regression?

By excluding height, the age coefficient is capturing some of the positive weight
effects that are associated with height (see regression 2) and people usually get taller
as they age (assuming they are young enough of course).
d. Which of the three models will you use to predict the annual increase in weight for a
swimmer who is 30 years old? What would your prediction be?

You would use regression 3 because you can control for age and this swimmer is
predicted to gain 0.33 kg.

e. What is your prediction for difference in weight between two fraternal twins, one who is
2 cm taller than the other? Provide a 95% confidence interval.

0.806∗2=1.612 kg

95 % CI :1.612 ± 2∗0.035∗2=[1.472,1 .752]

f. Somebody argues that once you control for height, age does not predict weight for this
population, as metabolism only starts to slow down in your forties. Can you prove him
wrong?

H0: B1age = 0

95 % CI : 0.326± 2∗0.070=[0.186,0.466 ]
¿
0.326−0
t−stat= =4.66
0.070

Since the confidence interval does not contain 0 and the t-statistic is greater than 2,
we can reject the null and prove him wrong.

MARCH MADNESS!
Some people claim that the No. 5 seed teams in the NCAA tournament are jinxed. Look at the
graph below to see why! This graphs the average win precentage over this period for this seed.
(Don’t know what NCAA or seed means? It doesn’t matter. Just think of it as a variable)
a. Let’s start with a simple regression. Define win_probability as the outcome variable and
seed as the explanatory variable (seed=1,2,3…8). What is your best estimate of the slope
based on the graph above?
Around -7. Change in y is close to 50, change in x is 7.

Now you construct dummy variables for the eight seeds: seed1=1 if the team is seed 1, and
seed1=0 if the team is not seed 1. Similarly define seed2, seed3, seed4, etc., through seed8.

You run the following regression:

Win_probability=b1+b2*seed2 + b3*seed3+ b4*seed4+ b5*seed5+ b6*seed6+b7*seed7+


b8*seed8+e

Provide your best estimate of:

b. b1 = 100
c. b8 = -47

d. b2 = -6

e. Based on your data, why do you think people say that “No. 5 seeds are jinxed?”
Because even though they are higher ranked than the six seed (and thus face a weaker
opponent), their probability of winning is smaller.

Now you run the following regression:

Win_probability=c1+c2*seed1 +c3*seed3+ c4*seed4+ c5*seed5+ c6*seed6+c7*seed7+


c8*seed8+e

Provide your best estimate of:

f. c1 = 94

g. c8 = -41

h. c2 = 6

MBA Salaries (based on Question 7.25 of the book).


During the first day of an MBA statistics course, students were asked the annual salary from
their most recent job, the salary they expect in their next job after graduation with an
MBA, whether or not they are an international student, and their gender.
The regression equation Y=48 + 0.9X1+6X2-17X3 is computed, where Y is the expected annual
salary in the next job (in $1000s), X2 equals one if the student is an international student and
zero otherwise and X3 equals one if the student is female and zero otherwise. The standard
error of the regression is 20 and the R-squared 30%.
a. Give managerial interpretations, if appropriate, for the coefficients, intercept, and R-
squared.
0.9: for students of the same gender and international status, each extra thousand
dollars in previous salary is associated with an extra $900 on average in future expected
salary. -17: For students with the same previous salary and international status, women
expect to earn $17,000 less on average compared with men in their next job. 6: for
students with the same previous salary and gender, international students expect $6000
more in future salary compared with non-international students. 48: There is no
managerial interpretation for this as a zero dollar previous salary is way outside the
range of the data. R-squared of 30%: 30% of variation in future expected salaries can be
explained by variation in previous salaries, gender, and international status.
b. A second regression equation Y=102 -20X2 – 19X3 is computed. What could explain why
the coefficient for X2 in this equation is negative, while in the first equation above it is
positive?
The -20 coefficient tells us that international students overall expect to earn less on
average than non-international students but the +6 coefficient tells us that among
people with the same previous salary, international students expect a higher salary in
their next job compared with non-international students. In other words, though
international students have lower previous salaries, they have higher expectations
compared with non-international students with the same previous salary.
c. Predict the expected salary for an international male student who earned 70K is his
previous job. Give a 95% Prediction Interval.
Prediction= =48 + 0.9*60+6*1=108
95% Interval = 108+-40=(68,148)

You might also like