Exercises Topic 2 Solutions
Exercises Topic 2 Solutions
ST22
A staff restaurant conducted a survey collecting data from a random sample of 32 clients.
They were asked, among other things: how many times did they eat the restaurant during the
last month (variable: FREQUENCY); how much did they spend on a meal (variable:
SPENDING); how old they were (variable: AGE).
The restaurant manager would like to construct a model that would explain the spending
amount in terms of the frequency and the age for all clients.
You are asked to proceed with the different tests required to validate the linear
regression model for all clients among which the sample was taken.
[Taken from chap 10 Méthodes Statistiques pour le Management - Hahn & Macé – Pearson 2016
Translated by Lynn FARAH]
Appendix 2 (2nd scatterplot)
Response variable: SPENDING
Explanatory variable: AGE
Variable #1 (SPENDING)
Mean 4.17188
Corrected Standard Deviation 1.53249
Variable #2 (FREQUENCY)
Mean 10.59375
Corrected Standard Deviation 6.76738
Variable #3 (AGE)
Mean 35.75
Corrected Standard Deviation 11.6453
Count n 32
R-square 0.58999
ANALYSIS OF VARIANCE
Sum of Squares
Regression ?
Residual 29.8504
Total 72.8047
?
[Taken from chap 10 Méthodes Statistiques pour le Management - Hahn & Macé – Pearson 2016
Translated by Lynn FARAH]
1. Model pertinence:
𝑅 2 /𝑘 0.59/2
𝐹𝑆𝑇𝐴𝑇 = = = 20.865
(1 − 𝑅 2 )/(𝑛 − 𝑘 − 1) (1 − 0.59)/29
[Taken from chap 10 Méthodes Statistiques pour le Management - Hahn & Macé – Pearson 2016
Translated by Lynn FARAH]
Exercise 2: Choosing a model and using it
The HR director of an industrial group would like to construct a model explaining the
monthly salary of all employees.
Using data collected from a random sample of 36 employees, he tests two explanatory
variables that he deems relevant: the number of years of graduate studies (X1) and the
number of years of service (X2).
You can find below results from three regression models he tested using Excel.
2) Can you help the HR director choose the most suitable model?
a. Using the information provided, which model would you suggest to use? Justify
your choice.
b. Estimate the parameters of the chosen model.
3) Pierre Durand, an employee of this group, is 38 years old, with 10 years of service
and 4 years of graduate studies. His monthly salary is 2050 Euros and he thinks he is
underpaid.
Calculate a 95% confidence interval for the mean salary of an employee with Pierre
Durand’s profile.
If you were the HR director of that firm, what
would you tell Pierre Durand about his salary?
Regression of Y w.r.t X1
ANALYSIS OF VARIANCE R² = 0.969
Sum of Squares F-stat = 1050.7 and crit value = 4.17 => Model pertinent
Regression 21 475 075
T-stat = 32.43 and crit. value = 2.045 => Coef significant
Residual 694 925
Total 22 170 000
Coefficients Standard-
error
Constant 706
Variable X1 326.9 10.08
4
[Taken from chap 10 Méthodes Statistiques pour le Management - Hahn & Macé – Pearson 2016
Translated by Lynn FARAH]
Regression of Y w.r.t. X2
ANALYSIS OF VARIANCE R² = 0.00063
Sum of Squares F-stat = 0.021 and crit value = 4.17 => Model NOT pertinent
Regression 13 975
Residual 22 156 025 T-stat = 0.1465 and crit. value = 2.045 => Coeff NOT significant
Total 22 170 000
?
Coefficients Standard-
error
Constant 1 811.2
Variable X2 9.32 63.62
X2: T-stat = -0.79 and crit. value = 2.042 => Coeff NOT significant
Coefficients Standard-
error
Constant 742
Variable X 327.27 10.15
1
Variable X -8.98 11.35
2
Q2. A. The chosen model is the first model (containing only variable X1) because it is
significant and contains no unnecessary and useless info (X2).
Q3. A.
Chosen model = model 1 => computations made with model 1
[Taken from chap 10 Méthodes Statistiques pour le Management - Hahn & Macé – Pearson 2016
Translated by Lynn FARAH]
𝐶𝐼95% = 𝑃𝑜𝑖𝑛𝑡𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑒 ∓ 𝐶𝑟𝑖𝑡. 𝑉𝑎𝑙𝑢𝑒 ∗ 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑𝐸𝑟𝑟𝑜𝑟
[Taken from chap 10 Méthodes Statistiques pour le Management - Hahn & Macé – Pearson 2016
Translated by Lynn FARAH]