0% found this document useful (0 votes)
3 views

TEST1

The document discusses a regression analysis of sales and advertising data, highlighting the impact of an outlier from week 12 on the regression results. After removing this outlier, the coefficients and statistical significance improved significantly, indicating a stronger relationship between advertising and sales. The final model shows that advertising explains 51.54% of the variance in sales, compared to only 2.7% in the initial model.

Uploaded by

cumaalihuyuk0
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

TEST1

The document discusses a regression analysis of sales and advertising data, highlighting the impact of an outlier from week 12 on the regression results. After removing this outlier, the coefficients and statistical significance improved significantly, indicating a stronger relationship between advertising and sales. The final model shows that advertising explains 51.54% of the variance in sales, compared to only 2.7% in the initial model.

Uploaded by

cumaalihuyuk0
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

a) Make the scatter diagram with sales on the vertical axis and advertising on the horizontal axis.

is. What do you expect to find if you would fit a regression line to
these data?

Advertising and Sales


Observation Advertising Sales
55
1 12 24
50
2 12 27
45
3 9 25
4 11 27 40

Sales
35 y = -0,3246x + 29,627
5 6 23 R² = 0,027
6 9 25 30
7 15 27 25
8 6 25 20
9 11 26 5 6 7 8 9 10 11 12 13 14 15 16 17
10 16 27 Advertising
11 11 25
Data Linéaire (Data)
12 6 50
13 13 26
14 11 23
15 13 26 Normally we would expect to find out a positive relationship between advertising and
16 7 23 sales.
17 8 23
However, due to the extreme value (6;50) of the 12th week, the regression results
18 8 24
are skewed by it.
19 12 26
20 9 24
b) Estimate the coefficients a and b in the simple regression model with sales as dependent variable and advertising as explanatory factor. Also compute the
standard error and t-value of b. Is b significantly different from 0?

We can see that by computing on Excel:

a = 29.63
b = -0.3246
Standard error of b = 0.4589
Statistiques de la régression t-value of b = -0.707
Coefficient de p-value of b = 0.4885 > 0.05 Thus b is not significantly different from 0.
détermination multiple 0,1644364
Therefore, in this model, it appears that the number of advertisements does not
Coefficient de
détermination R^2 0,02703933 affect the number of sales.
Coefficient de -
détermination R^2 0,027014041
Erreur-type 5,836474462
Observations 20

ANALYSE DE VARIANCE
Degré de Somme des Moyenne
liberté carrés des carrés F Valeur critique de F
Régression 1 17,04018547 17,04018547 0,500233921 0,488454014
Résidus 18 613,1598145 34,06443414
Total 19 630,2

Limite inférieure pour seuil de Limite supérieure pour seuil de


Coefficients Erreur-type Statistique t Probabilité confiance = 95% confiance = 95%
Constante 29,62689335 4,881527318 6,069185201 9,7841E-06 19,37118502 39,88260169
- -
Variable X 1 0,324574961 0,458910976 0,707272169 0,488454014 -1,288711145 0,639561222
c) Compute the residuals and draw a histogram of these residuals. What conclusion do you draw from this histogram?
By looking at the graphs (cf. next page), we
Observation Advertising Sales Linear Residuals Residuals can clearly state that the 12th week is the
Regression (Sale - ( squared special week
(prediction) a+b*Ad))
1 12 24 25,7318 -1,7318 2,99913124
2 12 27 25,7318 1,2682 1,60833124
3 9 25 26,7056 -1,7056 2,90907136
4 11 27 26,0564 0,9436 0,89038096
5 6 23 27,6794 -4,6794 21,89678436
6 9 25 26,7056 -1,7056 2,90907136
7 15 27 24,758 2,242 5,026564
8 6 25 27,6794 -2,6794 7,17918436
9 11 26 26,0564 -0,0564 0,00318096
10 16 27 24,4334 2,5666 6,58743556
11 11 25 26,0564 -1,0564 1,11598096
12 6 50 27,6794 22,3206 498,2091844
13 13 26 25,4072 0,5928 0,35141184
14 11 23 26,0564 -3,0564 9,34158096
15 13 26 25,4072 0,5928 0,35141184
16 7 23 27,3548 -4,3548 18,96428304
17 8 23 27,0302 -4,0302 16,24251204
18 8 24 27,0302 -3,0302 9,18211204
19 12 26 25,7318 0,2682 0,07193124
20 9 24 26,7056 -2,7056 7,32027136
Residuals
25

20

15 The 12th week clearly stands out

10

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-5

-10

Residuals squared
600

500

400

300

200

100

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
d) Apparently, the regression result of part (b) is not satisfactory. Once you realize that the large residual corresponds to the week with opening hours during
the evening, how would you proceed to get a more satisfactory regression model?

In order to obtain a more satisfactory regression model, we should delete the 12th
week's values. Therefore, we will have only 19 observations, but with fewer
observations, we can achieve a more accurate forecast.

e) Delete this special week from the sample and use the remaining 19 weeks to estimate the coefficients a and b in the simple regression model with sales as
dependent variable and advertising as explanatory factor. Also compute the standard error and t-value of b. Is b significantly different from 0?

Observation Advertising Sales Linear Regression Residuals (Sale - ( Residuals


(prediction) a+b*Ad)) squared
1 12 24 25,625 -1,625 2,640625
2 12 27 25,625 1,375 1,890625
3 9 25 24,5 0,5 0,25
4 11 27 25,25 1,75 3,0625
5 6 23 23,375 -0,375 0,140625
6 9 25 24,5 0,5 0,25
7 15 27 26,75 0,25 0,0625
8 6 25 23,375 1,625 2,640625
9 11 26 25,25 0,75 0,5625
10 16 27 27,125 -0,125 0,015625
11 11 25 25,25 -0,25 0,0625
12 13 26 26 0 0
13 11 23 25,25 -2,25 5,0625
14 13 26 26 0 0
15 7 23 23,75 -0,75 0,5625
16 8 23 24,125 -1,125 1,265625
17 8 24 24,125 -0,125 0,015625
18 12 26 25,625 0,375 0,140625
19 9 24 24,5 -0,5 0,25
Data Residuals squared
28 6
27
5
26
Sales

4
25
y = 0,375x + 21,125 3
24
R² = 0,5154
23 2
22
1
5 6 7 8 9 10 11 12 13 14 15 16 17
0
Advertising
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Residuals
2

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
-1

-2

-3
Statistiques de la régression
Coefficient de
détermination multiple 0,717893879
Coefficient de
détermination R^2 0,515371622
Coefficient de
détermination R^2 0,48686407

Erreur-type 1,053704948

Observations 19

ANALYSE DE VARIANCE
Degré de Somme des Moyenne
liberté carrés des carrés F Valeur critique de F

Régression 1 20,07236842 20,07236842 18,07842454 0,000537945

Résidus 17 18,875 1,110294118

Total 18 38,94736842

Limite inférieure pour seuil de Limite supérieure pour seuil de


Coefficients Erreur-type Statistique t Probabilité confiance = 95% confiance = 95%

Constante 21,125 0,954848094 22,12393797 5,71577E-14 19,11044662 23,13955338

Variable X 1 0,375 0,088196424 4,251873062 0,000537945 0,18892181 0,56107819


f) Discuss the differences between your findings in parts (b) and (e). Describe in words what you have learned from these results.

Before After
a 29,627 21,125
b -0,3246 0,375 Comparing the different outcomes that have been computed in Excel between
R² 0,027 0,5154 the cases with 20 and 19 observations.
standard error 0,459 0,088
t-value -0,707 4,252
p-value 0,489 0,00054

We can see that by removing this special week, the number of sales is better
explained by the number of advertisements made (51.54% explanatory power).

You might also like