Problems For Exercise
Problems For Exercise
1. Conduct the t test for the given multiple regression model for 8 df at α = 0.05 level of
significance. Let the std. errors for “Duration” and “Importance” are 0.22 and 0.12 (2
marks)
Attitude = 0.33732 + 0.48108 (Duration) + 0.28865 (Importance)
Find the missing values (?) and compute R2 from the given ANOVA table Information.
(2 marks)
Source SS df MS Calculated F
Regression ? 1 ?
?
Residual 18.223 ? ?
Total 63.815 24
6. For the given data, two multiple linear regression models were constructed.
(5 marks)
Model 1: Y = 62.405-0.144X4+0.102X3+0.510X2+1.551X1
Model 2: Y = 71.648+0.416X2+1.452X1-0.2365X4
X1 X2 X3 X4 Y
7.0 26.0 6.0 60.0 78.5
1.0 29.0 15.0 52.0 74.3
11.0 56.0 8.0 20.0 104.3
11.0 31.0 8.0 47.0 87.6
7.0 52.0 6.0 33.0 95.9
11.0 55.0 9.0 22.0 109.2
3.0 71.0 17.0 6.0 102.7
1.0 31.0 22.0 44.0 72.5
2.0 54.0 18.0 22.0 93.1
21.0 47.0 4.0 26.0 115.9
1.0 40.0 23.0 34.0 83.8
11.0 66.0 9.0 12.0 113.3
10.0 68.0 8.0 12.0 109.4
Compare the two models and find which one is best fitted? Why?
7. for the given data, apply the hierarchical-agglomerative clustering algorithms – Single
linkage, complete linkage and average linkage methods. Create the confusion matrix using
the actual label and predicted label. (Dataset name: IRIS dataset)
Page 1 of 2
Sepal Sepal Petal Petal Iris
Length Width Length Width Group label
50 33 14 2 1
64 28 56 22 1
65 28 46 15 2
67 31 56 24 2
63 28 51 15 2
46 34 14 3 1
69 31 51 23 1
62 22 45 15 2
48 35 15 2 1
8. A market researcher wants to understand the characteristic of people that determine which
one of two brands of toothpaste they prefer. Data is collected 13 people (7 of whom prefer
Brand A and 6 of whom prefer Brand B), with age, gender and income as the three
independent variables.
(10 marks)
Page 2 of 2