2015 No Memo Test 3
2015 No Memo Test 3
CLASS TEST 2
06 MAY 2015
QUESTION 1 [5 marks]
(a) What is test-retest reliability? (1)
(b) What is internal consistency reliability? (1)
(c) How do you measure internal consistency? Provide three formula’s or explanations, not just
the names of the methods. (3)
The current study aims to identify what factors make some people believe that they are lucky and others
believe that they are unlucky. The study is based on a survey of 62 STA3022F students who answered the
following questions in an online questionnaire (possible responses for categorical variables are given in
brackets).
1
3. What is your gender? (1 = Male; 0 = Female)
4. Have you ever won a competition before? (1 = Yes; 0 = No)
5. How many economic courses have you completed?
A discriminant analysis model has been constructed with the aim of identify which, if any, of the four
independent variables are able to distinguish between the two groups (groups labelled as “Yes”, and “No”).
Questions:
b) Can the discriminant model able to significantly discriminate between the two groups? Provide
statistical evidence at the 5% level to support your answer. Clearly state all null and alternate
hypotheses. (4)
c) Use the cut-off value rule to classify Respondent 4. Clearly indicate the classification rule. Is this a
correct classification? (5.5)
d) Compare the overall hit rate with two chance criteria and use these comparisons to evaluate the
overall quality of the discriminant model (4)
e) Evaluate whether the discriminant model is better at predicting some groups than others. (Hint:
Calculate the correct classification rate for each group) (1.5)
ID Q1 Q2 Q3 Q4 Q5
1 Yes 21 Female No 3
2 Yes 21 Male Yes 3
3 No 21 Female No 3
4 Yes 20 Male No 2
5 No 20 Male No 2
6 Yes 20 Female No 2
7 No 21 Male Yes 2
8 No 21 Female No 3
9 No 19 Male No 2
10 Yes 21 Female Yes 4
11 Yes 21 Male Yes 2
12 No 20 Male No 2
13 No 20 Male Yes 2
14 No 20 Male No 2
15 Yes 20 Female Yes 2
Group means:
Q2 Q3d Q4d Q5
Yes 20.75 0.428 0.2857143 3.20000
No 20.20 0.750 0.3636364 2.52273
2
Coefficients of linear discriminants:
LD1
Constant 0.254
Q2 -2.948
Q3d 0.085
Q4d 1.383
Q5 -0.011
Classification Table
Predicted Groups
yes no Total
Observed yes 28 6 34
Groups no 4 24 28
Total 32 30 62
> centroidYes
[1] -1.0242
> centroidNo
[1] 1.0974
In a 2001 paper titled “Variable precision rough set theory and data discretisation: an application to corporate
failure prediction”, Beynon and Peel use a number of financial performance ratios to build a model that is
able to discriminate between firms in the UK that fail and those that do not fail. Data of 60 randomly chosen
firms was collected on the following set of financial variables.
Refer to the attached Classification tree and answer the following questions.
Questions:
a) Define a set of decision rules indicating the circumstances under which firms can be predicting as
failing or not failing. (3)
b) Which group would Firm 2 be classified to? Is this a correct classification? (2)
3
c) Calculate the diversity index for node 1 (Root Node) and comment why CALC variable is chosen as
a splitting variable? (2)
d) Briefly explain the differences between the Bonsai and Pruning techniques. (1)
1 NotFail
29/31
CACL<=1.1694 CACL>1.1694
2 Fail 3 NotFail
23/8 7/22
ROCS<=4.4486
ROCS>4.4486 CLTA<=0.70635 CLTA>0.70635
𝑛 𝐷𝐼 +𝑛 𝐷𝐼 = 𝐷𝐼 − 𝑊𝐴𝐷𝐼
=
𝑛 +𝑛
(𝑛 − 1 − 𝑝)𝑛 𝑛 𝑛 𝑍̅ + 𝑛 𝑍̅
= 𝑑 =
𝑝(𝑛 − 2)(𝑛 + 𝑛 ) 𝑛 +𝑛
𝐹, , . = 2.557
𝐹, , . = 2.513
=1− 𝜌