0% found this document useful (0 votes)

26 views

STA3022 Test2 Solutions

The document discusses a statistical analysis of factors that influence whether people consider themselves lucky or unlucky. It describes a survey that collected data on 62 students, including whether they consider themselves lucky, their age, gender, history of competition wins, and economics courses completed. A discriminant analysis was conducted to identify which variables distinguish between those who say they are lucky versus unlucky. The analysis found that the model could significantly discriminate between the two groups. It also provides details on evaluating the model, such as calculating hit rates and classification accuracy for each group.

Uploaded by

alutakaunda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views

STA3022 Test2 Solutions

Uploaded by

alutakaunda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

UNIVERSITY OF CAPE TOWN

DEPARTMENT OF STATISTICAL SCIENCES

STA3022F
TEST 2

Question 1 [5 marks]
(a) What is test-retest reliability? (1)
(b) What is internal consistency reliability? (1)
(c) How do you measure internal consistency? Provide three formula’s or explanations, not just
the names of the methods. (3)

Answer to Q1
(a) One of
A reliable measuring instrument in this context is one that gives consistent scores when used
repeatedly.
Or
There should be high correlations between test scores taken over multiple trials.

(b) The group of questions is internally-consistent or reliable if they are able to measure the same
underlying construct.

𝑘 𝑣𝑎𝑟(𝑄1 +⋯𝑄𝑘 )−{𝑣𝑎𝑟(𝑄1 )+⋯+𝑣𝑎𝑟(𝑄𝑘 )}

(c) Chronbach’s alpha = 𝑘−1 × 𝑣𝑎𝑟(𝑄1 +⋯𝑄𝑘 )
𝛼-if-deleted by calculating Chronbach’s alpha without each questions
Item total correlation by calculating the correlation between each question and the sum of all
the other questions.
Half mark for each name and half mark for formula/description

Question 2 [16 marks]

(a) In the painters data set in the R package MASS the subjective assessment, on a 0 to 20 integer
scale, of 54 classical painters is given. The painters were assessed on four characteristics:
composition, drawing, colour and expression. Calculate the Euclidean distance between the
following two samples:
> painters[1:2,]
Composition Drawing Colour Expression
Da Udine 10 8 16 3
Da Vinci 15 16 4 14
(3)
(b) Why is there no need to scale the data set before calculating the Euclidean distance? (1)
1
(c) Define 𝑠𝑡𝑟𝑒𝑠𝑠 and explain how it is used. (5)
(d) Explain step by step how to perform hierarchical clustering with the centroid method. (7)

Answer to Q2
 
(a) 𝑑12 = √∑𝑝𝑗=1(𝑥1𝑗 − 𝑥2𝑗 )2 = √(10 − 15)2 + (8 − 16)2 + (16 − 4)2 + (3 − 14)2 =

√(−5)2 + (−8)2 + (12)2 + (−11)2 = √25 + 64 + 144 + 121 = √354 = 18.8 

(b) All variables are measured on the same 0 to 20 integer scale.

2 
(c) 𝑠𝑡𝑟𝑒𝑠𝑠 = ∑𝑛−1 𝑛
𝑖=1 ∑𝑗=𝑖+1(𝑑𝑖𝑗 − 𝛿𝑖𝑗 )
The aim of MDS is to find a representationof the samples so that the dissimilarities
between
 
them in the plot, given by 𝛿𝑖𝑗 , match the given dissimilarities 𝑑𝑖𝑗 as closely as possible
(optimally).
If the symbols are reversed, no marks are deducted as long as die descriptions are correct.

(d) Start with all objects each in its own cluster. 

Merge the two closest clusters 
Repeat

Calculate the dissimilarity between the newly merged cluster and each other cluster
By calculating the distance between the cluster means 
Merge the two closest clusters 
Until all objects are merged into the same cluster. 
Use the clustering tree to cut the tree at a specific height or into a specific number of
clusters. 

QUESTION 3 [17 marks]

The current study aims to identify what factors make some people believe that they are lucky and others
believe that they are unlucky. The study is based on a survey of 62 STA3022F students who answered the
following questions in an online questionnaire (possible responses for categorical variables are given in
brackets).

1. Do you consider yourself to be a lucky person? (Yes/No)

2. What is your age?
3. What is your gender? (1 = Male; 0 = Female)
4. Have you ever won a competition before? (1 = Yes; 0 = No)
5. How many economic courses have you completed?

A discriminant analysis model has been constructed with the aim of identify which, if any, of the four
independent variables are able to distinguish between the two groups (groups labelled as “Yes”, and “No”).
Questions:

a) Write down the discriminant function. (2)

2
b) Which groups is the discriminant model able to significantly discriminate between? Provide
statistical evidence at the 5% level to support your answer. Clearly state all null and alternate
hypotheses. (4)

c) Use the cut-off value rule to classify Respondent 4. Clearly indicate the classification rule. Is this a
correct classification? (5.5)

d) Compare the overall hit rate with two chance criteria and use these comparisons to evaluate the
overall quality of the discriminant model (4)

e) Evaluate whether the discriminant model is better at predicting some groups than others. (Hint:
Calculate the correct classification rate for each group)
(1.5)

Q3-a) Write down the discriminant function.

Z1 = 0.254 − 2.948 ∗ Q2 + 0.085 ∗ Q3d + 1.383 ∗ Q4d − 0.011 ∗ Q5
12 12 12 12
Q3-b)

𝐻0 : There is no difference between the yes and no categories’ centroids.

𝐻1 : There is difference between the yes and no categories’centroids.


First we need to calculate the distance:

12 12
2 2
𝑑 = (−1.0242 − 1.0974) = 4.501187

12 (ratio)
12 (answer)
(𝑛 − 1 − 𝑝)𝑛1 𝑛2 2 (62 − 1 − 4) ∗ 34 ∗ 28
𝐹𝑦𝑒𝑠,𝑙𝑜𝑤 = 𝑑 = ∗ 4.501187 = 16.41481
𝑝(𝑛 − 2)(𝑛1 +𝑛2 ) 4 ∗ (62 − 2) ∗ (34 + 28)

𝐹𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 = 𝐹𝑝,𝑛−1−𝑝,𝛼 = 𝐹4,62−1−4,0.05 = 𝐹4,57,0.05 = 2.533

12 (comparison) 12 (conclusion)

Since F calculated is greater than the critical F value, centroids are significantly different from each other at
5% sig. level.

(or alternatively they can say that the F calculated is very high)

3
Q3-c) First we need to calculate the cut-off value
12 (ratio)
12 (answer)
𝑛1 𝑍̅ 2 + 𝑛2 𝑍̅1 34 ∗ 1.0974 + 28 ∗ (−1.0242)
𝐶𝑢𝑡 − 𝑜𝑓𝑓 = = = 0.1392581
𝑛1 + 𝑛2 34 + 28

Then we need to specify the rule:

12
If Z<0.1392581 then classify as “YES”

Calculate Z value for the 4th respondent:

12 12 12 12 12

Z4 = 0.254 − 2.948 ∗ 20 + 0.085 ∗ 1 + 1.383 ∗ 0 − 0.011 ∗ 2 = −58.643

12 12
Since Z4 < 0.1392581, classify as “YES”, hence the centroid for Yes is negative

Therefore it is a correct classification. 12

Q3-d) Evaluate the hit-rate. 12

28 + 24 12
𝐻𝑖𝑡 − 𝑟𝑎𝑡𝑒 = = 83.87%
62

12 12
𝐻𝑚𝑎𝑥 = max(34/62, 28/62) = 54.84%

12 12
34 2 28 2
𝐻𝑝𝑟𝑜𝑝 = ( ) + ( ) = 50.47%
62 62

12 12
Hit-rate is greater than both 𝐻𝑚𝑎𝑥 and 𝐻𝑝𝑟𝑜𝑝 , therefore this indicates a good hit-rate.

4
Q3-e) Evaluate the hit-rate for each category
28 12
𝐻𝑖𝑡 − 𝑟𝑎𝑡𝑒(𝑦𝑒𝑠) = = 82.4%
34

12
24
𝐻𝑖𝑡 − 𝑟𝑎𝑡𝑒(𝑛𝑜) = = 85.7%
28

Both correct classification rates are similar and very good. 12

Q4-a) Interpret the Classification Tree and define an appropriate decision rule for selecting a
positive return.
(1) If CACL<=1.1694 & ROCS<=4.4486 & WCTA<= - 0.3326, then classify as Not Fail 12
(2) If CACL<=1.1694 & ROCS<=4.4486 & WCTA> - 0.3326, classify as Fail 12
(3) If CACL<=1.1694 & ROCS>4.4486, then classify as NotFail 12
(4) If CACL>1.1694 & CLTA<=0.70635 & Sales <=3091.5, then classify as Fail 12
(5) If CACL>1.1694 & CLTA<=0.70635 & Sales >3091.5, then classify as Not Fail
(6) If CACL>1.1694 & CLTA>0.70635, then classify as Fail 12
12

Q4-b)
Firm SALES ROCS CLTA CACL WCTA FAIL
2 16149 -1.07 1.22 0.62 -0.46 0

12 12 12

CACL=0.62 < 1.1694 & ROCS = -1.07 <4.4486 & WCTA = -0.46 < -0.3326

Therefore classify as Not Fail. 12

Q4-c)

12 12 12

2 2
29 31
𝐷𝐼1 = 1 − (( ) + ( ) ) = 0.4994444
60 60

OR
30 2 30 2
𝐷𝐼1 = 1 − (( ) + ( ) ) = 0.5
60 60

5
The variable is chosen according to the reduction in the DI. The variable that creates the maximum
reduction in the index is chosen for splitting the node. 12

Q4-d)

Bonsai techniques check the several stopping criteria before letting the tree grow fully. 12
Pruning techniques let the grow fully and then start pruning the tree. 12

Q4-d)

Classification Table
Predicted Groups
Fail 12 NotF Total
Observed Fail
12
21+2+2=25 29-25=412 29
Groups NotF 31-28=3 2+4+22=28 31 12
12 12 12 (totals)
Total 28 32 60
12
(totals)
OR

Classification Table
Predicted Groups
Fail 12 NotF Total
Observed Fail
12
21+2+2=25 1+1+3=512 30
Groups NotF 2+0+0=2 2+4+22=28 30 12
12 12 12 (totals)
Total 27 33 60
12
(totals)
OR

Classification Table
Predicted Groups
Fail 12 NotF Total
Observed Fail
12
21+2+2=25 1+1+3=512 29
Groups NotF 2+0+0=2 2+4+22=28 31 12
12 12 12 (totals)
Total 27 33 60
12
(totals) 6
7

Business Forecasting 9th Edition Hanke Solution Manual
71% (7)
Business Forecasting 9th Edition Hanke Solution Manual
9 pages
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
How To Create A 5 E Lesson Plan: 1. Engage
100% (1)
How To Create A 5 E Lesson Plan: 1. Engage
4 pages
The Art of Covert Hypnosis PDF
67% (9)
The Art of Covert Hypnosis PDF
10 pages
(Personal Development) : (If Available, Write The Attached Enabling Competencies
100% (1)
(Personal Development) : (If Available, Write The Attached Enabling Competencies
3 pages
2015 No Memo Test 3
No ratings yet
2015 No Memo Test 3
4 pages
STA3022Test2 2018
No ratings yet
STA3022Test2 2018
7 pages
STA3022Test2 2023 v2
No ratings yet
STA3022Test2 2023 v2
6 pages
ADS ia 2
No ratings yet
ADS ia 2
9 pages
RM Practical-195218222
No ratings yet
RM Practical-195218222
15 pages
Hypothesis Test - Variance - Section B
No ratings yet
Hypothesis Test - Variance - Section B
40 pages
Unit Study Guide On Linear Regression Models
No ratings yet
Unit Study Guide On Linear Regression Models
5 pages
Kolmogorov Smirnov
100% (1)
Kolmogorov Smirnov
12 pages
Section: - This Is An Open-Book and Open-Note Test. However, Sharing of Material Is NOT Permitted
No ratings yet
Section: - This Is An Open-Book and Open-Note Test. However, Sharing of Material Is NOT Permitted
9 pages
Test 1 Review A
No ratings yet
Test 1 Review A
7 pages
Grad Lecture 3
No ratings yet
Grad Lecture 3
27 pages
Statistical Inference and testing of single mean
No ratings yet
Statistical Inference and testing of single mean
50 pages
Final Exam 2023 Statistics 2
No ratings yet
Final Exam 2023 Statistics 2
14 pages
Nitika - X10 - Module - 1 PDF
No ratings yet
Nitika - X10 - Module - 1 PDF
12 pages
ASM Compre Paper (Sem-I) (2021-22)
No ratings yet
ASM Compre Paper (Sem-I) (2021-22)
2 pages
LSSBB Full Length Simulation Test
No ratings yet
LSSBB Full Length Simulation Test
23 pages
84 Quantitative Finals
No ratings yet
84 Quantitative Finals
235 pages
IV_AI-DS_AD3491_FDSA_Unit5
No ratings yet
IV_AI-DS_AD3491_FDSA_Unit5
39 pages
Weighted Clusterwise Linear Regression based on adaptive quadratic form distance
No ratings yet
Weighted Clusterwise Linear Regression based on adaptive quadratic form distance
20 pages
Non Parametric Tests
No ratings yet
Non Parametric Tests
64 pages
Chi Square Test
No ratings yet
Chi Square Test
38 pages
INFE StatsModule Part-3 T-Test ANOVA
No ratings yet
INFE StatsModule Part-3 T-Test ANOVA
15 pages
Discriminant Analysis For Risk Classification and Prediction
No ratings yet
Discriminant Analysis For Risk Classification and Prediction
23 pages
IAT Paper Jan-June 22 DMBI DIV A&B Solution
No ratings yet
IAT Paper Jan-June 22 DMBI DIV A&B Solution
10 pages
PUT Solution
No ratings yet
PUT Solution
12 pages
Data Analysis Tools.
No ratings yet
Data Analysis Tools.
51 pages
Statistics
No ratings yet
Statistics
8 pages
Cluster Ana
No ratings yet
Cluster Ana
12 pages
Sardilla's Report On Advance Statistic
No ratings yet
Sardilla's Report On Advance Statistic
32 pages
339 - DADMB End Term
No ratings yet
339 - DADMB End Term
3 pages
Statistics Problem
No ratings yet
Statistics Problem
8 pages
Stats Cheatsheet Final
No ratings yet
Stats Cheatsheet Final
2 pages
SDS Solution1
No ratings yet
SDS Solution1
26 pages
Chi-Square Test: Milan A Joshi
No ratings yet
Chi-Square Test: Milan A Joshi
39 pages
Introduction To Statistics and Descriptive Statistics
0% (1)
Introduction To Statistics and Descriptive Statistics
7 pages
TQM - TRG - F-09 - Discriminant Analysis - Rev01 - 20180602 PDF
No ratings yet
TQM - TRG - F-09 - Discriminant Analysis - Rev01 - 20180602 PDF
22 pages
Simulation Theory 2022 - With Solution
No ratings yet
Simulation Theory 2022 - With Solution
8 pages
HW5_solution_Fall_2024
No ratings yet
HW5_solution_Fall_2024
18 pages
Empirical Data Analysis in Accounting and Finance
No ratings yet
Empirical Data Analysis in Accounting and Finance
37 pages
Discriminant & Logit Analysis Using SAS Enterprise Guide
No ratings yet
Discriminant & Logit Analysis Using SAS Enterprise Guide
53 pages
ISLR solutions——Classification
No ratings yet
ISLR solutions——Classification
20 pages
Sample Test
No ratings yet
Sample Test
17 pages
Unit 4 & Unit 5
0% (1)
Unit 4 & Unit 5
59 pages
Multiple Discriminant Analysis: Dr. Hemal Pandya
No ratings yet
Multiple Discriminant Analysis: Dr. Hemal Pandya
29 pages
Engineering - Two Sample T-Test For The Mean - 2024
No ratings yet
Engineering - Two Sample T-Test For The Mean - 2024
35 pages
Discriminant Analysis: Plot of Y X. Symbol Is Value of GROUP
No ratings yet
Discriminant Analysis: Plot of Y X. Symbol Is Value of GROUP
8 pages
A Review of Basic Statistical Concepts: Answers To Problems and Cases 1
No ratings yet
A Review of Basic Statistical Concepts: Answers To Problems and Cases 1
94 pages
Group 4 (Analysis of Variance)
No ratings yet
Group 4 (Analysis of Variance)
80 pages
Steps in Logistic Regression
No ratings yet
Steps in Logistic Regression
5 pages
395
No ratings yet
395
8 pages
Data Analysis
No ratings yet
Data Analysis
61 pages
A Review of Basic Statistical Concepts: Answers To Odd Numbered Problems 1
No ratings yet
A Review of Basic Statistical Concepts: Answers To Odd Numbered Problems 1
32 pages
MTH 233 Week 1 MyStatLab® Post-Test
No ratings yet
MTH 233 Week 1 MyStatLab® Post-Test
16 pages
Practise Mathematics Grade 7 Book 8
From Everand
Practise Mathematics Grade 7 Book 8
Esther Chen
5/5 (1)
Sat Mathematics Review And Practice
From Everand
Sat Mathematics Review And Practice
Addison Shaw
1/5 (1)
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)
De Moiver's Theorem (Trigonometry) Mathematics Question Bank
From Everand
De Moiver's Theorem (Trigonometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
From Everand
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
Shubhankar Paul
No ratings yet
Gillespie Et Al-2003-Journal of Clinical Nursing
No ratings yet
Gillespie Et Al-2003-Journal of Clinical Nursing
10 pages
Reflection Essay
No ratings yet
Reflection Essay
2 pages
CID Assessment Monitoring Tool
No ratings yet
CID Assessment Monitoring Tool
3 pages
Immediate download Psychology for Nurses and Health Professionals Second Edition Gross ebooks 2025
No ratings yet
Immediate download Psychology for Nurses and Health Professionals Second Edition Gross ebooks 2025
67 pages
ASS ADM
No ratings yet
ASS ADM
5 pages
Global Applications of Indian Psychology: Therapeutic and Strategic Models Anuradha Sathiyaseelan pdf download
No ratings yet
Global Applications of Indian Psychology: Therapeutic and Strategic Models Anuradha Sathiyaseelan pdf download
38 pages
Group 1-Introduction To Business Correspondence
No ratings yet
Group 1-Introduction To Business Correspondence
76 pages
Instrumental Activities of Daily Living
0% (1)
Instrumental Activities of Daily Living
3 pages
How To Write A Research Paper PDF
No ratings yet
How To Write A Research Paper PDF
23 pages
Individual Transition Plan Aki
No ratings yet
Individual Transition Plan Aki
6 pages
Typing
No ratings yet
Typing
1 page
Template Bullet-1
No ratings yet
Template Bullet-1
2 pages
Types of Nonverbal Communication
No ratings yet
Types of Nonverbal Communication
3 pages
ASystematic Literature Reviewand Metaanalysison Effectivenessof Neurofeedbackfor Obsessive Compulsive Disorder
No ratings yet
ASystematic Literature Reviewand Metaanalysison Effectivenessof Neurofeedbackfor Obsessive Compulsive Disorder
10 pages
National Institute of Fashion Technology
No ratings yet
National Institute of Fashion Technology
3 pages
MEME: Dapatkah Meningkatkan Kemampuan Siswa Dalam Menulis Teks Anekdot?
No ratings yet
MEME: Dapatkah Meningkatkan Kemampuan Siswa Dalam Menulis Teks Anekdot?
13 pages
PGDGC Assignment 2023-24
No ratings yet
PGDGC Assignment 2023-24
4 pages
Behaviorism vs. Essentialism
No ratings yet
Behaviorism vs. Essentialism
5 pages
EDUC 323D PPT
No ratings yet
EDUC 323D PPT
45 pages
? Viral Hooks Worksheet - Hook Templates
No ratings yet
? Viral Hooks Worksheet - Hook Templates
2 pages
Describing Learners PDF
No ratings yet
Describing Learners PDF
24 pages
Social Psych Module 2 - Lesson 5
No ratings yet
Social Psych Module 2 - Lesson 5
5 pages
The Link Between Happiness and A Sense of Humor
100% (1)
The Link Between Happiness and A Sense of Humor
3 pages
How To Deal With A Mathematical Journal Editor
No ratings yet
How To Deal With A Mathematical Journal Editor
3 pages
Methods of Data Collection - CGS
No ratings yet
Methods of Data Collection - CGS
18 pages
Emotional Intelligence
No ratings yet
Emotional Intelligence
11 pages
Transcription Practice
No ratings yet
Transcription Practice
13 pages

STA3022 Test2 Solutions

Uploaded by

STA3022 Test2 Solutions

Uploaded by

UNIVERSITY OF CAPE TOWN

DEPARTMENT OF STATISTICAL SCIENCES

𝑘 𝑣𝑎𝑟(𝑄1 +⋯𝑄𝑘 )−{𝑣𝑎𝑟(𝑄1 )+⋯+𝑣𝑎𝑟(𝑄𝑘 )}

Question 2 [16 marks]

√(−5)2 + (−8)2 + (12)2 + (−11)2 = √25 + 64 + 144 + 121 = √354 = 18.8 

(b) All variables are measured on the same 0 to 20 integer scale.

(d) Start with all objects each in its own cluster. 

QUESTION 3 [17 marks]

1. Do you consider yourself to be a lucky person? (Yes/No)

a) Write down the discriminant function. (2)

Q3-a) Write down the discriminant function.

𝐻0 : There is no difference between the yes and no categories’ centroids.

First we need to calculate the distance:

𝐹𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 = 𝐹𝑝,𝑛−1−𝑝,𝛼 = 𝐹4,62−1−4,0.05 = 𝐹4,57,0.05 = 2.533

12 (comparison) 12 (conclusion)

Then we need to specify the rule:

Calculate Z value for the 4th respondent:

12 12 12 12 12

Therefore it is a correct classification. 12

Q3-d) Evaluate the hit-rate. 12

12 12 12

Therefore classify as Not Fail. 12

12 12 12

You might also like