tut2
tut2
Tutorial 2 2024/09/24
Review
Notation
p – number of variables in the data set.
m – number of common factors.
= ⋮ – mean of X
⋯ ⋯
⎡ ⋮ ⬚ ⋮ ⬚ ⋮ ⎤
⎢ ⎥
=⎢ ⋯ ⋯
⎥ – p x p covariance matrix for X.
⎢ ⋮ ⬚ ⋮ ⬚ ⋮ ⎥
⎣ ⋯ ⋯ ⎦
= – p x p correlation matrix for X.
…
= ⋮ ⋮ ⋮ – factor loading matrix
…
– factor loadings of ith variable on the jth factor.
ℎ = +⋯ – communality (amount of variance explained by the m factors)
of Xi
⋯ 0
= ⋮ ⋱ ⋮ – covariance matrix of .
0 ⋯
– uniqueness or specific variance for Xi.
Factor analysis
− = +⋯ +
− = +
Assumption
o ( )=
o ( )= ⟹ ( ) = 1, , =0
o ( )=0
1
o ( )= , , =0
o F and are independent Cov(, F) = 0
Properties
CovX LLT Ψ
o ( )= + ⋯+ +
o ( , )= +⋯+
CovX,F L
o , =
Rotation
For any m × m orthogonal matrix T such that
TTT = TTT = I
Rotated loadings and factors are
L* LT and F * TT F
L and F satisfy the same assumptions of L and F. Most importantly
* *
Σ LLT Ψ L*L*T Ψ
2
Principal component method
= = / /
=
/
= = ,…,
By ignoring (m+1)th, …, pth eigenvalues,
≈ +
= ,…, , = ( − )
Total variances
= (Σ) = ( )= ( )= ( )=
= 1=
= +⋯ = = = =
= ( , )= ( − ̅ )( − ̅ )
= ( , )=
where = is the standard deviation for .
3
Exercises
1. Do not use computer for this question. The covariance matrix for 3 random variables
Z1, Z2 and Z3 is given as
1 0.63 0.45
= 0.63 1 0.35 .
0.45 0.35 1
a. Show that the covariance matrix can be generated by the 1-factor model:
1 = 0.9 + ,
2 = 0.7 + ,
3 = 0.5 + ,
where ( ) = 1, ( , ) = 0 ( = 1, 2, 3) and
0.19 0 0
= 0 0.51 0 .
0 0 0.75
b. Calculate communalities and interpret these quantities.
c. Calculate ( , ), i = 1, 2, 3. Which variable might carry the greatest weight
in “naming” the common factor? Why?
2. A marketing researcher did a factor analysis using the sample correlation matrix of
rating scores on five attributes of a new product, obtaining the following factor
loadings (three of them have been omitted from the table) for two common factors by
the ML method:
Before rotation After rotation
Attribute F1 F2 F1 F2
Taste 0.976 -0.139 0.025 ?
Money’s worth 0.150 0.860 0.873 0.005
Flavour ? -0.032 0.131 0.972
Good for snack 0.535 0.739 ? 0.405
Nutrition 0.146 0.963 0.974 -0.016
a. Can you construct the common factors as linear combinations of the Attributes
using the loadings as coefficients? Why or why not?
b. Find the three missing values in the above table, given that they are all positive.
(Can you find the values without the prior knowledge of the signs?)
c. Roughly speaking, how much of the variability in Flavour cannot be explained
by the common factors?
d. Before rotation, how much of the total variability of the Attributes can be
explained by the first factor? By the two factors together?
e. Make an attempt to interpret the rotated factors.
f. The determinants of the correlation matrix based on a sample of 150 respondents
and of the correlation matrix reproduced by the factors are respectively
3.480x10-3 and 3.560x10-3. Perform the likelihood ratio test for the adequacy of
the two factor model at the 5% significance level.
4
3. Conduct a FA using principal components method to explore the behaviour of internet
usage of the 20 respondents. Data are stored in TUT201.csv.
a. Import the data.
b. Conduct a FA using the PC method on the correlation matrix of internet usage
data (X1 – X5) to obtain an initial solution.
c. Choose the number of common factors m such that the % of total variance
explained are greater than 70
d. Based on the chosen model, calculate the communalities ℎ of each variable Xi.
e. Rotate the factors using VARIMAX method. Do the total % of variance
explained by factors and the communality of each variable remain unchanged?
Also, compare the loading plots before and after factor rotations.
f. Write the factor analysis model. Interpret and label the factors.
g. Write the factor equations, and calculate the factor scores for the 20 customers.
h. Scatter plot the factor scores using rotated factors ∗ and ∗ as axes.
i. Describe the characteristics of users of three Internet service providers using
results in (f)-(h).