Factor Analysis Preview - Rev1
Factor Analysis Preview - Rev1
The factors typically are viewed as broad concepts or ideas that may
describe an observed phenomenon
Sky diver
Risk Taker
Car racing
Wealthy
Speculative
investment
Factor Analysis Explained
Variables:
Latent Factors:
Observed
Hidden Both f1 and f2 are having an impact on
EACH of the three variables
𝜀1 x1: Sky Diver The factors f1 and f2 can be combined
𝑙11
in a Linear Form to explain the
𝑙 12 f1: Risk Taker Common Variance between each of
𝑙 21 the variable. This is known as
Communality
𝜀2 x2 : Race Car
Driver
𝑙 22 However F1 and F2 together cannot
explain all the Variance in Each of the
l31 f2: Wealthy variables so we have error terms 𝜀1 ,
x3: 𝜀2 and 𝜀3 .
𝜀3 Speculative l32
These error terms are unique to each
Investment
of the variables x1, x2 and x3. This is
called Uniqueness
Variance Relationship
Total Variance observed in Variable = Communality + Uniqueness
Model:
x1 – u1 = 𝑙 11 f1 + 𝑙 12 f2 + 𝜀1
x2 – u2 = 𝑙 21 f1 + 𝑙22 f2 + 𝜀2
x3 – u3 = 𝑙 31 f1 + 𝑙32 f2 + 𝜀3
In the previous model there were 3 variables and it was reduced to 2 factors.
Essentially, FA acts to reduce dimensions
General Model
x1 - u1 l11 l12 l1m f1 ε1
x2 - u2 l21 l22 l2m f2 ε2
x3 - u3 l31 l32 l3m f3 ε3
.. .. .. .. .. .. ..
..
Xp -up lp1 lp2 lpm fm εm
Observed
Factor Loading Common Factors Specific Factors
Variables
Matrix form: X – u = l f + ε
Model Assumptions
Error terms
E(ε) = 0
Var(ε) = ψ1
Common Factors
E(F) = 0
Var(F) = I
Cov(F) = 0 (in this model it is assumed that factors are not correlated)
Cov(ε, F) = 0
Variance – Covariance Relationship
y1 = x1 – u1 = 𝑙 11 f1 + 𝑙21 f2 + 𝜀1
Therefore,
Var(y1)= 𝑙211 + 𝑙221 + ψ1
In general,
Var(yi)= σ2𝑖 = σ𝑚 2
𝑗=1 𝑖𝑗 + ψ1
𝑙
Cov(yi, yj )= σ𝑖𝑗 = σ𝑚
𝑘=1 𝑙𝑖𝑘 𝑙𝑗𝑘
Variance-Covariance
From the variance and covariance matrix the FACTOR Loadings will be
extracted. That is why the variance and covariance matrix has to be
constructed.
We now have the variance and covariance matrix. We also have the eigen
value and eigen vector. The question is how do we get the factor loadings
Example
Assume that there are only 2 vectors and 2 factors
The model equations are
y1 = x1 – u1 = 𝑙 11 f1 + 𝑙 12 f2 + 𝜀1
y2 = x2 – u2 = 𝑙 21 f1 + 𝑙22 f2 + 𝜀2
The variance and covariance matric is given by
2 2
𝑙11 + 𝑙12 + ψ1 𝑙11 𝑙21 + 𝑙12 𝑙22
2 2
𝑙21 𝑙11 + 𝑙22 𝑙12 𝑙21 + 𝑙22 + ψ2
Simplifying the Variance and Covariance Matrix
2 2 2 2 ψ1 0
𝑙11 + 𝑙12 + ψ1 𝑙11 𝑙21 + 𝑙12 𝑙22 𝑙11 + 𝑙12 𝑙11 𝑙21 + 𝑙12 𝑙22
ψ1 0
𝑙11 𝑙12 𝑙11 𝑙21 𝑙11 𝑙12
𝜮 𝑙21 𝑙22 𝑙12 𝑙22 0 ψ2 𝑙21 𝑙22
Loading Factor L
𝜮 = 𝑳 𝑳𝑻 + ψ
Spectral Decomposition of Variance-Covariance
In PCA: Av = λv where A is a matrix, v is the Eigen vector and λ is the Eigen value
Re-written as: Av vT = λv vT
It is known that for every eigen value we have an associated eigen vector
For the first eigen value λ1 the eigen vector is v1 where v1 = [ v11 v12…v1p]
Spectral Decomposition
For the first eigen value λ1 the matrix A = λv vT can be written as
λ1 𝑣1𝑝
Spectral Decomposition
For our example we will have 2 eigen values and 2 eigen vectors
$vectors
Finance Marketing Policy [,1] [,2] [,3]
1 9.84 -0.36 0.44 [1,] 0.99828780 0.008410623 -0.05788553
2 -0.36 5.04 3.84 [2,] -0.04206905 0.790808808 -0.61061578
3 0.44 3.84 3.04 [3,] 0.04064073 0.612005467 0.78980861
Example: Compute Factor Loading
#take the eigen values to a variable
eigenvalues = ev$values
eigenvalues
# attach the eigen vector to a variable
eigenvector = ev$vectors
eigenvector
#factor loading is squareroot of eigenvalues multiplied by eigen vector
factor_loading=sqrt(eigenvalues)*eigenvector
factor_loading
install.packages(“psych”)
library(“psych”)
data = read.csv(“survey.csv”)
attach(data)
new_data = data[,-1]
cortest.bartlett(new_data,n=329,diag=TRUE)
If the p-value is less than 0.05 we can reject the null hypothesis
KMO
install.packages("psych")
library(psych)
data = read.csv("survey.csv",1)
attach(data)
new_data = data[,-1] Overall Measure of Sampling
new_data Adequacy(MSA) should be between
KMO(new_data) 0.5 and 1
Y1 = 0.5F1 + 0.5F2 + e1
Model A Y2 = 0.3F1 + 0.3F2 + e2
Y3 = 0.5F1 – 0.5F2 + e2 0.5 + 𝜎12 0.3 0
f1 f1 0 0 0.5 + 𝜎32
2
(0.5, -0.5) (0,- )
2
Y1 = 0.5F1 + 0.5F2 + e1 Y1 =( 2 / 2) F1 + 0 F2 + e1
Y2 = 0.3F1 + 0.3F2 + e2 Y2 = 0.3 2 F1 + 0 F2 + e2
Y3 = 0.5F1 – 0.5F2 + e2 Y3 = 0 F1 – ( 2 / 2) F2 + e2
Variance and Covariance has not changed but factor loadings have changed
In Model A: Each Variable is associated with 2 factors
In Model B: Each Variable is associated with only 1 factor
Varimax Rotation
The goal is to have few variables loading on the factors
V is the sum of the variances of the squared factor scores for each factor
Maximizing it causes the large coefficients to become larger and the
small coefficients to approach 0
Varimax explained
Factor 1 Loading^2
Variable 1 0.2 0.04 Compute the Variance
Variable 2 -0.1 0.01 of Loading^2
Variable 3 0.2 0.04
Maximum(V)
Factor 2 Loading^2
Variable 1 -0.3 0.09
Compute the Variance
Variable 2 -0.2 0.04 of Loading^2
Variable 3 0.1 0.01
Rotation: How to do it manually
Notice that in the last two rotations the factor loadings have not
changed significantly
Factor 1 Factor 2
The sum of the squares of factor loading for EACH factor will almost be Variable 1 0.19
the same in the last 2 rotations
Variable 2 0.38 -0.21
No need to rotate further Variable 3 -0.51
Variable 4 0.1 -0.1
Maximum Factors Possible
Λ11 λ12
f1 Answer: 6
Λ21 λ22
f2
Λ31 λ32
If there are p variables and m factors we need to determine p*m factor loadings
Example: Error Terms
Model 𝜮 = 𝑳 𝑳𝑻 + ψ
𝜎11 0 0 𝜎11 0 0
ψ 0
0
𝜎21
0
0
𝜎31
𝜎21 0
𝜎31
The covariance terms are mirror images. In the above matrix there are 3 variance
terms to be determined.
In general if there is a p x p variance and covariance matrix for the ERROR term
then p elements must be determined
Maximum Factors
pm + p ≤ p(p+1)/2
Goodness of fit test
After fitting the model goodness of fit is tested
Ho: The factor model holds
Ha: The factor model does not hold
This ends up being a chi-square test where the degrees of freedom is computed
by {(p-m)2 – (p+m)}/2
Ideally, we would like the p-value to be as high as possible so that the Null
Hypothesis is not rejected
What happened!
Johnson and Wichern state the “vast majority of attempted factor analyses do not
yield clear-cut results.”
A good place to start is examining the correlation matrix of your data. If there are
few or no instances of high correlations there really is no use in pursuing a factor
analysis
PCA and FA
In PCA, we get components that are outcomes built from linear combinations of the variables
Look at how each variable relates to the principal component
In FA, we get factors that are thought to be the cause of the observed variables
Look at how factor loadings correlate with the variables
PCA is exploratory and without assumptions FA is based on statistical model with assumptions
In FA Orthogonal rotation of factor loadings are equivalent. This does not hold in PCA