Chapter 3 Factor Analysis

Uploaded by

Boutaina Az

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Chapter 3 Factor Analysis

Uploaded by

Boutaina Az

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Factor Analysis: Foundations, Methods,

and Interpretations
Chapter 3
Assumptions:
1.No outlier: Assume that there are no outliers in data.
2.Adequate sample size: The case must be greater than the factor.
3.No perfect multicollinearity: Factor analysis is an interdependency technique.
There should not be perfect multicollinearity between the variables.
4. Homoscedasticity: Since factor analysis is a linear function of measured
variables, it does not require homoscedasticity between the variables.
5.Linearity: Factor analysis is also based on linearity assumption. Non-linear
variables can also be used. After transfer, however, it changes into linear variable.
6.Interval Data: Interval data are assumed.
Frequently, empirical studies rely on a wide variety of variables—so-called item batteries—to
describe a certain state of affairs. An example for such a collection of variables is the study of
preferred toothpaste attributes by Malhotra (2010, p. 639). Thirty people were asked the
questions in the following figure (Toothpaste attributes):
Assuming these statements are accurate descriptions of the original object—preferred
toothpaste attributes—we can decrease their complexity by reducing them to some
underlying dimensions or factors. Empirical researchers use two basic approaches for doing
so:
1.The first method adds the individual item values to produce a total index for each person.
The statement scores—which in our example range from one to seven—are simply added
together for each person. One problem with this method occurs when questions are
formulated negatively, as with question 5. Another problem with this method is that it
assumes the one dimensionality of the object being investigated or the item battery being
applied. In practice, this is almost never the case. In our example, the first, third, and fifth
statements describe health benefits of toothpaste, while the others describe social
benefits. Hence, this method should only be used for item batteries or scales already
checked for one dimensionality.
2.A second method of data reduction—known as factor analysis—is almost always used to
carry out this check. Factor analysis uses correlation among individual items to reduce
them to a small number of independent dimensions or factors, without presuming the one
dimensionality of the scale. The correlation matrix of items indicates which statements
exhibit similar patterns of responses. These items are then bundled into factors.
The following figure shows that the health attributes preventing cavities, strengthening gums,
and not preventing tooth decay are highly correlated. The same is true for the social attributes
whitening teeth, freshening breath, and making teeth attractive. Hence, the preferred toothpaste
attributes should be represented by two factors, not by one.
If those surveyed do not show similar patterns in their responses, then the high level of data
heterogeneity and low level of data correlation render the results unusable for factor analysis.
Backhaus et al. (2016, p. 395) gives five criteria for determining whether the correlation matrix
is suitable for running a factor analysis:
1.Most of the correlation coefficients of the matrix must exhibit significant values.
2.The inverse of the correlation matrix must display a diagonal matrix with as many values
close to zero for the non-diagonal elements as possible.
3.The Bartlett test (sphericity test) verifies whether the variables correlate. It assumes a normal
distribution of item values and a χ2 distribution of the test statistics. It checks the randomness
of correlation matrix deviations from an identity matrix. A clear disadvantage with this test is
that it requires a normal distribution. For any other form of distribution, the Bartlett test
should not be used.
4.A factor analysis should not be performed when, in an anti-image covariance matrix (AIC),
more than 25% of elements below the diagonal have values larger than 0.09.
The Kaiser-Meyer-Olkin measure (or KMO measure) is generally considered by researchers to be the best method
for testing the suitability of the correlation matrix for factor analysis, and it is recommended that it be performed
before every factor analysis. It expresses a measure of sample adequacy (MSA) between zero and one. Calculated by
all standard statistics software packages, MSA works for the sampling adequacy test for the entire correlation matrix
as well as for each individual item. The KMO/MSA should be bigger or equal to 0.5. Table bellow suggests how
KMO might be interpreted.
Measure of sampling adequacy (MSA) score intervals

Correlation matrix check

If the correlation matrix turns out to be suitable for factor analysis, we can assume that regular
patterns exist between responses and questions. This turns out to be the case for our toothpaste
attribute survey, which possesses an acceptable MSA (0.660) and a significant result for the Bartlett
test ( p < 0.05).
After checking the correlation matrix, we must identify its communalities. The communalities
depend on the method of factor extraction. There are many types of factor analysis. Two are used
most frequently:
Principal component analysis (PCA) assumes that individual variables can be described by a linear
combination of the factors, i.e. that factors represent variable variances in their entirety. If there is a
common share of variance for a variable determined by all factors, a communality of 100% (or 1)
results. This desirable outcome occurs seldom in practice, as item batteries can rarely be reduced to a
few factors representing a variance of all items. With principal component analysis, a communality
less than 1 indicates a loss of information in the representation.
Principal factor analysis (PFA), by contrast, assumes that variable variances can be separated into
two parts. One part is determined by the joint variance of all variables in the analysis. The other part
is determined by the specific variance for the variable in question. The total variance among the
observed variable cannot be accounted for by its common factors. With principal factor analysis, the
factors explain only the first variance component—the share of variance formed commonly by all
variances—which means that the communality indicator must be less than 1.
• The difference in assumptions implied by the two different extraction methods can be
summarized as follows: in principal component analysis, the priority is placed on
representing each item exactly; in principal factor analysis, the hypothetical dimensions
behind the items are determined, so the correlation of individual items can be interpreted.
• This difference serves as the theoretical starting point for many empirical studies. For
instance, the point of our toothpaste example is to identify the hypothetical factors behind
the survey statements. A number of authors therefore see an advantage of method principal
factor analysis over method principal component analysis (see Russel 2002; Widaman
1993). Therefore, one should use the principal factor analysis technique.
• To check the quality of item representations by the factors, we need to use the factor
loading matrix. The factor loading indicates the extent to which items are determined by
the factors. The sum of all squared factor loadings for a factor is called the eigenvalue.
• Eigenvalues allow us to weigh factors based on the empirical data. When we divide the
eigenvalue of an individual factor by the sum of eigenvalues of all extracted factors, we get
a percentage value reflecting the perceived importance for all surveyed persons.
Say we extract from the toothpaste example two
factors, one with an eigenvalue of 2.57 and the
other with an eigenvalue of 1.87. This results in
an importance of 42.84% for factor one and
31.13% for factor two.

The sum of a factor’s eigenvalues strongly

depends on the selection of items. The square of
the factor loading matrix reproduces the
variables’ correlation matrix. If there are no
large deviations () between the reproduced and
the original correlation matrix, then the
reproduction—the representability of the
original data—is considered very good. This
table shows the reproduced correlation matrix
and the residuals from the original matrix for the
toothpaste attribute survey. There is only one
deviation above the level of difference (0.05),
and it is minor (0.051). This means that both
factors are highly representative of the original
data.
Though the number of factors can be set by the researcher himself (which is the reason why factor analysis
is often accused of being susceptible to manipulation) some rules have crystallized over time. The most
important of these is the Kaiser criterion. This rule takes into account all factors with an eigenvalue
greater than one. Since eigenvalues less than 1 describe factors that do a poorer job of explaining variance
than individual items do, this criterion is justified, hence its widespread acceptance. For instance, in our
toothpaste example an extraction of the third factor results in a smaller explanatory value than by adding
one of the six items. Hence, a two-factor solution is more desirable in this case.

Eigenvalues and stated total variance for toothpaste

attributes
The Kaiser criterion is often accompanied by
a scree plot in which the eigenvalues are plotted
against the number of factors into a coordinate
system in order of decreasing eigenvalues and
increasing number of factors. When the curve
forms an elbow towards a less steep decline, all
further factors after the one starting the elbow
are omitted. The plot in the following figure
applies to a three-factor solution.

After we set the number of factors, we interpret the Scree plot of the desirable toothpaste attributes
results based on the individual items. Each item
whose factor loading is greater than 0.5 is assigned
to a factor. The following figure shows the factor
loadings for attributes from our toothpaste example.
Each variable is assigned to exactly one factor. The
variables prevent cavities, strengthen gums, and not
prevent tooth decay are loaded on factor 1, which
describe the toothpaste’s health-related attributes.
Unrotated and rotated factor matrix for toothpaste attributes
When positive factor loadings obtain, high
factor values accompany high item values.
When negative factor loadings obtain, low
item values lead to high factor values and vice
versa. This explains the negative sign in front
of the factor loading for the variable not
prevent tooth decay. People who assigned high
values to prevent cavities and strengthen gums
assigned low values to not prevent tooth decay.
That is to say, those surveyed strongly prefer a
toothpaste with health-related attributes.

The second factor describes the social benefits of toothpaste: whiten teeth, freshen breath, and make teeth
attractive. Here too, the items correlate strongly, allowing the surveyed responses to be expressed by the
second factor. Sometimes, an individual item possesses factor loadings greater than 0.5 for several
factors at the same time, resulting in a multiple loading. In these cases, we must take it into account for all
the factors. If an item possesses factor loadings less than 0.5 for all its factors, we must either reconsider
the number of factors or assign the item to the factor with the highest loading.
The factor matrix is normally rotated to facilitate the interpretation. In most cases, it is rotated
orthogonally. This is known as a varimax rotation, and it preserves the statistical independence of the
factors. Figure 13.8 shows the effect of the varimax rotation on the values of a factor matrix.
The item freshen breath has an unrotated factor loading of - 0.246 for factor one (health attributes) and of
0.734 for factor two (social attributes).
The varimax method rotates the total coordinate system from its original position but preserves the
relationship between the individual variables.
The rotation calibrates the coordinate system anew. Factor one now has the value of - 0.090 and factor
two the value of 0.769 for the item freshen breath. The varimax rotation reduces the loading of factor one
and increases the loading of factor two, making factor assignments of items more obvious.
This is the basic idea of the varimax method: the coordinate system is rotated until the sum of the
variances of the squared loadings is maximized. In most cases, this simplifies the interpretation.
After setting the number of factors and interpreting their results, we must explain how the factor scores
differ among the surveyed individuals. Factor scores generated by regression analysis provide some
indications. The factor score of factor i can be calculated on the basis of linear combinations of the n
original z-scores (zj) of the surveyed person weighted with the respective values (αij) from the factor
score coefficient matrix:
Fi = ai1 z1+ ai2 z2 + ai3 z3 +….+ ain zn
For each factor, every person receives a
standardized value that assesses the scores
given by individuals vis-à-vis the average
scores given by all individuals. When the
standardized factor score is positive, the
individual scores are greater than the average of
all responses and vice versa. In the toothpaste
dataset, person #3 has a value of 1.14

For factor 2, F2 , the value is (-0.84).

This indicates a higher-than-average preference

for health benefits and a
lower-than-average preference for social
benefits.
Steps with SPSS

Output

Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
From Everand
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
Lee Baker
No ratings yet
SAS Libary Factor Analysis Using SAS PROC FACTOR
No ratings yet
SAS Libary Factor Analysis Using SAS PROC FACTOR
18 pages
Factor Analysis
No ratings yet
Factor Analysis
26 pages
BRM Report 2
No ratings yet
BRM Report 2
45 pages
Chapter Three Factor Analysis
No ratings yet
Chapter Three Factor Analysis
13 pages
Factor Analysis: Meaning of Underlying Variables
100% (1)
Factor Analysis: Meaning of Underlying Variables
8 pages
Chapter 19, Factor Analysis
No ratings yet
Chapter 19, Factor Analysis
7 pages
Factor Analysis: Nazia Qayyum SAP ID 48541
100% (1)
Factor Analysis: Nazia Qayyum SAP ID 48541
34 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
19 pages
Factor Analysis Hw2
No ratings yet
Factor Analysis Hw2
8 pages
Factor Analysis
100% (1)
Factor Analysis
35 pages
Factor_Analysis
No ratings yet
Factor_Analysis
70 pages
IBM SPSS Statistics Base
No ratings yet
IBM SPSS Statistics Base
36 pages
Factor Analysis: Dr. R. Ravanan
No ratings yet
Factor Analysis: Dr. R. Ravanan
22 pages
Factor Analysis
No ratings yet
Factor Analysis
20 pages
Factors
No ratings yet
Factors
68 pages
Factor Analysis Final
No ratings yet
Factor Analysis Final
13 pages
Factor Analysis
No ratings yet
Factor Analysis
11 pages
Factor Analysis Using SPSS: Example
No ratings yet
Factor Analysis Using SPSS: Example
16 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
10 pages
BRM Unit 4 Extra
No ratings yet
BRM Unit 4 Extra
10 pages
SPSS Annotated Output Factor Analysis
No ratings yet
SPSS Annotated Output Factor Analysis
11 pages
Factor Analysis PDF
100% (1)
Factor Analysis PDF
57 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
12 pages
Factor Analysis (DR See) : I I I Ik K I
No ratings yet
Factor Analysis (DR See) : I I I Ik K I
6 pages
Factor Analysis
No ratings yet
Factor Analysis
3 pages
BRM Report Group 8
No ratings yet
BRM Report Group 8
51 pages
Factor Analysis
100% (2)
Factor Analysis
36 pages
Factor Analysis Using SPSS: Example
No ratings yet
Factor Analysis Using SPSS: Example
14 pages
5 - Pca & Garett Rank
No ratings yet
5 - Pca & Garett Rank
14 pages
Factor Analysisd
No ratings yet
Factor Analysisd
20 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
9 pages
Factor Analysis
No ratings yet
Factor Analysis
19 pages
Exploratory Factor Analysis Using SPSS 2023
No ratings yet
Exploratory Factor Analysis Using SPSS 2023
50 pages
Finch 2013
No ratings yet
Finch 2013
20 pages
Untitled (1)
No ratings yet
Untitled (1)
93 pages
Data Analytics-11
No ratings yet
Data Analytics-11
23 pages
Unit 5 CS
No ratings yet
Unit 5 CS
3 pages
Workshop3 Bangert Factor Analysis Tutorial and ExSurvey Items
No ratings yet
Workshop3 Bangert Factor Analysis Tutorial and ExSurvey Items
13 pages
Final Answer
No ratings yet
Final Answer
4 pages
A Seminar Presentation On: Factor Analysis (Special Concentration On Exploratory & Confirmatory Factor Analysis)
No ratings yet
A Seminar Presentation On: Factor Analysis (Special Concentration On Exploratory & Confirmatory Factor Analysis)
68 pages
Unit 4
No ratings yet
Unit 4
13 pages
Exploratory Factor Analysis and Principal Components Analysis
No ratings yet
Exploratory Factor Analysis and Principal Components Analysis
16 pages
Factor Analysis
No ratings yet
Factor Analysis
54 pages
Factor Analysis Using SPSS: Example
No ratings yet
Factor Analysis Using SPSS: Example
14 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
1 page
Economic
No ratings yet
Economic
11 pages
4 Factor Analysis
No ratings yet
4 Factor Analysis
15 pages
Quiz 01
No ratings yet
Quiz 01
19 pages
Thesis Using Factor Analysis
100% (3)
Thesis Using Factor Analysis
5 pages
Factor Analysis
50% (2)
Factor Analysis
56 pages
Chapter 3 Factor Analysis New
No ratings yet
Chapter 3 Factor Analysis New
31 pages
Factor Analysis Is An Interdependence Technique Whose Primary Purpose Is To Define The Underlying
No ratings yet
Factor Analysis Is An Interdependence Technique Whose Primary Purpose Is To Define The Underlying
3 pages
Factor Analysis
No ratings yet
Factor Analysis
4 pages
Exploratory or Confirmatory Factor Analysis
No ratings yet
Exploratory or Confirmatory Factor Analysis
17 pages
Factor Analysis
No ratings yet
Factor Analysis
14 pages
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
How to Find Inter-Groups Differences Using Spss/Excel/Web Tools in Common Experimental Designs: Book Two
From Everand
How to Find Inter-Groups Differences Using Spss/Excel/Web Tools in Common Experimental Designs: Book Two
P.Y. Cheng
No ratings yet
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Robust Estimation and Testing
From Everand
Robust Estimation and Testing
Robert G. Staudte
3/5 (1)
Exam 1 Fall02
No ratings yet
Exam 1 Fall02
5 pages
Mean Mode Median
No ratings yet
Mean Mode Median
17 pages
Forecasting
No ratings yet
Forecasting
12 pages
Business Statistics-Sample QP
No ratings yet
Business Statistics-Sample QP
11 pages
MATERIAL08
No ratings yet
MATERIAL08
15 pages
Machine Learning For Finance
No ratings yet
Machine Learning For Finance
10 pages
CS229 Supplemental Lecture Notes: 1 Binary Classification
No ratings yet
CS229 Supplemental Lecture Notes: 1 Binary Classification
7 pages
Time Series and Its Applications
No ratings yet
Time Series and Its Applications
6 pages
ARIMA Modeling:: B-J Procedure
No ratings yet
ARIMA Modeling:: B-J Procedure
26 pages
MinhNguyen Q8 16mar12 48
No ratings yet
MinhNguyen Q8 16mar12 48
167 pages
2-Way Poisson Interactions
No ratings yet
2-Way Poisson Interactions
2 pages
FandI CT3 200709 Exam FINAL
No ratings yet
FandI CT3 200709 Exam FINAL
8 pages
Summative 2 SP
No ratings yet
Summative 2 SP
3 pages
Spot Speed Study
No ratings yet
Spot Speed Study
12 pages
Rems5953 Homework 4 Miguel Llovera Da Corte SPV
No ratings yet
Rems5953 Homework 4 Miguel Llovera Da Corte SPV
3 pages
Comm 201 (Biostatistics) Assignment: (U20DLNS20131)
No ratings yet
Comm 201 (Biostatistics) Assignment: (U20DLNS20131)
4 pages
ML Practical Updated
No ratings yet
ML Practical Updated
64 pages
Presentasi Zen
No ratings yet
Presentasi Zen
20 pages
Matlab Toolbox For MIDAS 2011
No ratings yet
Matlab Toolbox For MIDAS 2011
24 pages
15 AS Statistics and Mechanics Practice Paper H
No ratings yet
15 AS Statistics and Mechanics Practice Paper H
5 pages
Syllabus of Ph.D. Course
No ratings yet
Syllabus of Ph.D. Course
6 pages
Edexcel S2 Cheat Sheet
No ratings yet
Edexcel S2 Cheat Sheet
12 pages
ML Question Paper2
No ratings yet
ML Question Paper2
3 pages
Introduction To Discrete Bayesian Methods: Petri Nokelainen
No ratings yet
Introduction To Discrete Bayesian Methods: Petri Nokelainen
146 pages
Studi Eksplanatori Dan Konfirmatori Tentang Model
No ratings yet
Studi Eksplanatori Dan Konfirmatori Tentang Model
17 pages
Bridging Blaze Lbolytc Finals Reviewer
No ratings yet
Bridging Blaze Lbolytc Finals Reviewer
33 pages
Statistics Test - Correlation-Regression, Index number_1474785
No ratings yet
Statistics Test - Correlation-Regression, Index number_1474785
5 pages
A Data-Driven Framework For Predicting Weather Impact On High-Volume - Journal Consumer - Retailing
No ratings yet
A Data-Driven Framework For Predicting Weather Impact On High-Volume - Journal Consumer - Retailing
9 pages
Independent Sample T Test
No ratings yet
Independent Sample T Test
17 pages

Chapter 3 Factor Analysis

Uploaded by

Chapter 3 Factor Analysis

Uploaded by

Factor Analysis: Foundations, Methods,

Correlation matrix check

The sum of a factor’s eigenvalues strongly

Eigenvalues and stated total variance for toothpaste

For factor 2, F2 , the value is (-0.84).

This indicates a higher-than-average preference

You might also like