0% found this document useful (0 votes)
7 views

Everitt 2011

l

Uploaded by

alasmarimonaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Everitt 2011

l

Uploaded by

alasmarimonaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

5

Exploratory Factor Analysis

5.1 Introduction
In many areas of psychology, and other disciplines in the behavioural sciences,
often it is not possible to measure directly the concepts of primary interest.
Two obvious examples are intelligence and social class. In such cases, the re-
searcher is forced to examine the concepts indirectly by collecting information
on variables that can be measured or observed directly and can also realis-
tically be assumed to be indicators, in some sense, of the concepts of real
interest. The psychologist who is interested in an individual’s “intelligence”,
for example, may record examination scores in a variety of different subjects in
the expectation that these scores are dependent in some way on what is widely
regarded as “intelligence” but are also subject to random errors. And a soci-
ologist, say, concerned with people’s “social class” might pose questions about
a person’s occupation, educational background, home ownership, etc., on the
assumption that these do reflect the concept he or she is really interested in.
Both “intelligence” and “social class” are what are generally referred to as
latent variables–i.e., concepts that cannot be measured directly but can be as-
sumed to relate to a number of measurable or manifest variables. The method
of analysis most generally used to help uncover the relationships between the
assumed latent variables and the manifest variables is factor analysis. The
model on which the method is based is essentially that of multiple regression,
except now the manifest variables are regressed on the unobservable latent
variables (often referred to in this context as common factors), so that direct
estimation of the corresponding regression coefficients (factor loadings) is not
possible.
A point to be made at the outset is that factor analysis comes in two dis-
tinct varieties. The first is exploratory factor analysis, which is used to inves-
tigate the relationship between manifest variables and factors without making
any assumptions about which manifest variables are related to which factors.
The second is confirmatory factor analysis which is used to test whether a
specific factor model postulated a priori provides an adequate fit for the co-

B. Everitt and T. Hothorn, An Introduction to Applied Multivariate Analysis with R: Use R!, 135
DOI 10.1007/978-1-4419-9650-3_5, © Springer Science+Business Media, LLC 2011
136 5 Exploratory Factor Analysis

variances or correlations between the manifest variables. In this chapter, we


shall consider only exploratory factor analysis. Confirmatory factor analysis
will be the subject of Chapter 7.
Exploratory factor analysis is often said to have been introduced by Spear-
man (1904), but this is only partially true because Spearman proposed only
the one-factor model as described in the next section. Fascinating accounts of
the history of factor analysis are given in Thorndike (2005) and Bartholomew
(2005).

5.2 A simple example of a factor analysis model

To set the scene for the k-factor analysis model to be described in the next
section, we shall in this section look at a very simple example in which there
is only a single factor.
Spearman considered a sample of children’s examination marks in three
subjects, Classics (x1 ), French (x2 ), and English (x3 ), from which he calculated
the following correlation matrix for a sample of children:
 
Classics 1.00
R = French  0.83 1.00 .
English 0.78 0.67 1.00

If we assume a single factor, then the single-factor model is specified as follows:

x1 = λ1 f + u1 ,
x2 = λ2 f + u2 ,
x3 = λ3 f + u3 .

We see that the model essentially involves the simple linear regression of each
observed variable on the single common factor. In this example, the under-
lying latent variable or common factor, f , might possibly be equated with
intelligence or general intellectual ability. The terms λ1 , λ2 , and λ3 which are
essentially regression coefficients are, in this context, known as factor load-
ings, and the terms u1 , u2 , and u3 represent random disturbance terms and
will have small variances if their associated observed variable is closely related
to the underlying latent variable. The variation in ui actually consists of two
parts, the extent to which an individual’s ability at Classics, say, differs from
his or her general ability and the extent to which the examination in Classics
is only an approximate measure of his or her ability in the subject. In practise
no attempt is made to disentangle these two parts.
We shall return to this simple example later when we consider how to
estimate the parameters in the factor analysis model. Before this, however, we
need to describe the factor analysis model itself in more detail. The description
follows in the next section.
5.3 The k-factor analysis model 137

5.3 The k-factor analysis model


The basis of factor analysis is a regression model linking the manifest vari-
ables to a set of unobserved (and unobservable) latent variables. In essence the
model assumes that the observed relationships between the manifest variables
(as measured by their covariances or correlations) are a result of the relation-
ships of these variables to the latent variables. (Since it is the covariances or
correlations of the manifest variables that are central to factor analysis, we
can, in the description of the mathematics of the method given below, assume
that the manifest variables all have zero mean.)
To begin, we assume that we have a set of observed or manifest variables,
x> = (x1 , x2 , . . . , xq ), assumed to be linked to k unobserved latent variables
or common factors f1 , f2 , . . . , fk , where k < q, by a regression model of the
form

x1 = λ11 f1 + λ12 f2 + · · · + λ1k fk + u1 ,


x2 = λ21 f1 + λ22 f2 + · · · + λ2k fk + u2 ,
..
.
xq = λq1 f1 + λq2 f2 + · · · + λqk fk + uq .

The λj s are essentially the regression coefficients of the x-variables on the


common factors, but in the context of factor analysis these regression coeffi-
cients are known as the factor loadings and show how each observed variable,
xi , depends on the common factors. The factor loadings are used in the inter-
pretation of the factors; i.e., larger values relate a factor to the corresponding
observed variables and from these we can often, but not always, infer a mean-
ingful description of each factor (we will give examples later).
The regression equations above may be written more concisely as

x = Λf + u,

where      
λ11 . . . λ1k f1 u1
 .. .
..  , f =  ..  , u =  ... 
.
Λ= . .
   

λq1 . . . λqk fq uq
We assume that the random disturbance terms u1 , . . . , uq are uncorrelated
with each other and with the factors f1 , . . . , fk . (The elements of u are spe-
cific to each xi and hence are generally better known in this context as specific
variates.) The two assumptions imply that, given the values of the common
factors, the manifest variables are independent; that is, the correlations of
the observed variables arise from their relationships with the common factors.
Because the factors are unobserved, we can fix their locations and scales arbi-
trarily and we shall assume they occur in standardised form with mean zero
and standard deviation one. We will also assume, initially at least, that the
138 5 Exploratory Factor Analysis

factors are uncorrelated with one another, in which case the factor loadings
are the correlations of the manifest variables and the factors. With these ad-
ditional assumptions about the factors, the factor analysis model implies that
the variance of variable xi , σi2 , is given by
k
X
σi2 = λ2ij + ψi ,
j=1

where ψi is the variance of ui . Consequently, we see that the factor analysis


model implies that the variance of each Pk observed variable can be split into
two parts: the first, h2i , given by h2i = j=1 λ2ij , is known as the communality
of the variable and represents the variance shared with the other variables
via the common factors. The second part, ψi , is called the specific or unique
variance and relates to the variability in xi not shared with other variables. In
addition, the factor model leads to the following expression for the covariance
of variables xi and xj :
k
X
σij = λil λjl .
l=1

We see that the covariances are not dependent on the specific variates in any
way; it is the common factors only that aim to account for the relationships
between the manifest variables.
The results above show that the k-factor analysis model implies that the
population covariance matrix, Σ, of the observed variables has the form

Σ = ΛΛ> + Ψ ,

where
Ψ = diag(Ψi ).
The converse also holds: if Σ can be decomposed into the form given above,
then the k-factor model holds for x. In practise, Σ will be estimated by the
sample covariance matrix S and we will need to obtain estimates of Λ and Ψ
so that the observed covariance matrix takes the form required by the model
(see later in the chapter for an account of estimation methods). We will also
need to determine the value of k, the number of factors, so that the model
provides an adequate fit for S.

5.4 Scale invariance of the k-factor model

Before describing both estimation for the k-factor analysis model and how to
determine the appropriate value of k, we will consider how rescaling the x
variables affects the factor analysis model. Rescaling the x variables is equiv-
alent to letting y = Cx, where C = diag(ci ) and the ci , i = 1, . . . , q are the
5.5 Estimating the parameters in the k-factor analysis model 139

scaling values. If the k-factor model holds for x with Λ = Λx and Ψ = Ψ x ,


then
y = CΨ x f + Cu
and the covariance matrix of y implied by the factor analysis model for x is

Var(y) = CΣC = CΛx C + CΨ x C.

So we see that the k-factor model also holds for y with factor loading matrix
Λy = CΛx and specific variances Ψ y = CΨ x C = c2i ψi . So the factor loading
matrix for the scaled variables y is found by scaling the factor loading matrix
of the original variables by multiplying the ith row of Λx by ci and similarly
for the specific variances. Thus factor analysis is essentially unaffected by the
rescaling of the variables. In particular, if the rescaling factors are such that
ci = 1/si , where si is the standard deviation of the xi , then the rescaling is
equivalent to applying the factor analysis model to the correlation matrix of
the x variables and the factor loadings and specific variances that result can
be found simply by scaling the corresponding loadings and variances obtained
from the covariance matrix. Consequently, the factor analysis model can be
applied to either the covariance matrix or the correlation matrix because the
results are essentially equivalent. (Note that this is not the same as when
using principal components analysis, as pointed out in Chapter 3, and we will
return to this point later in the chapter.)

5.5 Estimating the parameters in the k-factor analysis


model
To apply the factor analysis model outlined in the previous section to a sample
of multivariate observations, we need to estimate the parameters of the model
in some way. These parameters are the factor loadings and specific variances,
and so the estimation problem in factor analysis is essentially that of finding Λ̂
(the estimated factor loading matrix) and Ψ̂ (the diagonal matrix containing
the estimated specific variances), which, assuming the factor model outlined in
Section 5.3, reproduce as accurately as possible the sample covariance matrix,
S. This implies
>
S ≈ Λ̂Λ̂ + Ψ̂ .
Given an estimate of the factor loading matrix, Λ̂, it is clearly sensible to
estimate the specific variances as
k
X
ψ̂i = s2i − λ̂2ij , i = 1, . . . , q
j=1

so that the diagonal terms in S are estimated exactly.


140 5 Exploratory Factor Analysis

Before looking at methods of estimation used in practise, we shall for the


moment return to the simple single-factor model considered in Section 5.2
because in this case estimation of the factor loadings and specific variances is
very simple, the reason being that in this case the number of parameters in
the model, 6 (three factor loadings and three specific variances), is equal to
the number of independent elements in R (the three correlations and the three
diagonal standardised variances), and so by equating elements of the observed
correlation matrix to the corresponding values predicted by the single-factor
model, we will be able to find estimates of λ1 , λ2 , λ3 , ψ1 , ψ2 , and ψ3 such that
the model fits exactly. The six equations derived from the matrix equality
implied by the factor analysis model,
   
λ1  ψ1 0 0
R =  λ2  λ1 λ2 λ3 +  0 ψ2 0  ,
λ3 0 0 ψ3
are

λ̂1 λ2 = 0.83,
λ̂1 λ3 = 0.78,
λ̂1 λ4 = 0.67,
ψ1 = 1.0 − λ̂21 ,
ψ2 = 1.0 − λ̂22 ,
ψ3 = 1.0 − λ̂23 .

The solutions of these equations are

λ̂1 = 0.99, λ̂2 = 0.84, λ̂3 = 0.79,


ψ̂1 = 0.02, ψ̂2 = 0.30, ψ̂3 = 0.38.

Suppose now that the observed correlations had been


 
Classics 1.00
R = French  0.84 1.00 .
English 0.60 0.35 1.00

In this case, the solution for the parameters of a single-factor model is

λ̂1 = 1.2, λ̂2 = 0.7, λ̂3 = 0.5,


ψ̂1 = −0.44, ψ̂2 = 0.51, ψ̂3 = 0.75.

Clearly this solution is unacceptable because of the negative estimate for the
first specific variance.
In the simple example considered above, the factor analysis model does
not give a useful description of the data because the number of parameters in
5.5 Estimating the parameters in the k-factor analysis model 141

the model equals the number of independent elements in the correlation ma-
trix. In practise, where the k-factor model has fewer parameters than there
are independent elements of the covariance or correlation matrix (see Sec-
tion 5.6), the fitted model represents a genuinely parsimonious description of
the data and methods of estimation are needed that try to make the covari-
ance matrix predicted by the factor model as close as possible in some sense to
the observed covariance matrix of the manifest variables. There are two main
methods of estimation leading to what are known as principal factor anal-
ysis and maximum likelihood factor analysis, both of which are now briefly
described.

5.5.1 Principal factor analysis


Principal factor analysis is an eigenvalue and eigenvector technique similar in
many respects to principal components analysis (see Chapter 3) but operating
not directly on S (or R) but on what is known as the reduced covariance
matrix , S∗ , defined as
S∗ = S − Ψ̂ ,
where Ψ̂ is a diagonal matrix containing estimates of the ψi . The “ones” on
the diagonal of S have in S∗ been replaced by the estimated communalities,
P k 2
j=1 λ̂ij , the parts of the variance of each observed variable that can be
explained by the common factors. Unlike principal components analysis, factor
analysis does not try to account for all the observed variance, only that shared
through the common factors. Of more concern in factor analysis is accounting
for the covariances or correlations between the manifest variables.
To calculate S∗ (or with R replacing S, R∗ ) we need values for the com-
munalities. Clearly we cannot calculate them on the basis of factor loadings
because these loadings still have to be estimated. To get around this seem-
ingly “chicken and egg” situation, we need to find a sensible way of finding
initial values for the communalities that does not depend on knowing the fac-
tor loadings. When the factor analysis is based on the correlation matrix of
the manifest variables, two frequently used methods are:
ˆ Take the communality of a variable xi as the square of the multiple corre-
lation coefficient of xi with the other observed variables.
ˆ Take the communality of xi as the largest of the absolute values of the
correlation coefficients between xi and one of the other variables.
Each of these possibilities will lead to higher values for the initial communality
when xi is highly correlated with at least some of the other manifest variables,
which is essentially what is required.
Given the initial communality values, a principal components analysis is
performed on S∗ and the first k eigenvectors used to provide the estimates of
the loadings in the k-factor model. The estimation process can stop here or
the loadings obtained at this stage can provide revised communality estimates
142 5 Exploratory Factor Analysis
Pk
calculated as j=1 λ̂2ij , where the λ̂2ij s are the loadings estimated in the pre-
vious step. The procedure is then repeated until some convergence criterion
is satisfied. Difficulties can sometimes arise with this iterative approach if at
any time a communality estimate exceeds the variance of the corresponding
manifest variable, resulting in a negative estimate of the variable’s specific
variance. Such a result is known as a Heywood case (see Heywood 1931) and
is clearly unacceptable since we cannot have a negative specific variance.

5.5.2 Maximum likelihood factor analysis


Maximum likelihood is regarded, by statisticians at least, as perhaps the most
respectable method of estimating the parameters in the factor analysis. The
essence of this approach is to assume that the data being analysed have a
multivariate normal distribution (see Chapter 1). Under this assumption and
assuming the factor analysis model holds, the likelihood function L can be
shown to be − 12 nF plus a function of the observations where F is given by

F = ln |ΛΛ> + Ψ | + trace(S|ΛΛ> + Ψ |−1 ) − ln |S| − q.

The function F takes the value zero if ΛΛ> +Ψ is equal to S and values greater
than zero otherwise. Estimates of the loadings and the specific variances are
found by minimising F with respect to these parameters. A number of iterative
numerical algorithms have been suggested; for details see Lawley and Maxwell
(1963), Mardia et al. (1979), Everitt (1984, 1987), and Rubin and Thayer
(1982).
Initial values of the factor loadings and specific variances can be found in
a number of ways, including that described above in Section 5.5.1. As with
iterated principal factor analysis, the maximum likelihood approach can also
experience difficulties with Heywood cases.

5.6 Estimating the number of factors

The decision over how many factors, k, are needed to give an adequate rep-
resentation of the observed covariances or correlations is generally critical
when fitting an exploratory factor analysis model. Solutions with k = m and
k = m + 1 will often produce quite different factor loadings for all factors,
unlike a principal components analysis, in which the first m components will
be identical in each solution. And, as pointed out by Jolliffe (2002), with too
few factors there will be too many high loadings, and with too many factors,
factors may be fragmented and difficult to interpret convincingly.
Choosing k might be done by examining solutions corresponding to dif-
ferent values of k and deciding subjectively which can be given the most
convincing interpretation. Another possibility is to use the scree diagram ap-
proach described in Chapter 3, although the usefulness of this method is not
5.7 Factor rotation 143

so clear in factor analysis since the eigenvalues represent variances of principal


components, not factors.
An advantage of the maximum likelihood approach is that it has an associ-
ated formal hypothesis testing procedure that provides a test of the hypothesis
Hk that k common factors are sufficient to describe the data against the alter-
native that the population covariance matrix of the data has no constraints.
The test statistic is
U = N min(F ),
where N = n + 1 − 16 (2q + 5) − 23 k. If k common factors are adequate to
account for the observed covariances or correlations of the manifest variables
(i.e., Hk is true), then U has, asymptotically, a chi-squared distribution with
ν degrees of freedom, where
1 1
ν= (q − k)2 − (q + k).
2 2
In most exploratory studies, k cannot be specified in advance and so a sequen-
tial procedure is used. Starting with some small value for k (usually k = 1),
the parameters in the corresponding factor analysis model are estimated using
maximum likelihood. If U is not significant, the current value of k is accepted;
otherwise k is increased by one and the process is repeated. If at any stage the
degrees of freedom of the test become zero, then either no non-trivial solution
is appropriate or alternatively the factor model itself, with its assumption of
linearity between observed and latent variables, is questionable. (This proce-
dure is open to criticism because the critical values of the test criterion have
not been adjusted to allow for the fact that a set of hypotheses are being
tested in sequence.)

5.7 Factor rotation

Up until now, we have conveniently ignored one problematic feature of the


factor analysis model, namely that, as formulated in Section 5.3, there is no
unique solution for the factor loading matrix. We can see that this is so by
introducing an orthogonal matrix M of order k × k and rewriting the basic
regression equation linking the observed and latent variables as

x = (ΛM)(M> f ) + u.

This “new” model satisfies all the requirements of a k-factor model as previ-
ously outlined with new factors f ∗ = Mf and the new factor loadings ΛM.
This model implies that the covariance matrix of the observed variables is

Σ = (ΛM)(ΛM)> + Ψ ,
144 5 Exploratory Factor Analysis

which, since MM> = I, reduces to Σ = ΛΛ> +Ψ as before. Consequently,


factors f with loadings Λ and factors f ∗ with loadings ΛM are, for any or-
thogonal matrix M, equivalent for explaining the covariance matrix of the
observed variables. Essentially then there are an infinite number of solutions
to the factor analysis model as previously formulated.
The problem is generally solved by introducing some constraints in the
original model. One possibility is to require the matrix G given by

G = ΛΨ −1 Λ

to be diagonal, with its elements arranged in descending order of magnitude.


Such a requirement sets the first factor to have maximal contribution to the
common variance of the observed variables, and the second has maximal con-
tribution to this variance subject to being uncorrelated with the first and so
on (cf. principal components analysis in Chapter 3). The constraint above
ensures that Λ is uniquely determined, except for a possible change of sign of
the columns. (When k = 1, the constraint is irrelevant.)
The constraints on the factor loadings imposed by a condition such as
that given above need to be introduced to make the parameter estimates in
the factor analysis model unique, and they lead to orthogonal factors that
are arranged in descending order of importance. These properties are not,
however, inherent in the factor model, and merely considering such a solution
may lead to difficulties of interpretation. For example, two consequences of a
factor solution found when applying the constraint above are:
ˆ The factorial complexity of variables is likely to be greater than one re-
gardless of the underlying true model; consequently variables may have
substantial loadings on more than one factor.
ˆ Except for the first factor, the remaining factors are often bipolar ; i.e.,
they have a mixture of positive and negative loadings.
It may be that a more interpretable orthogonal solution can be achieved using
the equivalent model with loadings Λ∗ = ΛM for some particular orthogonal
matrix, M. Such a process is generally known as factor rotation, but before
we consider how to choose M (i.e., how to “rotate” the factors), we need to
address the question “is factor rotation an acceptable process?”
Certainly factor analysis has in the past been the subject of severe criticism
because of the possibility of rotating factors. Critics have suggested that this
apparently allows investigators to impose on the data whatever type of solu-
tion they are looking for; some have even gone so far as to suggest that factor
analysis has become popular in some areas precisely because it does enable
users to impose their preconceived ideas of the structure behind the observed
correlations (Blackith and Reyment 1971). But, on the whole, such suspicions
are not justified and factor rotation can be a useful procedure for simplifying
an exploratory factor analysis. Factor rotation merely allows the fitted factor
analysis model to be described as simply as possible; rotation does not alter
5.7 Factor rotation 145

the overall structure of a solution but only how the solution is described. Ro-
tation is a process by which a solution is made more interpretable without
changing its underlying mathematical properties. Initial factor solutions with
variables loading on several factors and with bipolar factors can be difficult
to interpret. Interpretation is more straightforward if each variable is highly
loaded on at most one factor and if all factor loadings are either large and
positive or near zero, with few intermediate values. The variables are thus
split into disjoint sets, each of which is associated with a single factor. This
aim is essentially what Thurstone (1931) referred to as simple structure. In
more detail, such structure has the following properties:
ˆ Each row or the factor loading matrix should contain at least one zero.
ˆ Each column of the loading matrix should contain at least k zeros.
ˆ Every pair of columns of the loading matrix should contain several vari-
ables whose loadings vanish in one column but not in the other.
ˆ If the number of factors is four or more, every pair of columns should
contain a large number of variables with zero loadings in both columns.
ˆ Conversely, for every pair of columns of the loading matrix only a small
number of variables should have non-zero loadings in both columns.
When simple structure is achieved, the observed variables will fall into mu-
tually exclusive groups whose loadings are high on single factors, perhaps
moderate to low on a few factors, and of negligible size on the remaining
factors. Medium-sized, equivocal loadings are to be avoided.
The search for simple structure or something close to it begins after an
initial factoring has determined the number of common factors necessary and
the communalities of each observed variable. The factor loadings are then
transformed by post-multiplication by a suitably chosen orthogonal matrix.
Such a transformation is equivalent to a rigid rotation of the axes of the origi-
nally identified factor space. And during the rotation phase of the analysis, we
might choose to abandon one of the assumptions made previously, namely that
factors are orthogonal, i.e., independent (the condition was assumed initially
simply for convenience in describing the factor analysis model). Consequently,
two types of rotation are possible:
ˆ orthogonal rotation, in which methods restrict the rotated factors to being
uncorrelated, or
ˆ oblique rotation, where methods allow correlated factors.
As we have seen above, orthogonal rotation is achieved by post-multiplying
the original matrix of loadings by an orthogonal matrix. For oblique rotation,
the original loadings matrix is post-multiplied by a matrix that is no longer
constrained to be orthogonal. With an orthogonal rotation, the matrix of
correlations between factors after rotation is the identity matrix. With an
oblique rotation, the corresponding matrix of correlations is restricted to have
unit elements on its diagonal, but there are no restrictions on the off-diagonal
elements.
146 5 Exploratory Factor Analysis

So the first question that needs to be considered when rotating factors


is whether we should use an orthogonal or an oblique rotation. As for many
questions posed in data analysis, there is no universal answer to this ques-
tion. There are advantages and disadvantages to using either type of rotation
procedure. As a general rule, if a researcher is primarily concerned with get-
ting results that “best fit” his or her data, then the factors should be rotated
obliquely. If, on the other hand, the researcher is more interested in the gen-
eralisability of his or her results, then orthogonal rotation is probably to be
preferred.
One major advantage of an orthogonal rotation is simplicity since the
loadings represent correlations between factors and manifest variables. This
is not the case with an oblique rotation because of the correlations between
the factors. Here there are two parts of the solution to consider;
ˆ factor pattern coefficients, which are regression coefficients that multiply
with factors to produce measured variables according to the common factor
model, and
ˆ factor structure coefficients, correlation coefficients between manifest vari-
ables and the factors.
Additionally there is a matrix of factor correlations to consider. In many cases
where these correlations are relatively small, researchers may prefer to return
to an orthogonal solution.
There are a variety of rotation techniques, although only relatively few
are in general use. For orthogonal rotation, the two most commonly used
techniques are known as varimax and quartimax .
ˆ Varimax rotation, originally proposed by Kaiser (1958), has as its rationale
the aim of factors with a few large loadings and as many near-zero load-
ings as possible. This is achieved by iterative maximisation of a quadratic
function of the loadings–details are given in Mardia et al. (1979). It pro-
duces factors that have high correlations with one small set of variables
and little or no correlation with other sets. There is a tendency for any
general factor to disappear because the factor variance is redistributed.
ˆ Quartimax rotation, originally suggested by Carroll (1953), forces a given
variable to correlate highly on one factor and either not at all or very low
on other factors. It is far less popular than varimax.
For oblique rotation, the two methods most often used are oblimin and pro-
max .
ˆ Oblimin rotation, invented by Jennrich and Sampson (1966), attempts to
find simple structure with regard to the factor pattern matrix through a
parameter that is used to control the degree of correlation between the
factors. Fixing a value for this parameter is not straightforward, but Pett,
Lackey, and Sullivan (2003) suggest that values between about −0.5 and
0.5 are sensible for many applications.
5.8 Estimating factor scores 147

ˆ Promax rotation, a method due to Hendrickson and White (1964), operates


by raising the loadings in an orthogonal solution (generally a varimax
rotation) to some power. The goal is to obtain a solution that provides
the best structure using the lowest possible power loadings and the lowest
correlation between the factors.
Factor rotation is often regarded as controversial since it apparently allows
the investigator to impose on the data whatever type of solution is required.
But this is clearly not the case since although the axes may be rotated about
their origin or may be allowed to become oblique, the distribution of the points
will remain invariant. Rotation is simply a procedure that allows new axes to
be chosen so that the positions of the points can be described as simply as
possible.
(It should be noted that rotation techniques are also often applied to the
results from a principal components analysis in the hope that they will aid
in their interpretability. Although in some cases this may be acceptable, it
does have several disadvantages, which are listed by Jolliffe (1989). The main
problem is that the defining property of principal components, namely that
of accounting for maximal proportions of the total variation in the observed
variables, is lost after rotation.

5.8 Estimating factor scores


The first stage of an exploratory factor analysis consists of the estimation
of the parameters in the model and the rotation of the factors, followed by
an (often heroic) attempt to interpret the fitted model. The second stage is
concerned with estimating latent variable scores for each individual in the
data set; such factor scores are often useful for a number of reasons:
1. They represent a parsimonious summary of the original data possibly use-
ful in subsequent analyses (cf. principal component scores in Chapter 3).
2. They are likely to be more reliable than the observed variable values.
3. The factor score is a “pure” measure of a latent variable, while an observed
value may be ambiguous because we do not know what combination of
latent variables may be represented by that observed value.
But the calculation of factor scores is not as straightforward as the calcula-
tion of principal component scores. In the original equation defining the factor
analysis model, the variables are expressed in terms of the factors, whereas
to calculate scores we require the relationship to be in the opposite direction.
Bartholomew and Knott (1987) make the point that to talk about “estimat-
ing” factor scores is essentially misleading since they are random variables
and the issue is really one of prediction. But if we make the assumption of
normality, the conditional distribution of f given x can be found. It is

N (Λ> Σ−1 x, (Λ> Ψ −1 Λ + I)−1 ).


148 5 Exploratory Factor Analysis

Consequently, one plausible way of calculating factor scores would be to use


the sample version of the mean of this distribution, namely
>
f̂ = Λ̂ S−1 x,

where the vector of scores for an individual, x, is assumed to have mean


zero; i.e., sample means for each variable have already been subtracted. Other
possible methods for deriving factor scores are described in Rencher (1995),
and helpful detailed calculations of several types of factor scores are given
in Hershberger (2005). In many respects, the most damaging problem with
factor analysis is not the rotational indeterminacy of the loadings but the
indeterminacy of the factor scores.

5.9 Two examples of exploratory factor analysis

5.9.1 Expectations of life


The data in Table 5.1 show life expectancy in years by country, age, and sex.
The data come from Keyfitz and Flieger (1971) and relate to life expectancies
in the 1960s.

Table 5.1: life data. Life expectancies for different countries by


age and gender.

m0 m25 m50 m75 w0 w25 w50 w75


Algeria 63 51 30 13 67 54 34 15
Cameroon 34 29 13 5 38 32 17 6
Madagascar 38 30 17 7 38 34 20 7
Mauritius 59 42 20 6 64 46 25 8
Reunion 56 38 18 7 62 46 25 10
Seychelles 62 44 24 7 69 50 28 14
South Africa (C) 50 39 20 7 55 43 23 8
South Africa (W) 65 44 22 7 72 50 27 9
Tunisia 56 46 24 11 63 54 33 19
Canada 69 47 24 8 75 53 29 10
Costa Rica 65 48 26 9 68 50 27 10
Dominican Rep. 64 50 28 11 66 51 29 11
El Salvador 56 44 25 10 61 48 27 12
Greenland 60 44 22 6 65 45 25 9
Grenada 61 45 22 8 65 49 27 10
Guatemala 49 40 22 9 51 41 23 8
Honduras 59 42 22 6 61 43 22 7
Jamaica 63 44 23 8 67 48 26 9
Mexico 59 44 24 8 63 46 25 8
5.9 Two examples of exploratory factor analysis 149

Table 5.1: life data (continued).

m0 m25 m50 m75 w0 w25 w50 w75


Nicaragua 65 48 28 14 68 51 29 13
Panama 65 48 26 9 67 49 27 10
Trinidad (62) 64 63 21 7 68 47 25 9
Trinidad (67) 64 43 21 6 68 47 24 8
United States (66) 67 45 23 8 74 51 28 10
United States (NW66) 61 40 21 10 67 46 25 11
United States (W66) 68 46 23 8 75 52 29 10
United States (67) 67 45 23 8 74 51 28 10
Argentina 65 46 24 9 71 51 28 10
Chile 59 43 23 10 66 49 27 12
Colombia 58 44 24 9 62 47 25 10
Ecuador 57 46 28 9 60 49 28 11

To begin, we will use the formal test for the number of factors incorporated
into the maximum likelihood approach. We can apply this test to the data,
assumed to be contained in the data frame life with the country names
labelling the rows and variable names as given in Table 5.1, using the following
R code:
R> sapply(1:3, function(f)
+ factanal(life, factors = f, method ="mle")$PVAL)
objective objective objective
1.880e-24 1.912e-05 4.578e-01
These results suggest that a three-factor solution might be adequate to account
for the observed covariances in the data, although it has to be remembered
that, with only 31 countries, use of an asymptotic test result may be rather
suspect. The three-factor solution is as follows (note that the solution is that
resulting from a varimax solution. the default for the factanal() function):
R> factanal(life, factors = 3, method ="mle")

Call:
factanal(x = life, factors = 3, method = "mle")

Uniquenesses:
m0 m25 m50 m75 w0 w25 w50 w75
0.005 0.362 0.066 0.288 0.005 0.011 0.020 0.146

Loadings:
Factor1 Factor2 Factor3
150 5 Exploratory Factor Analysis

m0 0.964 0.122 0.226


m25 0.646 0.169 0.438
m50 0.430 0.354 0.790
m75 0.525 0.656
w0 0.970 0.217
w25 0.764 0.556 0.310
w50 0.536 0.729 0.401
w75 0.156 0.867 0.280

Factor1 Factor2 Factor3


SS loadings 3.375 2.082 1.640
Proportion Var 0.422 0.260 0.205
Cumulative Var 0.422 0.682 0.887

Test of the hypothesis that 3 factors are sufficient.


The chi square statistic is 6.73 on 7 degrees of freedom.
The p-value is 0.458

(“Blanks” replace negligible loadings.) Examining the estimated factor load-


ings, we see that the first factor is dominated by life expectancy at birth for
both males and females; perhaps this factor could be labelled “life force at
birth”. The second reflects life expectancies at older ages, and we might label
it “life force amongst the elderly”. The third factor from the varimax rotation
has its highest loadings for the life expectancies of men aged 50 and 75 and in
the same vein might be labelled “life force for elderly men”. (When labelling
factors in this way, factor analysts can often be extremely creative!)
The estimated factor scores are found as follows;
R> (scores <- factanal(life, factors = 3, method = "mle",
+ scores = "regression")$scores)

Factor1 Factor2 Factor3


Algeria -0.258063 1.90096 1.91582
Cameroon -2.782496 -0.72340 -1.84772
Madagascar -2.806428 -0.81159 -0.01210
Mauritius 0.141005 -0.29028 -0.85862
Reunion -0.196352 0.47430 -1.55046
Seychelles 0.367371 0.82902 -0.55214
South Africa (C) -1.028568 -0.08066 -0.65422
South Africa (W) 0.946194 0.06400 -0.91995
Tunisia -0.862494 3.59177 -0.36442
Canada 1.245304 0.29564 -0.27343
Costa Rica 0.508736 -0.50500 1.01329
Dominican Rep. 0.106044 0.01111 1.83872
El Salvador -0.608156 0.65101 0.48836
Greenland 0.235114 -0.69124 -0.38559
5.9 Two examples of exploratory factor analysis 151

Grenada 0.132008 0.25241 -0.15221


Guatemala -1.450336 -0.67766 0.65912
Honduras 0.043253 -1.85176 0.30633
Jamaica 0.462125 -0.51918 0.08033
Mexico -0.052333 -0.72020 0.44418
Nicaragua 0.268974 0.08407 1.70568
Panama 0.442333 -0.73778 1.25219
Trinidad (62) 0.711367 -0.95989 -0.21545
Trinidad (67) 0.787286 -1.10729 -0.51958
United States (66) 1.128331 0.16390 -0.68177
United States (NW66) 0.400059 -0.36230 -0.74299
United States (W66) 1.214345 0.40877 -0.69225
United States (67) 1.128331 0.16390 -0.68177
Argentina 0.731345 0.24812 -0.12818
Chile 0.009752 0.75223 -0.49199
Colombia -0.240603 -0.29544 0.42920
Ecuador -0.723452 0.44246 1.59165

We can use the scores to provide the plot of the data shown in Figure 5.1.
Ordering along the first axis reflects life force at birth ranging from
Cameroon and Madagascar to countries such as the USA. And on the third
axis Algeria is prominent because it has high life expectancy amongst men
at higher ages, with Cameroon at the lower end of the scale with a low life
expectancy for men over 50.

5.9.2 Drug use by American college students


The majority of adult and adolescent Americans regularly use psychoactive
substances during an increasing proportion of their lifetimes. Various forms
of licit and illicit psychoactive substance use are prevalent, suggesting that
patterns of psychoactive substance taking are a major part of the individual’s
behavioural repertory and have pervasive implications for the performance of
other behaviours. In an investigation of these phenomena, Huba, Wingard,
and Bentler (1981) collected data on drug usage rates for 1634 students in
the seventh to ninth grades in 11 schools in the greater metropolitan area of
Los Angeles. Each participant completed a questionnaire about the number of
times a particular substance had ever been used. The substances asked about
were as follows:
ˆ cigarettes;
ˆ beer;
ˆ wine;
ˆ liquor;
ˆ cocaine;
ˆ tranquillizers;
ˆ drug store medications used to get high;
152 5 Exploratory Factor Analysis
Tunis

3
2
Alger
Factor 2

1
ElSlv Chile Sychl
Ecudr Reunn US(W6
Canad
Grend Argnt US(66
US(67
Nicrg
DmRp. SA(W)
SA(C)
0
Colmb Marts
US(NW
CstRc
Jamac
Camrn
Mdgsc Gutml MexicGrnln
Panam
T(62)
−1

T(67)

Hndrs
−2

−2 −1 0 1

Factor 1
2

AlgerDmRp.
Nicrg
Ecudr
Panam
CstRc
1

Gutml
ElSlv Colmb
Mexic
Factor 3

Hndrs
Mdgsc Jamac
0

Grend Argnt
T(62)
Tunis Canad
Grnln
Chile
SA(C) Sychl T(67)
US(NW US(66
US(67
US(W6
Marts SA(W)
−1

Reunn
Camrn

−2 −1 0 1

Factor 1
2

DmRp. Alger
Nicrg
Ecudr
Panam
CstRc
1

Gutml
MexicColmb ElSlv
Factor 3

Hndrs
Jamac
Mdgsc
0

T(62) Argnt
Grend
Canad Tunis
Grnln Chile
T(67) Sychl
SA(C)
US(66
US(NWUS(67
US(W6
Marts
SA(W)
−1

Reunn
Camrn

−2 −1 0 1 2 3

Factor 2
Fig. 5.1. Individual scatterplots of three factor scores for life expectancy data, with
points labelled by abbreviated country names.
5.9 Two examples of exploratory factor analysis 153

ˆ heroin and other opiates;


ˆ marijuana;
ˆ hashish;
ˆ inhalants (glue, gasoline, etc.);
ˆ hallucinogenics (LSD, mescaline, etc.);
ˆ amphetamine stimulants.
Responses were recorded on a five-point scale: never tried, only once, a few
times, many times, and regularly. The correlations between the usage rates
of the 13 substances are shown in Figure 5.2. The plot was produced using
the levelplot() function from the package lattice with a somewhat lengthy
panel function, so we refer the interested reader to the R code contained in
the demo for this chapter (see the Preface for how to access this document).
The figure depicts each correlation by an ellipse whose shape tends towards
a line with slope 1 for correlations near 1, to a circle for correlations near
zero, and to a line with negative slope −1 for negative correlations near −1.
In addition, 100 times the correlation coefficient is printed inside the ellipse,
and a colourcoding indicates strong negative (dark) to strong positive (light)
correlations.
We first try to determine the number of factors using the maximum likeli-
hood test. The R code for finding the results of the test for number of factors
here is:
R> sapply(1:6, function(nf)
+ factanal(covmat = druguse, factors = nf,
+ method = "mle", n.obs = 1634)$PVAL)
objective objective objective objective objective objective
0.000e+00 9.786e-70 7.364e-28 1.795e-11 3.892e-06 9.753e-02
These values suggest that only the six-factor solution provides an adequate
fit. The results from the six-factor varimax solution are obtained from
R> (factanal(covmat = druguse, factors = 6,
+ method = "mle", n.obs = 1634))
Call:
factanal(factors = 6, covmat = druguse, n.obs = 1634)

Uniquenesses:
cigarettes beer
0.563 0.368
wine liquor
0.374 0.412
cocaine tranquillizers
0.681 0.522
drug store medication heroin
0.785 0.669
154 5 Exploratory Factor Analysis

−1.0 −0.5 0.0 0.5 1.0

wine 11 5 7 18 7 14183624425862100
beer 10 7 6 20 9 15204432456010062
liquor 121210261426294837441006058
cigarettes 9 11 8 241020245130100444542
hashish 163022303738475310030373224
marijuana 151915302032391005351484436
amphetamine 232831395155100394724292018
tranquillizers 223536323710055323820261514
hallucinogenics 23283234100375120371014 9 7
inhalants 312729100343239303024262018
heroin 2032100293236311522 8 10 6 7
cocaine 21100322728352819301112 7 5
drug store medication 1002120312322231516 9 121011
drug store medication
cocaine
heroin
inhalants
hallucinogenics
tranquillizers
amphetamine
marijuana
hashish
cigarettes
liquor
beer
wine

Fig. 5.2. Visualisation of the correlation matrix of drug use. The numbers in the
cells correspond to 100 times the correlation coefficient. The color and the shape of
the plotting symbols also correspond to the correlation in this cell.

marijuana hashish
0.318 0.005
inhalants hallucinogenics
0.541 0.620
amphetamine
0.005

Loadings:
Factor1 Factor2 Factor3 Factor4 Factor5
cigarettes 0.494 0.407
beer 0.776 0.112
wine 0.786
liquor 0.720 0.121 0.103 0.115 0.160
5.9 Two examples of exploratory factor analysis 155

cocaine 0.519 0.132


tranquillizers 0.130 0.564 0.321 0.105 0.143
drug store medication 0.255
heroin 0.532 0.101
marijuana 0.429 0.158 0.152 0.259 0.609
hashish 0.244 0.276 0.186 0.881 0.194
inhalants 0.166 0.308 0.150 0.140
hallucinogenics 0.387 0.335 0.186
amphetamine 0.151 0.336 0.886 0.145 0.137
Factor6
cigarettes 0.110
beer
wine
liquor
cocaine 0.158
tranquillizers
drug store medication 0.372
heroin 0.190
marijuana 0.110
hashish 0.100
inhalants 0.537
hallucinogenics 0.288
amphetamine 0.187

Factor1 Factor2 Factor3 Factor4 Factor5 Factor6


SS loadings 2.301 1.415 1.116 0.964 0.676 0.666
Proportion Var 0.177 0.109 0.086 0.074 0.052 0.051
Cumulative Var 0.177 0.286 0.372 0.446 0.498 0.549

Test of the hypothesis that 6 factors are sufficient.


The chi square statistic is 22.41 on 15 degrees of freedom.
The p-value is 0.0975
Substances that load highly on the first factor are cigarettes, beer, wine, liquor,
and marijuana and we might label it “social/soft drug use”. Cocaine, tranquil-
lizers, and heroin load highly on the second factor–the obvious label for the
factor is “hard drug use”. Factor three is essentially simply amphetamine use,
and factor four hashish use. We will not try to interpret the last two factors,
even though the formal test for number of factors indicated that a six-factor
solution was necessary. It may be that we should not take the results of the
formal test too literally; rather, it may be a better strategy to consider the
value of k indicated by the test to be an upper bound on the number of factors
with practical importance. Certainly a six-factor solution for a data set with
only 13 manifest variables might be regarded as not entirely satisfactory, and
clearly we would have some difficulties interpreting all the factors.
156 5 Exploratory Factor Analysis

One of the problems is that with the large sample size in this example, even
small discrepancies between the correlation matrix predicted by a proposed
model and the observed correlation matrix may lead to rejection of the model.
One way to investigate this possibility is simply to look at the differences
between the observed and predicted correlations. We shall do this first for the
six-factor model using the following R code:
R> pfun <- function(nf) {
+ fa <- factanal(covmat = druguse, factors = nf,
+ method = "mle", n.obs = 1634)
+ est <- tcrossprod(fa$loadings) + diag(fa$uniquenesses)
+ ret <- round(druguse - est, 3)
+ colnames(ret) <- rownames(ret) <-
+ abbreviate(rownames(ret), 3)
+ ret
+ }
R> pfun(6)
cgr ber win lqr ccn trn dsm hrn
cgr 0.000 -0.001 0.014 -0.018 0.010 0.001 -0.020 -0.004
ber -0.001 0.000 -0.002 0.004 0.004 -0.011 -0.001 0.007
win 0.014 -0.002 0.000 -0.001 -0.001 -0.005 0.008 0.008
lqr -0.018 0.004 -0.001 0.000 -0.008 0.021 -0.006 -0.018
ccn 0.010 0.004 -0.001 -0.008 0.000 0.000 0.008 0.004
trn 0.001 -0.011 -0.005 0.021 0.000 0.000 0.006 -0.004
dsm -0.020 -0.001 0.008 -0.006 0.008 0.006 0.000 -0.015
hrn -0.004 0.007 0.008 -0.018 0.004 -0.004 -0.015 0.000
mrj 0.001 0.002 -0.004 0.003 -0.004 -0.004 0.008 0.006
hsh 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
inh 0.010 -0.004 -0.007 0.012 -0.003 0.002 0.004 -0.002
hll -0.005 0.005 -0.001 -0.005 -0.008 -0.008 -0.002 0.020
amp 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
mrj hsh inh hll amp
cgr 0.001 0 0.010 -0.005 0
ber 0.002 0 -0.004 0.005 0
win -0.004 0 -0.007 -0.001 0
lqr 0.003 0 0.012 -0.005 0
ccn -0.004 0 -0.003 -0.008 0
trn -0.004 0 0.002 -0.008 0
dsm 0.008 0 0.004 -0.002 0
hrn 0.006 0 -0.002 0.020 0
mrj 0.000 0 -0.006 0.003 0
hsh 0.000 0 0.000 0.000 0
inh -0.006 0 0.000 -0.002 0
hll 0.003 0 -0.002 0.000 0
amp 0.000 0 0.000 0.000 0
5.10 Factor analysis and principal components analysis compared 157

The differences are all very small, underlining that the six-factor model does
describe the data very well. Now let us look at the corresponding matrices for
the three- and four-factor solutions found in a similar way in Figure 5.3. Again,
in both cases the residuals are all relatively small, suggesting perhaps that use
of the formal test for number of factors leads, in this case, to overfitting. The
three-factor model appears to provide a perfectly adequate fit for these data.

5.10 Factor analysis and principal components analysis


compared
Factor analysis, like principal components analysis, is an attempt to explain a
set of multivariate data using a smaller number of dimensions than one begins
with, but the procedures used to achieve this goal are essentially quite different
in the two approaches. Some differences between the two are as follows:
ˆ Factor analysis tries to explain the covariances or correlations of the ob-
served variables by means of a few common factors. Principal components
analysis is primarily concerned with explaining the variance of the observed
variables.
ˆ If the number of retained components is increased, say from m to m+1, the
first m components are unchanged. This is not the case in factor analysis,
where there can be substantial changes in all factors if the number of
factors is changed.
ˆ The calculation of principal component scores is straightforward, but the
calculation of factor scores is more complex, and a variety of methods have
been suggested.
ˆ There is usually no relationship between the principal components of the
sample correlation matrix and the sample covariance matrix. For maximum
likelihood factor analysis, however, the results of analysing either matrix
are essentially equivalent (which is not true of principal factor analysis).
Despite these differences, the results from both types of analyses are fre-
quently very similar. Certainly, if the specific variances are small, we would
expect both forms of analyses to give similar results. However, if the specific
variances are large, they will be absorbed into all the principal components,
both retained and rejected, whereas factor analysis makes special provision
for them.
Lastly, it should be remembered that both principal components analy-
sis and factor analysis are similar in one important respect–they are both
pointless if the observed variables are almost uncorrelated. In this case, factor
analysis has nothing to explain and principal components analysis will simply
lead to components that are similar to the original variables.
R> pfun(3)
158

cgr ber win lqr ccn trn dsm hrn mrj hsh inh hll amp
cgr 0.000 -0.001 0.009 -0.013 0.011 0.009 -0.011 -0.004 0.003 -0.027 0.039 -0.017 0.002
ber -0.001 0.000 -0.002 0.002 0.002 -0.014 0.000 0.005 -0.001 0.019 -0.002 0.009 -0.007
win 0.009 -0.002 0.000 0.000 -0.002 -0.004 0.012 0.013 0.001 -0.017 -0.007 0.004 0.002
lqr -0.013 0.002 0.000 0.000 -0.008 0.024 -0.017 -0.020 -0.001 0.014 -0.002 -0.015 0.006
ccn 0.011 0.002 -0.002 -0.008 0.000 0.031 0.038 0.082 -0.002 0.041 0.023 -0.030 -0.075
trn 0.009 -0.014 -0.004 0.024 0.031 0.000 -0.021 0.026 -0.002 -0.016 -0.038 -0.058 0.044
dsm -0.011 0.000 0.012 -0.017 0.038 -0.021 0.000 0.021 0.007 -0.040 0.113 0.000 -0.038
hrn -0.004 0.005 0.013 -0.020 0.082 0.026 0.021 0.000 0.006 -0.035 0.031 -0.005 -0.049
mrj 0.003 -0.001 0.001 -0.001 -0.002 -0.002 0.007 0.006 0.000 0.001 0.003 -0.002 -0.002
hsh -0.027 0.019 -0.017 0.014 0.041 -0.016 -0.040 -0.035 0.001 0.000 -0.035 0.034 0.010
inh 0.039 -0.002 -0.007 -0.002 0.023 -0.038 0.113 0.031 0.003 -0.035 0.000 0.007 -0.015
hll -0.017 0.009 0.004 -0.015 -0.030 -0.058 0.000 -0.005 -0.002 0.034 0.007 0.000 0.041
amp 0.002 -0.007 0.002 0.006 -0.075 0.044 -0.038 -0.049 -0.002 0.010 -0.015 0.041 0.000
5 Exploratory Factor Analysis

R> pfun(4)
cgr ber win lqr ccn trn dsm hrn mrj hsh inh hll amp
cgr 0.000 -0.001 0.008 -0.012 0.009 0.008 -0.015 -0.007 0.001 -0.023 0.037 -0.020 0.000
ber -0.001 0.000 -0.001 0.001 0.000 -0.016 -0.002 0.003 -0.001 0.018 -0.005 0.006 0.000
win 0.008 -0.001 0.000 0.000 -0.001 -0.005 0.012 0.014 0.001 -0.020 -0.008 0.001 0.000
lqr -0.012 0.001 0.000 0.000 -0.004 0.029 -0.015 -0.015 -0.001 0.018 0.001 -0.010 -0.001
ccn 0.009 0.000 -0.001 -0.004 0.000 0.024 -0.014 0.007 -0.003 0.035 -0.022 -0.028 0.000
trn 0.008 -0.016 -0.005 0.029 0.024 0.000 -0.020 0.027 -0.001 0.001 -0.032 -0.028 0.001
dsm -0.015 -0.002 0.012 -0.015 -0.014 -0.020 0.000 -0.018 0.003 -0.042 0.090 0.008 0.000
hrn -0.007 0.003 0.014 -0.015 0.007 0.027 -0.018 0.000 0.003 -0.037 -0.001 0.005 0.000
mrj 0.001 -0.001 0.001 -0.001 -0.003 -0.001 0.003 0.003 0.000 0.000 0.001 -0.002 0.000
hsh -0.023 0.018 -0.020 0.018 0.035 0.001 -0.042 -0.037 0.000 0.000 -0.031 0.055 -0.001
inh 0.037 -0.005 -0.008 0.001 -0.022 -0.032 0.090 -0.001 0.001 -0.031 0.000 0.021 0.000
hll -0.020 0.006 0.001 -0.010 -0.028 -0.028 0.008 0.005 -0.002 0.055 0.021 0.000 0.000
amp 0.000 0.000 0.000 -0.001 0.000 0.001 0.000 0.000 0.000 -0.001 0.000 0.000 0.000

Fig. 5.3. Differences between three- and four-factor solutions and actual correlation matrix for the drug use data.
5.12 Exercises 159

5.11 Summary
Factor analysis has probably attracted more critical comments than any other
statistical technique. Hills (1977), for example, has gone so far as to suggest
that factor analysis is not worth the time necessary to understand it and
carry it out. And Chatfield and Collins (1980) recommend that factor analysis
should not be used in most practical situations. The reasons for such an openly
sceptical view about factor analysis arise first from the central role of latent
variables in the factor analysis model and second from the lack of uniqueness of
the factor loadings in the model, which gives rise to the possibility of rotating
factors. It certainly is the case that, since the common factors cannot be
measured or observed, the existence of these hypothetical variables is open to
question. A factor is a construct operationally defined by its factor loadings,
and overly enthusiastic reification is not recommended. And it is the case that,
given one factor loading matrix, there are an infinite number of factor loading
matrices that could equally well (or equally badly) account for the variances
and covariances of the manifest variables. Rotation methods are designed to
find an easily interpretable solution from among this infinitely large set of
alternatives by finding a solution that exhibits the best simple structure.
Factor analysis can be a useful tool for investigating particular features of
the structure of multivariate data. Of course, like many models used in data
analysis, the one used in factor analysis may be only a very idealised approx-
imation to the truth. Such an approximation may, however, prove a valuable
starting point for further investigations, particularly for the confirmatory fac-
tor analysis models that are the subject of Chapter 7.
For exploratory factor analysis, similar comments apply about the size of
n and q needed to get convincing results, such as those given in Chapter 3
for principal components analysis. And the maximum likelihood method for
the estimation of factor loading and specific variances used in this chapter is
only suitable for data having a multivariate normal distribution (or at least a
reasonable approximation to such a distribution). Consequently, for the factor
analysis of, in particular, binary variables, special methods are needed; see,
for example, Muthen (1978).

5.12 Exercises
Ex. 5.1 Show how the result Σ = ΛΛ> + Ψ arises from the assumptions of
uncorrelated factors, independence of the specific variates, and indepen-
dence of common factors and specific variances. What form does Σ take
if the factors are allowed to be correlated?
Ex. 5.2 Show that the communalities in a factor analysis model are unaffected
by the transformation Λ∗ = ΛM.
Ex. 5.3 Give a formula for the proportion of variance explained by the jth
factor estimated by the principal factor approach.
160 5 Exploratory Factor Analysis

Ex. 5.4 Apply the factor analysis model separately to the life expectancies
of men and women and compare the results.
Ex. 5.5 The correlation matrix given below arises from the scores of 220 boys
in six school subjects: (1) French, (2) English, (3) History, (4) Arithmetic,
(5) Algebra, and (6) Geometry. Find the two-factor solution from a max-
imum likelihood factor analysis. By plotting the derived loadings, find an
orthogonal rotation that allows easier interpretation of the results.
 
French 1.00
English  0.44 1.00



History  0.41 0.35 1.00 
R=  .
Arithmetic  0.29 0.35 0.16 1.00
 

Algebra  0.33 0.32 0.19 0.59 1.00 
Geometry 0.25 0.33 0.18 0.47 0.46 1.00

Ex. 5.6 The matrix below shows the correlations between ratings on nine
statements about pain made by 123 people suffering from extreme pain.
Each statement was scored on a scale from 1 to 6, ranging from agreement
to disagreement. The nine pain statements were as follows:
1. Whether or not I am in pain in the future depends on the skills of the
doctors.
2. Whenever I am in pain, it is usually because of something I have done
or not done,
3. Whether or not I am in pain depends on what the doctors do for me.
4. I cannot get any help for my pain unless I go to seek medical advice.
5. When I am in pain I know that it is because I have not been taking
proper exercise or eating the right food.
6. People’s pain results from their own carelessness.
7. I am directly responsible for my pain,
8. relief from pain is chiefly controlled by the doctors.
9. People
 who are never in pain are just plain lucky. 
1.00
 −0.04 1.00 
 
 0.61 −0.07 1.00 
 
 0.45 −0.12 0.59 1.00 
 
 0.03 0.49 0.03 −0.08 1.00 .
 
 −0.29 0.43 −0.13 −0.21 0.47 1.00 
 
 −0.30 0.30 −0.24 −0.19 0.41 0.63 1.00 
 
 0.45 −0.31 0.59 0.63 −0.14 −0.13 −0.26 1.00 
0.30 −0.17 0.32 0.37 −0.24 −0.15 −0.29 0.40 1.00
(a) Perform a principal components analysis on these data, and exam-
ine the associated scree plot to decide on the appropriate number of
components.
(b) Apply maximum likelihood factor analysis, and use the test described
in the chapter to select the necessary number of common factors.
5.12 Exercises 161

(c) Rotate the factor solution selected using both an orthogonal and an
oblique procedure, and interpret the results.

You might also like