An Introduction to Spatial Data Science with GeoDa Volume 2 Clustering Spatial Data 2nd Edition Luc Anselin pdf download
An Introduction to Spatial Data Science with GeoDa Volume 2 Clustering Spatial Data 2nd Edition Luc Anselin pdf download
https://ebookgate.com/product/an-introduction-to-spatial-data-
science-with-geoda-volume-2-clustering-spatial-data-2nd-edition-
luc-anselin/
https://ebookgate.com/product/fundamentals-of-spatial-data-
quality-1st-edition-rodolphe-devillers/
ebookgate.com
https://ebookgate.com/product/building-european-spatial-data-
infrastructures-3rd-edition-ian-masser/
ebookgate.com
https://ebookgate.com/product/introduction-to-clustering-large-and-
high-dimensional-data-1st-edition-jacob-kogan/
ebookgate.com
https://ebookgate.com/product/clustering-for-data-mining-a-data-
recovery-approach-1st-edition-boris-mirkin/
ebookgate.com
An Introduction to Programming with IDL Interactive Data
Language 1st Edition Kenneth P. Bowman
https://ebookgate.com/product/an-introduction-to-programming-with-idl-
interactive-data-language-1st-edition-kenneth-p-bowman/
ebookgate.com
https://ebookgate.com/product/introduction-to-data-analytics-for-
accounting-2nd-edition/
ebookgate.com
https://ebookgate.com/product/data-science-fundamentals-with-r-python-
and-open-data-1st-edition-marco-cremonini/
ebookgate.com
https://ebookgate.com/product/agile-data-science-building-data-
analytics-applications-with-hadoop-1st-edition-russell-jurney/
ebookgate.com
https://ebookgate.com/product/knowledge-based-clustering-from-data-to-
information-granules-1st-edition-witold-pedrycz/
ebookgate.com
An Introduction to Spatial
Data Science with GeoDa
Volume 2 – Clustering Spatial Data
This book is the second in a two-volume series that introduces the field of spatial data sci-
ence. It moves beyond pure data exploration to the organization of observations into meaningful
groups, i.e., spatial clustering. This constitutes an important component of so-called unsuper-
vised learning, a major aspect of modern machine learning.
The distinctive aspects of the book are both to explore ways to spatialize classic clustering meth-
ods through linked maps and graphs, as well as the explicit introduction of spatial contiguity
constraints into clustering algorithms. Leveraging a large number of real-world empirical il-
lustrations, readers will gain an understanding of the main concepts and techniques and their
relative advantages and disadvantages. The book also constitutes the definitive user’s guide for
these methods as implemented in the GeoDa open-source software for spatial analysis.
It is organized into three major parts, dealing with dimension reduction (principal components,
multidimensional scaling, stochastic network embedding), classic clustering methods (hierar-
chical clustering, k-means, k-medians, k-medoids and spectral clustering) and spatially con-
strained clustering methods (both hierarchical and partitioning). It closes with an assessment of
spatial and non-spatial cluster properties.
The book is intended for readers interested in going beyond simple mapping of geographical
data to gain insight into interesting patterns as expressed in spatial clusters of observations.
Familiarity with the material in Volume 1 is assumed, especially the analysis of local spatial au-
tocorrelation and the full range of visualization methods.
Luc Anselin is the Founding Director of the Center for Spatial Data Science at the University
of Chicago, where he is also Stein-Freiler Distinguished Service Professor of Sociology and the
College, as well as a member of the Committee on Data Science. He is the creator of the GeoDa
software and an active contributor to the PySAL Python open-source software library for spatial
analysis. He has written widely on topics dealing with the methodology of spatial data analysis,
including his classic 1988 text on Spatial Econometrics. His work has been recognized by many
awards, such as his election to the U.S. National Academy of Science and the American Academy
of Arts and Science.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
An Introduction to Spatial
Data Science with GeoDa
Volume 2 – Clustering Spatial Data
Luc Anselin
Designed cover image: © Luc Anselin
Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot as-
sume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have
attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders
if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please
write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or
utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including pho-
tocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission
from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the
Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are
not available on CCC please contact [email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for iden-
tification and explanation without intent to infringe.
DOI: 10.1201/9781032713175
List of Figures xi
Preface xvii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
1 Introduction 1
1.1 Overview of Volume 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Sample Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
I Dimension Reduction 5
2 Principal Component Analysis (PCA) 7
2.1 Topics Covered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Matrix Algebra Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Matrix decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Principal Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Visualizing principal components . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4.1 Scatter plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4.2 Multivariate decomposition . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Spatializing Principal Components . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5.1 Principal component map . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5.2 Univariate cluster map . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5.3 Principal components as multivariate cluster maps . . . . . . . . . . 22
vii
viii Contents
II Classic Clustering 63
5 Hierarchical Clustering Methods 65
5.1 Topics Covered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2 Dissimilarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3 Agglomerative Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.3.1 Linkage and Updating Formula . . . . . . . . . . . . . . . . . . . . . 69
5.3.2 Dendrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4.1 Variable Settings Dialog . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.4.2 Ward’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4.3 Single linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4.4 Complete linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4.5 Average linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4.6 Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
IV Assessment 189
12 Cluster Validation 191
12.1 Topics Covered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
12.2 Internal Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
12.2.1 Traditional Measures of Fit . . . . . . . . . . . . . . . . . . . . . . . 193
12.2.2 Balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
12.2.3 Join Count Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
12.2.4 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
12.2.5 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
12.2.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
12.3 External Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
12.3.1 Classic Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
12.3.2 Visualizing Cluster Match . . . . . . . . . . . . . . . . . . . . . . . . 200
12.4 Beyond Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Bibliography 205
Index 211
List of Figures
xi
xii List of Figures
12.1 Clusters > Cluster Match Map | Make Spatial | Validation . . . . . . . . 192
12.2 Hierarchical Clustering – Ward’s method, Ceará . . . . . . . . . . . . . . . 196
12.3 Internal Validation Measures . . . . . . . . . . . . . . . . . . . . . . . . . . 196
12.4 Internal Validation Result – Hierarchical Clustering . . . . . . . . . . . . . 197
12.5 Internal Validation Result – AZP with Initial Region . . . . . . . . . . . . 198
12.6 Adjusted Rand Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
12.7 Normalized Information Distance . . . . . . . . . . . . . . . . . . . . . . . 200
12.8 K-Means and SKATER overlap . . . . . . . . . . . . . . . . . . . . . . . . . 201
12.9 SCHC and REDCAP overlap . . . . . . . . . . . . . . . . . . . . . . . . . . 201
12.10 Cluster Match Map – SKATER and K-MEANS . . . . . . . . . . . . . . . 202
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Preface
In contrast to the materials covered in Volume 1, this second volume has no precedent in
an earlier workbook. Much of its contents have been added in recent years to the GeoDa
documentation pages, as the topics were gradually included into my Introduction to Spatial
Data Science course and implemented in GeoDa. At one point, the material became too much
to constitute a single course and was split off into a separate Spatial Clustering course. The
division of the content between the two volumes follows this organization.
In contrast to the first volume, where the focus is almost exclusively on data exploration,
here attention switches to the delineation of groupings of observations, i.e., clusters. Both
traditional and spatially constrained methods are considered. Again, the emphasis is on how
a spatial perspective can contribute to additional insight, both by considering the spatial
aspects explicitly (as in spatially constrained clustering) as well as through spatializing
classic techniques.
Compared to Volume 1, the treatment is slightly more mathematical and familiarity with the
methods covered in the first volume is assumed. As before, extensive references are provided.
However, in contrast to the first volume, several methods included here are new and have
not been treated extensively in earlier publications. They were typically introduced as part
of the documentation of new features in GeoDa.
The empirical illustrations use the same sample data sets as in Volume 1. These are included
in the software.
All applications are based on Version 1.22 of the software, available in Summer 2023. Later
versions may include slight changes as well as additional features, but the treatment provided
here should remain valid. The software is free, cross-platform and open source, and can be
downloaded from https://geodacenter.github.io/download.html.
Acknowledgments
This second volume is based on enhancements in the GeoDa software implemented in the
past five or so years, with Xun Li as the lead software engineer and Julia Koschinsky as
a constant source of inspiration and constructive comments. The software development
received institutional support by the University of Chicago to the Center for Spatial Data
Science.
Help and suggestions with the production process from Lara Spieker of Chapman & Hall is
greatly appreciated.
As for the first volume, Emily has been patiently living with my GeoDa obsession for many
years. This volume is also dedicated to her.
Shelby, MI, Summer 2023
xvii
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
About the Author
Luc Anselin is the Founding Director of the Center for Spatial Data Science at the University
of Chicago, where he is also Stein-Freiler Distinguished Service Professor of Sociology and the
College. He previously held faculty appointments at Arizona State University, the University
of Illinois at Urbana-Champaign, the University of Texas at Dallas, the Regional Research
Institute at West Virginia University, the University of California, Santa Barbara, and The
Ohio State University. He also was a visiting professor at Brown University and MIT. He
holds a PhD in Regional Science from Cornell University.
Over the past four decades, he has developed new methods for exploratory spatial data
analysis and spatial econometrics, including the widely used local indicators of spatial
autocorrelation. His 1988 Spatial Econometrics text has been cited some 17,000 times. He
has implemented these methods into software, including the original SpaceStat software, as
well as GeoDa, and as part of the Python PySAL library for spatial analysis.
His work has been recognized by several awards, including election to the U.S. National
Academy of Sciences and the American Academy of Arts and Sciences.
xix
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
1
Introduction
This second volume in the Introduction to Spatial Data Science is devoted to the topic
of spatial clustering. More specifically, it deals with the grouping of observations into a
smaller number of clusters, which are designed to be representative of their members. The
techniques considered constitute an important part of so-called unsupervised learning in
modern machine learning. Purely statistical methods to discover spatial clusters in data are
beyond the scope.
In contrast to Volume 1, which assumed very little prior (spatial) knowledge, the current
volume is somewhat more advanced. At a minimum, it requires familiarity with the scope
of the exploratory toolbox included in the GeoDa software. In that sense, it clearly builds
upon the material covered in Volume 1. Important principles that are a main part of the
discussion in Volume 1 are assumed known. This includes linking and brushing, the various
types of maps and graphs, spatial weights and spatial autocorrelation statistics.
Much of the material covered in this volume pertains to methods that have been incorporated
into the GeoDa software only in the past few years, so as to support the second part of an
Introduction to Spatial Data Science course sequence. The particular perspective offered is the
tight integration of the clustering results with a spatial representation, through customized
cluster maps and by exploiting linking and brushing.
The treatment is slightly more technical than in the previous volume, but the mathematical
details can readily be skipped if the main interest is in application and interpretation.
Necessarily, the discussion relies on somewhat more formal concepts. Some examples are the
treatment of matrix eigenvalues and matrix decomposition, the concept of graph Laplacian,
essentials of information theory, elements of graph theory, advanced spatial data structures
such as quadtree and vantage point tree, and optimization algorithms like gradient search,
iterative greedy descent, simulated annealing and tabu search. These concepts are not
assumed known but will be explained in the text.
While many of the methods covered constitute part of mainstream data science, the perspec-
tive offered here is rather unique, with an enduring attempt at spatializing the respective
methods. In addition, the treatment of spatially constrained clustering introduces contiguity
as an additional element into clustering algorithms.
Most methods discussed are familiar from the literature, but some are new. Examples
include the common coverage percentage, a local measure of goodness of fit between distance
preserving dimension reduction methods, two new spatial measures to assess cluster quality,
i.e., the join count ratio and the cluster match map, a heuristic to obtain contiguous results
from classic clustering results and a hybrid approach toward spatially constrained clustering,
whereby the outcome of a given method is used as the initial feasible region in a second
method. The techniques are the results of refinements in the software and the presentation
of cluster results, and have not been published previously. In addition, the various methods
to spatialize cluster results are mostly also unique to the treatment in this volume.
DOI: 10.1201/9781032713175-1 1
2 Introduction
As in Volume 1, the coverage here also constitutes the definitive user’s guide to the GeoDa
software, complementing the previous discussion.
In the remainder of this introduction, I provide a broad overview of the organization of
Volume 2, followed by a listing of the sample data sets used. As was the case for Volume 1,
these data sets are included as part of the GeoDa software and do not need to be downloaded
separately. For a quick tour of the GeoDa software, I refer to the Introduction of Volume 1.
Dimension Reduction
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
2
Principal Component Analysis (PCA)
The familiar curse of dimensionality affects analysis across two dimensions. One is the
number of observations (big data) and the other is the number of variables considered.
The methods included in Volume 2 address this problem by reducing the dimensionality,
either in the number of observations (clustering) or in the number of variables (dimension
reduction). The three chapters in Part I address the latter problem. This chapter covers
principal components analysis (PCA), a core method of both multivariate statistics and
machine learning. Dimension reduction is particularly relevant in situations where many
variables are available that are highly intercorrelated. In essence, the original variables are
replaced by a smaller number of proxies that represent them well in terms of their statistical
properties.
Before delving into the formal derivation of principal components, a brief review is included
of some basic concepts from matrix algebra, focusing in particular on matrix decomposition.
Next follows a discussion of the mathematical properties of principal components and their
implementation and interpretation.
A distinct characteristic of this chapter is the attention paid to spatializing the inherently
non spatial concept of principal components. This is achieved by exploiting geovisualization,
linking and brushing to represent the dimension reduction in geographic space. Of particular
interest are principal component maps and the connection between univariate local cluster
maps for principal components and their multivariate counterpart.
The methods are illustrated using the Italy Community Banks sample data set.
DOI: 10.1201/9781032713175-2 7
8 Principal Component Analysis (PCA)
Toolbar Icons
Multiplying a matrix by a vector is slightly more complex, but again corresponds to a simple
geometric transformation. For example, consider the 2 × 2 matrix A:
1 3
A= .
3 2
The result of a multiplication of a 2 × 2 matrix by a 2 × 1 column vector is a 2 × 1 column
vector. The first element of this vector is obtained as the product of the matching elements
of the first row with the vector, the second element similarly as the product of the matching
elements of the second row with the vector. In the example, this boils down to:
(1 × 1) + (3 × 2) 7
Av = = .
(3 × 1) + (2 × 2) 5
Geometrically, this consists of a combination of rescaling and rotation. For example, in
Figure 2.2, first the slope of the vector is changed, followed by a rescaling to the point (7,5),
as shown by the blue dashed arrows.
A case of particular interest is for any matrix A to find a vector v, such that when post-
multiplied by that vector, there is only rescaling and no rotation. In other words, instead of
finding what happens to the point (1,2) after pre-multiplying by the matrix A, the interest
focuses on finding a particular vector that just moves a point up or down on the same slope
for that particular matrix. As it turns out, there are several such solutions. This problem
is known as finding eigenvectors and eigenvalues for a matrix. It has a broad range of
applications, including in the computation of principal components.
What does this mean? For an eigenvector (i.e., arrow from the origin), the transformation
by A does not rotate the vector, but simply rescales it (i.e., moves it further or closer to the
origin), by exactly the factor λ.
For the example matrix A, the two eigenvectors turn out to be [0.6464 0.7630] and [-0.7630
0.6464], with associated eigenvalues 4.541 and -1.541. Each square matrix has as many
eigenvectors and matching eigenvalues as its rank, in this case 2 – for a 2 by 2 nonsingular
matrix. The actual computation of eigenvalues and eigenvectors is rather complicated, and
is beyond the scope of this discussion.
To further illustrate this concept, consider post-multiplying the matrix A with its eigenvector
[0.6464 0.7630]:
(1 × 0.6464) + (3 × 0.7630) 2.935
=
(3 × 0.6464) + (2 × 0.7630) 3.465
The eigenvector rescaled by the matching eigenvalue gives the same result:
0.6464 2.935
4.541 × =
0.7630 3.465
In other words, for the point (0.6464 0.7630), a pre-multiplication by the matrix A just
moves it by a multiple of 4.541 to a new location on the same slope, without any rotation.
With the eigenvectors stacked in a matrix V, it is easy to verify that they are orthogonal
and the sum of squares of the coefficients sum to one, i.e., V V = I (with I as the identity
matrix):
0.6464 −0.7630 0.6464 −0.7630 1 0
=
0.7630 0.6464 0.7630 0.6464 0 1
In addition, it is easily verified that V V = I as well. This means that the transpose of V is
also its inverse (per the definition of an inverse matrix, i.e., a matrix for which the product
with the original matrix yields the identity matrix) or V −1 = V .
Eigenvectors and eigenvalues are central in many statistical analyses, but it is important
to realize they are not as complicated as they may seem at first sight. On the other hand,
computing them efficiently is complicated, and best left to specialized programs.
Finally, a couple of useful properties of eigenvalues are worth mentioning.
The sum of the eigenvalues equals the trace of the matrix. The trace is the sum of the
diagonal elements. For the matrix A in the example, the trace is 1 + 2 = 3. The sum of the
two eigenvalues is 4.541 − 1.541 = 3.
In addition, the product of the eigenvalues equals the determinant of the matrix. For a
2 × 2 matrix, the determinant is ab − cd, or the product of the diagonal elements minus the
product of the off-diagonal elements. In the example, that is (1 × 2) − (3 × 3) = −7. The
product of the two eigenvalues is 4.541 × −1.541 = −7.0.
AV = V G.
Note that V goes first in the matrix multiplication on the right hand side to ensure that
each column of V is multiplied by the corresponding eigenvalue on the diagonal of G to yield
λv. Taking advantage of the fact that the eigenvectors are orthogonal, namely that V V = I,
gives that post-multiplying each side of the equation by V yields AV V = V GV , or
A = V GV .
X = U DV ,
of the correlation matrix. Or, equivalently, the diagonal elements of the matrix D are the
square roots of the eigenvalues of X X. This property can be exploited to derive the principal
components of the matrix X.
zu = a1 x1 + a2 x2 + · · · + ak xk
The mathematical problem is to find the coefficients ah such that the new variables maximize
the explained variance of the original variables. In addition, to avoid an indeterminate
solution, the coefficients are scaled such that the sum of their squares equals 1.
A full mathematical treatment of the derivation of the optimal solution to this problem
is beyond the current scope (for details, see, e.g., Lee and Verleysen, 2007, Chapter 2).
Nevertheless, obtaining a basic intuition for the mathematical principles involved is useful.
The coefficients by which the original variables need to be multiplied to obtain each principal
component can be shown to correspond to the elements of the eigenvectors of X X, with the
2 The standardization should not be done mechanically, since there are instances where the variance
differences between the variables are actually meaningful, e.g., when the scales on which they are measured
have a strong substantive meaning (e.g., in psychology).
Principal Components 13
associated eigenvalue giving the explained variance. Even though the original data matrix X
is typically not square (of dimension n × k), the cross-product matrix X X is of dimension
k × k, so it is square and symmetric. As a result, all the eigenvalues are real numbers, which
avoids having to deal with complex numbers.
Operationally, the principal component coefficients are obtained by means of a matrix
decomposition. One option is to compute the spectral decomposition of the k × k matrix
X X, i.e., of the correlation matrix. As shown in Section 2.2.2.1, this yields:
X X = V GV ,
XV.
A second and computationally preferred way to approach this is as a singular value decom-
position (SVD) of the n × k matrix X, i.e., the matrix of (standardized) observations. From
Section 2.2.2.2, this follows as
X = U DV ,
where again V (the transpose of the k × k matrix V ) is the matrix with the eigenvectors
of X X as columns, and D is a k × k diagonal matrix, containing the square root of the
eigenvalues of X X on the diagonal.3 Note that the number of eigenvalues used in the
spectral decomposition and in SVD is the same, and equals k, the column dimension of X.
Since V V = I, the following result obtains when both sides of the SVD decomposition are
post-multiplied by V :
XV = U DV V = U D.
In other words, the principal components XV can also be obtained as the product of the
orthonormal matrix U with a diagonal matrix containing the square root of the eigenvalues,
D. This result is important in the context of multidimensional scaling, considered in
Chapter 3.
It turns out that the SVD approach is the solution to viewing the principal components
explicitly as a dimension reduction problem, originally considered by Karl Pearson. The
observed vector on the k variables x can be expressed as a function of a number of unknown
latent variables z, such that there is a linear relationship between them:
x = Az,
Instead of maximizing explained variance, the objective is now to find A and z such that
the so-called reconstruction error is minimized.4
Importantly, different computational approaches to obtain the eigenvalues and eigenvectors
(there is no analytical solution) may yield opposite signs for the elements of the eigenvectors.
However, the eigenvalues will be the same. The sign of the eigenvectors will affect the sign
of the resulting component, i.e., positives become negatives. For example, this can be the
difference between results based on a spectral decomposition versus SVD.
In a principal component analysis, the interest typically focuses on three main results. First,
the principal component scores are used as a replacement for the original variables. This
is particularly relevant when a small number of components explain a substantial share of
the original variance. Second, the relative contribution of each of the original variables to
each principal component is of interest. Finally, the variance proportion explained by each
component in and of itself is also important.
2.3.1 Implementation
Principal components are invoked from the drop-down list created by the toolbar Clusters
icon (Figure 2.1) as the top item (more precisely, the first item in the dimension reduction
category). Alternatively, from the main menu, Clusters > PCA gets the process started.
The illustration uses ten variables that characterize the efficiency of community banks, based
on the observations for 2013 from the Italy Community Bank sample data set (see Algeri
et al., 2022):
• CAPRAT: ratio of capital over risk weighted assets
• Z: z score of return on assets (ROA) + leverage over the standard deviation of ROA
• LIQASS: ratio of liquid assets over total assets
• NPL: ratio of non performing loans over total loans
• LLP: ratio of loan loss provision over customer loans
• INTR: ratio of interest expense over total funds
• DEPO: ratio of total deposits over total assets
• EQLN: ratio of total equity over customer loans
4 The concept of reconstruction error is somewhat technical. If A were a square matrix, one could solve
for z as z = A−1 x, where A−1 is the inverse of the matrix A. However, due to the dimension reduction, A is
not square, so something called a pseudo-inverse or Moore-Penrose inverse must be used. This is the p × k
matrix (A A)−1 A , such that z = (A A)−1 A x. Furthermore, because A A = I, this simplifies to z = A x
(of course, so far the elements of A are unknown). Since x = Az, if A were known, x could be found as Az,
or, as AA x. The reconstruction error is then the squared difference between x and AA x. The objective is
to find the coefficients for A that minimize this expression. For an extensive technical discussion, see Lee
and Verleysen (2007), Chapter 2.
Principal Components 15
2.3.2 Interpretation
The panel with summary results (Figure 2.5) provides several statistics pertaining to the
variance decomposition, the eigenvalues, the variable loadings and the contribution of each
of the original variables to the respective components.
When the original variables are all standardized, each eigenvector coefficient gives a measure
of the relative contribution of a variable to the component in question.
provides a way to interpret how the multiple dimensions along the ten original variables are
summarized into the main principal components.
Since the correlations are squared, they do not depend on the sign of the eigenvector elements,
unlike the loadings.
For example, the density cluster methods from Chapter 20 in Volume 1 could be employed
to identify clusters among the points in the PCA scatter plot. To achieve this, geographical
coordinates would be replaced by coordinates along the two principal component dimensions.
This provides an alternative perspective on multivariate local clusters.
To illustrate the effect of the choice of eigenvalue computation, Figure 2.10 shows a scatter
plot of the second principal component using the SVD method (PC2) and the Eigen method
(PC2e). The sign change is reflected in the perfect negative correlation in the scatter plot.
on global spatial autocorrelation, e.g., Jombart et al. (2008), Frichot et al. (2012).
22 Principal Component Analysis (PCA)
Figure 2.13: Principal Component Local Geary Cluster Map and PCP
Figure 2.14: Principal Component and Multivariate Local Geary Cluster Map
43 in the multivariate map. Interestingly, the number of spatial outliers is almost identical,
with two of them identified for the same locations on the island of Sicily, highlighted by the
blue rectangle.
There is also close correspondence between several cluster locations. For example, the High-
High cluster in the north-east Trentino region and the Low-Low cluster in the region of
Marche are shared by both maps (highlighted within a green rectangle). While these maps
may give similar impressions, it should be noted that in the multivariate Local Geary each
variable receives the same weight, whereas the principal component is based on different
contributions by each variable.
These findings again suggest that in some instances, a local spatial autocorrelation analysis
for one or a few dominant principal components may provide a viable alternative to a
full-fledged multivariate analysis. This constitutes a spatial aspect of principal components
analysis that is absent in standard treatments of the method.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Another Random Scribd Document
with Unrelated Content
the officers of order and decorum—could such a purpose be
supposed to be thought of? She dresses with neatness, according to
the established order, but always with such modesty that nothing is
offensive to the chastest eye. She understands the range of her
activity and of her affections. It is within the circle of family and
relatives. All her accomplishments are to make her home pleasing.
Duties and places are settled. She lives for those to whom she
belongs, and who also belong to her. Her smiles are for her husband,
and for her children, and her relations. She has no thought of going
abroad to shine, nor to waste the time and money which belong to
her family upon strangers. She never dreams that she has any
mission which calls her away from her home. She has no call to
"clothe the ragged," wash other people's dirty children, reform evil-
doers, "convert the heathen," nor support "Society!" (These are
some of the phrases which you will hear among the Barbarian
women).
Where women have not husbands, none the less they have relatives,
and their home is with them. They have a right to this home, and
are bound to do their duty in it, submissively, usefully, and quietly.
If the Western Barbarians would see to it that all women, married or
unmarried, were duly cared for in homes of relatives, as of right, and
that they also made themselves welcome there by their usefulness
and obedience, they would find an end of that agitation as to
Women's Rights existing among them. Rights would be as
indisputable as duties—and the first of these would be a quiet,
modest, and rational obedience to their natural protectors, who, in
turn, would be bound to respect and protect them. And if by any
strange chance a woman was absolutely without relatives (a thing
nearly impossible in our Flowery Land), then the State should see to
it that she had a suitable home.
The education of woman, in a well-ordered Society, is also fixed and
clear. It has immediate relation to her position and her duties.
She is from the first never disturbed in the natural order. She sees
her relatives always quiet, modest, obedient. She never thinks this
state of things to be wrong. She perceives the manner of female life;
its seclusion, its devotion to the family, its purpose, and end. There
is no complexity about it, no outside glitter, no field for show, no
seeking for excitement and display. All her duties are at home—her
happiness is there; there she is to be attractive, and there she is to
attract—the love and respect of her husband, the regard of her
relatives, the affection and obedience of her children!
So, her education needs no straining after effect. It looks directly to
her duties, to her natural function and place; and to those
accomplishments, of mind and of person, which shall enable her to
be happy with books, with music, and the like; and shall add to the
pleasures of her home.
All these things are common-place with us—so simple as to appear
trivial. Our Illustrious wives and mothers could not understand the
reasons for their elaboration—they have never seen the women of
the Western Barbarians!
The position of women in the Social system of the West, on the
whole, is the most remarkable thing in it.
I have made sufficiently suggestive remarks in the progress of these
Observations; and only now have to add a word or two upon the
general effect.
It gives a wonderful life, restlessness, and colour to the whole
aspect of Barbarian life. Think of all the women in our Illustrious
Land, at once leaving their homes, the seclusion of their orderly
houses and lives, and rushing everywhere with the men, over the
Land! And, not only so, dressed in splendid gaiety of colour, and
adorned with gems and feathers, crowding into all places of
amusement and of travel!
Nor this only, but showing themselves, in public places, with men,
where paintings and sculpture, and things here only seen by men
alone, are exhibited! And, often, so dressed as to cause even the
man to blush!
Why, the face of social life is completely altered. Instead of gravity,
dignity, and an undivided attention to the duties of daily life,
everything is rendered restless, confused; there seems to be no
natural order, nor scarcely natural (cultured) decorum.
But we must not be misled. Nature is too strong to be pushed aside
—and with cultivation, even though imperfect, the moral instinct
lives and saves. Habit, too, "is a second nature;" (as our divine
Confutzi says); and what would be so overwhelming, if at once
done, being usual, necessarily has been subordinated to some rule—
and made, at least, tolerable.
And now, in drawing these Observations to an end, perhaps, I may
add, in respect of my poor and unworthy thoughts, that if I have
said amiss, and which offends, I beg our Illustrious will pardon. To
our Literati, exalted in wisdom, there is but little to which they may
curiously look—but to our people, if any there be with whom some
discontent may have been caused by too close intimacy with
Missionaries in our ports; by these let my poor Observations be
studiously pondered—that they may praise the Sovereign Lord of
Heaven, who has given them to live in the Central and Illustrious
Kingdom; where a true morality and a true worship are known; and
where due ORDER AND PEACE, resting upon the unchangeable
Heavenly order and peace, are established!
Here, are no brutal worship of Force, and admiration of bloody
plunders. Content to the due ordering of affairs, and with peace
within, our Illustrious Realm seeks no aggrandisement, dreams of no
conquests; and wishes to do nothing but good. It has no fears for its
own position, nor jealousy of others. It is simply calm, strong, wise,
and self-poised. It demands no more from others abroad than that it
may peacefully live; and be treated with that respect which it
accords to those who practise moderation and virtue.
FINIS.
Barrett, Sons & Co., Printers, 21, Seething Lane, London, E.C.
*** END OF THE PROJECT GUTENBERG EBOOK SOME
OBSERVATIONS UPON THE CIVILIZATION OF THE WESTERN
BARBARIANS, PARTICULARLY OF THE ENGLISH ***
Updated editions will replace the previous one—the old editions will
be renamed.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
ebookgate.com