100% found this document useful (8 votes)
96 views

Full Download Multivariate Statistical Methods A Primer Third Edition Manly PDF DOCX

Third

Uploaded by

emotodounggb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (8 votes)
96 views

Full Download Multivariate Statistical Methods A Primer Third Edition Manly PDF DOCX

Third

Uploaded by

emotodounggb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Download the full version of the ebook at

https://ebookfinal.com

Multivariate Statistical Methods A Primer


Third Edition Manly

https://ebookfinal.com/download/multivariate-
statistical-methods-a-primer-third-edition-manly/

Explore and download more ebook at https://ebookfinal.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Multivariate Statistical Methods A Primer 4th Edition


Bryan F.J. Manly

https://ebookfinal.com/download/multivariate-statistical-methods-a-
primer-4th-edition-bryan-f-j-manly/

ebookfinal.com

Randomization Bootstrap and Monte Carlo Methods in Biology


Third Edition Bryan F.J. Manly

https://ebookfinal.com/download/randomization-bootstrap-and-monte-
carlo-methods-in-biology-third-edition-bryan-f-j-manly/

ebookfinal.com

A Climate Modelling Primer Third Edition Kendal Mcguffie

https://ebookfinal.com/download/a-climate-modelling-primer-third-
edition-kendal-mcguffie/

ebookfinal.com

Multivariate Methods in Chromatography A Practical Guide


1st Edition Tibor Cserhati

https://ebookfinal.com/download/multivariate-methods-in-
chromatography-a-practical-guide-1st-edition-tibor-cserhati/

ebookfinal.com
Introduction to Multivariate Statistical Analysis in
Chemometrics 1st Edition Kurt Varmuza

https://ebookfinal.com/download/introduction-to-multivariate-
statistical-analysis-in-chemometrics-1st-edition-kurt-varmuza/

ebookfinal.com

Multivariate Methods in Epidemiology 1st Edition Theodore


R. Holford

https://ebookfinal.com/download/multivariate-methods-in-
epidemiology-1st-edition-theodore-r-holford/

ebookfinal.com

A Handbook of Statistical Analyses using SAS Third Edition


Der

https://ebookfinal.com/download/a-handbook-of-statistical-analyses-
using-sas-third-edition-der/

ebookfinal.com

Measuring Customer Satisfaction and Loyalty Third Edition


Survey Design Use and Statistical Analysis Methods Bob E.
Hayes
https://ebookfinal.com/download/measuring-customer-satisfaction-and-
loyalty-third-edition-survey-design-use-and-statistical-analysis-
methods-bob-e-hayes/
ebookfinal.com

Statistical methods 2nd ed Edition Rudolf J. Freund

https://ebookfinal.com/download/statistical-methods-2nd-ed-edition-
rudolf-j-freund/

ebookfinal.com
Multivariate Statistical Methods A Primer Third Edition
Manly Digital Instant Download
Author(s): Manly, Bryan F.J
ISBN(s): 9781482285987, 1482285983
Edition: 3rd ed
File Details: PDF, 10.71 MB
Year: 2004
Language: english
Multivariate
Statistical
Methods
A primer
THIRD EDITION

Bryan F.J. Manly


Western EcoSystems Technology, Inc.
Laramie, Wyoming, USA

CHAPMAN & HALL/CRC


A CRC Press Company
Boca Raton London New York Washington, D.C.
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2004 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works


Version Date: 20140513

International Standard Book Number-13: 978-1-4822-8598-7 (eBook - PDF)

This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.copy-
right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222
Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro-
vides licenses and registration for a variety of users. For organizations that have been granted a photo-
copy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
A journey of a thousand miles begins with a single step

Lao Tsu
This page intentionally left blank
Contents
Chapter 1 The material of multivariate analysis .........................................1
1.1 Examples of multivariate data.....................................................................1
1.2 Preview of multivariate methods..............................................................12
1.3 The multivariate normal distribution.......................................................14
1.4 Computer programs ....................................................................................15
1.5 Graphical methods.......................................................................................15
1.6 Chapter summary ........................................................................................16
References...............................................................................................................16

Chapter 2 Matrix algebra.................................................................................17


2.1 The need for matrix algebra ......................................................................17
2.2 Matrices and vectors ...................................................................................17
2.3 Operations on matrices ...............................................................................19
2.4 Matrix inversion...........................................................................................21
2.5 Quadratic forms ...........................................................................................22
2.6 Eigenvalues and eigenvectors....................................................................22
2.7 Vectors of means and covariance matrices..............................................23
2.8 Further reading ............................................................................................25
2.9 Chapter summary ........................................................................................25
References...............................................................................................................26

Chapter 3 Displaying multivariate data ......................................................27


3.1 The problem of displaying many variables in two dimensions..........27
3.2 Plotting index variables ..............................................................................27
3.3 The draftsman’s plot ...................................................................................29
3.4 The representation of individual data points .........................................30
3.5 Profiles of variables .....................................................................................32
3.6 Discussion and further reading.................................................................33
3.7 Chapter summary ........................................................................................34
References...............................................................................................................34

Chapter 4 Tests of significance with multivariate data ............................35


4.1 Simultaneous tests on several variables ..................................................35
4.2 Comparison of mean values for two samples: the single variable
case .................................................................................................................35
4.3 Comparison of mean values for two samples: the multivariate
case .................................................................................................................37
4.4 Multivariate versus univariate tests .........................................................41
4.5 Comparison of variation for two samples: the single-variable
case .................................................................................................................42
4.6 Comparison of variation for two samples: the multivariate case..........42
4.7 Comparison of means for several samples .............................................46
4.8 Comparison of variation for several samples .........................................49
4.9 Computer programs ....................................................................................54
4.10 Chapter summary ........................................................................................54
Exercise ...................................................................................................................55
References...............................................................................................................57

Chapter 5 Measuring and testing multivariate distances ........................59


5.1 Multivariate distances .................................................................................59
5.2 Distances between individual observations ............................................59
5.3 Distances between populations and samples .........................................62
5.4 Distances based on proportions ................................................................67
5.5 Presence–absence data ................................................................................68
5.6 The Mantel randomization test .................................................................69
5.7 Computer programs ....................................................................................72
5.8 Discussion and further reading.................................................................72
5.9 Chapter summary ........................................................................................73
Exercise ...................................................................................................................74
References...............................................................................................................74

Chapter 6 Principal components analysis ...................................................75


6.1 Definition of principal components..........................................................75
6.2 Procedure for a principal components analysis .....................................76
6.3 Computer programs ....................................................................................84
6.4 Further reading ............................................................................................85
6.5 Chapter summary ........................................................................................85
Exercises .................................................................................................................87
References...............................................................................................................90

Chapter 7 Factor analysis ................................................................................91


7.1 The factor analysis model ..........................................................................91
7.2 Procedure for a factor analysis ..................................................................93
7.3 Principal components factor analysis.......................................................95
7.4 Using a factor analysis program to do principal components
analysis ..........................................................................................................97
7.5 Options in analyses ...................................................................................100
7.6 The value of factor analysis .....................................................................101
7.7 Computer programs ..................................................................................101
7.8 Discussion and further reading...............................................................102
7.9 Chapter summary ......................................................................................102
Exercise .................................................................................................................103
References.............................................................................................................103

Chapter 8 Discriminant function analysis ................................................105


8.1 The problem of separating groups .........................................................105
8.2 Discrimination using Mahalanobis distances .......................................105
8.3 Canonical discriminant functions ...........................................................107
8.4 Tests of significance ...................................................................................108
8.5 Assumptions ...............................................................................................109
8.6 Allowing for prior probabilities of group membership...................... 114
8.7 Stepwise discriminant function analysis ............................................... 114
8.8 Jackknife classification of individuals .................................................... 116
8.9 Assigning of ungrouped individuals to groups ................................... 116
8.10 Logistic regression ..................................................................................... 117
8.11 Computer programs ..................................................................................122
8.12 Discussion and further reading...............................................................122
8.13 Chapter summary ......................................................................................123
Exercises ...............................................................................................................124
References.............................................................................................................124

Chapter 9 Cluster analysis ............................................................................125


9.1 Uses of cluster analysis.............................................................................125
9.2 Types of cluster analysis ...........................................................................125
9.3 Hierarchic methods ...................................................................................127
9.4 Problems of cluster analysis.....................................................................129
9.5 Measures of distance .................................................................................129
9.6 Principal components analysis with cluster analysis ..........................130
9.7 Computer programs ..................................................................................134
9.8 Discussion and further reading...............................................................135
9.9 Chapter summary ......................................................................................136
Exercises ...............................................................................................................137
References.............................................................................................................141

Chapter 10 Canonical correlation analysis ................................................143


10.1 Generalizing a multiple regression analysis .........................................143
10.2 Procedure for a canonical correlation analysis .....................................145
10.3 Tests of significance ...................................................................................146
10.4 Interpreting canonical variates ................................................................148
10.5 Computer programs ..................................................................................158
10.6 Further reading ..........................................................................................158
10.7 Chapter summary ......................................................................................159
Exercise .................................................................................................................159
References.............................................................................................................161

Chapter 11 Multidimensional scaling ........................................................163


11.1 Constructing a map from a distance matrix .........................................163
11.2 Procedure for multidimensional scaling................................................165
11.3 Computer programs ..................................................................................172
11.4 Further reading ..........................................................................................174
11.5 Chapter summary ......................................................................................174
Exercise .................................................................................................................175
References.............................................................................................................175

Chapter 12 Ordination ...................................................................................177


12.1 The ordination problem ............................................................................177
12.2 Principal components analysis ................................................................178
12.3 Principal coordinates analysis .................................................................181
12.4 Multidimensional scaling .........................................................................189
12.5 Correspondence analysis ..........................................................................191
12.6 Comparison of ordination methods .......................................................196
12.7 Computer programs ..................................................................................197
12.8 Further reading ..........................................................................................197
12.9 Chapter summary ......................................................................................198
Exercise .................................................................................................................198
References.............................................................................................................198

Chapter 13 Epilogue .......................................................................................201


13.1 The next step ..............................................................................................201
13.2 Some general reminders ...........................................................................201
13.3 Missing values............................................................................................202
References.............................................................................................................203

Appendix Computer packages for multivariate analyses.....................205


References.............................................................................................................207

Author Index.......................................................................................................209

Subject Index ...................................................................................................... 211


Preface
The purpose of this book is to introduce multivariate statistical methods to
non-mathematicians. It is not intended to be a comprehensive textbook.
Rather, the intention is to keep the details to a minimum while serving as a
practical guide that illustrates the possibilities of multivariate statistical
analysis. In other words, it is a book to “get you going” in a particular area
of statistical methods.
It is assumed that readers have a working knowledge of elementary
statistics, including tests of significance using normal-, t-, chi-squared, and
F-distributions; analysis of variance; and linear regression. The material cov-
ered in a typical first-year university course in statistics should be quite
adequate in this respect. Some facility with algebra is also required to follow
the equations in certain parts of the text. Understanding the theory of mul-
tivariate methods requires some matrix algebra. However, the amount
needed is not great if some details are accepted on faith. Matrix algebra is
summarized in Chapter 2, and anyone that masters this chapter will have a
reasonable competency in this area.
One of the reasons why multivariate methods are being used so often
these days is the ready availability of computer packages to do the calcula-
tions. Indeed, access to suitable computer software is essential if the methods
are to be used. However, the details of the use of computer packages are not
stressed in this book because there are so many of these packages available.
It would be impossible to discuss them all, and it would be too restrictive
to concentrate on one or two of them. The approach taken here is to mention
which package was used for a particular example when this is appropriate.
In addition, the Appendix gives information about some of the packages in
terms of what analyses are available and how easy the programs are to use
for someone who is relatively inexperienced at carrying out multivariate
analyses.
To some extent, the chapters can be read independently of each other.
The first five are preliminary reading, focusing mainly on general aspects of
multivariate data rather than specific techniques. Chapter 1 introduces data
for several examples that are used to illustrate the application of analytical
methods throughout the book. Chapter 2 covers matrix algebra, and Chapter
3 discusses various graphical techniques. Chapter 4 discusses tests of signif-
icance, and Chapter 5 addresses the measurement of relative “distances”
between objects based on variables measured on those objects. These chap-
ters should be reviewed before Chapters 6 to 12, which cover the most
important multivariate procedures in current use. The final Chapter 13 con-
tains some general comments about the analysis of multivariate data.
The chapters in this third edition of the book are the same as those in
the second edition. The changes that have been made for the new edition
are the updating of references, some new examples, some examples carried
out using newer computer software, and changes in the text to reflect new
ideas about multivariate analyses.
In making changes, I have continually kept in mind the original intention
of the book, which was that it should be as short as possible and attempt to
do no more than take readers to the stage where they can begin to use
multivariate methods in an intelligent manner.
I am indebted to many people for commenting on the text of the three
editions of the book and for pointing out various errors. Particularly, I thank
Earl Bardsley, John Harraway, and Liliana Gonzalez for their help in this
respect. Any errors that remain are my responsibility alone.
I would like to express my appreciation to the Department of Mathe-
matics and Statistics at the University of Otago in New Zealand for hosting
me as a visitor twice in 2003, first in May and June, and later in November
and December. The excellent university library was particularly important
for my final updating of references.
In conclusion, I wish to thank the staff of Chapman and Hall and of CRC
for their work over the years in promoting the book and encouraging me to
produce the second and third editions.

Bryan F.J. Manly


Laramie, Wyoming
chapter one

The material of multivariate


analysis
1.1 Examples of multivariate data
The statistical methods that are described in elementary texts are mostly
univariate methods because they are only concerned with analyzing varia-
tion in a single random variable. On the other hand, the whole point of a
multivariate analysis is to consider several related variables simultaneously,
with each one being considered to be equally important, at least initially.
The potential value of this more general approach can be seen by considering
a few examples.

Example 1.1 Storm survival of sparrows


After a severe storm on 1 February 1898, a number of moribund sparrows
were taken to Hermon Bumpus’ biological laboratory at Brown University
in Rhode Island. Subsequently about half of the birds died, and Bumpus
saw this as an opportunity to see whether he could find any support for
Charles Darwin’s theory of natural selection. To this end, he made eight
morphological measurements on each bird, and also weighed them. The
results for five of the measurements are shown in Table 1.1, for females only.
From the data that he obtained, Bumpus (1898) concluded that “the birds
which perished, perished not through accident, but because they were phys-
ically disqualified, and that the birds which survived, survived because they
possessed certain physical characters.” Specifically, he found that the survi-
vors “are shorter and weigh less … have longer wing bones, longer legs,
longer sternums and greater brain capacity” than the nonsurvivors. He also
concluded that “the process of selective elimination is most severe with
extremely variable individuals, no matter in which direction the variation
may occur. It is quite as dangerous to be above a certain standard of organic
excellence as it is to be conspicuously below the standard.” This was saying
that stabilizing selection occurred, so that individuals with measurements

1
2 Multivariate Statistical Methods: A Primer, Third Edition

Table 1.1 Body Measurements of Female Sparrows


X1 X2 X3 X4 X5
Bird (mm) (mm) (mm) (mm) (mm)
1 156 245 31.6 18.5 20.5
2 154 240 30.4 17.9 19.6
3 153 240 31.0 18.4 20.6
4 153 236 30.9 17.7 20.2
5 155 243 31.5 18.6 20.3
6 163 247 32.0 19.0 20.9
7 157 238 30.9 18.4 20.2
8 155 239 32.8 18.6 21.2
9 164 248 32.7 19.1 21.1
10 158 238 31.0 18.8 22.0
11 158 240 31.3 18.6 22.0
12 160 244 31.1 18.6 20.5
13 161 246 32.3 19.3 21.8
14 157 245 32.0 19.1 20.0
15 157 235 31.5 18.1 19.8
16 156 237 30.9 18.0 20.3
17 158 244 31.4 18.5 21.6
18 153 238 30.5 18.2 20.9
19 155 236 30.3 18.5 20.1
20 163 246 32.5 18.6 21.9
21 159 236 31.5 18.0 21.5
22 155 240 31.4 18.0 20.7
23 156 240 31.5 18.2 20.6
24 160 242 32.6 18.8 21.7
25 152 232 30.3 17.2 19.8
26 160 250 31.7 18.8 22.5
27 155 237 31.0 18.5 20.0
28 157 245 32.2 19.5 21.4
29 165 245 33.1 19.8 22.7
30 153 231 30.1 17.3 19.8
31 162 239 30.3 18.0 23.1
32 162 243 31.6 18.8 21.3
33 159 245 31.8 18.5 21.7
34 159 247 30.9 18.1 19.0
35 155 243 30.9 18.5 21.3
36 162 252 31.9 19.1 22.2
37 152 230 30.4 17.3 18.6
38 159 242 30.8 18.2 20.5
39 155 238 31.2 17.9 19.3
40 163 249 33.4 19.5 22.8
41 163 242 31.0 18.1 20.7
42 156 237 31.7 18.2 20.3
43 159 238 31.5 18.4 20.3
44 161 245 32.1 19.1 20.8
45 155 235 30.7 17.7 19.6
(continued)
Chapter one: The material of multivariate analysis 3

Table 1.1 (continued) Body Measurements of Female Sparrows


X1 X2 X3 X4 X5
Bird (mm) (mm) (mm) (mm) (mm)
46 162 247 31.9 19.1 20.4
47 153 237 30.6 18.6 20.4
48 162 245 32.5 18.5 21.1
49 164 248 32.3 18.8 20.9

Note: X1 = total length, X2 = alar extent, X3 = length of beak and head, X4 = length of
humerus, and X5 = length of keel of sternum. Birds 1 to 21 survived, and birds 22 to
49 died. The data source is Bumpus (1898), who measured in inches and millimeters.
Source: Adapted from Bumpus, H.C. (1898), Biological Lectures, 11th Lecture, Marine Biology
Laboratory, Woods Hole, MA, pp. 209–226.

close to the average survived better than individuals with measurements far
from the average.
In fact, the development of multivariate statistical methods had hardly
begun in 1898 when Bumpus was writing. The correlation coefficient as a
measure of the relationship between two variables was devised by Francis
Galton in 1877. However, it was another 56 years before Harold Hotelling
described a practical method for carrying out a principal components analy-
sis, which is one of the simplest multivariate analyses that can usefully be
applied to Bumpus’ data. Bumpus did not even calculate standard devia-
tions. Nevertheless, his methods of analysis were sensible. Many authors
have reanalyzed his data and, in general, have confirmed his conclusions.
Taking the data as an example for illustrating multivariate methods,
several interesting questions arise. In particular:

1. How are the various measurements related? For example, does a


large value for one of the variables tend to occur with large values
for the other variables?
2. Do the survivors and nonsurvivors have statistically significant dif-
ferences for their mean values of the variables?
3. Do the survivors and nonsurvivors show similar amounts of varia-
tion for the variables?
4. If the survivors and nonsurvivors do differ in terms of the distribu-
tions of the variables, then is it possible to construct some function
of these variables that separates the two groups? It would then be
convenient if large values of the function tended to occur with the
survivors, as the function would then apparently be an index of the
Darwinian fitness of the sparrows.

Example 1.2 Egyptian skulls


For a second example, consider the data shown in Table 1.2 for measurements
made on male skulls from the area of Thebes in Egypt. There are five samples
of 30 skulls each from the early predynastic period (circa 4000 B.C.), the late
C4142_C01.fm Page 4 Monday, May 10, 2004 3:37 PM
4
Multivariate Statistical Methods: A Pri
Table 1.2 Measurement on Male Egyptian Skulls (mm)
12th and 13th
Early Predynastic Late Predynastic Dynasties Ptolemaic Period Roman Period
Skull X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4
1 131 138 89 49 124 138 101 48 137 141 96 52 137 134 107 54 137 123 91 50
2 125 131 92 48 133 134 97 48 129 133 93 47 141 128 95 53 136 131 95 49
3 131 132 99 50 138 134 98 45 132 138 87 48 141 130 87 49 128 126 91 57
4 119 132 96 44 148 129 104 51 130 134 106 50 135 131 99 51 130 134 92 52
5 136 143 100 54 126 124 95 45 134 134 96 45 133 120 91 46 138 127 86 47
6 138 137 89 56 135 136 98 52 140 133 98 50 131 135 90 50 126 138 101 52
7 139 130 108 48 132 145 100 54 138 138 95 47 140 137 94 60 136 138 97 58
8 125 136 93 48 133 130 102 48 136 145 99 55 139 130 90 48 126 126 92 45
9 131 134 102 51 131 134 96 50 136 131 92 46 140 134 90 51 132 132 99 55
10 134 134 99 51 133 125 94 46 126 136 95 56 138 140 100 52 139 135 92 54
C4142_C01.fm Page 5 Monday, May 10, 2004 3:37 PM
Chapter one:
17 135 137 103 50 138 129 107 53 129 135 92 50 131 141 99 55 138 125 99 51
18 132 133 93 53 123 131 101 51 134 125 90 60 129 135 95 47 137 135 96 54
19 139 136 96 50 130 129 105 47 138 134 96 51 136 128 93 54 133 125 92 50
20 132 131 101 49 134 130 93 54 136 135 94 53 131 125 88 48 145 129 89 47
21 126 133 102 51 137 136 106 49 132 130 91 52 139 130 94 53 138 136 92 46
22 135 135 103 47 126 131 100 48 133 131 100 50 144 124 86 50 131 129 97 44

The material of multivariate analysis


23 134 124 93 53 135 136 97 52 138 137 94 51 141 131 97 53 143 126 88 54
24 128 134 103 50 129 126 91 50 130 127 99 45 130 131 98 53 134 124 91 55
25 130 130 104 49 134 139 101 49 136 133 91 49 133 128 92 51 132 127 97 52
26 138 135 100 55 131 134 90 53 134 123 95 52 138 126 97 54 137 125 85 57
27 128 132 93 53 132 130 104 50 136 137 101 54 131 142 95 53 129 128 81 52
28 127 129 106 48 130 132 93 52 133 131 96 49 136 138 94 55 140 135 103 48
29 131 136 114 54 135 132 98 54 138 133 100 55 132 136 92 52 147 129 87 48
30 124 138 101 46 130 128 101 51 138 133 91 46 135 130 100 51 136 133 97 51

Note: X1 = maximum breadth, X2 = basibregmatic height, X3 = basialveolar length, X4 = nasal height.


Source: From Thomson, A. and Randall-Maciver, R. (1905), Ancient Races of the Thebaid, Oxford University Press, Oxford, U.K.
6 Multivariate Statistical Methods: A Primer, Third Edition

x2 x1

x4

x3

Figure 1.1 Four measurements made on Egyptian male skulls.

predynastic period (circa 3300 B.C.), the 12th and 13th dynasties (circa
1850 B.C.), the Ptolemaic period (circa 200 B.C.), and the Roman period
(circa A.D. 150). Four measurements are available on each skull, as illustrated
in Figure 1.1.
For this example, some interesting questions are:

1. How are the four measurements related?


2. Are there statistically significant differences in the sample means for
the variables, and if so, do these differences reflect gradual changes
over time in the shape and size of skulls?
3. Are there significant differences in the sample standard deviations
for the variables, and, if so, do these differences reflect gradual chang-
es over time in the amount of variation?
4. Is it possible to construct a function of the four variables that, in some
sense, describes the changes over time?

These questions are, of course, rather similar to the ones suggested for
Example 1.1.
It will be seen later that there are differences between the five samples
that can be explained partly as time trends. It must be said, however, that
the reasons for the apparent changes are unknown. Migration of other races
into the region may well have been the most important factor.

Example 1.3 Distribution of a butterfly


A study of 16 colonies of the butterfly Euphydryas editha in California and
Oregon produced the data shown in Table 1.3. Here there are four environ-
mental variables (altitude, annual precipitation, and the minimum and max-
imum temperatures) and six genetic variables (percentage frequencies for
different Pgi genes as determined by the technique of electrophoresis). For
the purposes of this example, there is no need to go into the details of how
the gene frequencies were determined, and, strictly speaking, they are not
exactly gene frequencies anyway. It is sufficient to say that the frequencies
C4142_C01.fm Page 7 Monday, May 10, 2004 3:37 PM
Chapter one:
Table 1.3 Environmental Variables and Phosphoglucose-Isomerase (Pgi) Gene Frequencies for Colonies of the Butterfly
Euphydryas editha in California and Oregona
Annual
Altitude Precipitation Temperature (°F) Frequencies of Pgi Mobility Genes (%)b
Colony (ft) (in.) Maximum Minimum 0.4 0.6 0.8 1 1.16 1.3

The material of multivariate analysis


SS 500 43 98 17 0 3 22 57 17 1
SB 808 20 92 32 0 16 20 38 13 13
WSB 570 28 98 26 0 6 28 46 17 3
JRC 550 28 98 26 0 4 19 47 27 3
JRH 550 28 98 26 0 1 8 50 35 6
SJ 380 15 99 28 0 2 19 44 32 3
CR 930 21 99 28 0 0 15 50 27 8
UO 650 10 101 27 10 21 40 25 4 0
LO 600 10 101 27 14 26 32 28 0 0
DP 1,500 19 99 23 0 1 6 80 12 1
PZ 1,750 22 101 27 1 4 34 33 22 6
MC 2,000 58 100 18 0 7 14 66 13 0
IF 2,500 34 102 16 0 9 15 47 21 8
AF 2,000 21 105 20 3 7 17 32 27 14
GH 7,850 42 84 5 0 5 7 84 4 0
GL 10,500 50 81 –12 0 3 1 92 4 0
a The data source was McKechnie et al. (1975), with the environmental variables rounded to integers for simplicity. The original data
8 Multivariate Statistical Methods: A Primer, Third Edition

SS (Oregon)

MC

DP GL
SB
IF
WSB
JRC
CR AF GH
JRH
SJ
PZ

California

Scale UO
0 50 100
Miles
LO

Figure 1.2 Colonies of Euphydryas editha in California and Oregon.

describe the genetic distribution of the butterfly to some extent. Figure 1.2
shows the geographical locations of the colonies.
In this example, questions that can be asked include:

1. Are the Pgi frequencies similar for colonies that are close in space?
2. To what extent, if any, are the Pgi frequencies related to the environ-
mental variables?

These are important questions in trying to decide how the Pgi frequen-
cies are determined. If the genetic composition of the colonies was largely
determined by past and present migration, then gene frequencies will tend
to be similar for colonies that are close in space, although they may show
little relationship with the environmental variables. On the other hand, if it
is the environment that is most important, then this should show up in
relationships between the gene frequencies and the environmental variables
(assuming that the right variables have been measured), but close colonies
will only have similar gene frequencies if they have similar environments.
Obviously, colonies that are close in space will usually have similar environ-
ments, so it may be difficult to reach a clear conclusion on this matter.
Chapter one: The material of multivariate analysis 9

Table 1.4 Mean Mandible Measurements for Seven Canine Groups


X1 X2 X3 X4 X5 X6
Group (mm) (mm) (mm) (mm) (mm) (mm)
Modern dog 9.7 21.0 19.4 7.7 32.0 36.5
Golden jackal 8.1 16.7 18.3 7.0 30.3 32.9
Chinese wolf 13.5 27.3 26.8 10.6 41.9 48.1
Indian wolf 11.5 24.3 24.5 9.3 40.0 44.6
Cuon 10.7 23.5 21.4 8.5 28.8 37.6
Dingo 9.6 22.6 21.1 8.3 34.4 43.1
Prehistoric dog 10.3 22.1 19.1 8.1 32.2 35.0

Note: X1 = breadth of mandible; X2 = height of mandible below the first molar; X3 = length of
the first molar; X4 = breadth of the first molar; X5 = length from first to third molar,
inclusive; and X6 = length from first to fourth premolar, inclusive.
Source: Adapted from Higham, C.F.W. et al. (1980), J. Archaeological Sci., 7, 149–165.

Example 1.4 Prehistoric dogs from Thailand


Excavations of prehistoric sites in northeast Thailand have produced a col-
lection of canid (dog) bones covering a period from about 3500 B.C. to the
present. However, the origin of the prehistoric dog is not certain. It could
descend from the golden jackal (Canis aureus) or the wolf, but the wolf is not
native to Thailand. The nearest indigenous sources are western China (Canis
lupus chanco) or the Indian subcontinent (Canis lupus pallides).
In order to try to clarify the ancestors of the prehistoric dogs, mandible
(lower jaw) measurements were made on the available specimens. These
were then compared with the same measurements made on the golden
jackal, the Chinese wolf, and the Indian wolf. The comparisons were also
extended to include the dingo, which may have its origins in India; the cuon
(Cuon alpinus), which is indigenous to southeast Asia; and modern village
dogs from Thailand.
Table 1.4 gives mean values for the six mandible measurements for
specimens from all seven groups. The main question here is what the mea-
surements suggest about the relationships between the groups and, in par-
ticular, how the prehistoric dog seems to relate to the other groups.

Example 1.5 Employment in European countries


Finally, as a contrast to the previous biological examples, consider the data
in Table 1.5. This shows the percentages of the labor force in nine different
types of industry for 30 European countries. In this case, multivariate meth-
ods may be useful in isolating groups of countries with similar employment
patterns, and in generally aiding the understanding of the relationships
between the countries. Differences between countries that are related to
political grouping (EU, the European Union; EFTA, the European Free Trade
Area; the Eastern European countries; and other countries) may be of par-
ticular interest.
C4142_C01.fm Page 10 Monday, May 10, 2004 3:37 PM
10
Multivariate Statistical Methods: A Pri
Table 1.5 Percentages of the Workforce Employed in Nine Different Industry Groups in 30 Countries in Europe
Country Group AGR MIN MAN PS CON SER FIN SPS TC
Belgium EU 2.6 0.2 20.8 0.8 6.3 16.9 8.7 36.9 6.8
Denmark EU 5.6 0.1 20.4 0.7 6.4 14.5 9.1 36.3 7.0
France EU 5.1 0.3 20.2 0.9 7.1 16.7 10.2 33.1 6.4
Germany EU 3.2 0.7 24.8 1.0 9.4 17.2 9.6 28.4 5.6
Greece EU 22.2 0.5 19.2 1.0 6.8 18.2 5.3 19.8 6.9
Ireland EU 13.8 0.6 19.8 1.2 7.1 17.8 8.4 25.5 5.8
Italy EU 8.4 1.1 21.9 0.0 9.1 21.6 4.6 28.0 5.3
Luxembourg EU 3.3 0.1 19.6 0.7 9.9 21.2 8.7 29.6 6.8
Netherlands EU 4.2 0.1 19.2 0.7 0.6 18.5 11.5 38.3 6.8
Portugal EU 11.5 0.5 23.6 0.7 8.2 19.8 6.3 24.6 4.8
Spain EU 9.9 0.5 21.1 0.6 9.5 20.1 5.9 26.7 5.8
U.K. EU 2.2 0.7 21.3 1.2 7.0 20.2 12.4 28.4 6.5
C4142_C01.fm Page 11 Monday, May 10, 2004 3:37 PM
Chapter one:
Albania Eastern 55.5 19.4 0.0 0.0 3.4 3.3 15.3 0.0 3.0
Bulgaria Eastern 19.0 0.0 35.0 0.0 6.7 9.4 1.5 20.9 7.5
Czech/Slovak Republics Eastern 12.8 37.3 0.0 0.0 8.4 10.2 1.6 22.9 6.9
Hungary Eastern 15.3 28.9 0.0 0.0 6.4 13.3 0.0 27.3 8.8
Poland Eastern 23.6 3.9 24.1 0.9 6.3 10.3 1.3 24.5 5.2
Romania Eastern 22.0 2.6 37.9 2.0 5.8 6.9 0.6 15.3 6.8

The material of multivariate analysis


USSR (former) Eastern 18.5 0.0 28.8 0.0 10.2 7.9 0.6 25.6 8.4
Yugoslavia (former) Eastern 5.0 2.2 38.7 2.2 8.1 13.8 3.1 19.1 7.8
Cyprus Other 13.5 0.3 19.0 0.5 9.1 23.7 6.7 21.2 6.0
Gibraltar Other 0.0 0.0 6.8 2.0 16.9 24.5 10.8 34.0 5.0
Malta Other 2.6 0.6 27.9 1.5 4.6 10.2 3.9 41.6 7.2
Turkey Other 44.8 0.9 15.3 0.2 5.2 12.4 2.4 14.5 4.4

Note: AGR, agriculture, forestry, and fishing; MIN, mining and quarrying; MAN, manufacturing; PS, power and water supplies; CON,
construction; SER, services; FIN, finance; SPS, social and personal services; TC, transport and communications. The data for the
individual countries are for various years from 1989 to 1995. Data from Euromonitor (1995), except for Germany and the U.K., where
more reasonable values were obtained from the United Nations Statistical Yearbook (2000).
Source: Adapted from Euromonitor (1995), European Marketing Data and Statistics, Euromonitor Publications, London; and from United
Nations (2000), Statistical Yearbook, 44th Issue, U.N. Department of Social Affairs, New York.
12 Multivariate Statistical Methods: A Primer, Third Edition

1.2 Preview of multivariate methods


The five examples just considered are typical of the raw material for multi-
variate statistical methods. In all cases, there are several variables of interest,
and these are clearly not independent of each other. At this point, it is useful
to give a brief preview of what is to come in the chapters that follow in
relationship to these examples.
Principal components analysis is designed to reduce the number of vari-
ables that need to be considered to a small number of indices (called the
principal components) that are linear combinations of the original variables.
For example, much of the variation in the body measurements of sparrows
(X1 to X5) shown in Table 1.1 will be related to the general size of the birds,
and the total

I1 = X1 + X2 + X3 + X4 + X5

should measure this aspect of the data quite well. This accounts for one
dimension of the data. Another index is

I2 = X1 + X2 + X3 – X4 – X5

which is a contrast between the first three measurements and the last two.
This reflects another dimension of the data. Principal components analysis
provides an objective way of finding indices of this type so that the variation
in the data can be accounted for as concisely as possible. It may well turn
out that two or three principal components provide a good summary of all
the original variables. Consideration of the values of the principal compo-
nents instead of the values of the original variables may then make it much
easier to understand what the data have to say. In short, principal compo-
nents analysis is a means of simplifying data by reducing the number of
variables.
Factor analysis also attempts to account for the variation in a number of
original variables using a smaller number of index variables or factors. It is
assumed that each original variable can be expressed as a linear combination
of these factors, plus a residual term that reflects the extent to which the
variable is independent of the other variables. For example, a two-factor
model for the sparrow data assumes that

X1 = a11F1 + a12F2 + e1
X2 = a21F1 + a22F2 + e2
X3 = a31F1 + a32F2 + e3
X4 = a41F1 + a42F2 + e4
X5 = a51F1 + a52F2 + e5
Chapter one: The material of multivariate analysis 13

where the aij values are constants, F1 and F2 are the factors, and ei represents
the variation in Xi that is independent of the variation in the other X variables.
Here F1 might be the factor of size. In that case, the coefficients a11, a21, a31,
a41, and a51 would all be positive, reflecting the fact that some birds tend to
be large and some birds tend to be small on all body measurements. The
second factor F2 might then measure an aspect of the shape of birds, with
some positive coefficients and some negative coefficients. If this two-factor
model fitted the data well, then it would provide a relatively straightforward
description of the relationship between the five body measurements being
considered.
One type of factor analysis starts by taking the first few principal com-
ponents as the factors in the data being considered. These initial factors are
then modified by a special transformation process called factor rotation in
order to make them easier to interpret. Other methods for finding initial
factors are also used. A rotation to simpler factors is almost always done.
Discriminant function analysis is concerned with the problem of seeing
whether it is possible to separate different groups on the basis of the available
measurements. This could be used, for example, to see how well surviving
and nonsurviving sparrows can be separated using their body measurements
(Example 1.1), or how skulls from different epochs can be separated, again
using size measurements (Example 1.2). Like principal components analysis,
discriminant function analysis is based on the idea of finding suitable linear
combinations of the original variables to achieve the intended aim.
Cluster analysis is concerned with the identification of groups of similar
objects. There is not much point in doing this type of analysis with data like
those of Example 1.1 and 1.2, as the groups (survivors/nonsurvivors and
epochs) are already known. However, in Example 1.3 there might be some
interest in grouping colonies on the basis of environmental variables or Pgi
frequencies, while in Example 1.4 the main point of interest is in the similarity
between prehistoric Thai dogs and other animals. Likewise, in Example 1.5
the European countries can possibly be grouped in terms of their similarity
in employment patterns.
With canonical correlation, the variables (not the objects) are divided into
two groups, and interest centers on the relationship between these. Thus in
Example 1.3, the first four variables are related to the environment, while
the remaining six variables reflect the genetic distribution at the different
colonies of Euphydryas editha. Finding what relationships, if any, exist
between these two groups of variables is of considerable biological interest.
Multidimensional scaling begins with data on some measure of the dis-
tances apart of a number of objects. From these distances, a map is then
constructed showing how the objects are related. This is a useful facility, as
it is often possible to measure how far apart pairs of objects are without
having any idea of how the objects are related in a geometric sense. Thus in
Example 1.4, there are ways of measuring the distances between modern
dogs and golden jackals, modern dogs and Chinese wolves, etc. Considering
each pair of animal groups gives 21 distances altogether, and from these
14 Multivariate Statistical Methods: A Primer, Third Edition

distances multidimensional scaling can be used to produce a type of map of


the relationships between the groups. With a one-dimensional map, the
groups are placed along a straight line. With a two-dimensional map, they
are represented by points on a plane. With a three-dimensional map, they
are represented by points within a cube. Four-dimensional and higher solu-
tions are also possible, although these have limited use because they cannot
be visualized in a simple way. The value of a one-, two-, or three-dimensional
map is clear for Example 1.4, as such a map would immediately show which
groups prehistoric dogs are most similar to. Hence multidimensional scaling
may be a useful alternative to cluster analysis in this case. A map of European
countries based on their employment patterns might also be of interest in
Example 1.5.
Principal components analysis and multidimensional scaling are some-
times referred to as methods for ordination. That is to say, they are methods
for producing axes against which a set of objects of interest can be plotted.
Other methods of ordination are also available.
Principal coordinates analysis is like a type of principal components analy-
sis that starts off with information on the extent to which the pairs of objects
are different in a set of objects, instead of the values for measurements on
the objects. As such, it is intended to do the same as multidimensional
scaling. However, the assumptions made and the numerical methods used
are not the same.
Correspondence analysis starts with data on the abundance of each of
several characteristics for each of a set of objects. This is useful in ecology,
for example, where the objects of interest are often different sites, the char-
acteristics are different species, and the data consist of abundances of the
species in samples taken from the sites. The purpose of correspondence
analysis would then be to clarify the relationships between the sites as
expressed by species distributions, and the relationships between the species
as expressed by site distributions.

1.3 The multivariate normal distribution


The normal distribution for a single variable should be familiar to readers
of this book. It has the well-known bell-shaped frequency curve, and many
standard univariate statistical methods are based on the assumption that
data are normally distributed.
Knowing the prominence of the normal distribution with univariate
statistical methods, it will come as no surprise to discover that the multi-
variate normal distribution has a central position with multivariate statistical
methods. Many of these methods require the assumption that the data being
analyzed have multivariate normal distributions.
The exact definition of a multivariate normal distribution is not too
important. The approach of most people, for better or worse, seems to be to
regard data as being normally distributed unless there is some reason to
Chapter one: The material of multivariate analysis 15

believe that this is not true. In particular, if all the individual variables being
studied appear to be normally distributed, then it is assumed that the joint
distribution is multivariate normal. This is, in fact, a minimum requirement
because the definition of multivariate normality requires more than this.
Cases do arise where the assumption of multivariate normality is clearly
invalid. For example, one or more of the variables being studied may have
a highly skewed distribution with several very high (or low) values; there
may be many repeated values; etc. This type of problem can sometimes be
overcome by an appropriate transformation of the data, as discussed in
elementary texts on statistics. If this does not work, then a rather special
form of analysis may be required.
One important aspect of a multivariate normal distribution is that it is
specified completely by a mean vector and a covariance matrix. The defini-
tions of a mean vector and a covariance matrix are given in Section 2.7.
Basically, the mean vector contains the mean values for all of the variables
being considered, while the covariance matrix contains the variances for all
of the variables plus the covariances, which measure the extent to which all
pairs of variables are related.

1.4 Computer programs


Practical methods for carrying out the calculations for multivariate analyses
have been developed over about the last 70 years. However, the application
of these methods for more than a small number of variables had to wait
until computers became available. Therefore, it is only in the last 30 years
or so that the methods have become reasonably easy to carry out for the
average researcher.
Nowadays there are many standard statistical packages and computer
programs available for calculations on computers of all types. It is intended
that this book should provide readers with enough information to use any
of these packages and programs intelligently, without saying much about
any particular one. However, where it is appropriate, the software used to
analyze example data will be mentioned.

1.5 Graphical methods


One of the outcomes of the greatly improved computer facilities in recent
times has been an increase in the variety of graphical methods that are
available for multivariate data. This includes contour plots and three-dimen-
sional surface plots for functions of two variables, and a variety of special
methods for showing the values that individual cases have for three or more
variables. These methods are being used more commonly as part of the
analysis of multivariate data, and they are therefore discussed at some length
in Chapter 3.
16 Multivariate Statistical Methods: A Primer, Third Edition

1.6 Chapter summary


• Five data sets are introduced, and these will be used for examples
throughout the remainder of the book. These data sets concern (1)
five body measurements on female sparrows that did or did not
survive a severe storm; (2) four measurements on skulls of Egyptian
males living at five different periods in the past; (3) four measure-
ments describing the environment and six measurements describing
the genetic characteristics of 16 colonies of a butterfly in California
and Oregon; (4) average values for six mandible measurements for
seven canine groups, including prehistoric dogs from Thailand; and
(5) percentages of people employed in nine different industry groups
for 30 countries in Europe.
• Several important multivariate methods are briefly described in re-
lationship to how they might be used with the data sets. These
methods are principal components analysis, factor analysis, discrim-
inant function analysis, cluster analysis, canonical correlation, mul-
tidimensional scaling, principal coordinates analysis, and correspon-
dence analysis.
• The importance of the multivariate normal distribution is mentioned.
• The use of statistical packages is discussed, and it is noted that the
individual packages used for example analyses will be mentioned
where this is appropriate.
• The importance of graphical methods is noted.

References
Bumpus, H.C. (1898), The elimination of the unfit as illustrated by the introduced
sparrow, Passer domesticus, Biological Lectures, 11th Lecture, Marine Biology
Laboratory, Woods Hole, MA, pp. 209–226.
Euromonitor (1995), European Marketing Data and Statistics, Euromonitor Publications,
London.
Higham, C.F.W., Kijngam, A., and Manly, B.F.J. (1980), An analysis of prehistoric
canid remains from Thailand, J. Archaeological Sci., 7, 149–165.
McKechnie, S.W., Ehrlich, P.R., and White, R.R. (1975), Population genetics of Euphy-
dryas butterflies, I: genetic variation and the neutrality hypothesis, Genetics,
81: 571–594.
Thomson, A. and Randall-Maciver, R. (1905), Ancient Races of the Thebaid, Oxford
University Press, Oxford, U.K.
United Nations (2000), Statistical Yearbook, 44th Issue, U.N. Department of Social
Affairs, New York.
chapter two

Matrix algebra
2.1 The need for matrix algebra
The theory of multivariate statistical methods can be explained reasonably
well only with the use of some matrix algebra. For this reason it is helpful,
if not essential, to have at least some knowledge of this area of mathematics.
This is true even for those who are interested in using the methods only as
tools. At first sight, the notation of matrix algebra is somewhat daunting.
However, it is not difficult to understand the basic principles, providing that
some of the details are accepted on faith.

2.2 Matrices and vectors


An m ¥ n matrix is an array of numbers with m rows and n columns,
considered as a single entity, of the form:

Èa a12 . . a1n ˘
Í 11 ˙
Ía 21 a 22 . . a2n ˙
A = Í. . . ˙
Í ˙
Í. . . ˙
Ía am 2 . . a mn ˙˙
ÍÎ m1 ˚

If m = n, then it is a square matrix. If there is only one column, such as

È c1 ˘
Í ˙
Í c2 ˙
c=Í . ˙
Í ˙
Í . ˙
Íc ˙
ÍÎ m ˙˚

17
18 Multivariate Statistical Methods: A Primer, Third Edition

then this is called a column vector. If there is only one row, such as

r = (r1, r2 , . ., rn)

then this is called a row vector. Bold type is used to indicate matrices and
vectors.
The transpose of a matrix is obtained by interchanging the rows and the
columns. Thus the transpose of the matrix A above is

È a11 a 21 . . a m1 ˘
Ía a . . a ˙
Í 12 22 m2 ˙

A¢ = Í . . . ˙
Í ˙
Í . . . ˙
Ía1n a 2 n . . a mn ˙
Î ˚

Also, the transpose of the vector c is c¢ = (c1, c2, . ., cm), and the transpose of
the row vector r is the column vector r¢.
There are a number of special kinds of matrices that are important. A
zero matrix has all elements equal to zero, so that it is of the form

È0 0 . . 0˘
Í ˙
Í0 0 . . 0˙
0 = Í. . .˙
Í ˙
Í. . .˙
Í0 0 . . 0˙˙
ÍÎ ˚

A diagonal matrix has zero elements except down the main diagonal, so that
it takes the form

Èd 0 . . 0 ˘
Í 1 ˙
Í 0 d2 . . 0 ˙
D=Í . . . ˙
Í ˙
Í. . . ˙
Í 0 0 .. d ˙
ÍÎ n˙
˚

A symmetric matrix is a square matrix that is unchanged when it is transposed,


so that A¢ = A. Finally, an identity matrix is a diagonal matrix with all on the
diagonal terms equal to one, so that
Chapter two: Matrix algebra 19

È1 0 . . 0˘
Í ˙
Í0 1 . . 0˙
I = Í. . .˙
Í ˙
Í. . .˙
Í0 0 . . 1˙˙
ÍÎ ˚

Two matrices are equal only if they are the same size and all of their
elements are equal. For example

Èa a a ˘ Èb b b ˘
Í 11 12 13 ˙ Í 11 12 13 ˙
Ía 21 a 22 a 23 ˙ = Í b 21 b 22 b 23 ˙
Ía a a ˙ Íb b b ˙
ÎÍ 31 32 33 ˙˚ ÍÎ 31 32 33 ˙˚

only if a11 = b11, a12 = b12, a13 = b13 , and so on.


The trace of a matrix is the sum of the diagonal terms, which is only
defined for a square matrix. For example, the trace of the 3 ¥ 3 matrix with
the elements aij shown above has trace (A) = a11 + a22 + a33.

2.3 Operations on matrices


The ordinary arithmetic processes of addition, subtraction, multiplication,
and division have their counterparts with matrices. With addition and sub-
traction, it is just a matter of working element by element with two matrices
of the same size. For example, if A and B are both of size 3 ¥ 2, then

Èa a ˘ Èb b ˘ Èa + b a + b ˘
Í 11 12 ˙ Í 11 12 ˙ Í 11 11 12 12
˙
A + B = Ía 21 a 22 ˙ + Í b 21 b 22 ˙ = Ía 21 + b21 a 22 + b22 ˙
Ía a ˙ Íb b ˙ Ía + b a + b ˙
ÍÎ 31 32 ˙˚ ÍÎ 31 32 ˙˚ ÍÎ 31 31 32 32 ˙
˚

while

Èa a ˘ Èb b ˘ Èa - b a - b ˘
Í 11 12 ˙ Í 11 12 ˙ Í 11 11 12 12
˙
A - B = Ía 21 a 22 ˙ - Í b 21 b 22 ˙ = Ía 21 - b 21 a 22 - b 22 ˙
Ía a ˙ Íb b ˙ Ía - b a - b ˙
ÍÎ 31 32 ˙˚ ÍÎ 31 32 ˙˚ ÍÎ 31 31 32 32 ˙
˚

Clearly, these operations only apply with two matrices of the same size.
20 Multivariate Statistical Methods: A Primer, Third Edition

In matrix algebra, an ordinary number such as 20 is called a scalar. Mul-


tiplication of a matrix A by a scalar k is then defined to be the multiplication
of every element in A by k. Thus if A is the 3 ¥ 2 matrix as shown above, then

Èka ka ˘
Í 11 12
˙
kA = Íka 21 ka 22 ˙
Íka ka ˙
ÍÎ 31 32 ˙
˚

The multiplication of two matrices, denoted by A.B or A ¥ B, is more


complicated. To begin with, A.B is defined only if the number of columns
of A is equal to the number of rows of B. Assume that this is the case, with
A having the size m ¥ n and B having the size n ¥ p. Then multiplication is
defined to produce the result:

È a11 a12 .. a1n ˘ È b11 b12 .. b1p ˘


Í a a .. a ˙ Í b b .. b ˙
Í 21 22 2n ˙ Í 21 22 2p ˙

A.B = Í . . . ˙ . Í . . . ˙
Í ˙ Í ˙
Í . . . ˙ Í . . . ˙
Ía m1 a m 2 .. a mn ˙ Í bn 1 bn 2 .. bnp ˙
Î ˚ Î ˚

È Sa1j b j1 Sa1j b j 2 .. Sa1j b jp ˘


Í Sa b Sa b .. Sa b ˙
Í 2 j j1 2 j j2 2 j jp ˙

= Í . . . ˙
Í ˙
Í . . . ˙
ÍSa mj b j1 Sa mj b j 2 .. Sa mj b jp ˙
Î ˚

where the summations are for j running from 1 to n. Hence the element in
the ith row and kth column of A.B is

S aij bjk = ai1 b1k + ai2 b2k + . . + ain bnk

When A and B are both square matrices, then A.B and B.A are both
defined. However, they are not generally equal. For example,

È2 -1 ˘ È1 1˘ È2 ¥ 1 - 1 ¥ 0 2 ¥ 1 - 1 ¥ 1˘ È 2 1˘
Í ˙ ◊ Í ˙ = Í ˙ = Í ˙
ÍÎ1 1 ˙˚ ÍÎ0 1˙˚ ÍÎ1 ¥ 1 + 1 ¥ 0 1 ¥ 1 + 1 ¥ 1˙˚ ÍÎ 1 2˙˚
Chapter two: Matrix algebra 21

whereas

È1 1 ˘ È2 -1 ˘ È 1 ¥ 2 + 1 ¥ 1 - 1 ¥ 1 + 1 ¥ 1˘ È3 0˘
Í0 1 ˙ ◊ Í1 1 ˙ = Í0 ¥ 2 + 1 ¥ 1 - 1 ¥ 0 + 1 ¥ 1˙ = Í1 1˙
Î ˚ Î ˚ Î ˚ Î ˚

2.4 Matrix inversion


Matrix inversion is analogous to the ordinary arithmetic process of division.
For a scalar k, it is of course true that k ¥ k –1 = 1. In a similar way, if A is a
square matrix and

A ¥ A–1 = I

where I is the identity matrix, then the matrix A–1 is the inverse of the matrix
A. Inverses exist only for square matrices, but all square matrices do not
have inverses. If an inverse does exist, then it is both a left inverse, so that
A–1 ¥ A = I, as well as a right inverse so that A ¥ A–1 = I.
An example of an inverse matrix is

-1
È 2 1˘ È 23 -1 3˘
Í 1 2˙ = Í-1 3 2 3˙˚
Î ˚ Î

which can be verified by checking that

È2 1˘ È 2 3 -1 3˘ È1 0˘
Í ˙ ◊ Í ˙ = Í ˙
ÍÎ 1 2˙˚ ÍÎ-1 3 2 3˙˚ ÍÎ0 1˙˚

Actually, the inverse of a 2 ¥ 2 matrix, if it exists, can be calculated fairly


easily. The equation is

-1
Èa b ˘ È d D - b D˘
Í ˙ = Í ˙
ÍÎc d˙˚ ÍÎ-c D a D˙
˚

where D = (a ¥ d) – (b ¥ c). Here the scalar D is called the determinant of the


matrix being inverted. Clearly, the inverse is not defined if D = 0 because
finding the elements of the inverse then involves a division by zero. For 3 ¥
3 and larger matrices, the calculation of the inverse is tedious and is best
done by using a computer program. Nowadays even spreadsheets include
a facility to compute an inverse.
22 Multivariate Statistical Methods: A Primer, Third Edition

Any square matrix has a determinant, which can be calculated by a


generalization of the equation just given for the 2 ¥ 2 case. If the determinant
is zero, then the inverse does not exist, and vice versa. A matrix with a zero
determinant is said to be singular.
Matrices sometimes arise for which the inverse is equal to the transpose.
They are then said to be orthogonal. Hence A is orthogonal if A–1 = A¢.

2.5 Quadratic forms


Suppose that A is an n ¥ n matrix and x is a column vector of length n. Then
the quantity

Q = x¢ A x

is a scalar that is called a quadratic form. This scalar can also be expressed as

n n
Q=Â Â xiaijx j
i =1 j =1

where xi is the element in the ith row of x and aij is the element in the ith
row and jth column of A.

2.6 Eigenvalues and eigenvectors


Consider the set of linear equations

a11x1 + a12 x 2 + º + a1n x n = l x1

a 21x1 + a 22 x 2 + º + a 2 n x n = l x 2

a n 1x1 + a n 2 x 2 + º + a nn x n = l x n

where l is a scalar. These can also be written in matrix form as

A x = lx

or

(A – l I) x = 0

where I is the n ¥ n identity matrix, and 0 is an n ¥ 1 vector of zeros. Then


it can be shown that these equations can hold only for certain particular
Chapter two: Matrix algebra 23

values of l that are called the latent roots or eigenvalues of A. There can be
up to n of these eigenvalues. Given the ith eigenvalue l i , the equations can
be solved by arbitrarily setting x1 = 1, and the resulting vector of x values
with transpose x¢ = (1, x2, x3, . ., xn), or any multiple of this vector, is called
the ith latent root or the ith eigenvector of the matrix A. Also, the sum of the
eigenvalues is equal to the trace of A defined above, so that

trace (A) = l 1 + l 2 + . . + l n

2.7 Vectors of means and covariance matrices


Population and sample values for a single random variable are often sum-
marized by the values for the mean and variance. Thus if a sample of size
n yields the values x1, x2, . ., xn, then the sample mean is defined to be

n
x = (x1 + x 2 + ... + x n ) n = Â x i n
i =1

while the sample variance is

n
s 2 = Â (x i - x ) (n - 1)
2

i =1

These are estimates of the corresponding population parameters, which are


the population mean µ and the population variance s2.
In a similar way, multivariate populations and samples can be summa-
rized by mean vectors and covariance matrices. Suppose that there are p vari-
ables X1, X2, . ., Xp being considered, and that a sample of n values for each
of these variables is available. Let the sample mean and sample variance for
the ith variable be xi and si2, respectively, where these are calculated using
the equations given above. In addition, the sample covariance between vari-
ables Xj and Xk is

( )( ) (n - 1)
n
c jk = Â x ij - x j x jk - xk
i =1

where xij is the value of variable Xj for the ith multivariate observation. This
covariance is then a measure of the extent to which there is a linear relation-
ship between Xj and Xk, with a positive value indicating that large value of
Xj and Xk tend to occur together, and a negative value indicating that large
values for one variable tend to occur with small values for the other variable.
It is related to the ordinary correlation coefficient between the two variables,
which is defined to be
24 Multivariate Statistical Methods: A Primer, Third Edition

rjk = cjk/(sj sk)

Furthermore, the definitions imply that ckj = cjk, rkj = rjk, cjj = sj2, and rjj = 1.
With these definitions, the transpose of the sample mean vector is

x ¢ = ( x1 , x2 ,.., xp )

which can be thought of as reflecting the center of the multivariate sample.


It is also an estimate of the transpose of the population vector of means

µ¢ = (µ1, µ2, . ., µp)

Furthermore, the sample matrix of variances and covariances, or the covari-


ance matrix, is

È c11 c12 .. c1p ˘


Íc c 22 .. c 2 p ˙
Í 21 ˙
C= Í . . . ˙
Í ˙
Í . . . ˙
Íc p1 c p 2 .. c pp ˙˚
Î

where cii = si2. This is also sometimes called the sample dispersion matrix, and
it measures the amount of variation in the sample as well as the extent to
which the p variables are correlated. It is an estimate of the population cova-
riance matrix

È s11 s12 .. s1p ˘


Í s s .. s ˙
Í 21 22 2p ˙

 = ÍÍ . . . ˙
˙
Í . . . ˙
Ís p1 s p 2 .. s pp ˙
Î ˚

Finally, the sample correlation matrix is

È1 r12 .. r1p ˘
Í ˙
Í r21 1 .. r2 p ˙
R = ÍÍ . . . ˙
˙
Í . . . ˙
Ír rp 2 .. 1 ˙˙
ÍÎ p1 ˚

Again, this is an estimate of the corresponding population correlation matrix.


An important result for some analyses is that if the observations for each of
Chapter two: Matrix algebra 25

the variables are coded by subtracting the sample mean and dividing by the
sample standard deviation, then the coded values will have a mean of zero
and a standard deviation of one for each variable. In that case, the sample
covariance matrix will equal the sample correlation matrix, i.e., C = R.

2.8 Further reading


This short introduction to matrix algebra will suffice for understanding the
methods described in the remainder of this book and some of the theory
behind these methods. However, for a better understanding of the theory,
more knowledge and proficiency is required.
One possibility in this respect is just to read a university text giving an
introduction to matrix methods. Alternatively, there are several books of
various lengths that cover what is needed just for statistical applications.
Three of these are by Searle (1982), Healy (1986), and Harville (1997), with
Healy’s book being quite short (less than 100 pages), Searle’s book quite long
(438 pages), and Harville’s book the longest of all (630 pages). Another short
book with less than 100 pages is by Namboodiri (1984). The shorter books
should be more than adequate for most people.
Another possibility is to do a Web search on the topic of matrix algebra.
This yields much educational material including free books and course notes.

2.9 Chapter summary


• Matrices and vectors are defined, as are the special forms of the zero,
diagonal, and identity matrices. Definitions are also given for the
transpose of a matrix, the equality of two matrices, and the trace of
a matrix.
• The operations of addition, subtraction, and multiplication are de-
fined for two matrices.
• The meaning of matrix inversion is briefly explained, together with
the associated concepts of a determinant, a singular matrix, and an
orthogonal matrix.
• A quadratic form is defined.
• Eigenvalues and eigenvectors (latent roots and vectors) are defined.
• The calculation of the sample mean vector and the sample covariance
matrix are explained, together with the corresponding population
mean vector and population covariance matrix. The sample correla-
tion matrix and the corresponding population correlation matrix are
also defined.
• Suggestions are made about books and other sources of further in-
formation about matrix algebra.
26 Multivariate Statistical Methods: A Primer, Third Edition

References
Harville, D.A. (1997), Matrix Algebra from a Statistician’s Perspective, Springer,
New York.
Healy, M.J.R. (1986), Matrices for Statistics, Clarendon Press, Oxford.
Namboodiri, K. (1984), Matrix Algebra: an Introduction, Sage Publications, Thousand
Oaks, CA.
Searle, S.R. (1982), Matrix Algebra Useful to Statisticians, Wiley, New York.
chapter three

Displaying multivariate
data
3.1 The problem of displaying many variables
in two dimensions
Graphs must be displayed in two dimensions either on paper or on a com-
puter screen. It is therefore straightforward to show one variable plotted on
a vertical axis against a second variable plotted on a horizontal axis. For
example, Figure 3.1 shows the alar extent plotted against the total length for
the 49 female sparrows measured by Hermon Bumpus in his study of natural
selection (Table 1.1). Such plots allow one or more other characteristics of
the objects being studied to be shown as well. For example, in the case of
Bumpus’s sparrows, survival and nonsurvival are also indicated.
It is considerably more complicated to show one variable plotted against
another two, but still possible. Thus Figure 3.2 shows beak and head lengths
(as a single variable) plotted against total lengths and alar lengths for the 49
sparrows. Again, different symbols are used for survivors and nonsurvivors.
It is not possible to show one variable plotted against another three at
the same time in some extension of a three-dimensional plot. Hence there is
a major problem in showing in a simple way the relationships that exist
between the individual objects in a multivariate set of data where those
objects are each described by four or more variables. Various solutions to
this problem have been proposed and are discussed in this chapter.

3.2 Plotting index variables


One approach to making a graphical summary of the differences between
objects that are described by more than four variables involves plotting the
objects against the values of two or three index variables. Indeed, a major
objective of many multivariate analyses is to produce index variables that
can be used for this purpose, a process that is sometimes called ordination.
For example, a plot of the values of principal component 2 against the values

27
28 Multivariate Statistical Methods: A Primer, Third Edition

255

Alar Extent (mm) 250

245

240

235

230

225
150 155 160 165
Total Length (mm)
Survivor Nonsurvivor

Figure 3.1 Alar extent plotted against total length for the 49 female sparrows mea-
sured by Hermon Bumpus.

34
Length of Beak & Head

33

32

31

165
235
240 160
245 155 Total
Alar Extent 250 Length
150
255

Figure 3.2 The length of the beak and head plotted against the total length and alar
extent (all measured in millimeters) for the 49 female sparrows measured by Hermon
Bumpus (• = survivor, ο = nonsurvivor).

of principal component 1 can be used as a means of representing the rela-


tionships between objects graphically, and a display of principal component
3 against the first two principal components can also be used if necessary.
The use of suitable index variables has the advantage of reducing the
problem of plotting many variables to two or three dimensions, but the
potential disadvantage is that some key difference between the objects may
be lost in the reduction. This approach is discussed in various different
contexts in the chapters that follow and will not be considered further here.
Chapter three: Displaying multivariate data 29

49

Bird

1
166

Length

150
255

Alar

225
34

Bk & Hd

30
20

Humerus

17
24

Sternum

18
1 49 150 166 225 255 30 34 17 20 18 24
Bird Length Alar Bk & Hd Humerus Sternum

Figure 3.3 Draftsman’s plot of the bird number and five variables measured (in
millimeters) on 49 female sparrows. The variables are the total length, the alar extent,
the length of the beak and head, the length of the humerus, and the length of the
keel of the sternum (• = survivor, ο = nonsurvivor). Only the extreme values are
shown on each scale.

3.3 The draftsman’s plot


A draftsman’s display of multivariate data consists of a plot of the values
for each variable against the values for each of the other variables, with the
individual graphs being small enough so that they can all be viewed at the
same time. This has the advantage of only needing two-dimensional plots,
but the disadvantage is that it cannot depict aspects of the data that would
only be apparent when three or more variables are considered together.
An example is shown in Figure 3.3. Here, the five variables measured
by Hermon Bumpus on 49 sparrows (total length, alar extent, length of beak
and head, length of humerus, and length of the keel of the sternum, all in
mm) are plotted for the data given in Table 1.1, with an additional first
variable being the number of the sparrow, from 1 to 49. Different symbols
are used for the measurements on survivors (birds 1 to 21) and nonsurvivors
(birds 22 to 49). Regression lines are also sometimes added to the plots.
Random documents with unrelated
content Scribd suggests to you:
and good-looking Mexican servant exemplifies more than any other human
being the thing called “style.” As darkness comes on everyone returns to
town to drive in San Francisco Street until half past eight or nine. This is a
most extraordinary sight—the narrow thoroughfare in the heart of the city
so congested with carriages as to be more or less impassable for two hours
—the occupants under the electric lights more pallid than their powder—the
sidewalks packed with spectators constantly urged by the police to “move
on.” It all happens at the same hour every Sunday, and no one seems to tire.
When I said there were but few “sights” in Mexican cities I made, in the
case of the capital, a mental reservation. Here there are formal, official,
objective points sufficient to keep the intelligent tourist busy for a week; the
cathedral, the Viga canal, the shrine of Guadalupe, the Monte de Piedad—
the National Palace, and the Castle of Chapultepec, if one cares to measure
the red tape necessary to passing within their historic and deeply interesting
portals. Even if one doesn’t, it would, in my opinion, be a tragedy to leave
without seeing, at sunset, the view of the volcanoes from the top of the rock
on which the castle is built; especially as this can be done by following,
without a card of admission, the steep, winding road past the pretty
grottolike entrance to the President’s elevator, until it ends at the gateway of
the famous military school on the summit. One also goes, of course, to the
National Museum to inspect the small but immensely valuable collection of
Aztec remains (large compared to any other Aztec remains, but small, if one
pauses to recall the remains in general that have remained elsewhere) and to
receive the impression that the pre-Spanish inhabitants of the country,
interesting as they undoubtedly were, had by no means attained that facility
in the various arts which Prescott and other historians claim for them. After
examining their grotesque and terrifying gods, the incoherent calendar and
sacrificial stones, the pottery, the implements, and the few bits of crude,
gold jewelry, one strolls into the small room in which are left, perhaps, the
most tangible evidences of Maximilian’s “empire,” reflecting that Prescott’s
monumental effort is one of the most entrancing works of fiction one
knows. To the unarcheological, Maximilian’s state coach, almost as
overwhelmingly magnificent as the gilded sledge in which Lillian Russell
used to make her entrance in “The Grand Duchess,” his carriage for
ordinary occasions, the saddle he was in when captured, and the colored
fashion plates of his servants’ liveries, are sure to be the museum’s most
interesting possessions. Not without a pardonable touch of malice, in the
guise of a grave political lesson, is the fact that the severely simple, well-
worn, eminently republican vehicle of Benito Juarez is displayed in the
same room.
The four or five vast apartments of the Academy of San Carlos (the
national picture gallery) suggests certain aspects of the Louvre, but their
variously sized canvases suggest only the melancholy reflection that all
over the world so many perfectly well-painted pictures are so perfectly
uninteresting. One cannot but except, however, a dozen or more scattered
little landscapes—absolutely faultless examples of the kind of picture (a
very beautiful kind I have grown to think) that the grandparents of all good
Bostonians felt it becoming their means and station to acquire fifty or sixty
years ago in Rome. The Mexican Government, it no doubt will be
surprising to hear, encourages painting and music by substantial
scholarships. Talented students are sent abroad to study at government
expense. One young man I happened to know was given his opportunity on
the strength of an exquisite oil sketch of the patio of his parents’ house in
the white glare of noon. He is in Paris now, painting pictures of naked
women lying on their backs in vacant lots. Several of them, naturally, have
been hung in the Salon.
But the guidebook will enumerate the sights, and the “Seeing Mexico”
electric car will take one to them. Still there is one I do not believe the book
mentions, and I am sure the car does not include. That is the city itself
between five and six o’clock on a fair morning. It several times has been
my good fortune (in disguise) to be obliged to get up at this hour for the
purpose of saying good-by to people who were leaving on an early train,
and in returning all the way on foot from the station to the Zócalo (as the
stupendous square in front of the cathedral is called) I saw the place, I am
happy to remember, in what was literally as well as figuratively a new light.
Beyond a few laborers straggling to their work, and the men who were
making the toilet of the Alameda with large, green bushes attached to the
end of sticks, the city appeared to be blandly slumbering, and just as the
face of some one we know will, while asleep, surprise us by a rare and
unsuspected expression, the great, unfinished, unsympathetic capital smiled,
wisely and a trifle wearily, in its dreams. It is at this hour, before the
mongrel population has begun to swarm, that one should walk through the
Alameda, inhale the first freshness of the wet roses and lilies, the gardenias
and pansies and heliotrope in the flower market, and, undisturbed among
the trees in front of the majestic cathedral, listen to “the echoed sob of
history.”

THE END
Typographical errors corrected by
the etext transcriber:
Futhermore=> Furthermore {pg 73}
Oh que bonitas=> Oh qué bonitas
{pg 179}
a desert=> a dessert {pg 185}
she as giving=> she was giving {pg
210}
exclaims her hushand=> exclaims
her husband {pg 261}
innocent midemeanor=> innocent
misdemeanor {pg 272}
of preoccuption while=> of
preoccupation while {pg 281}
*** END OF THE PROJECT GUTENBERG EBOOK VIVA MEXICO! ***

Updated editions will replace the previous one—the old editions will
be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States copyright in
these works, so the Foundation (and you!) can copy and distribute it
in the United States without permission and without paying
copyright royalties. Special rules, set forth in the General Terms of
Use part of this license, apply to copying and distributing Project
Gutenberg™ electronic works to protect the PROJECT GUTENBERG™
concept and trademark. Project Gutenberg is a registered trademark,
and may not be used if you charge for an eBook, except by following
the terms of the trademark license, including paying royalties for use
of the Project Gutenberg trademark. If you do not charge anything
for copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such as
creation of derivative works, reports, performances and research.
Project Gutenberg eBooks may be modified and printed and given
away—you may do practically ANYTHING in the United States with
eBooks not protected by U.S. copyright law. Redistribution is subject
to the trademark license, especially commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the free


distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund
from the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only be


used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law
in the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name associated
with the work. You can easily comply with the terms of this
agreement by keeping this work in the same format with its attached
full Project Gutenberg™ License when you share it without charge
with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.

1.E. Unless you have removed all references to Project Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears,
or with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this eBook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is derived


from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is posted


with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning
of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.

1.E.5. Do not copy, display, perform, distribute or redistribute this


electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the Project
Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or providing


access to or distributing Project Gutenberg™ electronic works
provided that:

• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™


electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for


the “Right of Replacement or Refund” described in paragraph 1.F.3,
the Project Gutenberg Literary Archive Foundation, the owner of the
Project Gutenberg™ trademark, and any other party distributing a
Project Gutenberg™ electronic work under this agreement, disclaim
all liability to you for damages, costs and expenses, including legal
fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR
NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR
BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK
OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL
NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT,
CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF
YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you


discover a defect in this electronic work within 90 days of receiving
it, you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or
entity that provided you with the defective work may elect to provide
a replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.

1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation,


the trademark owner, any agent or employee of the Foundation,
anyone providing copies of Project Gutenberg™ electronic works in
accordance with this agreement, and any volunteers associated with
the production, promotion and distribution of Project Gutenberg™
electronic works, harmless from all liability, costs and expenses,
including legal fees, that arise directly or indirectly from any of the
following which you do or cause to occur: (a) distribution of this or
any Project Gutenberg™ work, (b) alteration, modification, or
additions or deletions to any Project Gutenberg™ work, and (c) any
Defect you cause.

Section 2. Information about the Mission


of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will
remain freely available for generations to come. In 2001, the Project
Gutenberg Literary Archive Foundation was created to provide a
secure and permanent future for Project Gutenberg™ and future
generations. To learn more about the Project Gutenberg Literary
Archive Foundation and how your efforts and donations can help,
see Sections 3 and 4 and the Foundation information page at
www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-profit
501(c)(3) educational corporation organized under the laws of the
state of Mississippi and granted tax exempt status by the Internal
Revenue Service. The Foundation’s EIN or federal tax identification
number is 64-6221541. Contributions to the Project Gutenberg
Literary Archive Foundation are tax deductible to the full extent
permitted by U.S. federal laws and your state’s laws.

The Foundation’s business office is located at 809 North 1500 West,


Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission of
increasing the number of public domain and licensed works that can
be freely distributed in machine-readable form accessible by the
widest array of equipment including outdated equipment. Many
small donations ($1 to $5,000) are particularly important to
maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws regulating


charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states where


we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot make


any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.

Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.

Section 5. General Information About


Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.
Project Gutenberg™ eBooks are often created from several printed
editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.
back
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

ebookfinal.com

You might also like