Applied Matrix Algebra in the Statistical Sciences
4/5
()
About this ebook
Consisting of two interrelated parts, this volume begins with the basic structure of vectors and vector spaces. The latter part emphasizes the diverse properties of matrices and their associated linear transformations--and how these, in turn, depend upon results derived from linear vector spaces. An overview of introductory concepts leads to more advanced topics such as latent roots and vectors, generalized inverses, and nonnegative matrices. Each chapter concludes with a section on real-world statistical applications, plus exercises that offer concrete examples of the applications of matrix algebra.
Related to Applied Matrix Algebra in the Statistical Sciences
Titles in the series (100)
First-Order Partial Differential Equations, Vol. 2 Rating: 0 out of 5 stars0 ratingsFirst-Order Partial Differential Equations, Vol. 1 Rating: 5 out of 5 stars5/5Differential Forms Rating: 5 out of 5 stars5/5Analytic Inequalities Rating: 5 out of 5 stars5/5A Catalog of Special Plane Curves Rating: 2 out of 5 stars2/5Differential Forms with Applications to the Physical Sciences Rating: 5 out of 5 stars5/5Advanced Calculus: Second Edition Rating: 5 out of 5 stars5/5Topology for Analysis Rating: 4 out of 5 stars4/5Laplace Transforms and Their Applications to Differential Equations Rating: 5 out of 5 stars5/5Methods of Applied Mathematics Rating: 3 out of 5 stars3/5Optimization Theory for Large Systems Rating: 5 out of 5 stars5/5How to Gamble If You Must: Inequalities for Stochastic Processes Rating: 0 out of 5 stars0 ratingsInfinite Series Rating: 4 out of 5 stars4/5Vision in Elementary Mathematics Rating: 3 out of 5 stars3/5History of the Theory of Numbers, Volume II: Diophantine Analysis Rating: 0 out of 5 stars0 ratingsAn Introduction to Lebesgue Integration and Fourier Series Rating: 0 out of 5 stars0 ratingsGeometry: A Comprehensive Course Rating: 4 out of 5 stars4/5Chebyshev and Fourier Spectral Methods: Second Revised Edition Rating: 4 out of 5 stars4/5Numerical Methods Rating: 5 out of 5 stars5/5An Adventurer's Guide to Number Theory Rating: 3 out of 5 stars3/5Dynamic Probabilistic Systems, Volume II: Semi-Markov and Decision Processes Rating: 0 out of 5 stars0 ratingsMathematics for the Nonmathematician Rating: 4 out of 5 stars4/5Calculus: An Intuitive and Physical Approach (Second Edition) Rating: 4 out of 5 stars4/5Statistical Inference Rating: 4 out of 5 stars4/5Fourier Series and Orthogonal Polynomials Rating: 0 out of 5 stars0 ratingsA History of Mathematical Notations Rating: 4 out of 5 stars4/5Counterexamples in Topology Rating: 4 out of 5 stars4/5Applied Functional Analysis Rating: 0 out of 5 stars0 ratingsA Treatise on Probability Rating: 0 out of 5 stars0 ratingsDifferential Geometry Rating: 5 out of 5 stars5/5
Related ebooks
Kronecker Products and Matrix Calculus with Applications Rating: 0 out of 5 stars0 ratingsExercises of Matrices and Linear Algebra Rating: 4 out of 5 stars4/5Matrices and Linear Algebra Rating: 4 out of 5 stars4/5Practice Makes Perfect Linear Algebra: With 500 Exercises Rating: 0 out of 5 stars0 ratingsAn Introduction to Linear Algebra and Tensors Rating: 1 out of 5 stars1/5Linear Algebra and Matrix Theory Rating: 5 out of 5 stars5/5Matrices and Linear Transformations: Second Edition Rating: 3 out of 5 stars3/5An Introduction to Linear Algebra Rating: 3 out of 5 stars3/5Classic Papers in Control Theory Rating: 0 out of 5 stars0 ratingsIntroduction to Matrices and Vectors Rating: 0 out of 5 stars0 ratingsMatrix Theory and Applications for Scientists and Engineers Rating: 0 out of 5 stars0 ratingsIntroduction to Advanced Algebra Rating: 0 out of 5 stars0 ratingsSolving Equations with MATLAB (Taken from the Book "MATLAB for Beginners: A Gentle Approach") Rating: 3 out of 5 stars3/5Top Numerical Methods With Matlab For Beginners! Rating: 0 out of 5 stars0 ratingsLinear Programming and Network Flows Rating: 0 out of 5 stars0 ratingsMatrices and Transformations Rating: 4 out of 5 stars4/5Matrices with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach") Rating: 3 out of 5 stars3/5Fundamentals of Modern Mathematics: A Practical Review Rating: 0 out of 5 stars0 ratingsIntroduction to Differential Geometry for Engineers Rating: 0 out of 5 stars0 ratingsLebesgue Integration Rating: 0 out of 5 stars0 ratingsNonlinear Differential Equations Rating: 0 out of 5 stars0 ratingsA Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach" Rating: 3 out of 5 stars3/5The Red Book of Mathematical Problems Rating: 0 out of 5 stars0 ratingsBoundary Value Problems and Fourier Expansions Rating: 0 out of 5 stars0 ratingsFoundations of Applied Mathematics Rating: 3 out of 5 stars3/5Tensor and Vector Analysis: With Applications to Differential Geometry Rating: 0 out of 5 stars0 ratingsIntegral Equations Rating: 0 out of 5 stars0 ratingsThe Theory of Functions of Real Variables: Second Edition Rating: 0 out of 5 stars0 ratingsA Concrete Approach to Abstract Algebra Rating: 5 out of 5 stars5/5Determinants and Matrices Rating: 3 out of 5 stars3/5
Mathematics For You
Calculus For Dummies Rating: 4 out of 5 stars4/5Math Magic: How To Master Everyday Math Problems Rating: 3 out of 5 stars3/5Pre-Calculus For Dummies Rating: 5 out of 5 stars5/5My Best Mathematical and Logic Puzzles Rating: 4 out of 5 stars4/5Mental Math Secrets - How To Be a Human Calculator Rating: 5 out of 5 stars5/5Calculus Made Easy Rating: 4 out of 5 stars4/5The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need Rating: 5 out of 5 stars5/5Quantum Physics for Beginners Rating: 4 out of 5 stars4/5Basic Math & Pre-Algebra For Dummies Rating: 4 out of 5 stars4/5Statistics: a QuickStudy Laminated Reference Guide Rating: 0 out of 5 stars0 ratingsAlgebra - The Very Basics Rating: 5 out of 5 stars5/5Introducing Game Theory: A Graphic Guide Rating: 4 out of 5 stars4/5Algebra I For Dummies Rating: 4 out of 5 stars4/5The Art of Statistical Thinking Rating: 5 out of 5 stars5/5The Little Book of Mathematical Principles, Theories & Things Rating: 3 out of 5 stars3/5Game Theory: A Simple Introduction Rating: 4 out of 5 stars4/5Geometry For Dummies Rating: 4 out of 5 stars4/5Calculus Made Easy Rating: 5 out of 5 stars5/5Mental Math: Tricks To Become A Human Calculator Rating: 2 out of 5 stars2/5Trigonometry For Dummies Rating: 5 out of 5 stars5/5Precalculus: A Self-Teaching Guide Rating: 4 out of 5 stars4/5The Cartoon Introduction to Calculus Rating: 5 out of 5 stars5/5Feynman Lectures Simplified 4A: Math for Physicists Rating: 5 out of 5 stars5/5Introductory Discrete Mathematics Rating: 4 out of 5 stars4/5Calculus: An Intuitive and Physical Approach (Second Edition) Rating: 4 out of 5 stars4/5Calculus Essentials For Dummies Rating: 5 out of 5 stars5/5
Reviews for Applied Matrix Algebra in the Statistical Sciences
1 rating0 reviews
Book preview
Applied Matrix Algebra in the Statistical Sciences - Alexander Basilevsky
Preface
In recent decades matrix algebra and statistics and probability have become two of the most important areas of theoretical and applied mathematics, in terms of university curricula as well as actual research in the social, geographical, human, and life sciences. With a growing awareness of the usefulness of matrices in the statistical sciences has come a very elegant development of statistical theory, combining the principles of probability and linear algebra. This has greatly increased our understanding of the algebraic linear structures and properties of statistical models. At the same time, however, many of the more specialized theorems of linear algebra, as well as particular types of matrices commonly used in statistics, are usually not discussed in the more theoretical linear algebra texts, whose main purpose is to give the student a broad perspective of linear algebraic structures rather than a more specialized view. Usually matrix results of interest to statisticians are scattered in the appendices and introductory chapters of advanced statistical publications or are discussed in journal articles not readily available to the student and researcher. This in turn tends to influence university programs in statistics to place less importance on matrix algebra than it deserves, with the result that students involved in statistics and other quantitative programs are often forced to rely on courses taught in pure mathematics in order to obtain even the most elementary notions of applied linear algebra. The present volume is therefore an attempt to collect into a single text some of the main results of matrix algebra that find wide use in both applied and theoretical branches of the statistical sciences, and provide a bridge between linear algebra and statistical models. It is hoped that this volume will be useful to students undertaking undergraduate or postgraduate degrees in quantitative areas such as statistics, probability, econometrics, psychometrics, sociometrics, biometrics, the various life and earth sciences, as well as quantitative geography and demography, by providing a self-contained book on applied matrix algebra designed for statistical application. Also, it is the intention of the author to place at the disposal of the researcher a handy reference book to be consulted when the need arises. The material has therefore been organized in a self-contained manner, beginning with introductory concepts and leading to the more advanced topics such as latent roots and vectors, generalized inverses, and nonnegative matrices. At the end of each chapter is a section dealing with real-world statistical applications, as well as exercises, in order to provide the reader with concrete examples (and motivation) of how (and why) matrices are useful in the statistical sciences, as well as with an illustration of the matrix algebra. Whenever possible, proofs of the theorems have been structured in such a way so as to obviate the use of abstract algebraic notions. The only mathematical background that is needed is a good knowledge of high school mathematics, as well as a first course in statistics, although certain sections, such as those dealing with quadratic forms, require differential calculus. The author has also attempted to relate algebraic results to geometric concepts and diagrams in order to render the material more understandable from the intuitive point of view. The main purpose of this book, however, is to provide the reader with a systematic development of applied matrix theory, and the author has included more-or-less complete proofs of the theorems. Formal proofs not only develop analytical skills that are indispensable in quantitative research, but also provide a sound understanding of the fundamental properties of the mathematical tools used.
This book consists of two interrelated parts: The first deals with the basic structure of vectors and vector spaces, while the latter part emphasizes the diverse properties of matrices and their associated linear transformations, and how these in turn depend on results derived from linear vector spaces. Chapter 1 introduces the notion of a real vector as a distinct mathematical entity, defines operations for vectors, and then considers scalar functions of vector elements such as length, distance, inner product, and the angle between two vectors. In turn, these scalar functions are used to define statistical measures of location, variation, and association such as the centroid (mean), variance, covariance, and correlation. Chapter 2 extends the notion of a vector to that of an n-dimensional vector space provided with orthogonal—or, more generally, oblique—coordinates. The discussion centers on sums of vector spaces, coordinate transformations, and projections of a vector onto other vectors and how coordinate transformations and vector projections provide the statistician with some of the basic tools used in curve fitting, principal components, and other associated topics in multivariate analysis.
Chapters 3–7 form the main body of this book, namely, matrices and their properties and how these in turn are used in statistical analysis and model building. Chapter 3 defines general types of matrices, operations on matrices, scalar functions of the elements of a matrix, and how matrices are used to study various types of linear transformations. In Chapter 4 the discussion is narrowed down to more specialized matrices such as idempotent, nilpotent, orthogonal, Grammian, and projection matrices, which play a fundamental role in statistics. Chapter 5 is devoted exclusively to latent roots and vectors, whereas Chapter 6 considers a relatively recent development in matrix theory and linear statistical models—generalized matrix inverses and their unifying role in statistical estimation. Finally, Chapter 7 considers some of the more important elements of graphs, nonnegative and diagonally dominant matrices, and how these can be employed in the study of Markov chains and other associated models used in projections and forecasting.
Not all chapters will be of equal interest to each reader. For those whose interest lies in multivariate models and estimations but who have not had any previous exposure to linear vector spaces, Chapters 1–5 will be of relevance (with Chapter 6 providing an additional topic of a more advanced nature), whereas readers already familiar with vectors and vector spaces can begin with Chapter 3. Students who are more concerned with discrete stochastic processes and time projections can replace Chapter 6 with Chapter 7.
The author wishes to express thanks to D. S. Grant of the Department of Mathematics at the University of Winnipeg, who read the entire manuscript and suggested many useful changes. I would also like to thank S. A. Hathout of the Department of Geography, who has provided original data for a numerical example, as well as T. J. Kuz of the same department and A. Anderson of the University of Massachusetts for many useful discussions. The final preparation of this volume was made possible by a research grant from the University of Winnipeg. I alone am responsible for any defects in the manuscript.
Alexander Basilevsky
Winnipeg, Manitoba
Chapter 1
Vectors
1.1 Introduction
In applied quantitative work matrices arise for two main reasons—to manipulate data arranged in tables and to solve systems of equations. A real matrix A is defined as a n × k rectangular array
e9780486153377_i0002.jpgwhere the real aij (i = 1, 2,..., n; j = 1, 2,..., k) comprising the elements of A have either known or unknown values. When n = 3 and k = 2, we have the 3 × 2 matrix
e9780486153377_i0003.jpgand when the elements are known we may have, for example,
e9780486153377_i0004.jpgwhere a11 = 2, a12 = 0,..., a32 = 8. The subscripts i and j are convenient index numbers that indicate the row and column of aij, respectively. For the special case when n = k =1, matrix A reduces to a single number, referred to as a scalar. When n = 1 (k =1) we obtain a row (column) array, or a vector, which can be viewed as a particular type of matrix. Alternatively, a matrix can be considered as a set of vectors, which in turn consist of real scalar numbers. Each view has its own particular merit, but for the sake of exposition it is useful to first consider properties of vectors and to then extend these properties to matrices. Geometrically, a vector is represented as a point in a Cartesian system of coordinate axes and is frequently depicted by a straight arrow (Figure 1.1); however, it is important to keep in mind that a vector is in fact a point, and not a straight line.
e9780486153377_i0005.jpgFigure 1.1 A parallel translation of a vector V to a new location V
1.2 Vector Operations
Although geometric representations of vectors can be intuitive aids and will be used frequently in the following chapters, they are less helpful for defining the basic properties of vectors, which is best achieved by algebra.
Let a set of n numbers ai (i = 1, 2, ...,n), be represented in the linear¹ array (a1, a2,...,an) where, in general, interchanging any two (or more) numbers results in a different set. The set (a2, a1,..., an), for example, is not the same as a1, a2,...,an), unless a1 = a2. For this reason a vector is said to be ordered. Such ordered sets of numbers are generally referred to as vectors, and the scalars ai are known as components of a vector. The components are measured with respect to the zero vector 0 = (0,0,..., 0), which serves as the origin point. Not all vector systems employ zero as the origin, but we confine our treatment to those systems that do. The total number of components n is known as the dimension of the vector. A more precise definition of the notion of dimensionality is given in Chapter 2.
A vector obeys the following algebraic rules.
1.2.1 Vector Equality
Let
A = (a1, a2,...,an), B = (b1,b2,...,bn)
denote any two n-dimensional vectors. Then vectors A and B are said to be equal if and only if ai = bi for all i = 1,2,...,n. Two equal vectors are written as A = B, and the equality therefore holds only if the corresponding elements of A and B are equal. Note that two vectors can be equal only when they contain the same number of components.
Example 1.1. The two vectors
A = (3,8,1), B = (3,8,1)
are equal.
1.2.2 Addition
Consider any three n-dimensional vectors
A= (a1, a2,..., an), B= (b1, b2,..., bn), C= (c1, c2,..., cn)
The addition of two vectors,
A = B + C,
is defined as
e9780486153377_i0006.jpgThe vector A is thus defined in terms of sums of corresponding elements of B and C. Note that in order to be conformable for addition, B and C must again contain the same number of components.
Vector addition obeys the following axioms.
The commutative law for addition,
B + C = C + B.
The associative law for addition,
A + ( B + C ) = ( A + B ) + C.
There exists a zero (null) vector 0 = (0, 0, ... , 0) such that
A + 0 = A.
For any vector A there exists a negative vector − A = (− a1, − a2, ..., −an) such that
A + ( − A ) = 0.
It is straightforward to show that the negative vector of − A is the vector A.
Theorem 1.1. The negative vector of − A is the vector A.
PROOF: Let A+ be the negative vector of A, and A* the negative vector of − A. We will now prove that A* = A. By rule 4 we have
− A + A* = A + A+ = 0.
Adding A to both sides of the equation yields
A + (− A) + A* = A + A+ + A
or
0 + A* = A + 0 (rule 4).
Thus
A* = A (rule 3). □
1.2.3 Scalar Multiplication
If A = (a1, a2, ..., an is any n-dimensional vector and k any scalar, then the scalar product
kA = k (a1, a2, ..., an)
= (ka1, ka2, ..., kan)
is a uniquely determined vector that obeys the following laws.
The commutative law:
kA = Ak.
The associative law:
k1(k2A) = (k1k2)A,
where k1 and k2 are any two scalars.
The following products hold for the scalars 0 and 1:
0A = 0, 1A = A, (−1) A = − A.
Rule 4 of Section 1.2.2 can therefore be expressed in the alternative form
e9780486153377_i0007.jpgwhich effectively establishes vector subtraction.
1.2.4 Distributive Laws
Combining vector addition and scalar multiplication, we obtain the following two distributive laws:
(k1+ k2)A = k1A + k2A,
k(A + B) = kA + kB,
where k, k1, and k2 are scalars and A and B are any two n-component vectors.
The following two examples provide an illustration of the above rules.
Example 1.2 Find the sum and difference of the vectors A = (3, −1,0) and B = (1,4, − 7).
SOLUTION: We have
e9780486153377_i0008.jpgand
e9780486153377_i0009.jpgExample 1.3 Find, for the vector e9780486153377_i0010.jpg , the negative vector − A.
SOLUTION:
e9780486153377_i0011.jpgThus A + (− A) = A − A = 0.
1.3 Coordinates of a Vector
Since basic operations can be defined in terms of their components, it is natural to extend this method to cover other properties of vectors. This can be achieved by defining a system of coordinate axes that provide numerical scales along which all possible values of the components of a vector are measured. It is more common to use orthogonal Cartesian coordinate systems, although such practice is not essential. Indeed, in the following chapters we shall have occasion to refer to oblique (nonorthogonal) systems.
With coordinate axes every component of a vector can be uniquely² associated with a point on an axis. The axes therefore serve as a reference system in terms of which concepts such as dimension, length, and linear dependency can be defined. Consider the two-dimensional parallel vectors V and V*, as in Figure 1.1. The two vectors are of equal length; components a1, b1, and x1 are measured along the horizontal axis, and a2, b2 and x2 along the vertical axis. A vector can originate and terminate at any point,³ but it is convenient to standardize the origin to coincide with the zero vector (0, 0) = 0. We then have
e9780486153377_i0012.jpg(1.1)
and setting x1 = x2 = 0, we have V = V*. Displacing a vector to a new parallel position (or, equivalently, shifting the vertical and horizontal axes in a parallel fashion) leaves that vector unchanged, and in such a system equal vectors possess equal magnitudes and direction. A parallel displacement of a vector (coordinate axes) is known as a translation.
Although the requirement that vectors originate at the zero point simplifies a coordinate system, this is achieved at a cost; it now becomes meaningless to speak of parallel vectors. However, an equivalent concept is that of collinearity. Two vectors are said to be collinear if they lie on the same straight line. Collinearity (and multicollinearity) will be dealt with more fully when we consider the linear dependence of vectors. For the moment we note that collinear vectors need not point in the same direction, since their terminal points can be separated by an angle of 180°. Thus if vector V1 is collinear with V2, then − V1 is also collinear with V2, since even though V1 and − V1 point in opposite directions, they nevertheless lie on the same straight line.
Once the coordinates of a vector are defined in terms of orthogonal axes, the basic vector operations of Section 1.2 can be given convenient geometric representation. For example, vector addition corresponds to constructing the diagonal vector of a parallelogram (Figure 1.2). Also, vector components can themselves be defined in terms of vector addition, since for an orthogonal three-dimensional system any vector Y = (y1, y2, y3) can be written as
e9780486153377_i0013.jpg(1.2)
(see Figure 1.3). More generally, the coordinate numbers of any n-dimensional vector can be easily visualized as a set of n component vectors that make up the vector Y.
e9780486153377_i0014.jpgFigure 1.2 Addition of two vectors V1 and V2 in two-dimensional space.
Figure 1.3 The components of a three-dimensional vector Y= (y1, y2, y3).
e9780486153377_i0015.jpg1.4 The Inner Product of Two Vectors
A vector product can be defined in several ways, the two better known products being the vector (or cross) product and the scalar (inner) product.⁴ In what follows we shall be mainly concerned with the scalar product, which, as its name suggests yields a scalar rather than a vector quantity. The importance of the inner product is derived from its use as a measure of association between two vectors and the length (magnitude) of a vector. As a result, it is one of the more widely employed indexes of linear association in quantitative research. Vector spaces for which the inner product is defined are known as Euclidean vector spaces. For the time being we shall leave the concept of a vector space undefined, but, loosely speaking, a vector space consists of a collection of vectors that possess certain properties. Since the inner product makes use of right-angled triangles, we will briefly review some of their properties first.
1.4.1 The Pythagorean Theorem
In a two-dimensional Euclidean vector space the magnitude of a vector is defined by the Pythagorean theorem, which relates the length of the hypotenuse of a right-angled triangle to those of the remaining two orthogonal (perpendicular) sides. Consider the two squares of Figure 1.4, where the vertices of the inner square partition the four sides of the larger square into two constant lengths a and b. The area of the larger square is given by
e9780486153377_i0016.jpg(1.3)
Since the four right-angled triangles with sides a, b, and c are congruent, they form two rectangles when joined, each with area ab. The area of the inner square is then equal to the difference
e9780486153377_i0017.jpgfrom Equation (1.3), so that
e9780486153377_i0018.jpg(1.4)
1.4.2 Length of a Vector and Distance Between Two Vectors
Given Equation (1.4), the length of a two-dimensional vector X= (x1, x2) can be expressed as
e9780486153377_i0019.jpg(1.5)
e9780486153377_i0020.jpgFigure 1.4 The Pythagorean theorem illustrated with the areas of two squares.
Also, from Figure 1.5, the distance between the vectors (points) X1 = (x11, xx2) and X2 = (x21, x22) is the length of vector X3 = X2 − X1, where
e9780486153377_i0021.jpgsince translation of vectors does not alter length or direction (see Section 1.3). Thus
e9780486153377_i0022.jpg(1.6)
where we let X1∙X2 = x11x21 + x22x12. The squared distance ∥X3∥² can therefore be decomposed into two parts: the first part, which consists of the squared lengths of X1 and X2, and a second part 2X1. X2, which depends on the interaction (nonorthogonality) between X1 and X2. It will be shown later that X1 X2 = 0 when the two vectors are orthogonal. In that particular case expression (1.6) becomes identical to Pythagoras’ theorem. This suggests that the scalar X1 ∙ X2 can be used to measure the extent of nonorthogonality between X1 and X2. Evidently, in a certain sense orthogonal vectors are independent of each other, so X1 ∙ X2 also provides us with a measure of the degree of association between X1 and X2. The scalar X1 ∙ X2 is known as the inner (scalar) product of the two vectors.
e9780486153377_i0023.jpgFigure 1.5 The distance between two vectors X1 and X2 as the length of the difference X3 = X2 − X1.
1.4.3 The Inner Product and Norm of a Vector
As illustrated by Equation (1.6), the inner product X1 ∙ X2 is given by the sum of the products of the vector components. More generally we have the following definitions.
Definition 1.1. Let X1 = (x11, x12,..., x1n) and X2 = (x21, x22,...,x2n) be any two n-dimensional (finite) vectors. Then the inner product of X1 and X2 is the real-valued scalar
e9780486153377_i0024.jpg(1.7)
Note that X1 ∙ X2 can assume both negative as well as positive values.
Definition 1.2. Let X1 = (x11, x12,..., x1n) and X2 = (x21, x22,...,x2n) be any two n-dimensional (finite) vectors. Then the distance between X1 and X2 is given by the (nonnegative) function
e9780486153377_i0025.jpg(1.8)
Definition 1.3. Let X1 = (x11, x12,....,x1n) be any n-dimensional (finite) vector. Then the length, or norm, of X1 is given be the (nonnegative) function
e9780486153377_i0026.jpg(1.9)
Note that the norm can be considered as the distance from the origin. Evidently the vector norm (1.9) provides a multidimensional generalization of Pythagoras’ theorem; however, both distance and vector magnitude depend on the inner product.
The inner product obeys the following operations.
Theorem 1.2. The inner product is distributive over addition (subtraction), so that
e9780486153377_i0027.jpg(1.10)
where X1, X2, and X3 are any n-dimensional vectors.
PROOF: We have, for addition,
e9780486153377_i0028.jpgA similar result holds for subtraction. □
Theorem 1.3. The inner product is commutative, that is,
X1 ∙ X2 = X2 ∙ X1,
where X1 and X2 are any n-dimensional vectors.
PROOF: The proof consists in noting that any ith element x1ix2i of X1 ∙ X2 can also be written as x2ix1i, which is the ith element of X2 ∙X1. □
Theorem 1.4. The inner product is commutative with respect to scalar multiplication, so that
X1 ∙ (kX2) = k ( X1 ∙ X2) = ( X1 ∙ X2) k,
where k is any scalar and X1, and X2 are n-dimensional vectors.
PROOF: The proof again consists of expanding the inner products in terms of the vector components. □
Corollary. Multiplying a vector by a positive scalar alters magnitude but not direction.
PROOF: Let k > 0 be any scalar. Then the magnitude of the vector kX1, is
e9780486153377_i0029.jpgso that for k ≠ 1 scalar multiplication either magnifies or shrinks the length of X1, depending on the magnitude of k. Since the relative magnitudes and the positions of the components x1j are not changed, it follows that multiplication by k does not alter the direction of X1, □
The inner product of Definition 1.1 is a binary operation, since only two vectors are involved. The concept, however, may be extended to more than two vectors. For example, we can define the product
X1X2X3 = x11x21x31 + x12x22x32 + ... + x1nx2nx3n.
Such products, however, are not frequently employed.
1.5 The Dimension of a Vector: Unit Vectors
An important concept in linear algebra is that of dimension,
which plays a key role in the theory of vector spaces and their application to quantitative research and statistical data analysis. We are all aware of the existence of spatial dimensions in our physical (geographical) space. Thus the usual physical space that we perceive, such as a room, is three dimensional; the flat surface of a wall is two dimensional, and the straight line where wall and ceiling intersect is one dimensional. Naturally this refers to perfect (ideal) geometric objects, since in practice there is always some random distortion. Thus our physical space lends itself naturally to dimensional description.
Other phenomena can also be depicted or modeled by means of dimensions. Consider two commuters A and B, who travel to work in the city center. Assume that A lives farther from the center of town than B, but has access to a more rapid means of transportation (train versus bus). It is then possible for A to reach the city center before B. Thus although A lives farther from the city center in terms of mileage, he is nevertheless closer in terms of the time dimension. Similarly, a third individual C, who lives even farther away from the city center than both A and B, but who can afford a more expensive mode of travel (helicopter, say), is closer to the city center in yet another dimension, a wealth dimension.
We therefore have three potential determinants of travel time: distance (mileage), speed of travel (miles per hour), and income (dollar amount per year). Travel time can therefore be described in terms of three dimensions.
The concept of dimensionality also permits us to organize and interpret data in a systematic manner. Consider four consecutive yearly time periods, the years 1977–1980, during which data are collected on the following variables:
Y = number of fatal road accidents per year (hundreds),
X1 = time (year),
X2 = volume of automobile sales in a given year (thousands),
X3 = maximum speed limit in miles per hour.
We may then be able to relate the accident rate data Y with X1, X2, and X3 as
e9780486153377_i0030.jpg(1.11)
or, in terms of hypothetical data (column vectors)
e9780486153377_i0031.jpgwhere β1, β2, and β3 are arbitrary coefficients to be determined. The vector Y= (12, 14, 11, 10) is then said to be embedded in a three-dimensional vector space. However, should X1 and X2 be collinear, then our accident space
reduces to two dimensions, since one of the vectors (X1 or X2) becomes completely redundant.
A linear equation such as Equation (1.11) is also known as a linear combination of the vectors X1, X2, and X3. We have the following definitions concerning linear combinations of vectors.
Definition 1.4. A n-dimensional vector Y is said to be linearly dependent on a set of n-dimensional vectors X1, X2, ... , Xk if and only if Y can be expressed as a linear combination of these vectors.
Definition 1.5. A set of n-dimensional vectors X1, X2,...,Xk is said to be linearly interdependent if and only if there exist scalars β1, β2, βk, not all zero, such that
e9780486153377_i0032.jpg(1.12)
Evidently, if a set of vectors is linearly interdependent, then any one of the vectors, say X1, can be expressed as
e9780486153377_i0033.jpgand X1 therefore depends on X2, X3,...,Xk, assuming that β1 ≠ 0. Likewise, when
e9780486153377_i0034.jpg(1.13)
as per Definition 1.4, then Equation (1.13) can be rewritten as
Y − α1X1 − α2X2 − ... − αkXk = 0,
so that Y1, X1, X2, ..., Xk are linearly interdependent. Linear dependency can therefore be viewed either in terms of Definition 1.4 or 1.5.
Example 1.4. The vector Y = (11, 16, 21 ) is linearly dependent on vectors X1 = (1, 2, 3) and X2 = (4,5,6), since
(11, 16, 21) = 3(1,2,3)+2(4,5,6);
also, it can be verified that
(11, 16, 21) − 3(1, 2, 3) − 2(4, 5, 6) = (0, 0, 0) = 0,
so that Definition 1.5 applies, where we let
β1 = 1, β2 = − 3, β3 = − 2.
Equation (1.11) is also an example of the linear dependence of vector Y on X1, X2, and X3, where coordinates consist of observed data. In applied work, however, it is frequently impossible to compare the magnitudes of coordinates or vectors. The reason for this is that coordinate magnitudes often reflect differences in measurement units, which in turn may not be comparable. For example, there is nothing significant in the fact that the components of vector X1 are much larger than those of X3, since the units associated with these vectors are not the same. Consequently X1 possesses a much larger magnitude than X3. Also, it is not possible to rescale any one of the two vectors, since their measurement units are not qualitatively comparable. For this reason vector components are frequently transformed to relative magnitudes by rescaling vectors to unit length.
Definition 1.6. A vector Y is termed a unit vector if and only if it possesses unit length, i.e., ∥Y∥ = 1.
Example 1.5.
i. The vectors
E1 = (1,0,0), E2 = (0, 1, 0), E3 = (0, 0, 1)
are three-dimensional unit vectors, since (see also Figure 1.3)
∥E1∥ = ∥E2∥ = ∥E3∥ = 1.
ii. The vector
E = (0.267,0.534,0.802)
is a three-dimensional unit vector, since
∥E∥ = (0.267² + 0.534² + 0.802²)¹/² = 1.
Definition 1.7. Two unit vectors Ei, and Ej are mutually orthogonal (perpendicular) if and only if
e9780486153377_i0035.jpg(1.14)
Example 1.6. The vectors E1, E2, and E3 of the previous example are orthogonal unit vectors.
It will be shown later that any n-dimensional vector can be expressed as a linear combination of n orthogonal unit vectors. For example, when the unit vectors are of the form given by Example 1.5(i), we have the particularly simple form
e9780486153377_i0036.jpg(1.15)
When the unit vectors do not consist of zeros and units, we obtain the more general form Y = a1E1 + a2E2 + ... + anEn. The vector inner product can also be written in terms of Equation (1.15) as
e9780486153377_i0037.jpgwhich is the same as Definition 1.3.
Theorem 1.5. Any n-dimensional vector can be standardized to unit length.
PROOF: Let Y = (y1, y2,...,yn) be any n-dimensional vector with length ∥Y∥. Let
e9780486153377_i0038.jpgThen Y* is a unit vector, since
e9780486153377_i0039.jpgA vector that is standardized to unit length is also said to be normalized (to unit length). Orthogonal unit vectors are also referred to as orthonormal vectors.
Example 1.7. To normalize Y1 = (2,3,4) and e9780486153377_i0040.jpg to unit length we proceed as follows. We have
e9780486153377_i0041.jpgand unit vectors e9780486153377_i0042.jpg and e9780486153377_i0043.jpg are given by
e9780486153377_i0044.jpgsince
e9780486153377_i0045.jpgUnit vectors, however, need not be orthogonal, since
e9780486153377_i0046.jpgAlso note that inner products between unit vectors lie in the closed interval [−1,1], as do components of the unit vectors.
Orthogonal unit vectors can be interpreted as orthogonal coordinate axes of unit length [see Equation (1.15)]; however, other representations of orthogonal axes are also possible.
Example 1.8. Let e9780486153377_i0047.jpg be a vector whose coordinates are given with respect to the unit vector axes E1 = (1,0) and E2 = (0,1).
i. Find the components of Y with respect to axes X1 = (2, 0) and X2 = (0,3).
ii. Find the magnitude of Y with respect to both coordinate axes.
SOLUTION:
i. Y can be expressed as
e9780486153377_i0048.jpgwith respect to E1 and E2. To find the components of Y relative to X1 = (2, 0) and X2 = (0,3) we have
e9780486153377_i0049.jpgwhere a and b are the unknown components. Then Y = (2a,3b) or e9780486153377_i0050.jpg . Since equal vectors must have equal components, we obtain e9780486153377_i0051.jpg and e9780486153377_i0052.jpg . Thus e9780486153377_i0053.jpg are the components with respect to X1 and X2; that is,
e9780486153377_i0054.jpgii. Since changing the reference system can not alter vector magnitude, we have
e9780486153377_i0055.jpgwhere e9780486153377_i0056.jpg . Similarly, we find
e9780486153377_i0057.jpgsince E1·E2 = 0.
1.6 Direction Cosines
An n-dimensional vector