Explore 1.5M+ audiobooks & ebooks free for days

Only $12.99 CAD/month after trial. Cancel anytime.

Applied Matrix Algebra in the Statistical Sciences
Applied Matrix Algebra in the Statistical Sciences
Applied Matrix Algebra in the Statistical Sciences
Ebook889 pages4 hoursDover Books on Mathematics

Applied Matrix Algebra in the Statistical Sciences

Rating: 4 out of 5 stars

4/5

()

Read preview

About this ebook

This comprehensive text covers both applied and theoretical branches of matrix algebra in the statistical sciences. It also provides a bridge between linear algebra and statistical models. Appropriate for advanced undergraduate and graduate students, the self-contained treatment also constitutes a handy reference for researchers. The only mathematical background necessary is a sound knowledge of high school mathematics and a first course in statistics.
Consisting of two interrelated parts, this volume begins with the basic structure of vectors and vector spaces. The latter part emphasizes the diverse properties of matrices and their associated linear transformations--and how these, in turn, depend upon results derived from linear vector spaces. An overview of introductory concepts leads to more advanced topics such as latent roots and vectors, generalized inverses, and nonnegative matrices. Each chapter concludes with a section on real-world statistical applications, plus exercises that offer concrete examples of the applications of matrix algebra.
LanguageEnglish
PublisherDover Publications
Release dateJan 18, 2013
ISBN9780486153377
Applied Matrix Algebra in the Statistical Sciences

Related to Applied Matrix Algebra in the Statistical Sciences

Titles in the series (100)

View More

Related ebooks

Mathematics For You

View More

Reviews for Applied Matrix Algebra in the Statistical Sciences

Rating: 4 out of 5 stars
4/5

1 rating0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Applied Matrix Algebra in the Statistical Sciences - Alexander Basilevsky

    Preface

    In recent decades matrix algebra and statistics and probability have become two of the most important areas of theoretical and applied mathematics, in terms of university curricula as well as actual research in the social, geographical, human, and life sciences. With a growing awareness of the usefulness of matrices in the statistical sciences has come a very elegant development of statistical theory, combining the principles of probability and linear algebra. This has greatly increased our understanding of the algebraic linear structures and properties of statistical models. At the same time, however, many of the more specialized theorems of linear algebra, as well as particular types of matrices commonly used in statistics, are usually not discussed in the more theoretical linear algebra texts, whose main purpose is to give the student a broad perspective of linear algebraic structures rather than a more specialized view. Usually matrix results of interest to statisticians are scattered in the appendices and introductory chapters of advanced statistical publications or are discussed in journal articles not readily available to the student and researcher. This in turn tends to influence university programs in statistics to place less importance on matrix algebra than it deserves, with the result that students involved in statistics and other quantitative programs are often forced to rely on courses taught in pure mathematics in order to obtain even the most elementary notions of applied linear algebra. The present volume is therefore an attempt to collect into a single text some of the main results of matrix algebra that find wide use in both applied and theoretical branches of the statistical sciences, and provide a bridge between linear algebra and statistical models. It is hoped that this volume will be useful to students undertaking undergraduate or postgraduate degrees in quantitative areas such as statistics, probability, econometrics, psychometrics, sociometrics, biometrics, the various life and earth sciences, as well as quantitative geography and demography, by providing a self-contained book on applied matrix algebra designed for statistical application. Also, it is the intention of the author to place at the disposal of the researcher a handy reference book to be consulted when the need arises. The material has therefore been organized in a self-contained manner, beginning with introductory concepts and leading to the more advanced topics such as latent roots and vectors, generalized inverses, and nonnegative matrices. At the end of each chapter is a section dealing with real-world statistical applications, as well as exercises, in order to provide the reader with concrete examples (and motivation) of how (and why) matrices are useful in the statistical sciences, as well as with an illustration of the matrix algebra. Whenever possible, proofs of the theorems have been structured in such a way so as to obviate the use of abstract algebraic notions. The only mathematical background that is needed is a good knowledge of high school mathematics, as well as a first course in statistics, although certain sections, such as those dealing with quadratic forms, require differential calculus. The author has also attempted to relate algebraic results to geometric concepts and diagrams in order to render the material more understandable from the intuitive point of view. The main purpose of this book, however, is to provide the reader with a systematic development of applied matrix theory, and the author has included more-or-less complete proofs of the theorems. Formal proofs not only develop analytical skills that are indispensable in quantitative research, but also provide a sound understanding of the fundamental properties of the mathematical tools used.

    This book consists of two interrelated parts: The first deals with the basic structure of vectors and vector spaces, while the latter part emphasizes the diverse properties of matrices and their associated linear transformations, and how these in turn depend on results derived from linear vector spaces. Chapter 1 introduces the notion of a real vector as a distinct mathematical entity, defines operations for vectors, and then considers scalar functions of vector elements such as length, distance, inner product, and the angle between two vectors. In turn, these scalar functions are used to define statistical measures of location, variation, and association such as the centroid (mean), variance, covariance, and correlation. Chapter 2 extends the notion of a vector to that of an n-dimensional vector space provided with orthogonal—or, more generally, oblique—coordinates. The discussion centers on sums of vector spaces, coordinate transformations, and projections of a vector onto other vectors and how coordinate transformations and vector projections provide the statistician with some of the basic tools used in curve fitting, principal components, and other associated topics in multivariate analysis.

    Chapters 3–7 form the main body of this book, namely, matrices and their properties and how these in turn are used in statistical analysis and model building. Chapter 3 defines general types of matrices, operations on matrices, scalar functions of the elements of a matrix, and how matrices are used to study various types of linear transformations. In Chapter 4 the discussion is narrowed down to more specialized matrices such as idempotent, nilpotent, orthogonal, Grammian, and projection matrices, which play a fundamental role in statistics. Chapter 5 is devoted exclusively to latent roots and vectors, whereas Chapter 6 considers a relatively recent development in matrix theory and linear statistical models—generalized matrix inverses and their unifying role in statistical estimation. Finally, Chapter 7 considers some of the more important elements of graphs, nonnegative and diagonally dominant matrices, and how these can be employed in the study of Markov chains and other associated models used in projections and forecasting.

    Not all chapters will be of equal interest to each reader. For those whose interest lies in multivariate models and estimations but who have not had any previous exposure to linear vector spaces, Chapters 1–5 will be of relevance (with Chapter 6 providing an additional topic of a more advanced nature), whereas readers already familiar with vectors and vector spaces can begin with Chapter 3. Students who are more concerned with discrete stochastic processes and time projections can replace Chapter 6 with Chapter 7.

    The author wishes to express thanks to D. S. Grant of the Department of Mathematics at the University of Winnipeg, who read the entire manuscript and suggested many useful changes. I would also like to thank S. A. Hathout of the Department of Geography, who has provided original data for a numerical example, as well as T. J. Kuz of the same department and A. Anderson of the University of Massachusetts for many useful discussions. The final preparation of this volume was made possible by a research grant from the University of Winnipeg. I alone am responsible for any defects in the manuscript.

    Alexander Basilevsky

    Winnipeg, Manitoba

    Chapter 1

    Vectors

    1.1 Introduction

    In applied quantitative work matrices arise for two main reasons—to manipulate data arranged in tables and to solve systems of equations. A real matrix A is defined as a n × k rectangular array

    e9780486153377_i0002.jpg

    where the real aij (i = 1, 2,..., n; j = 1, 2,..., k) comprising the elements of A have either known or unknown values. When n = 3 and k = 2, we have the 3 × 2 matrix

    e9780486153377_i0003.jpg

    and when the elements are known we may have, for example,

    e9780486153377_i0004.jpg

    where a11 = 2, a12 = 0,..., a32 = 8. The subscripts i and j are convenient index numbers that indicate the row and column of aij, respectively. For the special case when n = k =1, matrix A reduces to a single number, referred to as a scalar. When n = 1 (k =1) we obtain a row (column) array, or a vector, which can be viewed as a particular type of matrix. Alternatively, a matrix can be considered as a set of vectors, which in turn consist of real scalar numbers. Each view has its own particular merit, but for the sake of exposition it is useful to first consider properties of vectors and to then extend these properties to matrices. Geometrically, a vector is represented as a point in a Cartesian system of coordinate axes and is frequently depicted by a straight arrow (Figure 1.1); however, it is important to keep in mind that a vector is in fact a point, and not a straight line.

    e9780486153377_i0005.jpg

    Figure 1.1 A parallel translation of a vector V to a new location V

    1.2 Vector Operations

    Although geometric representations of vectors can be intuitive aids and will be used frequently in the following chapters, they are less helpful for defining the basic properties of vectors, which is best achieved by algebra.

    Let a set of n numbers ai (i = 1, 2, ...,n), be represented in the linear¹ array (a1, a2,...,an) where, in general, interchanging any two (or more) numbers results in a different set. The set (a2, a1,..., an), for example, is not the same as a1, a2,...,an), unless a1 = a2. For this reason a vector is said to be ordered. Such ordered sets of numbers are generally referred to as vectors, and the scalars ai are known as components of a vector. The components are measured with respect to the zero vector 0 = (0,0,..., 0), which serves as the origin point. Not all vector systems employ zero as the origin, but we confine our treatment to those systems that do. The total number of components n is known as the dimension of the vector. A more precise definition of the notion of dimensionality is given in Chapter 2.

    A vector obeys the following algebraic rules.

    1.2.1 Vector Equality

    Let

    A = (a1, a2,...,an), B = (b1,b2,...,bn)

    denote any two n-dimensional vectors. Then vectors A and B are said to be equal if and only if ai = bi for all i = 1,2,...,n. Two equal vectors are written as A = B, and the equality therefore holds only if the corresponding elements of A and B are equal. Note that two vectors can be equal only when they contain the same number of components.

    Example 1.1. The two vectors

    A = (3,8,1), B = (3,8,1)

    are equal.

    1.2.2 Addition

    Consider any three n-dimensional vectors

    A= (a1, a2,..., an), B= (b1, b2,..., bn), C= (c1, c2,..., cn)

    The addition of two vectors,

    A = B + C,

    is defined as

    e9780486153377_i0006.jpg

    The vector A is thus defined in terms of sums of corresponding elements of B and C. Note that in order to be conformable for addition, B and C must again contain the same number of components.

    Vector addition obeys the following axioms.

    The commutative law for addition,

    B + C = C + B.

    The associative law for addition,

    A + ( B + C ) = ( A + B ) + C.

    There exists a zero (null) vector 0 = (0, 0, ... , 0) such that

    A + 0 = A.

    For any vector A there exists a negative vector − A = (− a1, − a2, ..., −an) such that

    A + ( − A ) = 0.

    It is straightforward to show that the negative vector of − A is the vector A.

    Theorem 1.1. The negative vector of A is the vector A.

    PROOF: Let A+ be the negative vector of A, and A* the negative vector of − A. We will now prove that A* = A. By rule 4 we have

    A + A* = A + A+ = 0.

    Adding A to both sides of the equation yields

    A + (− A) + A* = A + A+ + A

    or

    0 + A* = A + 0 (rule 4).

    Thus

    A* = A (rule 3). □

    1.2.3 Scalar Multiplication

    If A = (a1, a2, ..., an is any n-dimensional vector and k any scalar, then the scalar product

    kA = k (a1, a2, ..., an)

    = (ka1, ka2, ..., kan)

    is a uniquely determined vector that obeys the following laws.

    The commutative law:

    kA = Ak.

    The associative law:

    k1(k2A) = (k1k2)A,

    where k1 and k2 are any two scalars.

    The following products hold for the scalars 0 and 1:

    0A = 0, 1A = A, (−1) A = − A.

    Rule 4 of Section 1.2.2 can therefore be expressed in the alternative form

    e9780486153377_i0007.jpg

    which effectively establishes vector subtraction.

    1.2.4 Distributive Laws

    Combining vector addition and scalar multiplication, we obtain the following two distributive laws:

    (k1+ k2)A = k1A + k2A,

    k(A + B) = kA + kB,

    where k, k1, and k2 are scalars and A and B are any two n-component vectors.

    The following two examples provide an illustration of the above rules.

    Example 1.2 Find the sum and difference of the vectors A = (3, −1,0) and B = (1,4, − 7).

    SOLUTION: We have

    e9780486153377_i0008.jpg

    and

    e9780486153377_i0009.jpg

    Example 1.3 Find, for the vector e9780486153377_i0010.jpg , the negative vector − A.

    SOLUTION:

    e9780486153377_i0011.jpg

    Thus A + (− A) = A A = 0.

    1.3 Coordinates of a Vector

    Since basic operations can be defined in terms of their components, it is natural to extend this method to cover other properties of vectors. This can be achieved by defining a system of coordinate axes that provide numerical scales along which all possible values of the components of a vector are measured. It is more common to use orthogonal Cartesian coordinate systems, although such practice is not essential. Indeed, in the following chapters we shall have occasion to refer to oblique (nonorthogonal) systems.

    With coordinate axes every component of a vector can be uniquely² associated with a point on an axis. The axes therefore serve as a reference system in terms of which concepts such as dimension, length, and linear dependency can be defined. Consider the two-dimensional parallel vectors V and V*, as in Figure 1.1. The two vectors are of equal length; components a1, b1, and x1 are measured along the horizontal axis, and a2, b2 and x2 along the vertical axis. A vector can originate and terminate at any point,³ but it is convenient to standardize the origin to coincide with the zero vector (0, 0) = 0. We then have

    e9780486153377_i0012.jpg

    (1.1)

    and setting x1 = x2 = 0, we have V = V*. Displacing a vector to a new parallel position (or, equivalently, shifting the vertical and horizontal axes in a parallel fashion) leaves that vector unchanged, and in such a system equal vectors possess equal magnitudes and direction. A parallel displacement of a vector (coordinate axes) is known as a translation.

    Although the requirement that vectors originate at the zero point simplifies a coordinate system, this is achieved at a cost; it now becomes meaningless to speak of parallel vectors. However, an equivalent concept is that of collinearity. Two vectors are said to be collinear if they lie on the same straight line. Collinearity (and multicollinearity) will be dealt with more fully when we consider the linear dependence of vectors. For the moment we note that collinear vectors need not point in the same direction, since their terminal points can be separated by an angle of 180°. Thus if vector V1 is collinear with V2, then − V1 is also collinear with V2, since even though V1 and − V1 point in opposite directions, they nevertheless lie on the same straight line.

    Once the coordinates of a vector are defined in terms of orthogonal axes, the basic vector operations of Section 1.2 can be given convenient geometric representation. For example, vector addition corresponds to constructing the diagonal vector of a parallelogram (Figure 1.2). Also, vector components can themselves be defined in terms of vector addition, since for an orthogonal three-dimensional system any vector Y = (y1, y2, y3) can be written as

    e9780486153377_i0013.jpg

    (1.2)

    (see Figure 1.3). More generally, the coordinate numbers of any n-dimensional vector can be easily visualized as a set of n component vectors that make up the vector Y.

    e9780486153377_i0014.jpg

    Figure 1.2 Addition of two vectors V1 and V2 in two-dimensional space.

    Figure 1.3 The components of a three-dimensional vector Y= (y1, y2, y3).

    e9780486153377_i0015.jpg

    1.4 The Inner Product of Two Vectors

    A vector product can be defined in several ways, the two better known products being the vector (or cross) product and the scalar (inner) product.⁴ In what follows we shall be mainly concerned with the scalar product, which, as its name suggests yields a scalar rather than a vector quantity. The importance of the inner product is derived from its use as a measure of association between two vectors and the length (magnitude) of a vector. As a result, it is one of the more widely employed indexes of linear association in quantitative research. Vector spaces for which the inner product is defined are known as Euclidean vector spaces. For the time being we shall leave the concept of a vector space undefined, but, loosely speaking, a vector space consists of a collection of vectors that possess certain properties. Since the inner product makes use of right-angled triangles, we will briefly review some of their properties first.

    1.4.1 The Pythagorean Theorem

    In a two-dimensional Euclidean vector space the magnitude of a vector is defined by the Pythagorean theorem, which relates the length of the hypotenuse of a right-angled triangle to those of the remaining two orthogonal (perpendicular) sides. Consider the two squares of Figure 1.4, where the vertices of the inner square partition the four sides of the larger square into two constant lengths a and b. The area of the larger square is given by

    e9780486153377_i0016.jpg

    (1.3)

    Since the four right-angled triangles with sides a, b, and c are congruent, they form two rectangles when joined, each with area ab. The area of the inner square is then equal to the difference

    e9780486153377_i0017.jpg

    from Equation (1.3), so that

    e9780486153377_i0018.jpg

    (1.4)

    1.4.2 Length of a Vector and Distance Between Two Vectors

    Given Equation (1.4), the length of a two-dimensional vector X= (x1, x2) can be expressed as

    e9780486153377_i0019.jpg

    (1.5)

    e9780486153377_i0020.jpg

    Figure 1.4 The Pythagorean theorem illustrated with the areas of two squares.

    Also, from Figure 1.5, the distance between the vectors (points) X1 = (x11, xx2) and X2 = (x21, x22) is the length of vector X3 = X2 − X1, where

    e9780486153377_i0021.jpg

    since translation of vectors does not alter length or direction (see Section 1.3). Thus

    e9780486153377_i0022.jpg

    (1.6)

    where we let X1∙X2 = x11x21 + x22x12. The squared distance ∥X3∥² can therefore be decomposed into two parts: the first part, which consists of the squared lengths of X1 and X2, and a second part 2X1. X2, which depends on the interaction (nonorthogonality) between X1 and X2. It will be shown later that X1 X2 = 0 when the two vectors are orthogonal. In that particular case expression (1.6) becomes identical to Pythagoras’ theorem. This suggests that the scalar X1 ∙ X2 can be used to measure the extent of nonorthogonality between X1 and X2. Evidently, in a certain sense orthogonal vectors are independent of each other, so X1 ∙ X2 also provides us with a measure of the degree of association between X1 and X2. The scalar X1 ∙ X2 is known as the inner (scalar) product of the two vectors.

    e9780486153377_i0023.jpg

    Figure 1.5 The distance between two vectors X1 and X2 as the length of the difference X3 = X2 − X1.

    1.4.3 The Inner Product and Norm of a Vector

    As illustrated by Equation (1.6), the inner product X1 ∙ X2 is given by the sum of the products of the vector components. More generally we have the following definitions.

    Definition 1.1. Let X1 = (x11, x12,..., x1n) and X2 = (x21, x22,...,x2n) be any two n-dimensional (finite) vectors. Then the inner product of X1 and X2 is the real-valued scalar

    e9780486153377_i0024.jpg

    (1.7)

    Note that X1 ∙ X2 can assume both negative as well as positive values.

    Definition 1.2. Let X1 = (x11, x12,..., x1n) and X2 = (x21, x22,...,x2n) be any two n-dimensional (finite) vectors. Then the distance between X1 and X2 is given by the (nonnegative) function

    e9780486153377_i0025.jpg

    (1.8)

    Definition 1.3. Let X1 = (x11, x12,....,x1n) be any n-dimensional (finite) vector. Then the length, or norm, of X1 is given be the (nonnegative) function

    e9780486153377_i0026.jpg

    (1.9)

    Note that the norm can be considered as the distance from the origin. Evidently the vector norm (1.9) provides a multidimensional generalization of Pythagoras’ theorem; however, both distance and vector magnitude depend on the inner product.

    The inner product obeys the following operations.

    Theorem 1.2. The inner product is distributive over addition (subtraction), so that

    e9780486153377_i0027.jpg

    (1.10)

    where X1, X2, and X3 are any n-dimensional vectors.

    PROOF: We have, for addition,

    e9780486153377_i0028.jpg

    A similar result holds for subtraction. □

    Theorem 1.3. The inner product is commutative, that is,

    X1 ∙ X2 = X2 ∙ X1,

    where X1 and X2 are any n-dimensional vectors.

    PROOF: The proof consists in noting that any ith element x1ix2i of X1 ∙ X2 can also be written as x2ix1i, which is the ith element of X2 ∙X1. □

    Theorem 1.4. The inner product is commutative with respect to scalar multiplication, so that

    X1 ∙ (kX2) = k ( X1 ∙ X2) = ( X1 ∙ X2) k,

    where k is any scalar and X1, and X2 are n-dimensional vectors.

    PROOF: The proof again consists of expanding the inner products in terms of the vector components. □

    Corollary. Multiplying a vector by a positive scalar alters magnitude but not direction.

    PROOF: Let k > 0 be any scalar. Then the magnitude of the vector kX1, is

    e9780486153377_i0029.jpg

    so that for k ≠ 1 scalar multiplication either magnifies or shrinks the length of X1, depending on the magnitude of k. Since the relative magnitudes and the positions of the components x1j are not changed, it follows that multiplication by k does not alter the direction of X1, □

    The inner product of Definition 1.1 is a binary operation, since only two vectors are involved. The concept, however, may be extended to more than two vectors. For example, we can define the product

    X1X2X3 = x11x21x31 + x12x22x32 + ... + x1nx2nx3n.

    Such products, however, are not frequently employed.

    1.5 The Dimension of a Vector: Unit Vectors

    An important concept in linear algebra is that of dimension, which plays a key role in the theory of vector spaces and their application to quantitative research and statistical data analysis. We are all aware of the existence of spatial dimensions in our physical (geographical) space. Thus the usual physical space that we perceive, such as a room, is three dimensional; the flat surface of a wall is two dimensional, and the straight line where wall and ceiling intersect is one dimensional. Naturally this refers to perfect (ideal) geometric objects, since in practice there is always some random distortion. Thus our physical space lends itself naturally to dimensional description.

    Other phenomena can also be depicted or modeled by means of dimensions. Consider two commuters A and B, who travel to work in the city center. Assume that A lives farther from the center of town than B, but has access to a more rapid means of transportation (train versus bus). It is then possible for A to reach the city center before B. Thus although A lives farther from the city center in terms of mileage, he is nevertheless closer in terms of the time dimension. Similarly, a third individual C, who lives even farther away from the city center than both A and B, but who can afford a more expensive mode of travel (helicopter, say), is closer to the city center in yet another dimension, a wealth dimension. We therefore have three potential determinants of travel time: distance (mileage), speed of travel (miles per hour), and income (dollar amount per year). Travel time can therefore be described in terms of three dimensions.

    The concept of dimensionality also permits us to organize and interpret data in a systematic manner. Consider four consecutive yearly time periods, the years 1977–1980, during which data are collected on the following variables:

    Y = number of fatal road accidents per year (hundreds),

    X1 = time (year),

    X2 = volume of automobile sales in a given year (thousands),

    X3 = maximum speed limit in miles per hour.

    We may then be able to relate the accident rate data Y with X1, X2, and X3 as

    e9780486153377_i0030.jpg

    (1.11)

    or, in terms of hypothetical data (column vectors)

    e9780486153377_i0031.jpg

    where β1, β2, and β3 are arbitrary coefficients to be determined. The vector Y= (12, 14, 11, 10) is then said to be embedded in a three-dimensional vector space. However, should X1 and X2 be collinear, then our accident space reduces to two dimensions, since one of the vectors (X1 or X2) becomes completely redundant.

    A linear equation such as Equation (1.11) is also known as a linear combination of the vectors X1, X2, and X3. We have the following definitions concerning linear combinations of vectors.

    Definition 1.4. A n-dimensional vector Y is said to be linearly dependent on a set of n-dimensional vectors X1, X2, ... , Xk if and only if Y can be expressed as a linear combination of these vectors.

    Definition 1.5. A set of n-dimensional vectors X1, X2,...,Xk is said to be linearly interdependent if and only if there exist scalars β1, β2, βk, not all zero, such that

    e9780486153377_i0032.jpg

    (1.12)

    Evidently, if a set of vectors is linearly interdependent, then any one of the vectors, say X1, can be expressed as

    e9780486153377_i0033.jpg

    and X1 therefore depends on X2, X3,...,Xk, assuming that β1 ≠ 0. Likewise, when

    e9780486153377_i0034.jpg

    (1.13)

    as per Definition 1.4, then Equation (1.13) can be rewritten as

    Y α1X1 − α2X2 − ... − αkXk = 0,

    so that Y1, X1, X2, ..., Xk are linearly interdependent. Linear dependency can therefore be viewed either in terms of Definition 1.4 or 1.5.

    Example 1.4. The vector Y = (11, 16, 21 ) is linearly dependent on vectors X1 = (1, 2, 3) and X2 = (4,5,6), since

    (11, 16, 21) = 3(1,2,3)+2(4,5,6);

    also, it can be verified that

    (11, 16, 21) − 3(1, 2, 3) − 2(4, 5, 6) = (0, 0, 0) = 0,

    so that Definition 1.5 applies, where we let

    β1 = 1, β2 = − 3, β3 = − 2.

    Equation (1.11) is also an example of the linear dependence of vector Y on X1, X2, and X3, where coordinates consist of observed data. In applied work, however, it is frequently impossible to compare the magnitudes of coordinates or vectors. The reason for this is that coordinate magnitudes often reflect differences in measurement units, which in turn may not be comparable. For example, there is nothing significant in the fact that the components of vector X1 are much larger than those of X3, since the units associated with these vectors are not the same. Consequently X1 possesses a much larger magnitude than X3. Also, it is not possible to rescale any one of the two vectors, since their measurement units are not qualitatively comparable. For this reason vector components are frequently transformed to relative magnitudes by rescaling vectors to unit length.

    Definition 1.6. A vector Y is termed a unit vector if and only if it possesses unit length, i.e., ∥Y∥ = 1.

    Example 1.5.

    i. The vectors

    E1 = (1,0,0), E2 = (0, 1, 0), E3 = (0, 0, 1)

    are three-dimensional unit vectors, since (see also Figure 1.3)

    E1∥ = ∥E2∥ = ∥E3∥ = 1.

    ii. The vector

    E = (0.267,0.534,0.802)

    is a three-dimensional unit vector, since

    E∥ = (0.267² + 0.534² + 0.802²)¹/² = 1.

    Definition 1.7. Two unit vectors Ei, and Ej are mutually orthogonal (perpendicular) if and only if

    e9780486153377_i0035.jpg

    (1.14)

    Example 1.6. The vectors E1, E2, and E3 of the previous example are orthogonal unit vectors.

    It will be shown later that any n-dimensional vector can be expressed as a linear combination of n orthogonal unit vectors. For example, when the unit vectors are of the form given by Example 1.5(i), we have the particularly simple form

    e9780486153377_i0036.jpg

    (1.15)

    When the unit vectors do not consist of zeros and units, we obtain the more general form Y = a1E1 + a2E2 + ... + anEn. The vector inner product can also be written in terms of Equation (1.15) as

    e9780486153377_i0037.jpg

    which is the same as Definition 1.3.

    Theorem 1.5. Any n-dimensional vector can be standardized to unit length.

    PROOF: Let Y = (y1, y2,...,yn) be any n-dimensional vector with length ∥Y∥. Let

    e9780486153377_i0038.jpg

    Then Y* is a unit vector, since

    e9780486153377_i0039.jpg

    A vector that is standardized to unit length is also said to be normalized (to unit length). Orthogonal unit vectors are also referred to as orthonormal vectors.

    Example 1.7. To normalize Y1 = (2,3,4) and e9780486153377_i0040.jpg to unit length we proceed as follows. We have

    e9780486153377_i0041.jpg

    and unit vectors e9780486153377_i0042.jpg and e9780486153377_i0043.jpg are given by

    e9780486153377_i0044.jpg

    since

    e9780486153377_i0045.jpg

    Unit vectors, however, need not be orthogonal, since

    e9780486153377_i0046.jpg

    Also note that inner products between unit vectors lie in the closed interval [−1,1], as do components of the unit vectors.

    Orthogonal unit vectors can be interpreted as orthogonal coordinate axes of unit length [see Equation (1.15)]; however, other representations of orthogonal axes are also possible.

    Example 1.8. Let e9780486153377_i0047.jpg be a vector whose coordinates are given with respect to the unit vector axes E1 = (1,0) and E2 = (0,1).

    i. Find the components of Y with respect to axes X1 = (2, 0) and X2 = (0,3).

    ii. Find the magnitude of Y with respect to both coordinate axes.

    SOLUTION:

    i. Y can be expressed as

    e9780486153377_i0048.jpg

    with respect to E1 and E2. To find the components of Y relative to X1 = (2, 0) and X2 = (0,3) we have

    e9780486153377_i0049.jpg

    where a and b are the unknown components. Then Y = (2a,3b) or e9780486153377_i0050.jpg . Since equal vectors must have equal components, we obtain e9780486153377_i0051.jpg and e9780486153377_i0052.jpg . Thus e9780486153377_i0053.jpg are the components with respect to X1 and X2; that is,

    e9780486153377_i0054.jpg

    ii. Since changing the reference system can not alter vector magnitude, we have

    e9780486153377_i0055.jpg

    where e9780486153377_i0056.jpg . Similarly, we find

    e9780486153377_i0057.jpg

    since EE2 = 0.

    1.6 Direction Cosines

    An n-dimensional vector

    Enjoying the preview?
    Page 1 of 1